Building a Twitter Reader
Using pyTwerp and TWiki
DISCLAIMER - Before you read this...If you want to implement what I have here, you will need a computer capable of running Python, Perl, and the Unix cron command.
Capable systems include Linux, BSD, Mac OS X, and Windows (if properly configured, probably with cygwin installed). For any other system, your mileage will vary widely. I have no idea if Windows has something approximating cron.
I use Mac OS X, which is based on BSD Unix. I cannot help you with any other system!
Background
One of the features of Twitter is that it runs 24/7/365. (Un)fortunately, I don't. So, I miss things. I didn't want to miss things, so I looked for a solution.Being a programmer myself, I wanted a solution I could control and tweak if necessary. However, I didn't want to write something from scratch if I didn't have to!
Twitter has a popular, published API, so I figured someone would have written what I wanted. Someone did. I found pyTwerp (written in Python). *
From the pyTwerp documentation:
jdhore on the #twitter channel (irc.wyldryde.org) was talking about the lack of a simple linux command line utility to post Twitter updates so I asked him what features he wanted and created pyTwerp.
...
The whole concept of Twerp is to allow your Twitter data stream to be pulled from Twitter, formatted via a template you can control and then output to the console.It also allows you to post a status message or send a direct message.
That's it in a nutshell.
You can access the following data streams on Twitter:
- Friends Timeline
- Your Timeline
- Your Replies
- Direct Messages sent to you
I'm not a Python programmer, but pyTwerp looked like it does pretty much everything I need, out-of-the-box. It turns out that this is exactly the case. In fact, pyTwerp does some things I didn't even know I wanted until after I had it!
The Front End
I could easily read the pyTwerp output files with any text editor, or as plain .txt files in my web browser. But I wanted something that looked a little nicer. Enter TWiki.
TWiki is a structured enterprise wiki with a lot of programmable features. By using TWiki, I don't need to convert the output to html in order to read it on the web and I get the following useful features:
- Blank lines are retained as "paragraph" breaks.
- Anything that begins with http:// is rendered as a clickable link.
- *word* is converted to bold; _word_ is converted to italic.
- I can "script" the results, using TWiki to hide everything I'm not currently reading, or only show me this month's output.
- I can easily adjust the look and feel with CSS.
How I Built My Twitter Reader
Steps
- Install and Configure pyTwerp
- Get the Tweets
- Plan For TWiki
- Wrap it up (Packaging)
Install and Configure pyTwerp
Start by downloading pyTwerp from code.google.com/p/pytwerp/.You will also need two required libraries:
- setuptools http://cheeseshop.python.org/pypi/setuptools
- simplejson http://cheeseshop.python.org/pypi/simplejson
Installation
- setuptools is easy to install. Follow the directions on the download page.
- simplejson doesn't come with instructions. Run
easy_install simplejson
- Install pyTwerp by running
python setup.py install
This will install the pyTwerp library into your Python's site-packages location. A utility script named twerp will be installed in /usr/local/bin (by default); this lets you invoke pyTwerp using twerp <options>.
Configuration
The default configuration file ~/.twerp.cfg is created and populated the first time you run twerp. Configure twerp for your twitter account by running:twerp -U twitter-username -P twitter passwordYou'll only need to do this once.
Get the Tweets
First, I wrote a small shell script I named gotwerp. This does some housekeeping tasks and runs twerp. (Full gotwerp code can be seen at the end of this article.)
/usr/local/bin/twerp -f ...
... >> TwitterLog${DATE}.txt
gotwerp creates files with date-stamped names, for example: TwitterLog2008Jun26.txt. Every day at midnight, a new file is created.
I configured cron to run gotwerp every 10 minutes, appending to my Log file every time it runs.
0,10,20,30,40,50 * * * * $HOME/bin/gotwerp
I also configured gotwerp to time-stamp the output every 30 minutes.
Output file format
Using the pyTwerp defaults, the output (TwitterLog) files are formatted like this:
0000 mdy: Devoting a hot and humid afternoon to home and electrical repairs. dlpasco: George Carlin - Jammin' in New York Still totally brilliant. ... megfowler: just scratched my own face with a piece of fruit. i am epic. vdichev: I'm amused... NOT. I'm annoyed that I cannot login to ... MaryHodder: GirlGeekRevolution tomorrow night at Sugar Cafe/SF 6-9pm... 0030 Suw: Oh! Email says I've won 500k from Google UK to ... al3x: Missed textures.
Reversing The Order
perl -e 'print reverse <>'
Now each chunk of output is internally ordered by the time of posting. (actually, they're ordered by the time each posting reached twitter, but that's close enough.)
As of pyTwerp 0.4 this is no longer necessary.
TWikification of Output
Knowing I would be using TWiki, I made a few tweaks to add a little bit of TWiki markup code.pyTwerp has a -T template option that provides more control of the output format. So, I changed the template like this
TWiki uses __ to signify bold italic type. I have defined Twitter: as shorthand using the TWiki Interwiki Plugin. This will cause [[Twitter:al3x]] to expand to http://twitter.com/al3x when viewed in TWiki.
The result (in the TwitterLog file):
__al3x__: Missed textures. [[Twitter:al3x/statuses/843948918][view]]
Viewed in TWiki, I'll see something like this:
al3x: Missed textures. view
(Note: I'm a bit embarrassed to admit that I had totally missed the Template feature of pyTwerp until I had been running my Twitter reader for a few weeks! My earlier output simply set the "view" link to the person's Twitter page, not directly to the tweet in question. Duh.)
I can also (and actually have) put the teplate string into my .twerprc file. However, for purposes of this article, we'll pretend it's still in gotwerp.
Prettification With CSS
Now I wanted to be even trickier. Under normal circumstances, TWiki would merge and wrap lines that aren't otherwise separated by a blank line or an explicit HTML <br> tag. I'd get0030 Suw: Oh! Email says I've won ... view al3x: Missed textures. viewunless I use <pre> to preformat the text. And if I used <pre>, I'd run into other constraints: fixed-width fonts and lines approximately 140 characters long (no wrapping).
I did a little investigating and found the CSS white-space: pre-wrap; directive. This is a relatively new directive. It's not yet supported by all browsers (but there are workarounds for those).
pre-wrap is supported in Firefox 3 (and a variant is available for pre-Firefox-3 Mozilla browsers). Personally, that's all I care about. If you use a different browser, check this workaround or do a web search for "white-space pre-wrap".
As long as I was including CSS, I made a few tweaks to the look of the output, increasing the font size slightly, highlighting italic (<em>) in green...
Here's what a Twitter log file looks looks when viewed in TWiki:
Additional Tweaks...
Just when you think everything is working, someone drops a monkeywrench into the soup...In the first case, one of the people I follow has been reworking his webpage with CSS. He's twittering about it:
... Experimenting by deleting and moving and renaming random <div> tags until something happens...
Oops. TWiki processed that <div>. That is, it tried to. (Without a matching </div>, the results were "unexpected".)
So now gotwerp now includes one more filter:
s/</</g
Then I discovered that some feeders can send newlines to Twitter! That is, my expectation that all tweets were on one line was not 100% correct. Heresy! I added an end-of-tweet marker to my template and added a call to paste to gotwerp.
Wrap It Up (Packaging)
Finally, I wanted to make my Twitter Reader into a handy application. I wanted:- An easy way to choose which Twitter logs to read
- A reasonable default choice of log (today)
- Hiding of any logs I'm not currently reading
- A table of contents with quick links into the list of choices.
You can view my Twitter Reader in action in my TWiki.
The Code
Show Code for gotwerp
Reference
pyTwerp man page* Thank you to bear and decklin for pyTwerp; special thanks to decklin for answering bozo novice questions and pointing me toward seeing the template feature which I somehow had missed the first time out!
