Previous 20

Nov. 14th, 2009

The standard unit

While talking with a doctor at one of the sites I work at, she told this story:

During my residency, a prescription pad was stolen from the office. A few days later I got a call from a pharmacist. "Now, I know you didn't write this prescription, but I just had to call and tell you about it. Someone just gave me a prescription for 'mophine...one pound'."

Tags:

Nov. 8th, 2009

A simple metric

If the name of the food product takes up less space on the packaging than the list of things that it doesn't have in it, I don't want it.

Oct. 31st, 2009

Changes to rawdog-rss

I've finally "finished" my work on the rawdog rss plugin (which should realistically be called "syndication.py" or something similar). You can view the results at the rawdog-rss github repository. The plugin was originally authored by Jonathan Riddell, and was later updated by Adam Sampson. While I may leave the git repository online, it should not be considered the new plugin homepage. Now, with that out of the way, the big changes I made:

  1. There's now a unit test framework built up around the plugin.
  2. The plugin outputs well-formed XML (in RSS, OPML, and FOAF formats).
  3. FOAF output is now more better.

I also added a hack: after noticing that the Planet KDE configuration file included HTML references instead of Unicode characters (i.e. "á" instead of just "รก"), I decided to make the plugin do what was intended, not what was actually written.

The only other thing worth noting is that you can't run the test suite without modifying the rawdog source code. Comments, questions, and criticisms are welcome. Bug reports should be filed at github.

Oct. 17th, 2009

Easy subscriptions and filtering

While perusing Adam Sampson's website (the author of rawdog), I noticed that he maintains a list of ideas he's had. I think that's a great idea, so I'm going to take note of something that occurred to me a few moments ago.

I often glance at Planet Gnome (PGO). I'm already subscribed to many of its syndicated authors, but every so often someone who hasn't posted in a while shows up on its front page, and I find someone new to subscribe to. But therein lies the problem: I'm wasting significant time maintaining a separate list of feeds. PGO has a blogroll in two formats, but my current feed reader doesn't give me any tools for filtering out the authors and content that are irrelevant to me.

Linus Torvalds once noted that If you can do something really fast, really well, people start using it differently (skip to 50:38). He was talking about merging in git, but I think it applies to feed reading as well.

Earlier this week I found an author who was posting about a bit of software he had written. I was discouraged from subscribing to his blog because he used tags and categories so disjointedly that I knew that by subscribing to a tag- or category-specific feed I would either be inundated with unrelated posts, or I would miss relevant content that I really wanted to see. And that's when it hit me (again) that my feed reader doesn't (and all feed readers I've ever used don't) give me the tools to insta-subscribe, and insta-filter, and insta-unubscribe (heaven forbid) when it's clear that the content doesn't interest me any longer.

If and when I write my own feed reader, I hope I remember to make it incredibly easy to subscribe to feeds, but brain-dead easy to filter content on my terms.

Tags:

Patching rawdog's rss.py plugin

After previously noting that listparser chokes on invalid XML (as it should, but eventually won't), I've taken it upon myself to patch the software that was creating invalid XML in the first place. I found that Planet KDE's FOAF blogroll was being created by a plugin for rawdog.

I wrote to both the plugin author and the rawdog author and got the go-ahead to update the plugin, so I downloaded the latest version of the plugin and set about creating unit tests. Right off the bat I discovered a bug in rawdog: it was reading -- but ignoring -- my custom configuration files! The solution is to add a single line near the bottom of rawdoglib/rawdog.py:

  elif o in ("-c", "--config"):
      try:
          config.load(a)
      except ConfigError, err:
          print >>sys.stderr, "In " + a + ":"
          print >>sys.stderr, err
          return 1
+     rawdog.sync_from_config(config)

With that fixed, I don't think it'll be long before the plugin will be outputting valid FOAF and OPML files.

Oct. 3rd, 2009

listparser 0.9 - "Celery wolves"

I'm very pleased to announce that listparser v0.9 is now available! There are some big changes in this release.

RDF+FOAF

listparser now supports RDF+FOAF, an XML-based subscription list format that the vast majority of Planet-based websites produce. If you like reading what the Pidgin developers are up to, keep in mind that Pidgin Planet has a FOAF blogroll. Or if you're an RDF junkie, Planet RDF has a FOAF blogroll. So do Planet Gnome, Planet Ubuntu, and Planet Mozilla.

Adding support for another format was exciting, but it prompted me to finally fix an issue that had been bothering me.

Opportunities

A while back I discovered that Wordpress-based blogs can output an OPML version of the links in the blog sidebar. What I noticed, however, was that most blog owners only put in the linked blog's homepage, and not its feed URL.

While implementing RDF+FOAF support I discovered that the same problem could exist in that format (see Planet Freedesktop's FOAF blogroll for an example). Thus, I decided to save those homepage URLs.

The way I see it, software can either choose to beat users over the head for problems they neither understand nor have any way to fix, or it can do its best to handle the problems gracefully. By storing those homepage URLs, feed readers that use listparser have the opportunity to run feed and subscription list autodiscovery software against those homepage URLs. Obviously they don't have to, but its an opportunity for the feed reader to help the user, rather than hinder them. Its for that reason that I called the set of homepage URLs "opportunities".

distutils support

At long last, listparser can be easily installed using the standard Python distutils. Rather than just copying listparser.py somewhere in your path, you can now just install it like any other piece of Python software:

$ python setup.py install

It's just as easy as that.

In closing

listparser has become much more mature; there is exactly one more significant shortcoming that I'm aware needs to be addressed (you can look at Planet KDE's broken FOAF blogroll to see exactly what listparser -- like Firefox -- will not handle). However, I think I'm going to take a break from listparser and recharge my batteries.

listparser is a Python library that parses subscription lists (also called reading lists) and returns all of the feeds and subscription lists that it finds. It currently supports OPML, RDF+FOAF, and the iGoogle exported settings format.

[listparser homepage]
[listparser downloads]

Sep. 26th, 2009

Do you want a cookie?

While working on September 11th, I had a number of computer boxes to open and tear down for a lady. I whipped out my trusty box cutter, made short work of the boxes, and commented to the lady how useful the tool was. She turned her eyes down and raised her eyebrows while shaking her head, saying I just don't know if that word will ever be the same after 9-11.

I was surprised; her tone didn't suggest that she was reprimanding me for saying "box cutter"; it just seemed like she was hollowly acting out a part. I replied that Well, box cutters can be useful no matter what day of the year it is.

I just think that, I don't know, maybe some words should be treated differently on the anniversary of 9-11.

As the conversation continued I became irritated with her. There aren't merit badges for NEVER FORGETting, but this lady seemed determined to demonstrate that she could be more plaintive than anyone else. While honoring the event doesn't bother me, putting on a show about it does, and that was the impression she gave me.

Tags:

Sep. 20th, 2009

Replacing magical code

listparser has had some magical code in it for a while, and today I removed the two most obvious magics.

If you've ever looked at the code and have seen the expect-related code, you've probably said "wat". Yes, the code was absurd, but it's now gone. If you don't know what I'm talking about, just appreciate that the stage has been set for support for another subscription list format.

The other big change (and the one I'm most pleased with) is the replacement of the tag and category code. When I first started writing it, list comprehensions seemed like an obvious solution. Unfortunately, as I discovered corner cases I continued to solve the problem with more list comprehensions. Eventually I ended up with this ugly mess:

def or_strip(x, y):
    return x.strip() or y.strip()
tags = [x.strip() for x in attrs[(None, 'category')].split(',') if x.strip() and '/' not in x]
cats = (x.strip() for x in attrs[(None, 'category')].split(',') if '/' in x)
cats = (x.split('/') for x in cats if reduce(or_strip, x.split('/')))
cats = (xlist for xlist in cats if reduce(or_strip, xlist))
cats = [[y.strip() for y in xlist if y.strip()] for xlist in cats]

Splits and strips and lists, oh my! All told, I replaced 15 lines of code with 16 lines but gained shorter line lengths and more maintainable code.

I'm looking forward to releasing the next version of listparser.

Sep. 17th, 2009

Regarding reboots

I've been hired on at a company that provides contract technical support to a number of organizations, and one of the beliefs that I've heard both from users and coworkers is that a reboot is the first solution to try. I want to address what I think about that.

I hate magical solutions.

When a user experiences a problem midday that clears up after a reboot, there's no diagnosis nor documented solution. The problem presented itself and then cleared up magically. There is nothing to suggest that the problem won't return, and intermittent problems irk me like no other.

Case in point: while on-site one day I was called in because the X-ray machine suddenly couldn't send images to the server, and doctors consequently couldn't view patients' X-rays. It turns out that the X-ray machine (running Windows XP, natch) had a duplicate IP address. The X-ray tech told me that the duplicate IP address message had been popping up for months, but always cleared up with a quick reboot.

Long story short, the X-ray machine had to have a static IP address, but someone gave it an address smack dab in the middle of the DHCP server's configured address range. For months the server had been handing out a duplicate IP address and somewhere in the building a user would reboot their computer because they couldn't access the network, solving both the X-ray machine's and their own problem. That particular day, though, one fateful user turned on their computer but had to leave suddenly. Consequently, the computer was left on and hogging that miserable IP address. It was only after much gnashing of teeth that the problem got resolved.

I actually have to get ready for work, so I'll just close by reiterating my earlier statement, but slightly tempered for the benefit of the people who've read this far:

I hate magical solutions (but acknowledge their utility).

Tags:

Sep. 3rd, 2009

listparser v0.8 - "Three day weekend"

This past week has been really exciting for a number of reasons, but not least of which is the work that's gone in listparser. Almost immediately after releasing the last version I started working on the next big thing: listparser now supports the iGoogle exported settings format!

For everyone who's using iGoogle, it's possible to export your iGoogle homepage settings to an XML file. If you've added any feeds to your iGoogle homepage, those will be included in the XML file, and listparser will find them. It will even keep track of what tab it's in!

On a side note, I'm also very happy that listparser can now find subscription lists in Liferea's feedlist.opml configuration file.

Next up: packaging listparser for easier distribution and installation.

[listparser homepage]
[listparser downloads]

Aug. 28th, 2009

listparser v0.7 - "The Codex"

It's that time again! I just uploaded the latest version of listparser, and this one's an important milestone: listparser now has documentation!

There's really not a whole lot else to be said about this release, but I can safely project that the next release is going to come hot on the heels of v0.7 as I've got several ideas banging around in my brain.

[listparser homepage]
[listparser downloads]

Aug. 18th, 2009

A life achievement

One of my friends came to help me unload when I arrived home, and he showed up while I was having difficulty backing the car and the trailer onto the driveway.

David: Wow, you're doing a great job with that. Just to let you know, I've driven with one of those before, so if you want any help...
Kurt: Yeah, well I just drove over 500 miles with this trailer today.
David: Backwards?
Kurt: Yes, backwards. I was trying to unlock an achievement.

Tags:

Aug. 10th, 2009

Writing documentation

I know that one of the most challenging things about writing software is writing documentation. Usually this is chalked up to "programmers prefer to program" and "writing documentation isn't glamorous" and so forth, but as I rely on high-quality documentation, I want to provide others with the same.

To that end, I'm in the process of writing just such documentation for listparser. Surprisingly, as precise as I think that I've been while writing it, codifying listparser's behavior is revealing that it has a few inconsistencies. As an example, the OPML version is supposed to be a string (either an empty string or one of opml, opml1, or opml2); under at least one circumstance, however, the version may be reported as None. In each case, I'm making a mental note to go back and amend the behavior.

Since I'm moving next weekend and still looking for a job, none of the above is a high priority, but watch for a release in relatively short order.

Aug. 7th, 2009

listparser v0.6 - "Hatchet Hotel"

I'm pleased to announce listparser v0.6 is now available! This release primarily fixes some problems that I uncovered while parsing Wordpress OPML files produced by wp-links-opml.php.

listparser is a Python library that parses OPML subscription lists into a sanitized and standardized format.

[listparser homepage]
[download listparser]
[New! listparser @ ohloh]

Aug. 1st, 2009

listparser v0.5 - "Going green"

listparser v0.5 is now available. I spent time improving its HTTP support, so now it supports HTTP ETag and Last-Modified headers, HTTP redirects, and it sends a configurable User-Agent header. Additionally, you can now parse strings and file-like objects, which means that if you already have the OPML file in a str or unicode variable, or have already opened a file on-disk, listparser will happily parse the thing.

You can download v0.5 at github.

[listparser homepage]
[listparser downloads]

Jul. 30th, 2009

Penny Arcade powered by Django

This evening while checking my feeds, I discovered that I couldn't click through to Penny Arcade's site...apparently their MySQL server was down. However, I discovered that they're using Django:

[a bunch of traceback text]

File "wang/legacy/../../vendor/django/db/models/manager.py", line 82, in get

[more trackback text]

OperationalError: (2003, "Can't connect to MySQL server on '[REDACTED]' (61)")

I refreshed a minute later and the site was operational again, but I was pleased to see that they have moved from their old PHP site to Python and Django.

Jul. 26th, 2009

Bash programmable completion

A long time ago I read a blog entry about a shell called zsh that supported command line completion like I had never seen before. Don't get me wrong, I've been using filename completion for years, but I had never seen a shell perform filename completion across the network over an SSH connection before, or complete a manpage name, or complete a long option to a program (like mplayer's obscure -noflip-hebrew-commas option). The powers of zsh impressed me, but I never got around to switching to it. Years later, now, I've discovered that my current shell, bash, actually does support programmable command line completion, and I've become so impressed that I wanted to share some of the things that I've found.

Directory-only completion

First, you need to know what command to use to actually set up programmable completion in bash. It's called complete, surprise, and I'll give you a simple example of how to use it. Let's say that you've just downloaded that totally rockin' piece of software in a zipfile. It's named "baloo-kit-latest.zip", and when it's extracted it goes into a single directory named "baloo-kit-5.0". Now, bash isn't too bright, so when you try to complete the directory name, bash unhelpfully fails to match only the directory:

$ cd b <tab>
$ cd baloo-kit- <tab> <tab>
baloo-kit-5.0/        baloo-kit-latest.zip

Thanks for nothing, bash! But just wait a second, it gets awesome:

$ complete -d cd
$ cd b <tab>
$ cd baloo-kit-5.0/

complete's -d option restricts filename completion to directories.

Filtering filenames

Suppose you're dealing with a command that deals with documents with certain extensions, such as unzip. complete has a switch, -X, with which you can specify filtering. The -X switch will exclude whatever you specify, but you can invert the matching (so that instead of excluding you're including) by prefacing the pattern with "!".

$ complete -f -X '!*.zip' unzip
$ unzip <tab>
$ unzip baloo-kit-latest.zip

The -f switch specifies filename completion, and -X excludes everything from the results that doesn't match the pattern specified. Be careful to quote everything properly!

Matching usernames

complete has many built-in completion possibility sources, all of which can be specified using the -A switch. The -f and -d switches are actually convenient shorthand equivalents for -A file and -A directory, respectively. As another simple example, I know that the usermod command deals not with files but users, which is something bash can provide completion for:

$ complete -A user usermod
$ usermod <tab>
alpha     bravo     charlie   root

Limitless possibilities

While bash provides a lot of built-in possibility sources, it can't possibly cover everything, which is why complete also allows you to run a command (the -C switch) or a shell function (the -F switch). With these, it's possible to complete hostnames from the SSH known_hosts file, to complete remote pathnames over an SSH connection, to complete git branch and tag names, or anything else. You can find a bunch of smart completions already written over at the Bash-Completion project, and git has a thorough set of completions included in its distribution.

Smart command line completion can make you faster and more effective, so get after it!

Tags:

Jul. 24th, 2009

Commanding bash

While reading a book about my Linux shell, bash, I've been working to incorporate the incredibly powerful tools it offers into my normal workflow. In addition to memorizing useful hotkeys (like ^r to search backward through the command history), I'm starting to use history expansion, and I've more thoroughly customized my initialization file. As an example, I've set the bash command history to store 10,000 entries (and to always append to, rather than overwrite, the history file). I'm already looking through my command history and identifying areas where I'm wasting typing energy.

Here's to knowing how to use your tools as efficiently as possible!

Tags:

Jul. 18th, 2009

listparser v0.4 - "07/18,29"

I just released listparser v0.4. It now supports categories and tags, and in two ways:

  1. The category attribute of the <outline> element.
  2. Nested <outline> elements.

While the OPML specification uses the category attribute, I also wanted to support nested <outline> elements because that's how Google Reader indicates what tags (what they call "labels") are associated with a feed. Thus, the following two samples are equivalently categorized:

<outline text="Alice" category="/friend" type="rss" ... />

<outline text="friend">
    <outline text="Alice" type="rss" ... />
</outline>

I've chosen to handle categories and tags in different ways in listparser: categories are inherently hierarchical. Tags are not. Thus, if a feed is nested one level deep (as is the case with Google Reader OPML exports), listparser will give it both a tag and a category. If it's nested more than one level deep (as can be the case with Liferea OPML exports), then only categories will be assigned.

Try it out!

[download listparser]
[listparser homepage]

Jul. 4th, 2009

Quantifying unit test coverage

I've really been working hard to make sure that every piece of code that I commit to listparser is backed by unit tests to ensure that the code does what I expect it to do. But a while back I had the idea that it would be great to have some kind of program that could watch the unit tests run and produce a webpage showing me exactly what lines of code did and did not get tested.

Today the thought occurred to me again, and within minutes I found coverage.py, which does exactly what I had envisioned. It ran my unit tests, produced a highlighted webpage, and showed me that I hadn't tested eight lines of code. Minutes later I was down to five lines of untested code.

Is a metric like "97% code coverage" a good indicator of the strength of the unit tests? No. It's trivially easy to produce code that doesn't account for things:

if everything_is_okay:
    print "sweet"

(In this example, what if not everything_is_okay?!) So, code coverage isn't a guarantee that all your bases are...um, covered. It is, however, another useful tool.

Tags:

Previous 20

November 2009

S M T W T F S
1234567
891011121314
15161718192021
22232425262728
2930     

Advertisement

Syndicate

RSS Atom
Powered by LiveJournal.com