Copyright (c) 2001 by Rich Morin
published in Silicon Carny, February 2001
The Meta Project's file-tree browser is supposed
to recognize path names and supply descriptive information,
but in cases like
As I mentioned last month, I'm working on a file-tree browser for the Meta Project. One of the
interesting sub-problems I've encountered has to do with characterizing
the device names in
There are literally thousands of possible device names, so a brute-force
approach is out of the question. Even when the name space is folded down
by unit numbers and such, there are hundreds of device families (e.g.
My solution to this nightmare is based on three components: a set of device family descriptions, a set of parsing macros, and some supporting Perl code. Both the descriptions and the macros use XML syntax.
By matching each family's base name (e.g., Device families
Here's the XML description for the
<driver>
<name>sa</name>
<desc>(SCSI) Sequential Access devices</desc>
<man_page section="4" status="primary">sa</man_page>
<parse>[EN]?R?BU.M</parse>
</driver>
The parse entry looks pretty complex, but it's actually just a mnemonic name for the parsing macro. Any unique text string would serve to identify the macro, but this one gives a hint to the nature of the required parsing. The rest of the description should be pretty self-explanatory. I should note, in passing, that the description above is written in something I call Ostensible Mark-up Language (OML). That is, it looks enough like XML to pass muster, but it doesn't have a style sheet or other niceties. It may also contain things, such as Perl regular expressions, that aren't really kosher by normal XML standards. Parsing macros
Assuming that the entered device name contains the device family's base
name (
<macro parse="[EN]?R?BU.M">
<regexp>([en]?)(r?)$name([0-9]+)(?:\.([0-9]+))?</regexp>
<redisp>[en]?r?$name[0-9]+</redisp>
<redisp>[en]?r?$name[0-9]+\.([0-9]+</redisp>
<substr>rewind,root,unit,mode</substr>
</macro>
If the entered device name matches the regular expression specified in
Supporting Perl code
Fortunately, the really hard parts of the job are accomplished by some
handy Perl utility modules. For instance, the XML text is stored in a
tied hash, using
With these nasty parts under control, we only need to fiddle the
returned values into English. Here's a simplified version of the
relevant code. The device family name (
$regexp =~ s/\$name/$dname/;
if ($qname =~ /^$regexp$/) {
print "\n /dev/$qname is a device node ";
@substr = split(/,/, $substr);
$paren = '';
$paren .= "($substr[0] $1" if ($#substr >= 0);
$paren .= ", $substr[1] $2" if ($#substr >= 1);
...
$paren .= ") " if ($#substr >= 0);
if ($parse eq '[EN]?R?BU.M') {
$paren =~ s/\(rewind , /(rewind on close,/;
...
}
...
print $paren if ($paren ne '');
print "\n for device family $dname ($desc).\n";
...
I won't try to pretend this is elegant code, but it gets the job done in a small and reasonably simple bit of code. Part of the reason for this brevity lies in Perl and its very handy modules. Another part, however, comes from using XML as a tool to build a little language. By creating XML-based parsing macros (complete with embedded regular expressions), I was able to encode some fairly complex notions in a very compact form. I'm not sure what other applications could benefit from this approach, but I think that it is one that will stay in my coding arsenal. Here's hoping it will find a place in yours... About the authorRich Morin (rdm@cfcl.com) operates Prime Time Freeware (www.ptf.com), a publisher of books about Open Source software. Rich lives in San Bruno, on the San Francisco peninsula. |