<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Rich Morin :: tchotchkes</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/" />
    <link rel="self" type="application/atom+xml" href="http://www.cfcl.com/rdm/weblog/atom.xml" />
   <id>tag:www.cfcl.com,2008:/rdm/weblog//3</id>
    <link rel="service.post" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3" title="Rich Morin :: tchotchkes" />
    <updated>2006-06-21T03:39:52Z</updated>
    
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.1</generator>
 

<entry>
    <title>Multiple sets of Terminal windows in Mac OS X</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/001108.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=1108" title="Multiple sets of Terminal windows in Mac OS X" />
    <id>tag:www.cfcl.com,2006:/~rdm/weblog//3.1108</id>
    
    <published>2006-05-28T01:09:58Z</published>
    <updated>2006-06-21T03:39:52Z</updated>
    
    <summary>Having three screens up 24/7, it&apos;s easy to accumulate windows. I typically have several dozen available on my Mac, spread across a few dozen apps. In general, this is quite convenient. When I&apos;m not using an app, I hide it (via cmd-H) and all of its windows disappear. If I only want to get a few windows out of the way, I WindowShade them or send them to the Dock. Unfortunately, these approaches don&apos;t work very well for applications that...</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[Having three screens up 24/7, it's easy to accumulate windows.
I typically have several dozen available on my Mac,
spread across a few dozen apps.
In general, this is quite convenient.
When I'm not using an app,
I hide it (via cmd-H) and all of its windows disappear.
If I only want to get a few windows out of the way,
I <a href="http://www.unsanity.com/haxies/wsx">WindowShade</a> them
or send them to the Dock.
<p>
Unfortunately, these approaches don't work very well
for applications that have <i>lots</i> of windows.
Terminal programs, for example, may have dozens of windows,
each presenting a different shell and/or application context.
Mac OS X doesn't let me hide <i>some</i> of an app's windows, so that's out.
The Dock and WindowShade are awkward ways to manage dozens of windows;
simply dismissing and retrieving a set of several windows becomes quite a hassle.]]>
        <![CDATA[<p>
Fortunately, it turns out that there is a (relatively) simple workaround.
By using multiple <i>copies</i> of an app (e.g., Terminal),
I can hide and display each copy's windows separately.
I currently have four copies of Terminal available.
One is used for generic tasks, two are used for specific projects,
and one is idle.
<p>
The arrangement is working well.
Any new Terminal window I create (via cmd-N) becomes part
of the current copy's set.
I even found a way to let a script know which Terminal it's running under.
Finally, although it might be nice to be able to migrate windows between sets,
this has not been a serious deficiency in practice.

<h4>HOWTO</h4>
<p>
Here's a basic walkthrough of the procedure.
Feel free to season to taste; it's your machine, after all!
However, if you're a bit paranoid,
you may want to quit the relevant copies of Terminal before performing these steps.

<ul>
  <p><li>
    Copy the preference file and the Terminal app.
  <p>
    Using the Finder, go to <code>~/Library/Preferences</code>
    and copy <code>com.apple.Terminal.plist</code>
    to <code>com.apple.Terminal_2.plist</code>.
    Then, go to <code>/Applications/Utilities</code>
    and copy <code>Terminal.app</code> to <code>Terminal_2.app</code>.

  <p><li>
    Link up the app to the preference file.
  <p>
    Control-click on <code>Terminal_2.app</code>
    and select "Show Package Contents".
    In the Contents folder,
    double-click on the <code>Info.plist</code> file.
    In the resulting Property List Editor,
    change <code>CFBundleIdentifier</code>
    from <code>com.apple.Terminal</code>
    to   <code>com.apple.Terminal_2</code>.
    Save and Quit to make the changes permanent.

  <p><li>
    Trim the foreign language support files (optional).
  <p>
    In the Resources folder,
    remove any <b>superfluous</b> <code>*.lproj</code> files.
    This can reduce the copy's disk storage by more than 60%.
</ul>

<p>
  You can now start up and play with the copied app.
  Some customization is also a reasonable idea at this point.
  The following sections explain how to change the windows' titles,
  the app's name in the Menu Bar,
  and the value of the <code>TERM_PROGRAM</code> environment variable.
  The techniques get increasingly scary as we go along;
  feel free to bail out at any time...

<h4>Window Titles</h4>
<p>
  It's easy to change the "Title" information
  that gets used on each window.
  Pull down Terminal > Window Setting from the Menu Bar,
  then select the "Window" category.
  Change the "Title" fill-in to "TS2" (Terminal Set 2).
  Note that each copy of the app has its own preferences,
  window settings, etc.
  So, if you want to make a global change,
  you'll have to adjust each copy.

<h4>Menu Bar</h4>
<p>
  If you have Xcode installed, you can change the Menu Bar
  to say "Terminal_2" (or whatever).
  Navigate to the <code>Contents/Resources/English.lproj</code> folder
  and double-click on <code>Terminal.nib</code>.
  After Interface Builder has started up,
  double-click on the "Terminal" string in "Terminal.nib (English) - MainMenu"
  and fill in the desired value (e.g., <code>Terminal_2</code>).
  Save and Quit.

<p>
  Now, double-click on <code>InfoPlist.strings</code>.
  When a text editor window appears,
  change the value of <code>CFBundleName</code> to the same text string.
  Save and Quit.
  When you next start up Terminal_2, the Menu Bar will read correctly.

<h4>Shell Support</h4>
<p>
  It's handy for shell scripts to be able
  to find out which terminal set they are in.
  This can be used, for example,
  in dynamically setting window titles or command-line prompts.
  My current solution, which is a bit scary,
  is to edit the executable binary for the application.
  If you're comfortable with this idea, read on...

<p>
  This hack is based on the fact that Terminal sets up a pair
  of environment variables for its sessions.
  One of these, <code>TERM_PROGRAM</code>,
  is normally set to <code>Apple_Terminal</code>.
  By editing Terminal's executable,
  we can change this value to a different text string.

<p>
  Navigate to the <code>Contents/MacOS</code> folder.
  Using a binary-capable text editor
  (e.g., <a href="http://www.barebones.com/products/bbedit/index.shtml"
           >BBEdit</a>),
  open the original executable.
  Search for the text string <code>Apple_Terminal</code>.
  In my executable, it is preceded by <code>TERM_PROGRAM</code>
  and followed by the string <code>TERM_PROGRAM_VERSION</code>,
  but Your Mileage May Vary.

<p>
  Select the part of the string you wish to change
  and replace it with an equal number of characters.
  (I use <code>Apple_Term_TS2</code> for <code>Terminal_2</code>.)
  Save and Quit.
  When you next start up the app,
  this string will be available in <code>TERM_PROGRAM</code>.

<h4>Final Comments</h4>
<p>
  Although this exercise used Terminal as its target app,
  the same sorts of changes could be made to any app.
  By (a) breaking apps up into phalanxes of files
  and (b) giving us the tools to edit these files,
  Apple has given its users quite a bit of freedom.
  Not as much as an Open Source application would provide,
  to be sure, but still very handy on occasion.

<p>
  Although he bears no blame for any of my ideas or mistakes,
  David Hill was splendidly helpful in my effort to change
  the app's title in the Menu Bar.
  Having worked closely with David in co-authoring the
  <a href="http://http://www.spiderworks.com/books/spotlight.php"
    >Mac OS X Technology Guide to Spotlight</a> for
   <a href="http://http://www.spiderworks.com">SpiderWorks</a>,
  I expected no less.
  Nonetheless, it's nice to have one's positive expectations confirmed.]]>
    </content>
</entry>

<entry>
    <title>Mechanical augmentation of Wikipedia</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/001053.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=1053" title="Mechanical augmentation of Wikipedia" />
    <id>tag:www.cfcl.com,2006:/~rdm/weblog//3.1053</id>
    
    <published>2006-04-17T22:13:21Z</published>
    <updated>2007-01-09T15:25:07Z</updated>
    
    <summary> I&apos;m a big fan of Wikipedia. I use it both as a personal reference tool and as an easy way to add depth to web-based documents. However, I think that its utility might be improved by a bit more mechanical augmentation. This augmentation could take (at least :-) three forms: generated pages, automatic content, and requested content....</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[<p>
  I'm a big fan of <a href="http://en.wikipedia.org/">Wikipedia</a>.
  I use it both as a personal reference tool
  and as an easy way to add depth to web-based documents.
  However, I think that its utility might be improved
  by a bit more
  <a href="http://www.cfcl.com/rdm/weblog/archives/001002.html"
    >mechanical augmentation</a>.
  This augmentation could take (at least :-) three forms:
  generated pages, automatic content, and requested content.]]>
        <![CDATA[<h4>Generated Pages</h4>
<p>
  Wikipedia (or, more generally,
  <a href="http://en.wikipedia.org/wiki/MediaWiki">MediaWiki</a>)
  provides a variety of mechanically-generated pages.
  Several dozen
  <a href="http://en.wikipedia.org/wiki/Special:Specialpages">special pages</a>,
  for example, are used for administrative content
  (e.g., indexes, searches, reports).
  There are also some "magic" pages
  that support administration of regular pages,
  documentation of uploaded images, etc.

<p>
  This idea could easily be extended to cover general content.
  For example, astronomers might enjoy having a wiki page
  for each cataloged celestial object.
  Aside from giving basic information about the object,
  each page would be cross-referenced to pages
  that describe its discoverer(s),
  neighbors (both apparent and actual),
  relevant literature, spectral class, etc.

<p>
  Although such a collection might look quite a bit
  like a web-based catalog of astronomical objects,
  the fact that it was embedded in a wiki
  would give it some extra features.
  Users could post comments and/or questions on the "discussion" page,
  engaging other users in dialogs about the object.
  Suggestions for additional cross-references (etc)
  could also be made in a convenient, interactive manner.

<p>
  Some types of material (e.g., biased or commercial content)
  are not appropriate for Wikipedia.
  However, there is no reason that other wiki-based venues
  could not publish any material they wished.
  Some possible mechanized wikis might cover:

<ul>
  <p><li>
    politicians, impending laws, etc.
    (see <a href="http://www.theyworkforyou.com/"
           >TheyWorkForYou</a>)

  <p><li>
    passages of the Bible, Koran, Talmud, etc.

  <p><li>
    a software package's
    <a href="http://en.wikipedia.org/wiki/application_programming_interface"
      >application programming interface</a>, etc.

  <p><li>
    a class of standardized products (e.g., fittings)

  <p><li>
    every file, interface, and other object of interest
    in the <a href="http://en.wikipedia.org/wiki/Unix">Unix</a> OS.
</ul>

Of course, this is not a "one size fits all" solution.
Before creating a set of mechanically-generated wiki pages,
one should consider whether the content:

<ul>
  <p><li>
    can be generated in an automated (or at least semi-automated) manner

  <p><li>
    would benefit from the features (e.g., discussion pages) provided by wikis

  <p><li>
    would benefit from the social interaction provided by wikis
</ul>

<h4>Automatic Content</h4>
<p>
  MediaWiki generates some page content automagically.
  For example, it generates pop-up messages for links,
  links to versions of the page in other languages,
  a "table of contents" for larger pages,
  and a variety of "toolbox" links.
  However, there are many other possibilities.

<p>
  Existing category and link information
  could be used to create 
  <a href="http://en.wikipedia.org/wiki/Concept_map"
    >concept maps</a> or other forms of diagrams
  that could provide context and ease navigation.
  Mechanized analysis (e.g., a
  <a href="http://en.wikipedia.org/wiki/Naive_Bayes_classifier"
    >naive Bayes classifier</a>)
  can also be used to detect relationships between articles.
  However, the most interesting source of information
  may come from enhanced links.

<p>
  The <a href="http://en.wikipedia.org/wiki/Semantic_MediaWiki">Semantic MediaWiki</a>
  folks are working on adding
  <a href="http://en.wikipedia.org/wiki/Semantic_Wiki">Semantic Wiki</a>
  extensions to MediaWiki.
  Once these are in place,
  editors will be able to add "type" information to wiki links.
  By harvesting and analyzing this information,
  the wiki software can provide context and navigation diagrams,
  "intelligent" search, and other useful features.

<p>
  Semantic Wikis can take advantage of
  all of the tooling that is being developed
  for the <a href="http://en.wikipedia.org/wiki/Semantic_Web"
            >Semantic Web</a>,
  so it would be foolish to try to predict all
  of the ways in which this information could be used.
  Expect the unexpected...

<h4>Requested Content</h4>
<p>
  The most interesting possibilities, however,
  may be in the area of "requested content".
  Using a facility such as 
  <a href="http://meta.wikimedia.org/wiki/Transwiki:Wikimania05/Workshop-TG3">Getlets</a>,
  a wiki page can request arbitrary content from arbitrary servers.
  The content may be cached (e.g., static, with occasional updates)
  or generated and included on a dynamic basis.

<p>
  Let's say that a user is editing an article on a favorite sports team.
  By creating a link to the appropriate server,
  s/he could include the current standings of the team
  in the middle of the current web page.
  Something like:

<pre>
  ==Current Standings==

  {{sports_team_standings:Wombats | current}}
</pre>

<p>
  The resulting content might be a formatted table,
  a generated image, an interactive
  (e.g., <a href="http://en.wikipedia.org/wiki/Ajax_(programming)"
           >AJAX</a>) region,
  or even a whiz-bang, multimedia presentation.
  After all, (most) wiki software is based on the web,
  so any web technology can be used on a wiki!

<p>
  To be sure, there are many issues to consider
  in allowing this sort of activity.
  Is the server reliable and trustworthy?
  Does the facility open up security holes?
  Will it be a resource sink or degrade the user experience?
  Can users be trusted to honor copyrights and other legal provisions?

<p>
  However, as Wikipedia's
  <a href="http://en.wikipedia.org/wiki/Jimmy_Wales">Jimmy Wales</a> notes,
  we don't put restaurant patrons in cages,
  for fear that they will attack each other with their steak knives.
  So, let's start looking into ways that this technology can serve the world!

<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/Getlet"
                               rel="tag">Getlet</a>,

  <a href="http://www.technorati.com/tag/MediaWiki"
                               rel="tag">MediaWiki</a>,

  <a href="http://www.technorati.com/tag/Model-based+Documentation"
                               rel="tag">Model-based Documentation</a>,

  <a href="http://www.technorati.com/tag/semantic+wiki"
                               rel="tag">semantic wiki</a>,

  <a href="http://www.technorati.com/tag/wiki"
                               rel="tag">wiki</a>,

  <a href="http://www.technorati.com/tag/Wikipedia"
                               rel="tag">Wikipedia</a>
</p>
<!-- technorati tags start --><p style="text-align:right;font-size:10px;">Technorati Tags: <a href="http://www.technorati.com/tag/Getlet" rel="tag">Getlet</a>, <a href="http://www.technorati.com/tag/MediaWiki" rel="tag">MediaWiki</a>, <a href="http://www.technorati.com/tag/model-based documentation" rel="tag">model-based documentation</a>, <a href="http://www.technorati.com/tag/semantic wiki" rel="tag">semantic wiki</a>, <a href="http://www.technorati.com/tag/wiki" rel="tag">wiki</a>, <a href="http://www.technorati.com/tag/Wikipedia" rel="tag">Wikipedia</a></p><!-- technorati tags end -->]]>
    </content>
</entry>

<entry>
    <title>Ontiki: first steps</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/001045.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=1045" title="Ontiki: first steps" />
    <id>tag:www.cfcl.com,2006:/~rdm/weblog//3.1045</id>
    
    <published>2006-04-14T04:35:55Z</published>
    <updated>2006-05-28T01:14:07Z</updated>
    
    <summary>Previous weblog entries (Ontiki: an ontology-aware wiki, Mechanically-augmented wikis) have discussed the possibility of creating structured wikis, using mechanical (i.e., software) augmentation. This entry is a very early status report, discussing my initial experiments and early progress in this effort....</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[Previous weblog entries
(<a href="http://www.cfcl.com/rdm/weblog/archives/000902.html"
   >Ontiki: an ontology-aware wiki</a>,
 <a href="http://www.cfcl.com/rdm/weblog/archives/001002.html"
   >Mechanically-augmented wikis</a>)
have discussed the possibility of creating structured wikis,
using mechanical (i.e., software) augmentation.
This entry is a very early status report,
discussing my initial experiments and early progress
in this effort.
]]>
        <![CDATA[<h4>Apologia</h4>

<p>
I subscribe to a number of mailing lists that discuss
<a href="http://en.wikipedia.org/wiki/Knowledge_engineering"
  >knowledge engineering</a> and related topics.
I've also skimmed through many related books and papers.
The main thing that has become clear to me, from all of this study,
is that knowledge engineering is a complex, subtle,
and (as yet) ill-defined discipline.

<p>
If the experts can't agree on how to approach the creation of
<a href="http://en.wikipedia.org/wiki/Ontology_%28computer_science%29"
  >ontologies</a>, or even how to define the relevant terminology,
can an application programmer such as myself hope to do anything useful?
Well, <a href="http://en.wikipedia.org/wiki/Larry_Wall">Larry Wall</a> says
that <i>hubris</i>, <i>laziness</i>, and <i>impatience</i>
are the three great virtues of programmers.
If I can harness the latter two properly, I may be able to justify the first...

<h4>Getting Started</h4>

<p>
A full Ontiki system will require the integration
of a variety of technologies,
but I have to start <i>somewhere</i>.
So, following the techie commandment to
"<a href="http://www.cfcl.com/rdm/MBD/Eat_one's_own_dog_food"
     >eat your own dog food</a>",
I decided to use a mechanically-augmented wiki
to develop and display a naive ontology for the Unix operating system.
See <a href="http://www.cfcl.com/rdm/MBD/mbd_cs_unix.php"
      >MBD: Case Study (Unix)</a> for background information, etc.

<p>
The underlying wiki technology is provided by
<a href="http://en.wikipedia.org/wiki/MediaWiki">MediaWiki</a>,
a <a href="http://en.wikipedia.org/wiki/PHP">PHP</a>-based wiki
that is used by <a href="http://en.wikipedia.org">Wikipedia</a>
and many other substantial wikis.
MediaWiki is convenient, portable, full-featured, and robust;
it also has active user and developer communities.

<h4>Wiki Access</h4>

<p>
Although I am intrigued by the notion of
<a href="http://www.cfcl.com/rdm/weblog/archives/000999.html"
   >using DBMS tables for inter-application communication</a>,
adding a page to MediaWiki involves updating about a dozen tables.
This requires a much closer relationship with MediaWiki's logic
than I have any interest in having.

<p>
Fortunately, others have blazed a trail that I can follow.
<a href="http://meta.wikimedia.org/wiki/Pywikipediabot">Pywikipedia</a>
is a <a href="http://en.wikipedia.org/wiki/Python_programming_language"
       >Python</a>-based "bot" (robot) framework
for <a href="http://en.wikipedia.org/wiki/HyperText_Transfer_Protocol"
      >HTTP</a> client scripts.
Although it was originally developed for accessing
<a href="http://en.wikipedia.org/">Wikipedia</a>,
it can access any MediaWiki-based wiki.
Using this framework, I can get and put the text of wiki pages,
upload images, etc.

<h4>Data Storage</h4>

<p>
Of course, this begs the question of where to store the information
that is used to <i>create</i> the wiki pages.
I hope to migrate, in time, to an
<a href="http://en.wikipedia.org/wiki/Relational_database_management_system"
  >RDBMS</a>, a knowledge base framework such as
<a href="http://protege.stanford.edu">Prot&eacute;g&eacute;</a>,
and/or a knowledge representation and reasoning system such as
<a href="http://www.isi.edu/isd/LOOM/PowerLoom/">PowerLoom</a>.
However, I'm still not sure which way to jump.

<p>
So, for the moment, I'm using an informal approach:
a directory tree of several dozen hand-edited
<a href="http://en.wikipedia.org/YAML">YAML</a> files.
Each directory corresponds to a class;
a YAML file in each directory defines the class,
lists its relationships, etc.
The YAML files are readable, self-documenting,
and extremely flexible in structure.

<p>
Editing an extensive hierarchy of files is quite awkward
using command-line tools (e.g.,
<a href="http://en.wikipedia.org/wiki/Vi">vi</a>).
Fortunately, my explorations into
<a href="http://en.wikipedia.org/wiki/Ruby_on_Rails">Ruby on Rails</a>
introduced me to the excellent
<a href="http://macromates.com/">TextMate</a> editor.
Using its Project drawer,
I can view the directory hierarchy,
disclose and hide sub-trees,
and jump into any desired file at the click of a mouse.

<p>
Every so often,
I use a <a href="http://en.wikipedia.org/Perl">Perl</a> script
to concatenate the files, process the definitions,
and load the web pages.
The script looks for problems (e.g., missing definitions)
and generates assorted cross-references, indexes, etc.
This run takes about five seconds per page, however,
so I don't do it all that often.

<h4>The Ontology</h4>

<p>
As mentioned above, this is a "naive" ontology.
I don't expect it to support much in the way of deduction;
it simply has to provide a way
to organize classes and instances of entities and relationships.
At this point, I'm only working with "abstract classes";
once I get these sketched in,
I can look at lower levels
(e.g., "concrete classes", "instances").

<p>
For simplicity and flexibility,
I treat everything as entities.
This includes not only conventional entities (e.g., 
<a href="http://www.cfcl.com/rdm/mediawiki/index.php/File_node_%28AC%29"
>file node</a>),
but also attributes (e.g.,
<a href="http://www.cfcl.com/rdm/mediawiki/index.php/Is_interpreted_%28AC%29"
>is interpreted</a>) and relationships (e.g.,
<a href="http://www.cfcl.com/rdm/mediawiki/index.php/May_include_%28AC%29"
>may include</a>).
Attributes may have zero or one value;
relationships may have two or more roles.

<p>
The top level of the ontology
(<a href="http://www.cfcl.com/rdm/mediawiki/index.php/Thing_%28AC%29"
   >thing</a>) is currently a bit of a muddle.
The reason for this is that I'm mostly just entering concepts at this point.
I suspect that a bit more order will emerge
as I actually try to <i>use</i> these classes for something!

<p>
Anyway, feel free to take a look.
The <a href="http://www.cfcl.com/rdm/mediawiki/index.php/AC_Index"
      >Abstract Class</a> (AC) Index is structured (roughly :-)
as an "is a" tree
(i.e., <a href="http://en.wikipedia.org/wiki/taxonomy"
         >taxonomy</a>).
However, because it allows multiple inheritance,
it's really a <a href="http://en.wikipedia.org/wiki/Directed_acyclic_graph"
         >directed acyclic graph</a> (DAG).

<p>
If you are interested in helping me fill in the ontology,
and have some part of Unix which you are knowledgeable about,
please <a href="mailto:Rich Morin &lt;rdm@cfcl.com&gt;?Subject=Ontiki"
         >contact me</a>.
There are some substantial parts of the ontology (e.g., networking)
that I've simply punted on, for lack of knowledge and/or time.


<h4>What's Next</h4>

<p>
Now that I can generate and upload wiki pages,
I get to choose between enhancing their content
and improving their appearance and usability.
In all likelihood, I'll work on all of these areas.
Comments, suggestions, and help are all welcome...

<p>
Image-mapped diagrams work well for context and navigation,
but they aren't supported by MediaWiki "out of the box".
So, I'm looking into ways to provide this capability.
For examples of how I've used these diagrams in the past,
see <a href="http://www.cfcl.com/rdm/MBD/mbd_cs_fsw.php"
      >MBD: Case Study (FSW)</a>.

<p>
<a href="http://upload.wikimedia.org/wikibooks/en/a/a9/Wikimania05_Workshop_TG3.pdf"
   rel="nofollow">Getlets</a> extend
<a href="http://en.wikipedia.org/wiki/InterWiki">InterWiki</a> notation,
allowing wiki pages to draw upon arbitrary dynamic content.
This could allow users to create their own dynamic reports,
simply by editing a wiki page and asking for the right content.

<p>
<a href="http://www.isi.edu/isd/LOOM/PowerLoom/">PowerLoom</a>
is a well-regarded knowledge representation and reasoning system.
I have downloaded a copy and am trying to get up to speed on it,
as well as evaluate its suitability
as a back-end server for Ontiki.

<p>
In short, there is no shortage of interesting research directions.
Stay tuned; I'll let you know what I find out...

<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/knowledge+representation"
                               rel="tag">knowledge representation</a>,

  <a href="http://www.technorati.com/tag/MediaWiki"
                               rel="tag">MediaWiki</a>,

  <a href="http://www.technorati.com/tag/Model-based+Documentation"
                               rel="tag">Model-based Documentation</a>,

  <a href="http://www.technorati.com/tag/Ontiki"
                               rel="tag">Ontiki</a>,

  <a href="http://www.technorati.com/tag/ontology"
                               rel="tag">ontology</a>,

  <a href="http://www.technorati.com/tag/PowerLoom"
                               rel="tag">PowerLoom</a>,

  <a href="http://www.technorati.com/tag/Pywikipedia"
                               rel="tag">Pywikipedia</a>,

  <a href="http://www.technorati.com/tag/semantic+wiki"
                               rel="tag">semantic wiki</a>,

  <a href="http://www.technorati.com/tag/TextMate"
                               rel="tag">TextMate</a>,

  <a href="http://www.technorati.com/tag/Unix"
                               rel="tag">Unix</a>,

  <a href="http://www.technorati.com/tag/wiki"
                               rel="tag">wiki</a>]]>
    </content>
</entry>

<entry>
    <title>Our Spotlight book is out!</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/001010.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=1010" title="Our Spotlight book is out!" />
    <id>tag:www.cfcl.com,2006:/~rdm/weblog//3.1010</id>
    
    <published>2006-03-13T01:21:28Z</published>
    <updated>2006-04-13T05:22:09Z</updated>
    
    <summary>Spotlight, introduced in Mac OS X 10.4 (Tiger), is Apple&apos;s new desktop search feature. Although it isn&apos;t perfect, it&apos;s quite a useful addition to other forms of file-system navigation. So, when SpiderWorks asked me to write a book on the topic, I jumped at the chance. Now, after a year of off-and-on effort, the book is available for purchase....</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Books, Movies, Music" />
    
        <category term="Computers" />
    
        <category term="Science" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[<a href="http://www.apple.com/macosx/features/spotlight/"
  >Spotlight</a>, introduced in
<a href="http://www.apple.com/macosx">Mac OS X</a> 10.4 (Tiger),
is <a href="http://www.apple.com">Apple</a>'s
new desktop search feature.
Although it isn't perfect,
it's quite a useful addition to other forms of file-system navigation.
So, when <a href="http://www.spiderworks.com">SpiderWorks</a>
asked me to write a book on the topic, I jumped at the chance.
Now, after a year of off-and-on effort,
the book is available for
<a href="http://www.spiderworks.com/books/spotlight.php">purchase</a>.]]>
        <![CDATA[<p>
As the proprietor of Prime Time Freeware,
I was the credited author or editor for a few dozen books
and the internal editor for several others.
I've also written some 200 articles for trade magazines (e.g.,
<a href="http://www.spiderworks.com/">MacTech</a>,
SunExpert, SunWorld,
<a href="http://www.unixreview.com/">Unix Review</a>).
In short, I'm reasonably familiar
with the writing, editing, and publication process.


<p>
However, this was my first experience writing a book for another publisher.
It was also my first experience at being a (recognized) co-author.
I'm pleased to report that the experience was remarkably painless.
If you're interested in writing a technical book,
especially on an Apple-related topic,
SpiderWorks is well worth considering.

<h2>Collaboration</h2>
<p>
One of my concerns, when this book was first proposed,
was that I'm not really an Apple developer.
That is, I haven't made extensive use of
<a href="http://developer.apple.com/cocoa">Cocoa</a> frameworks, the
<a href="http://www.apple.com/macosx/features/xcode">Xcode</a>
<a href="http://en.wikipedia.org/wiki/Integrated_development_environment"
  >IDE</a> (Integrated Development Environment), etc.
Most of my programming, in fact, is done in
<a href="http://en.wikipedia.org/wiki/Perl">Perl</a>
and other <strike>scripting</strike> agile languages.

<p>
Fortunately, SpiderWorks had another author on tap.
<a href="http://www.spiderworks.com/authors/davidhill.php"
  >David Hill</a> had already written one book for them,
so he was familiar with their procedures.
He was also well qualified to research and write the chapters
on "Accessing Spotlight Using Core Services",
"Accessing Spotlight Using Foundation", and
"Writing a Spotlight Importer".

<p>
This allowed me to concentrate on higher-level topics:
"Introduction", "Overview", "Data and Metadata",
"Query Strings", and "Saved Searches".
I also dug through the relevant files and documentation,
generating an annotated appendix of "Spotlight Keys".

<p>
Finally, I developed some AppleScript and Perl code
that allowed Spotlight to index
(and Personal Web Sharing to present)
assorted "include files", "man pages", etc.
I covered this work in an appendix named "Indexing Darwin Files".
Although the code is admittedly preliminary,
it demonstrates some interesting and useful tricks.

<p>
Having done a number of long-distance projects,
I was comfortable with the fact that David
was (only) two time zones away.
Calling (say) Australia from California can be awkward,
because of the large time difference;
calling Texas is comparatively simple.
That said, we used email for most of our interaction.

<h2>Publication</h2>
<p>
Having nursed dozens of books through the publication process,
it was interesting (and a great relief :-)
to have that chore handled by SpiderWorks.
I also found it interesting to see my pedestrian formatting efforts
turning into artistic and attractive layout.

<p>
Peculiarly, I don't actually <i>know</i> what tools
SpiderWorks uses to generate the downloadable PDF files.
I do know that all of us passed Microsoft Word files back and forth,
using a set of SpiderWorks templates to define the basic styles.

<p>
These templates covered all of our basic needs,
but couldn't handle those of the "Spotlight Keys" appendix.
However, SpiderWorks crafted a special template
just for this appendix, resolving that issue.
In summary, the tooling was quite adequate to the job.

<p>
Given that the Spotlight ebook sells for a fraction of the cost
of a conventional printed volume,
you might expect that my royalty wouldn't be worth mentioning.
However, three factors serve to mitigate this:

<ul>
  <p><li>
    SpiderWorks sells directly to its customers,
    so nothing is lost to distributors and retailers.

  <p><li>
    Publication is electronic, by default,
    so nothing is spent on "production costs".

  <p><li>
    SpiderWorks isn't rapacious,
    so authors get a reasonable portion of the proceeds.
</ul>

<p>
So, at the risk of repeating myself,
I would cheerfully recommend SpiderWorks to any author
who is considering writing a technical book,
particularly on an Apple-related topic.
I would also (all modesty aside :-) recommend this book
to any developer who wants to get started using Spotlight.


<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/Apple"
                               rel="tag">Apple</a>,

  <a href="http://www.technorati.com/tag/desktop+search"
                               rel="tag">desktop search</a>,

  <a href="http://www.technorati.com/tag/Mac+OS+X"
                               rel="tag">Mac OS X</a>,

  <a href="http://www.technorati.com/tag/Macintosh"
                               rel="tag">Macintosh</a>,

  <a href="http://www.technorati.com/tag/SpiderWorks"
                               rel="tag">SpiderWorks</a>]]>
    </content>
</entry>

<entry>
    <title>Mechanically-augmented wikis</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/001002.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=1002" title="Mechanically-augmented wikis" />
    <id>tag:www.cfcl.com,2006:/~rdm/weblog//3.1002</id>
    
    <published>2006-03-07T05:31:11Z</published>
    <updated>2006-04-13T19:24:38Z</updated>
    
    <summary>I&apos;ve been thinking about ways to augment wikis
with mechanically-harvested information, navigation aids, etc.</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Science" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[I've been thinking about ways to augment
<a href="http://en.wikipedia.org/wiki/Wiki">wikis</a>
with mechanically-harvested information, navigation aids, etc.
The result would have the convenience and flexibility of wikis,
but wouldn't depend on humans to provide all of the content, links, etc.
As an example, let's consider the problem
of generating detailed documentation for large collections of software.
<!-- technorati tags start --><p style="text-align:right;font-size:10px;">Technorati Tags: <a href="http://www.technorati.com/tag/knowledge representation" rel="tag">knowledge representation</a>, <a href="http://www.technorati.com/tag/model-based documentation" rel="tag">model-based documentation</a>, <a href="http://www.technorati.com/tag/ontiki" rel="tag">ontiki</a>, <a href="http://www.technorati.com/tag/ontology" rel="tag">ontology</a>, <a href="http://www.technorati.com/tag/semantic wiki" rel="tag">semantic wiki</a>, <a href="http://www.technorati.com/tag/wiki" rel="tag">wiki</a></p><!-- technorati tags end -->]]>
        <![CDATA[<p>
Following the <a href="http://www.cfcl.com/rdm/MBD"
                >Model-based Documentation</a> approach,
a documentation system should reflect (at least, in part)
the major entities and relationships of the system being documented.
This consistency eases both development and use,
because the same "mental model" works
for both the system and the documentation.

<h2>Entities and Relationships</h2>

<p>
In the case of a software system,
most of the entities will be files
(e.g., documents, libraries, programs, source code)
or programming constructs
(e.g., data structures, methods, modules, objects).
Some additional entities
(e.g., bug reports, developers, requirements, tests, users)
can provide useful context for the system model.

<p>
Given this list of entities,
it's easy to think of relationships that we might find.
Data structures and methods, for example,
may be defined within the context of objects.
The definitions reside in source code files,
which are written and maintained by developers.
Similarly, there are dependency relationships
between source and object files, functions, etc.

<p>
A software system may have millions of entities
(and many more relationships), but a workable
<a href="http://en.wikipedia.org/wiki/Ontology_%28computer_science%29"
  >ontology</a> (i.e., collection of class definitions)
can be quite compact.
Indeed, if you have to define more than a few dozen classes,
you're probably doing something wrong!

<p>
Once the classes of entities and relationships are defined,
you get to collect the instance data.
Any program that deals with entities and relationships
will have to store information about them.
Harvesting this information may be tedious,
but it is seldom challenging.
See <a href="http://www.cfcl.com/rdm/MBD/mbd_extraction.php"
      >MBD: The Extraction Phase</a>
for an overview of this process.

<h2>Presentation</h2>

<p>
Once the instance data has been collected,
it must be presented in a way
that can be navigated and absorbed by humans.
It surely isn't much use, just sitting in the database!
Fortunately, this is reasonably well-trodden territory.

<p>
Documentation systems can present relationships as cross-references,
use them to generate indexes and context diagrams, etc.
They can also use the information
to generate other charts, diagrams, and tables.
Diagrams of class inheritance, data flows, and module dependencies
can be very useful to programmers.
Managers can use charts or tables that show how many bugs
are getting past the development and testing teams.

<p>
With a bit more analysis,
a documentation system can detect second-order attributes,
such as "hot spots" in the code base.
These might be defined by numerical measures,
such as the amount of check-in activity
or the number of bug reports they appear in.
In short, any desired level of analysis can be performed,
on a continuous and painstaking basis.

<p>
The biggest apparent challenge comes
from the fact we may have millions of entities to present,
with many more relationships.
How can we present all of that, usefully, to the users?
Gosh, I thought you'd never ask!

<h2>How about a Wiki?</h2>

<p>
<a href="http://en.wikipedia.org">Wikipedia</a> has already demonstrated
that <a href="http://en.wikipedia.org/wiki/MediaWiki">MediaWiki</a>
can handle a million inter-related pages.
In fact, because each Wikipedia page
has a shadowing "discussion" page,
the number is arguably two million!
If humans can navigate Wikipedia,
they can certainly navigate a (properly designed :-) documentation suite
of the same scale.

<p>
As a useful first step,
we could populate a MediaWiki database
with machine-generated entries.
Some simple ground rules and a modicum of care
would keep the human- and machine-generated content
from interfering with each other.
For example, manual notations could be restricted to selected parts
of machine-generated pages and/or to the related discussion pages.

<p>
It would be better, of course, to allow finer-grained mixing of content.
I don't know how this should be done,
but I'm quite confident that solutions will emerge
if the general approach proves useful...

<h2>Adding Reports, etc.</h2>

<p>
It would also be nice to let users request the inclusion
of particular "reports" in selected pages.
It's not hard to imagine a declarative or functional notation
that would safely let the user ask
for a particular table or graph
to be displayed.
(<i>Implementing</i> it might be a challenge;
 imagining it is not. :-)

<p>
Given that the mechanically-generated wiki pages are organized
around classes and instances of entities and relationships,
the report definitions could also reside in the wiki.
Some sort of <a href="http://en.wikipedia.org/wiki/Object-oriented_programming"
               >object-oriented</a> approach
might allow a report to "do the right thing"
when requested by a particular wiki page.

<p>
Following the same logic,
We could allow users to add instances of relationships
that they happen to know about.
For example, "program <tt>foo</tt> creates log file <tt>/var/log/foo</tt>".
Ideally, of course, there would be a way to integrate this information
into the structure of the wiki.

<p>
My essay on <a href="http://www.cfcl.com/rdm/Ontiki/"
  >Ontiki: an ontology-aware wiki</a>
sketches out some ideas about ways that this interaction might be supported.
<a href="http://www.cfcl.com/rdm/MBD/mbd_sem_wiki.php">Semantic wikis</a>,
which add the strengths of semantically-aware
(e.g., ontology-based) systems to wikis,
are already a good start in this direction.

<p>
There are some interesting coordination problems, to be sure.
How should the mechanized documentation system retrieve requests
from the wiki, return reports, edit pages, etc?
My current thinking is that this should all be done
through MediaWiki's underlying database,
but that's the subject of <a href="000999.html">another essay</a>.


<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/knowledge+representation"
                               rel="tag">knowledge representation</a>,

  <a href="http://www.technorati.com/tag/Model-based+Documentation"
                               rel="tag">Model-based Documentation</a>,

  <a href="http://www.technorati.com/tag/Oontiki"
                               rel="tag">Ontiki</a>,

  <a href="http://www.technorati.com/tag/ontology"
                               rel="tag">ontology</a>,

  <a href="http://www.technorati.com/tag/semantic+wiki"
                               rel="tag">semantic wiki</a>,

  <a href="http://www.technorati.com/tag/wiki"
                               rel="tag">wiki</a>]]>
    </content>
</entry>

<entry>
    <title>Using DBMS tables for inter-application communication</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/000999.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=999" title="Using DBMS tables for inter-application communication" />
    <id>tag:www.cfcl.com,2006:/~rdm/weblog//3.999</id>
    
    <published>2006-03-05T21:35:26Z</published>
    <updated>2006-04-13T19:27:18Z</updated>
    
    <summary>I have been thinking about ways to integrate some large applications and frameworks into an even larger system. In line with the Perl virtue of Laziness, I&apos;d like to write as little code as possible, particularly if it means making changes to the apps themselves. At the same time, I&apos;d like to avoid supporting a plethora of interfaces and protocols. Fortunately, I may have hit upon a useful approach. Technorati Tags: DBI-Link, DBI-Link, DBMS, ontology, Perl DBI, PostgreSQL, RDBMS, semantic...</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Science" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[I have been thinking about ways to integrate
some large applications and frameworks into an even larger system.
In line with the Perl virtue of Laziness,
I'd like to write as little code as possible,
particularly if it means making changes to the apps themselves.
At the same time,
I'd like to avoid supporting a plethora
of interfaces and protocols.
Fortunately, I may have hit upon a useful approach.
<!-- technorati tags start --><p style="text-align:right;font-size:10px;">Technorati Tags: <a href="http://www.technorati.com/tag/DBI-Link" rel="tag">DBI-Link</a>, <a href="http://www.technorati.com/tag/DBI-Link" rel="tag">DBI-Link</a>, <a href="http://www.technorati.com/tag/DBMS" rel="tag">DBMS</a>, <a href="http://www.technorati.com/tag/ontology" rel="tag">ontology</a>, <a href="http://www.technorati.com/tag/Perl DBI" rel="tag">Perl DBI</a>, <a href="http://www.technorati.com/tag/PostgreSQL" rel="tag">PostgreSQL</a>, <a href="http://www.technorati.com/tag/RDBMS" rel="tag">RDBMS</a>, <a href="http://www.technorati.com/tag/semantic wiki" rel="tag">semantic wiki</a>, <a href="http://www.technorati.com/tag/wiki" rel="tag">wiki</a></p><!-- technorati tags end -->]]>
        <![CDATA[<p>
The components in question are quite diverse,
including a Business Intelligence Platform
(<a href="http://www.pentaho.org">Pentaho</a>),
an Ontology Editor and Knowledge Base Framework
(<a href="http://protege.stanford.edu/">Prot&eacute;g&eacute;</a>)
and a Semantic Wiki
(<a href="http://meta.wikimedia.org/wiki/Semantic_MediaWiki"
   >Semantic MediaWiki</a>),
just for starters.
They are written in assorted languages (e.g., C, Java, PHP)
and support wildly diverse communication protocols.

<p>
However, each of them uses an
<a href="http://en.wikipedia.org/wiki/RDBMS">RDBMS</a>
(Relational Data Base Management System) for its persistent storage.
So, if I could find a way to transform and exchange database tables,
I might be able to effect the particular kinds of communication I need.
However, there are two obvious problems with this approach.

<p>
First, there is no consensus on <i>which</i> RDBMSs to support.
Some apps support more than one, but there is no "universal" choice.
Modifying all of the apps to use a common DBMS (e.g.,
<a href="http://en.wikipedia.org/wiki/MySQL"     >MySQL</a>,
<a href="http://en.wikipedia.org/wiki/PostgreSQL">PostgreSQL</a>)
is a non-starter, as it is insufficiently Lazy.

<p>
There is also the problem of reconciling the schemata (etc)
of the different databases.
Transforming complex data structures is a daunting task.
Doing it <i>inside</i> the apps is a total non-starter.
Even if I were willing to deal
with different programming languages and tool sets,
the learning curves and maintenance issues would be killers.

<p>
However, it may be possible to "step around" these problems,
by means of an inter-DBMS "communications server" daemon.
Such a program could read and write assorted databases,
storing and transforming the data as needed.
However, this still begs the question
of how to access the databases.


<h2>DBMS Access</h2>

<p>
There are strong arguments for using each DBMS's specific interfaces.
For instance, this guarantees access to all of the DBMS's features.
However, this is an expensive approach
in terms of development and maintenance.
Each DBMS has its own flavor of SQL, specialized capabilities, etc.
Writing code that can handle all of these variations
is a mind-boggling prospect.

<p>
Consequently, many cross-database projects use abstract interfaces, such as 
<a href="http://en.wikipedia.org/wiki/Open_Database_Connectivity">ODBC</a>
(Open Database Connectivity), possibly in a language-specific form such as
<a href="http://en.wikipedia.org/wiki/Java_Database_Connectivity">JDBC</a>
(Java Database Connectivity).
In the Perl community, this role is commonly filled by the
<a href="http://en.wikipedia.org/wiki/Perl_DBI">Perl DBI</a>
(Perl Database Interface) module,
which can use JDBC, ODBC, DBMS-specific,
and even flat-file "drivers":

<ul>
The DBI is the standard database interface module for Perl.
It defines a set of methods, variables, and conventions
that provide a consistent database interface,
independent of the actual database being used.
<p>
-- Tim Bunce
</ul>

<p>
Unfortunately, the consistency of the Perl DBI interface
is limited by the features available in the underlying databases.
So, one can either use
<a href="http://en.wikipedia.org/wiki/Lowest_common_denominator"
  >lowest common denominator</a> approaches
or pay close attention to which features are available in which DBMS.
In addition, settling on Perl DBI
would actively hinder <i>language</i> portability.
What if I wanted to recode the server in Ruby?

<p>
Although none of these issues is a show-stopper for exploratory coding,
I can well imagine them ganging up on me as the project develops.
So, it would be nice to have a strategy that has growth potential.

<h2>Enter DBI-Link</h2>

Consequently, I'm quite pleased to know about 
<a href="http://fetter.org/DBI-Link.pdf">DBI-Link</a>,
David Fetter's imaginative and powerful bit of PL/Perl hackery.
By installing <a href="http://dbi.perl.org/">Perl DBI</a> (via <a
href="http://www.postgresql.org/docs/8.1/static/plperl-trusted.html"
 >PL/PerlU</a>) as a
<a href="http://www.postgresql.org/">PostgreSQL</a> extension,
DBI-Link provides the wide connectivity of Perl DBI
<i>and</i> the advanced features of PostgreSQL.

<p>
The application (e.g., my server daemon) interacts directly with PostgreSQL,
so it can be written in any desired programming language.
Because PostgreSQL supports a wide range of database features,
the application can use simple queries
to perform complicated operations on the "target" databases.

<p>
Synchronization is clearly an issue.
We cannot expect a flat file (or even all RDBMSs)
to inform PostgreSQL when data has been changed.
If the timing requirements are loose,
a polling loop may suffice.
A small "flag" file or table can also be used to indicate
that a particular query should be made.
In the worst case, it may be necessary to make some small additions
to the target application.

<h2>Layering</h2>

<p>
Like many "layered" approaches,
DBI-Link can look rather convoluted in practice.
However, this also contributes greatly to its flexibility.
Here is an informal sketch of how my communications daemon might work,
with an expanded view of the DBI-Link "chain":

<p>
<img src="http://www.cfcl.com/rdm/weblog/images/db_daemon.png">
<img src="http://www.cfcl.com/rdm/weblog/images/dbi_link.png" height="300">

<h2>Requirements, Gotchas, etc.</h2>

<p>
DBI-Link has a relatively small set of requirements:

<ul>
  <p><li><a href="http://www.postgresql.org/"
           >PostgreSQL</a> (8.0 or better)

  <p><li><a
href="http://www.postgresql.org/docs/8.1/static/plperl-trusted.html"
           >PL/PerlU</a> (Untrusted PL/Perl)

  <p><li><a href="http://en.wikipedia.org/wiki/Perl"
           >Perl</a>       (5.8.5 or better)

  <p><li><a href="http://dbi.perl.org/"
           >Perl DBI</a> (aka, <tt>DBI.pm</tt>)

  <p><li>DBD modules for each data source
</ul>

<p>
Because DBI-Link consists of a small set of additions to Perl and PostgreSQL,
it does not impose any significant maintenance burden.
However, it should be noted that DBI-Link has <b>not</b> been written
with security in mind.
Indeed, the very idea of one database system making arbitrary queries
into other database systems is antithetical to basic ideas of secure design.

<p>
However, it should serve very well as a way to let me create my prototype.
And, if the results seem useful, a re-implementation always can be considered...


<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/DBI-Link"
                               rel="tag">DBI-Link</a>,

  <a href="http://www.technorati.com/tag/DBMS"
                               rel="tag">DBMS</a>,

  <a href="http://www.technorati.com/tag/MySQL"
                               rel="tag">MySQL</a>,

  <a href="http://www.technorati.com/tag/ontology"
                               rel="tag">ontology</a>,

  <a href="http://www.technorati.com/tag/Perl+DBI"
                               rel="tag">Perl DBI</a>,

  <a href="http://www.technorati.com/tag/PostgreSQL"
                               rel="tag">PostgreSQL</a>,

  <a href="http://www.technorati.com/tag/RDBMS"
                               rel="tag">RDBMS</a>,

  <a href="http://www.technorati.com/tag/semantic+wiki"
                               rel="tag">semantic wiki</a>,

  <a href="http://www.technorati.com/tag/wiki"
                               rel="tag">wiki</a>]]>
    </content>
</entry>

<entry>
    <title>Polyglot Programming</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/000998.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=998" title="Polyglot Programming" />
    <id>tag:www.cfcl.com,2006:/~rdm/weblog//3.998</id>
    
    <published>2006-03-05T07:00:03Z</published>
    <updated>2006-04-13T05:22:11Z</updated>
    
    <summary>Programmers who are facile with multiple languages
frequently combine them in single projects, to great effect.</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Science" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[Programmers who are facile with multiple languages frequently combine them,
to great effect, in single projects.
In <a href="http://www.cfcl.com/~rdm/weblog/archives/000910.html"
>Using PHP as a Macro Pre-processor</a>,
I only used two languages (HTML and PHP),
but others (e.g., CSS, JavaScript) could easily have been added.]]>
        <![CDATA[<p>
In a recent programming project,
I found myself combining Perl, PHP, SQL, and YAML.
Making things a bit more interesting,
three of these languages were used in a single file.
The results were a bit baroque,
but the approach allowed me to get a lot of flexibility
from a very small amount of code.

<p>
My client wanted a DBMS-driven report generator.
Users would "subscribe" to specified reports,
to be emailed on selected days.
Early each morning,
the program would perform the necessary SQL queries (etc),
generate the reports, and email them to the appropriate users.

<p>
Although a CGI-based subscription page may be added at some point,
an editable control file was all that I needed for the prototype.
So, I sketched up a <a href="http://www.yaml.org">YAML</a>
(YAML Ain't Markup Language) configuration file, of the following form:

<pre>
  # Describe each _query_, specifying any needed SQL statements
  # and/or executable scripts (e.g., output filters).

  Queries:
    _query_:
      sql:          |
        SELECT  foo
        FROM    Bar;
      script:       _script_

  # Describe each _report_, specifying each included _query_
  # and any surrounding text.

  Reports:
    _report_:
      subject:      'This is the subject line.'
      title:        'This is the title text.'
      description:  >
        This is a short (wrapped) description.
      content:
        - b:        |
            This is some block-formatted text.
        - t:        >
            This is some normal (wrapped) text.
        - q:        _query_

  # Describe each _user_, specifying each desired _report_
  # and when it should be delivered.

  Users:
    _user_:
      address:      'rdm@cfcl.com'
      full_name:    'Rich Morin'
      reports:
        - name:     _report_
          when:     'Mon Wed Fri'
</pre>

<p>
Even if you don't know YAML or SQL, this should be pretty easy to read.
As in Python, indentation is used to indicate structure.
However, unlike Python, YAML disallows tabs (whew!).

<p>
Labels that end in colons (e.g., <tt>Reports:</tt>) are hash keys.
A dash (<tt>-</tt>) indicates the start of a list element.
Values may be strings (e.g., <tt>'rdm@cfcl.com'</tt>), hashes, or lists.

<p>
The <tt>|</tt> and <tt>></tt> operators indicate that the following lines
will contain, respectively, block-formatted or line-wrapped text.
The next hash key or list sigil terminates this text.

<p>
Understanding the generated data structure is, however, a bit more complicated.
Using the <tt>Load()</tt> function
from Brian Ingerson's pure-Perl <tt>YAML.pm</tt> module,
my script obtains a reference (<tt>$r</tt>) to the data structure.
It can then access the enclosed content, as:

<pre>
  $r->{Queries}{$query }{sql}             # SQL command for $query

  $r->{Reports}{$report}{content}[0]{b}   # some block-formatted text
  $r->{Reports}{$report}{content}[2]{q}   # the name of a _query_

  $r->{Users  }{reports}[0]{name}         # the name of a _report_
  $r->{Users  }{reports}[0]{when}         # a list (string, really) of days
</pre>

<p>
The <tt>content</tt> area has a rather odd structure,
in that each list element is a hash with only one element.
This hack lets me add a content type flag
(e.g., <tt>b</tt>, <tt>q</tt>, <tt>t</tt>),
at a small cost in obscurity.

<h2>Enter PHP...</h2>

<p>
Given that I wrote the application and the "output filter" scripts in Perl,
I was already using three languages (four, if you count the English text :-).
However, things had only <i>started</i> to get <strike>ugly</strike> interesting.
You see, many SQL queries are very similar in structure,
varying only in specific details.

According to the DRY (Don't Repeat Yourself) principle,
this is a Bad Thing:

<ul>
  Every piece of knowledge must have a single, unambiguous,<br>
  authoritative representation within a system.
  <p>
  - <a href="http://www.pragmaticprogrammer.com/ppbook/extracts/rule_list.html"
      >The Pragmatic Programmers' List of Tips</a>
</ul>

<p>
I wanted to "boil out" this repetition,
but I didn't want to embed a macro facility (or whatever)
in the report generator.
Recognizing that laziness can be a virtue,
I decided to use command-line PHP as a macro pre-processor.

<p>
The first steps were strictly administrative.
I made the file executable, changed its extension  to <tt>.php</tt>,
and put a "shebang" line at the start of the file:

<pre>
  #!/usr/bin/env php
</pre>

<p>
I then had an executable file that would <i>emit</i> its own content
(except for the shebang line),
having first performed any requested PHP operations.
One of the pleasant side benefits of this
is that I no longer had to worry about the location of the file.
Just put it into <tt>~/bin</tt> and let the shell find it!

<p>
More critically, I could now mix PHP code into my YAML:

<pre>
  &lt;? # Loop over desired intervals.

    $i_hash  = array('today'     => 'NOW()',
                     'yesterday' => 'NOW() - INTERVAL 1 DAY',
                     'last_week' => 'NOW() - INTERVAL 7 DAY',
                    );

    $t_list  = array('foo', 'bar');

    foreach ($i_hash as $i_key => $i_val) {

      foreach ($t_list as $t_item) {
  ?>
      &lt;?= $t_item ?>_entries_for_&lt;?= $i_key ?>:
          sql:            |
            SELECT    COUNT(*)
            FROM      Events
            WHERE     date       =  &lt;?= "$i_val\n"  ?>
            AND       type       =  '&lt;?= "$t_item\n" ?>';
  &lt;? } ?>
</pre>

<p>
This generates six queries, as follows:

<pre>
      foo_entries_for_today:
          sql:            |
            SELECT    COUNT(*)
            FROM      Events
            WHERE     date       =  NOW()
            AND       type       =  'foo';
      foo_entries_for_yesterday:
      ...
</pre>

<p>
I should probably note that, unlike Perl's hashes,
PHP's associative arrays are traversed in a predictable order,
normally based on the order in which elements are defined.
Alternatively, the array's keys can be sorted
before the <tt>foreach</tt> loop traverses them:

<pre>
    ksort($i_hash);
</pre>

<p>
In the actual application, a 200-line PHP/YAML/SQL file
generated 500 lines of YAML/SQL.
Although the file was admittedly more arcane-looking,
it certainly met the DRY requirement
and I felt more confident that I wasn't going
to have copy-and paste errors, etc.

<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/Perl"
                               rel="tag">PHP</a>,

  <a href="http://www.technorati.com/tag/Perl"
                               rel="tag">PHP</a>,

  <a href="http://www.technorati.com/tag/report generator"
                               rel="tag">report generator</a>,

  <a href="http://www.technorati.com/tag/SQL"
                               rel="tag">SQL</a>,

  <a href="http://www.technorati.com/tag/YAML"
                               rel="tag">YAML</a>]]>
    </content>
</entry>

<entry>
    <title>Graph-related notions about LinkedIn</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/000950.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=950" title="Graph-related notions about LinkedIn" />
    <id>tag:www.cfcl.com,2006:/~rdm/weblog//3.950</id>
    
    <published>2006-02-02T18:18:26Z</published>
    <updated>2006-04-13T05:15:36Z</updated>
    
    <summary>LinkedIn bills itself as &quot;an online network of more than 4.8 million experienced professionals from around the world, representing 130 industries&quot;. My spouse Vicki Brown has a weblog entry that gives a general introduction, but it says little about the graph-related aspects of the network. So, here are some initial observations, based on a day or two of my own explorations......</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Science" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[<a href="https://www.linkedin.com">LinkedIn</a> bills itself as
"an online network of more than 4.8 million experienced professionals
from around the world, representing 130 industries".
My spouse
<a href="http://www.cfcl.com/vlb">Vicki Brown</a> has a
<a href="http://www.cfcl.com/vlb/weblog/archives/000948.html">weblog entry</a>
that gives a general introduction,
but it says little about the graph-related aspects of the network.
So, here are some initial observations,
based on a day or two of my own explorations...]]>
        <![CDATA[<p>
LinkedIn's  "online network of professionals"
is supported by a dynamic (and presumably, database-backed) web site.
As a member of the network,
I get several pages, listing my "contacts",
showing my "profile", etc.
The links between these pages are the usual ones supported by
<a href="http://en.wikipedia.org/wiki/HTML">HTML</a>,
but the underlying structure is rather different.
Specifically, it's a binary, sparsely-connected, typeless, undirected graph:
<ul>
  <p><li>binary
  <p>
    Connections (read, edges) are made between pairs of members.
    There is no provision for defining triads, etc.
    However, this isn't a problem,
    partially because members can link to groups, etc.

  <p><li>sparsely-connected
  <p>
    The LinkedIn graph is very large (five million and growing),
    but each member only connects to a tiny fraction of it.
    Some members have hundreds or even thousands of links,
    but most have only a few dozen.

  <p><li>typeless
  <p>
    Edges have no "type",
    aside from that conferred by the nodes they connect.
    So, for example, there is no way
    to differentiate an "acquaintance" from a "friend".
    There is, however, a way to give "endorsements".

  <p><li>undirected
  <p>
    Edges can be traversed in either direction.
    Thus, if I have a connection to Fred,
    he will have a connection to me.

</ul>

<p>
The connections go to other members,
but it is also possible to "link" to institutions
(e.g., schools, companies), groups, and interests (e.g., astronomy),
by mentioning them in the profile.
This allows me to search for people in an abstract manner
(e.g., people who worked at XYZ Company at the same time I did,
people that work in the ABC industry).

<p>
By adding connections,
I increase the size and/or connectivity of my portion of the graph.
This increases the
<a href="http://en.wikipedia.org/wiki/Network_effect">network effects</a>,
but the sparsity of the graph limits this to less than that predicted by
<a href="http://en.wikipedia.org/wiki/Metcalfe%27s_law">Metcalfe's Law</a>.

That is, unlike the Internet,
a member can't (trivially) connect to any other arbitrary member.
So, the fact that someone I don't know has joined an distantly connected subgraph
will have little bearing on my own connectivity.

<p>
So much for half-baked theory.  In practice, LinkedIn seems to
work well for re-connecting with folks.  Connecting with folks
you've never met is also possible, but your mileage may vary.
The recipient of a contact request may evaluate it by looking
over your profile, the intervening contacts, etc.
If they aren't convinced that you're "worthy",
you won't get the connection.


<p>
Even though this is billed as a professional network, contacts
tend to be a mix of family, friends, peers, etc.  Links to peers
are probably going to be more useful most of the time, but links
to family and friends connect you "outside of your own circle",
which can be useful when you want to make a contact.

<p>
Despite its strongly graph-based architecture,
LinkedIn has few resources for graph exploration, visualization, etc.
However, I expect to see this deficiency modified soon:
if LinkedIn doesn't provide these facilities, others will
(e.g., by means of Firefox plug-ins).


<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/graph"
                               rel="tag">graph</a>,

  <a href="http://www.technorati.com/tag/LinkedIn"
                               rel="tag">LinkedIn</a>,

  <a href="http://www.technorati.com/tag/network+effect"
                               rel="tag">network effect</a>
]]>
    </content>
</entry>

<entry>
    <title>Checking out Benford&apos;s Law...</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/000942.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=942" title="Checking out Benford's Law..." />
    <id>tag:www.cfcl.com,2006:/~rdm/weblog//3.942</id>
    
    <published>2006-01-29T06:19:52Z</published>
    <updated>2006-04-13T05:24:27Z</updated>
    
    <summary>I spend far too much time following links in www.digg.com.</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Science" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[I spend far too much time following links on
<a href="http://www.boingboing.net">Boing Boing</a>,
<a href="http://www.digg.com">Digg</a>, and
<a href="http://slashdot.org">Slashdot</a>,
but they're really rather addictive
(and make a nice break from my other activities :-).

Anyway, after reading a fascinating article on
<a href="http://www.rexswain.com/benford.html">Benford's Law</a>
(courtesy of Digg),
I decided to check its assertions for myself...]]>
        <![CDATA[<p>
Benford's Law asserts that the distribution of the first digits
in many sets of numbers will follow the function "log10(1 + 1/d)",
where 'd' is the value of the digit.
So, for example, '1' will the leading digit about 30% of the time.
Although the article gives a plausible explanation,
the assertion sounds so odd that I wanted to check it out for myself.

<p>
As it happens, I have a plausible data set very close at hand:
the byte counts of the files on my Mac OS X system.
So, I hacked up a bit of Perl code to walk the file tree,
tallying the file sizes.
Here's the code:

<pre><code>
#!/usr/bin/env perl
#
# fdf - first digit frequency (in file sizes)
#
#  Usage: sudo fdf
#
# Benford's Law predicts the probability of 'd' occurring as the first
# digit of a number to be log10(1 + 1/d), assuming that the number has
# at least four digits.  For details, see:
#
#  "Following Benford's Law, or Looking Out for No. 1"
#  http://www.rexswain.com/benford.html
#
# Written by Rich Morin, rdm@cfcl.com

use strict;
use warnings;

use File::Find;
use POSIX qw(log10);

our (@cnt, $cnt);

{
  my ($d, $data, $hist, $path, $pred);

  $path = '/';

  finddepth(\&#38;wanted, $path);

  print "cnt: $cnt\n\n";
  print "d  pred.  data           1         2         3         4\n";
  print "                123456789 123456789 123456789 123456789\n";

  foreach $d (1 .. 9) {

    $pred = 100 * log10(1 + 1.0/$d);
    $data = (100.0 * $cnt[$d]) / $cnt;
    $hist = '*' x int($data + 0.5);

    printf("%d (%4.1f): %4.1f  %s\n",
           $d, $pred, $data, $hist);
} }


sub wanted {

  my $s = -s $_;

  if (-f $_) {
    if ($s > 999) {
      $cnt++;
      $cnt[ substr($s, 0, 1) ]++;
} } }
</code></pre>

<p>
There's no real magic in the code above.
It just does a depth-first traversal of the file tree,
calling the "<code>wanted</code>" function for each node.
After the tree walk finishes,
the code prints the results:

<pre><code>
cnt: 766281

d  pred.  data           1         2         3         4
                123456789 123456789 123456789 123456789
1 (30.1): 32.1  ********************************
2 (17.6): 19.7  ********************
3 (12.5): 13.4  *************
4 ( 9.7):  9.4  *********
5 ( 7.9):  7.3  *******
6 ( 6.7):  6.0  ******
7 ( 5.8):  4.8  *****
8 ( 5.1):  4.1  ****
9 ( 4.6):  3.1  ***
</code></pre>

Although there is certainly some variation from the pred(icted) values,
the file system data follows the expected curve about as well as some
of the data shown in the article.
I also suspect that the file size distribution may be skewed
by certain "systematic" effects
(e.g., integral numbers of blocks in some types of binary files).
In any case, it's enough to convince me...


<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/Benford's+Law"
                               rel="tag">Benford's Law</a>,

  <a href="http://www.technorati.com/tag/mathematics"
                               rel="tag">mathematics</a>,

  <a href="http://www.technorati.com/tag/Perl"
                               rel="tag">Perl</a>]]>
    </content>
</entry>

<entry>
    <title>Using PHP as a Macro Pre-processor</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/000910.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=910" title="Using PHP as a Macro Pre-processor" />
    <id>tag:www.cfcl.com,2006:/~rdm/weblog//3.910</id>
    
    <published>2006-01-02T06:44:24Z</published>
    <updated>2006-04-13T05:11:23Z</updated>
    
    <summary>I&apos;ve been using both markup languages (e.g., troff, HTML) and macro pre-processors (e.g., cpp, m4) for a couple of decades now.</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[I've been using macro pre-processors (e.g., cpp, m4)
since I switched to Unix a couple of decades ago.
I've also used assorted <i>ad hoc</i> processors.
The shell makes a handy a runtime pre-processor for awk and sed scripts.
"Little language" processors can be written using Perl, YAML, etc.
In short, macro processing is not a new concept for me. 

<p>
Nor, for that matter,
is its application to document markup languages (e.g.,
<a href="http://en.wikipedia.org/wiki/Troff">Troff</a>,
<a href="http://en.wikipedia.org/wiki/HTML">HTML</a>).
In fact, I've written several special-purpose processors
to generate and/or massage documents.
Consequently, I'm a bit embarrassed to realize that I've been missing out
on a nifty way to use macros in web pages.]]>
        <![CDATA[<h2>Background</h2>

<p>
I use raw HTML for most of my hand-edited web pages.
I like the control that this gives me
and don't mind the extra typing (or mousing) it requires.
I also take the trouble to format my HTML "code"
(e.g., indenting, folding lines),
so that it is easy to read, edit, etc.

<p>
Unfortunately, the occasional humongous
<a href="http://en.wikipedia.org/wiki/Uniform_Resource_Locator">URL</a>
really gets in the way of my formatting preferences.
URLs can't be folded;
about the best I've been able to do is to put long URLs
on lines by themselves, so the damage is localized.
Repeating common portions of URLs
(e.g., <code>http://en.wikipedia.org/wiki</code>) is also a pain.
Grumble.

<h2>Enter PHP, by way of Ruby</h2>

<p>
Despite <a href="http://en.wikipedia.org/wiki/PHP">PHP</a>'s growing popularity,
I've never been all that interested in trying it.
Looks kind of like <a href="http://en.wikipedia.org/wiki/Perl">Perl</a>,
only broken (e.g., no support for variable argument lists!).
On the other hand, it certainly isn't <i>difficult</i>.

<p>
Meanwhile, I've been trying to wrap my
<a href="http://en.wikipedia.org/wiki/AARP">AARP</a>-qualified,
procedurally-oriented brain around the wild and crazy world
of <a href="http://en.wikipedia.org/wiki/Ruby">Ruby</a> and
<a href="http://en.wikipedia.org/wiki/Ruby_on_Rails">Rails</a>.
One of the niftier tricks that Ruby offers (and Rails uses)
is something called Embedded Ruby (aka ERb, eRuby).

<p>
The basic idea is that you can interleave arbitrary Ruby code
with declarative languages such as
<a href="http://en.wikipedia.org/wiki/HTML">HTML</a>,
<a href="http://en.wikipedia.org/wiki/XML">XML</a>,
<a href="http://en.wikipedia.org/wiki/YAML">YAML</a>, etc.
So, you can expand variables, make method calls,
iteratively generate blocks of text, etc.
Quite cool, really...

<p>
Unfortunately, ERb isn't (yet) part of the typical web server offerings.
So, as I was editing a bunch of HTML,
I started thinking about PHP.
Yeah, it's not Ruby (or even Perl), but it isn't <i>bad</i>.
Maybe I could use it to spare my poor, tired fingers.

<h2>Show Us the Code!</h2>

<p>
Ok, Ok...
The following code defines a string (<code>$WP</code>)
and a function (<code>ah</code>) that generates a typical HTML link.
I added a "<code>target</code>" attribute to the link,
to keep the original page from going away,
but still, it's all pretty vanilla:

<!-- <pre class="pretext"> -->
<pre>
  $WP = 'http://en.wikipedia.org/wiki';

  # ah - generate an "a href" link 
  #
  function ah($url, $text) {

    print "&lt;a href='$url'\n"
        . "   target='_blank'\n" 
        . "  &gt;$text&lt;a&gt;" 
        ;
  }
</pre>

<p>
Here is some HTML/PHP source code, drawn from one of my
<a href="http://www.cfcl.com/rdm/MBD"
  >Model-based Documentation</a> pages:

<pre>
   About half of the pages are generated by 
   &lt;?= ah("$WP/Doxygen",                 'Doxygen'); ?&gt;, a well-known
   &lt;?= ah("$WP/Documentation_generator", 'documentation generator'); ?&gt;;
   the rest are generated by custom Perl scripts.
</pre>

I won't claim that this is exactly <i>pretty</i>,
but it's a lot shorter and easier to read (IMHO) than the pure HTML form:

<pre>
   About half of the pages are generated by 
   &lt;a href='http://en.wikipedia.org/wiki/Doxygen'
      target='_blank'
     &gt;Doxygen&lt;a&gt;, a well-known
   &lt;a href='http://en.wikipedia.org/wiki/Documentation_generator'
      target='_blank'
     &gt;documentation generator&lt;a&gt;;
   the rest are generated by custom Perl scripts.
</pre>

One of the real joys of this approach
is that I can format the calls as desired,
break URLs any place I like, etc:

<pre>
  &lt;? $T1 = 'Some Really Enormous URL...'; ?&gt;

  &lt;p&gt;
    &lt;?= ah($T1, 'yada yada...'); ?&gt;<br>
    &lt;?= ah('http://www.clues_are_optional.xyz'
         . '?tag1=1234567890123456789012345678901234567890'
         . '&#38;tag2=1234567890123456789012345678901234567890',
           'Clues are optional...'); ?&gt;
</pre>

Also note that, if I need to fiddle with the link format,
I may get lucky and be able to edit just the function definition!

<h3>It Gets Worse...</h3>

<p>
Because I wanted a navigation sidebar on each page,
and the ability to generate a "printable version" of the pages,
I actually ended up writing 200+ lines of function definitions, etc.
However, this meant that each web page needed very little code:

<pre>
  &lt;? include 'mbd_defines.php';
     do_page('Case Study');
  ?&gt;
    ...
    &lt;?= sect_head('Design Goals'); ?&gt;
    ...
    &lt;?= sect_head('Implementation'); ?&gt;
    ...
    &lt;?= next_link('mbd_advice.php', 'Advice'); ?&gt;
  &lt;? page_footer(); ?&gt;
</pre>

<p>
Although all of this may be old hat to most PHP programmers
(let alone those
<a href="http://en.wikipedia.org/wiki/Agile_software_development"
  >Agile</a> Rubyists :-),
I've found it to be a very useful addition to my bag of tricks.
Now, if I could only convince
<a href="http://ecto.kung-foo.tv">ecto</a>
to expand my PHP to HTML, on its way to
<a href="http://en.wikipedia.org/wiki/Movable_Type">Movable Type</a>...


<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/HTML"
                               rel="tag">HTML</a>,

  <a href="http://www.technorati.com/tag/macro+pre-processor"
                               rel="tag">macro pre-processor</a>,

  <a href="http://www.technorati.com/tag/PHP"
                               rel="tag">PHP</a>]]>
    </content>
</entry>

<entry>
    <title>Peirce&apos;s semeiotic as a foundation for ontology</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/000903.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=903" title="Peirce's semeiotic as a foundation for ontology" />
    <id>tag:www.cfcl.com,2005:/~rdm/weblog//3.903</id>
    
    <published>2005-12-29T07:47:47Z</published>
    <updated>2006-02-02T18:01:57Z</updated>
    
    <summary>Conceptual graphs (CGs) are &quot;a system of logic based on the existential graphs of Charles Sanders Peirce and the semantic networks of artificial intelligence.&quot;...  The notions map quite well to the ones in Jeff Hawkins&apos; book &quot;On Intelligence&quot;, modulo differences in perspective, terminology, etc. FYI, John Sowa is an expert on knowledge representation systems.</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Science" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[<p>
John Sowa, an expert on knowledge representation systems,
developed <a href="http://conceptualgraphs.org/" target="_blank"
            >Conceptual Graphs</a> (CGs)
as a notation for First-Order Logic (a form of Predicate Calculus).
CGs are "a system of logic based on the existential graphs
of Charles Sanders Peirce
and the semantic networks of artificial intelligence".
The Conceptual Graphs Interchange Format (CGIF)
is being proposed to ISO as part of
<a href="http://cl.tamu.edu/" target="_blank">Common Logic</a> (CL),
which seeks to define logic-based formats for knowledge interchange.


<p>
I have been following the discussions on the CG and CL mailing
lists with great interest (if not always complete understanding :-).]]>
        <![CDATA[John Sowa's postings tend to be quite interesting, if a bit chewy.
The ideas in this recent posting map quite well to the ones in Jeff Hawkins' book
"<a href="http://www.onintelligence.org" target="_blank">On Intelligence</a>",
despite some differences in perspective, terminology, etc:

<blockquote>
From: "John F. Sowa"<BR>
Subject: CG: Peirce's semeiotic as a foundation for ontology

<p>
In many messages, I have claimed that Peirce's writings are fundamental to
the problems of ontology.  These remarks triggered several responses in
another forum, and I have excerpted, revised, and assembled my replies in the
following summary.
<HR width=100 align="left">
Everything that is perceived is perceived by means of a sign, which may be
just a sign of itself.  But more likely it is a sign of just some aspect of
the thing, such as an image, a feeling, a change in temperature, pressure,
sweetness, salinity, etc.
...
</blockquote >
<a href="http://suo.ieee.org/email/msg13283.html" target="_blank"
  >Read the rest of Dr. Sowa's posting</a> ...]]>
    </content>
</entry>

<entry>
    <title>Ontiki: an ontology-aware wiki</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/000902.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=902" title="Ontiki: an ontology-aware wiki" />
    <id>tag:www.cfcl.com,2005:/~rdm/weblog//3.902</id>
    
    <published>2005-12-29T04:06:35Z</published>
    <updated>2006-04-13T19:27:20Z</updated>
    
    <summary>Ontiki is a proposed design for an ontology-aware wiki, combining a wiki&apos;s convenience and freedom with the strengths of ontology-based systems.</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[As discussed in my notes on
<a href="http://www.cfcl.com/rdm/MBD" target="_blank"
  >model-based documentation</a>,
I'm quite interested in tools that make it easier
to combine human-edited and machine-generated content.
Ontiki is a proposed design for one such tool,
based on currently available
<a href="http://www.opensource.org" target="_blank"
  >Open Source</a> software.

<p>
As an <a href="http://en.wikipedia.org/wiki/Ontology_%28computer_science%29" target="_blank"
        >ontology</a>-aware
<a href="http://en.wikipedia.org/wiki/Wiki" target="_blank">wiki</a>,
Ontiki would allow pages to represent classes and instances
of entities and relationships.
The hope is that it could combine a wiki's convenience and freedom
with the strengths of ontology-based systems,
allowing a graceful merging of human-edited
and mechanically generated content.
<!-- technorati tags start --><p style="text-align:right;font-size:10px;">Technorati Tags: <a href="http://www.technorati.com/tag/knowledge representation" rel="tag">knowledge representation</a>, <a href="http://www.technorati.com/tag/model-based documentation" rel="tag">model-based documentation</a>, <a href="http://www.technorati.com/tag/ontiki" rel="tag">ontiki</a>, <a href="http://www.technorati.com/tag/ontology" rel="tag">ontology</a>, <a href="http://www.technorati.com/tag/semantic wiki" rel="tag">semantic wiki</a>, <a href="http://www.technorati.com/tag/wiki" rel="tag">wiki</a></p><!-- technorati tags end -->]]>
        <![CDATA[<h2>Background</h2>

<p>
An <a href="http://en.wikipedia.org/wiki/Ontology_%28computer_science%29" target="_blank"
     >ontology</a>, in computer science parlance,
is a set of statements (primarily, definitions) concerning a
<a href="http://en.wikipedia.org/wiki/Domain_of_discourse" target="_blank"
     >domain of discourse</a>.
Thus, an ontology defines the things we are talking about
and makes assertions about their attributes and relationships.
For example, the statement "control files may be read by processes"
might be found in an ontology for computer software.

<p>
Ontologies are very popular these days.
They are used extensively in
<a href="http://en.wikipedia.org/wiki/Knowledge_engineering" target="_blank"
     >knowledge engineering</a>
and form a critical part of the proposed 
<a href="http://en.wikipedia.org/wiki/Semantic_Web" target="_blank"
  >semantic web</a>.
Unfortunately, I haven't found any ontology-based tools
that strike me as being as flexible and easy to use as the typical wiki.

<p>
<a href="http://en.wikipedia.org/wiki/Wiki" target="_blank">Wikis</a>
are extremely easy to edit.
Links can be created by the simple act of typing in a
<a href="http://en.wikipedia.org/wiki/CamelCase" target="_blank"
  >CamelCase</a> word.
If a page doesn't already exist,
the act of clicking on its link will create one.
Simplified
<a href="http://en.wikipedia.org/wiki/Markup_language" target="_blank"
  >markup_languages</a> are also available,
easing the process of page creation.

<p>
The web's basic architecture reduces the apparent complexity
of web site (and thus wiki) generation.
Although collections of pages and links form a
<a href="http://en.wikipedia.org/wiki/Graph_%28data_structure%29" target="_blank"
  >graph-based data structure</a>,
few users think about this fact.
Looking at any given page, the user sees only content and links;
the global structure can be (and usually is) ignored.

<p>
In Ontiki, a similar simplification should apply.
Each page will only describe a given class or instance
of an entity or relationship.
So, although the user will get the benefits of a page's relationships
(e.g., displays of deduced information,
       clickable diagrams for context and navigation),
s/he will not need to keep the entire ontology in mind.

<p>
By allowing wiki pages to have precisely specified attributes and relationships,
Ontiki should be able to provide improved context and navigation,
generate and display deduced content, etc.
At least, I think it's worth a try!

<h3>Web Limitations</h3>

<p>
Web links (e.g., <code>&lt;A HREF="..."&gt;...&lt;/A&gt;</code>)
have limitations that greatly hamper automated analysis.
By augmenting a wiki with better structure and more metadata,
we can "get a better grip" on the relationships involved.

<p>
Because a web link only goes <i>to</i> a given page,
the entire graph must be traversed in order to find
<a href="http://en.wikipedia.org/wiki/Backlink" target="_blank"
  >backlinks</a>
(links that come <i>from</i> other pages).
For search engines such as
<a href="http://www.google.com" target="_blank">Google</a>,
this can be a massive problem,
because the "graph" in question is the entire web.

<p>
Most wikis do not bother to track backlink information.
Even fewer can display clickable context diagrams,
showing a page's "local neighborhood". 
<a href="http://pimki.rubyforge.org" target="_blank"
  >Pimki</a>
(an experimental "Personal Information Management" wiki)
does both, but it is a conspicuous exception.

<p>
Even Pimki, however, is constrained by the limitations of HTML links.
Although a link can have many attributes,
most only contain the
<a href="http://en.wikipedia.org/wiki/URL" target="_blank"
  >URL</a> for the target page
and the text content to be highlighted and displayed.
Nothing, in any case, indicates which links are of what "type".

<p>
Without typed links
(e.g., <code>Is_A</code>, <code>Has_A</code>, <code>Used_By</code>),
Pimki has very little information to work with.
It cannot, for example, filter by link type
or assess the "strength" of given links,
much less make deductions (e.g., inherited characteristics)
based on link types.

<h3>Ontology Awareness</h3>

<p>
By letting users add ontological information,
Ontiki would overcome these limitations,
as well as provide a convenient framework
for mechanical generation and/or augmentation of pages.
This should work particularly well
for documenting the details of highly-structured systems
such as collections of computer software.

<p>
In an Ontiki web about a Unixish operating system,
pages might represent classes
such as <code>File</code> or <code>Control_File</code>,
instances such as <code>/etc/passwd</code>, etc.
Relationships such as <code>Read_By</code> and <code>Written_By</code>
could be used to connect entity pages in well-defined ways.

<p>
Because relationships (and the roles within them)
would be defined in terms of class definitions,
instances would only be allowed to take on "legal" roles.
As a <code>Control_File</code>, <code>/etc/passwd</code>
would not be allowed to occupy the <code>Process</code> role
of a <code>Read_By</code> relationship.

<p>
Given a suitable ontology,
mechanized harvesting could be used to populate many instance pages
with attribute and relationship information.
For example, a scan of the Unix man pages
could fill in details on related documentation, files, etc.

<p>
Human participants, meanwhile, could make arbitrary links
and post comments or ask questions about any portion of any page.
By specifying interest in particular topics (e.g., Control_File),
they could also receive notification of changes, questions, etc.

<h3>The Bad News</h3>

Defining ontologies is tricky,
even for experts who are dealing with limited and well-defined domains.
Defining a consistent ontology for an unbounded domain,
full of fuzzy definitions (e.g., the World Wide Web)
is well beyond our current capabilities.

<p>
If a topic is highly structured and well understood,
defining an ontology for it may seem rather trivial.
Even so, there are many opportunities for confusion.
William Kent's short book,
<a href="http://www.authorhouse.com/BookStore/ItemDetail~bookid~2713.aspx"
  >Data and Reality</a>,
is a very readable introduction to these sorts of problems.

<p>
As topics get fuzzier, categorization can become difficult or even impossible.
George Lakoff's book,
<a href="http://search.barnesandnoble.com/booksearch/isbnInquiry.asp?isbn=0226468046"
  >Women, Fire, and Dangerous Things</a>
is a fascinating introduction to category theory,
drawing on disciplines such as anthropology, cognitive science, linguistics, and philosophy.

<p>
John Sowa's slide sets,
<a href="http://www.jfsowa.com/talks/challenge.pdf",
  >The Challenge of Knowledge Soup</a> and
<a href="http://www.jfsowa.com/talks/souprepr.htm",
  >Representing Knowledge Soup In Language and Logic</a>,
are entertaining introductions to knowledge engineering.
His introductory textbook,
<a href="http://search.barnesandnoble.com/booksearch/isbnInquiry.asp?isbn=0534949657f"
  >Knowledge Representation</a>,
is more daunting, but very worthwhile.

<p>
Finally, Clay Shirky's essay,
<a href="http://www.shirky.com/writings/ontology_overrated.html"
  >Ontology is Overrated</a>,
is an amusing and informative (if quite informal) overview and critique
of ontology and the semantic web.

<h3>The Good News</h3>

<p>
If Ontiki were intended as a full-scale
<a href="http://en.wikipedia.org/wiki/Expert_system" target="_blank"
  >expert system</a>,
the difficulties noted above would be far more worrisome.
However, Ontiki is more like a "wiki on steroids",
keeping track of ontological assertions
and (occasionally) making trivial deductions.
So, we can live with a bit of error and imprecision.

<p>
In creating or editing Ontiki pages,
users may assert things (e.g., attributes or relationships)
that aren't useful or even "true".
However, other users are perfectly free to ignore these assertions.
In short, relax...

<h2>Implementation</h2>

<p>
Even if I were in a position to create such a system from scratch,
it would be silly to do so.
By basing Ontiki on technologies with
<a href="http://www.opensource.org" target="_blank"
  >Open Source</a> implementations,
I can take advantage of existing code, interfaces, user communities, etc.

<p>
Ontiki's "front end" will probably be based on a
<a href="http://en.wikipedia.org/wiki/Ruby_on_Rails" target="_blank"
  >Rails</a>-based wiki, such as
<a href="http://www.instiki.org/show/HomePage" target="_blank"
  >Instiki</a> or
<a href="http://pimki.rubyforge.org" target="_blank"
  >Pimki</a>.
This should give me a nice
<a href="http://en.wikipedia.org/wiki/Model-view-controller" target="_blank"
  >model-view-controller</a> architecture for my base wiki,
allowing great flexibility in adding new functionality.

<p>
I also need to decide on a
<a href="http://en.wikipedia.org/wiki/Knowledge_representation" target="_blank"
  >knowledge representation</a> scheme for Ontiki's "back end".
For obvious reasons, I'd like this to allow interoperability
with other semantic web and knowledge representation projects.
I'd also like to leverage existing work in related areas.
I've found a number of promising technologies, including:

<ul>
  <p><li><a href="http://conceptualgraphs.org/" target="_blank"
           >Conceptual Graphs</a> (CG)

  <p><li><a href="http://en.wikipedia.org/wiki/Object_Role_Modeling"
           >Object Role Modeling</a> (ORM)

  <p><li><a href="http://en.wikipedia.org/wiki/Resource_Description_Framework" target="_blank"
           >Resource Description Framework</a> (RDF)

  <p><li><a href="http://en.wikipedia.org/wiki/Topic_Maps" target="_blank"
           >Topic Maps</a> (TM)

  <p><li><a href="http://en.wikipedia.org/wiki/Unified_Modeling_Language" target="_blank"
           >Unified Modeling Language</a> (UML) class diagrams
</ul> 

<p>
CG comes from the
<a href="http://en.wikipedia.org/wiki/Expert_system" target="_blank"
  >expert systems</a> side of the
<a href="http://en.wikipedia.org/wiki/Artificial_intelligence" target="_blank"
  >artificial intelligence</a> (AI) community.
ORM was created as a design technique for
<a href="http://en.wikipedia.org/wiki/Database_management_system" target="_blank"
  >database management systems</a>.
RDF and TM, aimed at indexing documents,
are emerging standards for the
<a href="http://en.wikipedia.org/wiki/Semantic_Web" target="_blank"
  >semantic web</a>.
UML was created as a "standard" set of diagramming notations
for software design.

<h3>Separated at Birth?</h3>

<p>
Despite their differing origins,
there are strong similarities between these technologies.
For example, CG, ORM, TM, and UML all provide variations on
<a href="http://en.wikipedia.org/wiki/Entity-relationship_model" target="_blank"
  >entity-relationship diagrams</a> (ERDs).
So, it's not inconceivable that any or all of them
could be used as differing "views" of a given set of knowledge.

<p>
The advantage of this, from my perspective,
is that it could allow me to take advantage of the differing strengths
of given representations.
CG, for example, is based on a form of
<a href="http://en.wikipedia.org/wiki/Predicate_calculus" target="_blank"
  >predicate calculus</a> known as
<a href="http://en.wikipedia.org/wiki/First-order_logic" target="_blank"
  >first-order logic</a> (FOL).
In fact, this allows CG to be used as one of the syntactic variants of
<a href="http://en.wikipedia.org/wiki/Common_Logic" target="_blank"
  >Common Logic</a> (CL), a proposed standard for knowledge interchange.

<p> 
ORM's notation is a bit different from CG's,
but it shares many common aspects.
For example, both systems describe collections of entities,
playing specified roles in multi-way (i.e., N-ary) relationships.
The big difference, with ORM,
is that ancillary notations can be added
to help in the definition of a supporting database schema.

<p>
RDF is, comparatively speaking, a very low-level representation
(based on subject / predicate / object "triples"):
sort of an "assembly language" for knowledge representation.
Nonetheless, RDF is gaining adherents (and supporting software)
at a rapid rate,
so it's clearly a technology to watch.

<p>
It's not inconceivable that Ontiki could use (or borrow from)
multiple notations and representation schemes,
taking advantage of their respective strengths.
Unfortunately, each of these technologies
has its own supporting software, user communities, etc.
What to do; what to do...

<h3>Prot&eacute;g&eacute;, Amine, or ???</h3>

<p>
My current thought is to use
<a href="http://protege.stanford.edu/" target="_blank"
  >Prot&eacute;g&eacute;</a> as the back end.
Initially created as an ontology editor,
Prot&eacute;g&eacute; is now a substantial and very extensible knowledge-base framework.
It can define and use fairly arbitrary knowledge bases,
either interactively or by means of a
<a href="http://en.wikipedia.org/wiki/Web_service" target="_blank"
  >web services</a> interface.

<p>
Using Prot&eacute;g&eacute; as Ontiki's knowledge base
would let me take advantage of a powerful system and dozens of "plug-ins".
New plug-ins (e.g., for CG or ORM) are also a possibility.
Thus, it might be possible for Prot&eacute;g&eacute;
to support assorted diagramming notations as input and editing modes.

<p>
Initially,
Prot&eacute;g&eacute; can serve as an interactive tool
for defining and experimenting with ontologies.
Over time, some of this activity could migrate to the wiki,
though it would probably be limited to "administrative" users.

<p>
Better yet, Prot&eacute;g&eacute; isn't the only game in town.
The <a href="http://amine-platform.sourceforge.net" target="_blank"
  >Amine Platform</a>,
for example, offers a roughly equivalent set of capabilities.
(Please feel free to direct me to other possibilities!)

<h3>Back to Reality</h3>

<p>
Although I have prototyped some relevant technology,
Ontiki is entirely vaporware at this point.
Thus, even the design comments are very speculative
(after all, I'm still looking for interesting technologies to "borrow").
Stay tuned, however; I might eventually produce something...


<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/knowledge+representation"
                               rel="tag">knowledge representation</a>,

  <a href="http://www.technorati.com/tag/Model-based+Documentation"
                               rel="tag">Model-based Documentation</a>,

  <a href="http://www.technorati.com/tag/Ontiki"
                               rel="tag">Ontiki</a>,

  <a href="http://www.technorati.com/tag/ontology"
                               rel="tag">ontology</a>,

  <a href="http://www.technorati.com/tag/semantic+wiki"
                               rel="tag">semantic wiki</a>,

  <a href="http://www.technorati.com/tag/wiki"
                               rel="tag">wiki</a>]]>
    </content>
</entry>

<entry>
    <title>A PC/FreeBSD War Story</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/000681.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=681" title="A PC/FreeBSD War Story" />
    <id>tag:www.cfcl.com,2005:/~rdm/weblog//3.681</id>
    
    <published>2005-02-12T20:00:24Z</published>
    <updated>2006-01-02T07:55:44Z</updated>
    
    <summary><![CDATA[ Cfcl's copy of FreeBSD 4.7 is starting to seem a bit dated. It doesn't have the latest Sendmail, never has supported DMA access on this mobo&sup1;, etc. Also, I have some interest in playing with Kirk McKusick's latest file system hacks, etc. So, I decided to install FreeBSD 5.3 on a "spare" machine, so that I could (eventually) migrate cfcl to it. The machine already had a 20 GB IDE drive, which is plenty big enough for the OS....]]></summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[<P>
Cfcl's copy of FreeBSD 4.7 is starting to seem a bit dated.
It doesn't have the latest Sendmail,
never has supported DMA access on this mobo&sup1;, etc.
Also, I have some interest in playing with Kirk McKusick's latest file system hacks, etc.
So, I decided to install FreeBSD 5.3 on a "spare" machine,
so that I could (eventually) migrate cfcl to it.
<P>
The machine already had a 20 GB IDE drive,
which is plenty big enough for the OS. 
I added a 200 GB drive for home directories, etc.
This may have been a mistake,
but it seemed reasonable at the time
(where are we going and what are we doing in this hand basket?).
<P>
Then I went to the FreeBSD site,
downloaded a bunch of CD images,
put them on discs (using the Mac OS X Disk Utility's "burn" feature)
and tried an install.
Didn't work.
Complained about the CD format.
Sigh.]]>
        <![CDATA[<P>
To save pain, I ordered a FreeBSD 5.3 distribution from BSDMall.
When it arrived, I tried again.
This got further, but didn't really succeed.
The 200 GB drive wasn't mountable, the Ethernet card didn't work, etc.
<P>
So, I went to a user group meeting and whined.  A lot.
Eventually, two helpful folks looked at the BIOS, tried things out,
and reported that my motherboard was simply too old and crufty
to support the 200 GB drive.
<P>
At this point, I _could_ have gone in the direction
of using a PCI disk controller card, an older disk, etc.
Being somewhat tired of the pain, however, I decided to do a massive upgrade.
So, this morning I went to the local branch of a chain computer store.
<P>
There, I was greeted by a friendly fellow who was pleased
to help me pick out some components.
He knew lots about this stuff and was quite happy to fill me in.
Unfortunately, much of what he knew was wrong.
<P>
I started to worry when he informed me
that the DB-9 connector on one board was a video connector.
I gently informed him that, although it looked similar,
it didn't have enough (rows of) pins.
He wasn't sure he believed me, but he was polite about it.
Later, he told me that a CPU part number (3200) indicated its speed in MHz.
That turned out to be off by 50%.
<P>
By this time, I was checking things out for myself, as well as I could.
Eventually, I found a mobo (MSI K8M Neo-V) / CPU (Athlon 64) combo
that looked attractive and reasonably priced.
I purchased it, along with a 512 MB stick of memory, for $400.
I had some trepidation,
based on the fact that the FreeBSD "Hardware Requirements" document
didn't mention the Athlon 64 and a web search found varying indications,
but I was feeling brave.
<P>
So, I went home, eviscerated the chassis, and played Dr. F&sup2; for a while.
All went well until I attempted to put in the CPU.
Hmmm; doesn't seem to drop in.  Look closely.
Yep, it has more rows of pins than the ZIF socket has holes.
Even when I rotate it 90 degrees.
This is not the CPU chip I'm looking for.
<P>
Went down to the store and explained the problem to the friendly fellow.
He convinced himself (eventually) that I might be right,
then got some techies to look things over.
Well, it was the right CPU, but not the right packaging for this motherboard.
However, after some fiddling and paperwork,
a replacement chip was socketed and I drove home.
<P>
Now I just need to plug in the RAM stick and...
Ulp; it doesn't seem to be snapping into place.
And the notches don't match the socket, in either count or placement.
Look in the manual.
Yep, this seems to be the wrong kind of RAM.
GRUMBLE; back to the store again...
<P>
More paperwork, but now I have the right RAM (I hope).
After dinner, I come home and snap in the RAM.
Check all the connections.  Power it up.
Doesn't seem to be powering up.  Examine assorted things.  Aha.
There appears to be a power jack just for the CPU.
My power supply has no plugs of this type.
See if Vicki's <a href="http://www.cfcl.com/vlb/weblog/archives/000682.html" target="_blank">birthday-present power supply</a>
(with the blue fan LEDs) has one of these plugs.
Indeed it does.  Steal it.
<P>
Now things light up, but it still won't boot off the DVD.
Mess with the BIOS (am I hot or what?).
It doesn't see the USB keyboard; jack in an old PC keyboard.  Rinse, repeat.
Eventually, I do (AFAICT) all of the right things
and the installation proceeds to an error-free finish.
<P>
I'm still not out of the woods,
because I need to migrate assorted files from the FreeBSD 4.7 system,
make miscellaneous changes, etc.
But the system boots, says all the right things in dmesg(8), and seems solid.
So, I'm past the hardware issues (I hope!).
A few hours of messing about with system configuration files
and a few days of stomping out brush fires
should see me through to a stable, upgraded system.

<P>
Given that we have several Mac OS X systems around the house,
none of which have ever caused me this much grief,
you might ask why I didn't just move cfcl to one of them.
My only excuse is that FreeBSD is open and configurable
(albeit a bit TOO configurable, at times)
and the migration from FB4.7 to FB5.3
shouldn't run into any real show-stoppers.
That is, it should support all the same applications, etc.
<P>
Nor can I blame the FreeBSD folks for the madness of PC "design".
On the other hand, I can and do blame them for the !@#$% installer
and the absence of reasonable support for either upgrading or patching
(as in Mac OS X) the OS.
These have been open issues for years,
but support for multiprocessors must be more important
than ease of installation and administration.  Or something.
Anyway, that's the news from San Bruno.
<P><HR width=50 align="left"><P>
&sup1;motherboard
<BR>
&sup2;"Frahnken steen! <I>Frahnk</I>en steen!"</I>]]>
    </content>
</entry>

<entry>
    <title>A Truly Technoid Gear Shift Knob</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/000626.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=626" title="A Truly Technoid Gear Shift Knob" />
    <id>tag:www.cfcl.com,2004:/~rdm/weblog//3.626</id>
    
    <published>2004-11-22T06:48:55Z</published>
    <updated>2006-01-02T07:56:13Z</updated>
    
    <summary> Yesterday, I noticed that my right hand hurt when I tried to pick things up. I couldn&apos;t remember doing anything to it, so I started having disturbing thoughts about possible &quot;creeping disabilities&quot;. I make my living typing and using a mouse, so this is not an inconsequential issue to me. I mentioned this to Vicki, so today she asked me how my hand was. It seemed better and I said so. Then, a bit later, I was driving our...</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Computers" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[<P>
Yesterday, I noticed that my right hand hurt when I tried to pick things up. I couldn't remember doing anything to it, so I started having disturbing thoughts about possible "creeping disabilities".  I make my living typing and using a mouse, so this is not an inconsequential issue to me.
<P>
I mentioned this to Vicki, so today she asked me how my hand was. It seemed better and I said so. Then, a bit later, I was driving our Scion xB and I said "This is what's hurting my hand. The gear shift knob is pressing on the inside of the joint".
<P>
We both agreed that this wasn't good. Vicki suggested that we stop by Kragen Auto after breakfast, to see if they had replacement shift knobs or soft covers. We went; we looked. They didn't have much, but we did try a small ball-shaped replacement knob. I liked it; my hand held it differently, so it didn't cause the problem. But Vicki didn't like it, because it didn't fit her hand.]]>
        <![CDATA[<P>
We tried the "Dive Shop" next, looking for a scrap of neoprene wet-suit material - maybe I could make a soft cover... Unfortunately, the shop buys all of their suits from other providers, so thay didn't have any scraps. Then Vicki mentioned that she had a spare ball from a <a href="http://us.kensington.com/html/1159.html" target="_blank">Kensington Turbo Mouse</a> (trackball), sitting in a drawer at home. Hmmmm...
<P>
So, we went home and Vicki dug out the trackball. We played with it a bit, confirming that it felt good to both of us.  Now, I just needed to mount it on the shift lever...
<P>
I made a few measurements, dug out some tools, etc.  I thought I was ready to drill the hole (I wasn't), but I couldn't find any taps.  In retrospect, I think I don't actually have any.  So, I called my friend and neighbor, Rich Pastor.  Rich has some tools that I don't have, because he works on car engines, etc.
<P>
Not too surprisingly, he had a set of taps and was willing to loan me one.  OK; I headed down the hill with assorted pieces and parts.  Once I get there, the question became both easier and harder.  Easier, because Rich was doing most of the work and knew what he was doing.  Harder, because his measurements and concerns complicated the problem.
<P>
By his measure, for instance, the shaft wasn't 1/2".  Rather, it was 15/64", a size that wasn't included in his set of taps.  He also indicated that the local OSH probably wouldn't carry this size and the local tool store wasn't open (on Sunday).
<P>
He also speculated that the trackball might be hollow; it seemed to flex a bit when squeezed hard and felt lighter than it really "should".  This could seriously complicate the issue of tapping; I might have to fill the innards with epoxy, first!
<P>
After some discussion, we tackled the "hollow ball" question by drilling a small hole into the ball.  It never "punched through", so we breathed a sigh of relief and went on.
<P>
We tried drilling and tapping a piece of plywood, to see how well it worked on the shaft.  Felt rather loose, but then, plywood isn't a really machinable material.  Also, I expected to have a deeper set of threads than the (rather thin) piece of plywood allowed.
<P>
Looking at the tag for the shift knob that Vicki and I had purchased, I saw that it was listed as 1/2".  And it had worked.  So, we decided to proceed, using some teflon tape to fill in any incidental looseness in the threads.
<P>
We drilled the ball about four or five times, working our way up to the desired bit size.  This, with Rich's light touch, kept the brittle plastic of the ball from "chipping".  It also kept the drill from "grabbing" the ball, melting it, etc.  Tedious, but definitely the right approach!
<P>
Once the hole was ready, Rich started the tapping process.  After he had it well started, I asked for a chance to finish it up.  This let me feel like I had actually "done something", although I knew that Rich had done all the real work.
<P>
Once the ball was tapped, I went out to the car, wrapped some teflon tape on the shift lever, and - very carefully - screwed on the ball.  I was also cautious about how hard I tightened it; the last thing I wanted to do was strip out the threads.
<P>
Fortunately, all went well.  Vicki and I both like the feel of the ball, the light grey color matches the interior color scheme of the Scion, and it gives us a bit of "techiana" to go along with the "cutesy" cloth lizards that Vicki had already attached to the fuzzy ceiling, using Velro "hook" material.
<P>
<a href="http://www.cfcl.com/~rdm/weblog/images/DSCN0017.JPG" onclick="window.open('http://www.cfcl.com/~rdm/weblog/images/DSCN0017.JPG','popup','width=1024,height=768,scrollbars=no,resizable=yes,toolbar=no,directories=no,location=no,menubar=no,status=yes,left=0,top=0');return false"><img src="http://www.cfcl.com/~rdm/weblog/images/DSCN0017-tm.jpg" height="100" width="133" border="1" hspace="4" vspace="4" alt="Dscn0017" /></a>

<a href="http://www.cfcl.com/~rdm/weblog/images/DSCN0018.JPG" onclick="window.open('http://www.cfcl.com/~rdm/weblog/images/DSCN0018.JPG','popup','width=768,height=1024,scrollbars=no,resizable=yes,toolbar=no,directories=no,location=no,menubar=no,status=yes,left=0,top=0');return false"><img src="http://www.cfcl.com/~rdm/weblog/images/DSCN0018-tm.jpg" height="100" width="75" border="1" hspace="4" vspace="4" alt="Dscn0018" /></a>]]>
    </content>
</entry>

<entry>
    <title>A Little Bit o&apos; Nuthin&apos;</title>
    <link rel="alternate" type="text/html" href="http://www.cfcl.com/rdm/weblog/archives/000517.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.cfcl.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=3/entry_id=517" title="A Little Bit o' Nuthin'" />
    <id>tag:www.cfcl.com,2004:/~rdm/weblog//3.517</id>
    
    <published>2004-07-08T02:39:13Z</published>
    <updated>2006-04-13T19:21:37Z</updated>
    
    <summary> One of the cool things about working at SLAC is being around &quot;Big Science&quot;. I had lunch the other day with a physicist and his grad student (or so I surmised) and learned something new and interesting. I had picked a nice shady picnic table for my lunch. The scientist and his colleague joined me and started chatting about their experiments. I listened in on their conversation and, eventually, got brave enough to ask a few questions. The scientist...</summary>
    <author>
        <name></name>
        <uri>http://www.cfcl.com/~rdm/weblog</uri>
    </author>
    
        <category term="Science" />
    
        <category term="Technology" />
    
    <content type="html" xml:lang="en" xml:base="http://www.cfcl.com/rdm/weblog/">
        <![CDATA[<P>
   One of the cool things about working at <A HREF="http://www.slac.stanford.edu">SLAC</A>
   is being around "Big Science".
   I had lunch the other day with a physicist and his grad student
   (or so I surmised) and learned something new and interesting.
<P>
   I had picked a nice shady picnic table for my lunch.
   The scientist and his colleague joined me
   and started chatting about their experiments.
   I listened in on their conversation and, eventually,
   got brave enough to ask a few questions.
   The scientist was happy to indulge me,
   telling me about an experiment he had done with neutrinos.
<P>
   Neutrinos are about as close to nothing as it is possible to be
   (and still be something :-).
   They have no charge and <I>very</I> little mass.
   They go right through most matter, with few noticeable effects.
   With no charge, you can't grab onto them, let alone fling them around.
   All of this makes them rather, erm, challenging to work with!
<P>
   Nonetheless, this scientist needed to do so.
   Specifically, he needed to generate a stream of neutrinos.
   And, because neutrinos are hard to detect,
   he needed rather a large stream.
   Here's how he did it.
<!-- technorati tags start --><p style="text-align:right;font-size:10px;">Technorati Tags: <a href="http://www.technorati.com/tag/big science" rel="tag">big science</a>, <a href="http://www.technorati.com/tag/neutrino" rel="tag">neutrino</a>, <a href="http://www.technorati.com/tag/physics" rel="tag">physics</a></p><!-- technorati tags end -->]]>
        <![CDATA[<P>
   He started with a bunch of protons.
   Protons are easy to come by;
   just ionize some Hydrogen and throw away the electrons.
   They are charged, so you can accelerate them with microwaves,
   steer them with magnets, etc.
   Consequently, generating a stream of protons is duck soup (quark!).
<P>
   As things approach the speed of light, however,
   any "acceleration" only produces tiny increments in speed.
   The energy doesn't go away; it just increases the particle's mass.
   So, these protons are now packing quite a punch.
<P>
   When a stream of accelerated protons slams into a "target"
   (e.g., a thin metal plate),
   a large number of particles and photons comes out the other side.
   One of these particles, a "muon", has a very short lifetime.
   When it "decays", it produces a neutrino and some other junk.
<P>
   Because of the way relativistic interactions work,
   all of this junk is traveling
   on (more or less) the same path as the original proton.
   Thus, he now has a stream of neutrinos,
   mixed in with a whole lot of other things.
   Here is where the neutrino's ability to zip through matter comes in handy.
   To get rid of the junk (but not the neutrinos),
   he put up a barricade of dense matter (e.g., steel plates).
<P>
   As the junk particles and photons go through each plate,
   some of them interact with the steel,
   producing still more particles and photons.
   With each interaction, however, some energy gets drained off,
   so each generation is less energetic.
   Eventually, none of the junk is energetic enough
   to get through the plates.
   This leaves us with a nice, clean stream of neutrinos.
<P>
   This technique requires quite a bit of steel, to be sure.
   In this experiment, the scientist used 300 meters (!) of steel plates,
   stacked up like playing cards.
   The plates, as it happens, were carved up
   from decommissioned battleships (swords into plowshares :-).
   Because they were slightly curved,
   they didn't stack up entirely neatly,
   but that didn't keep them from doing their job!
<P>
   So, having generated a stream of neutrinos,
   the scientist only needs to detect them.
   Unfortunately, the neutrino's ability to zip through matter
   is now a problem.
   Even with another hundred meters of dense matter (e.g., iron),
   only one neutrino in a million will interact with anything.
<P>
   But, since he has as many protons as he needs,
   that's not a real problem.
   Generate tens of millions of neutrinos per second,
   detecting every millionth one (or so).
   This gives him dozens of "events" per second,
   which is plenty enough to work with.
<P>
   Ain't science wonderful?


<p style="text-align:right;font-size:10px;">
  Technorati Tags:
  <a href="http://www.technorati.com/tag/big+science"
                               rel="tag">big science</a>,

  <a href="http://www.technorati.com/tag/physics"
                               rel="tag">physics</a>,

  <a href="http://www.technorati.com/tag/neutrino"
                               rel="tag">neutrino</a>]]>
    </content>
</entry>

</feed> 

