Wednesday December 28, 2005Ontiki: an ontology-aware wikiAs discussed in my notes on model-based documentation, I'm quite interested in tools that make it easier to combine human-edited and machine-generated content. Ontiki is a proposed design for one such tool, based on currently available Open Source software.As an ontology-aware wiki, Ontiki would allow pages to represent classes and instances of entities and relationships. The hope is that it could combine a wiki's convenience and freedom with the strengths of ontology-based systems, allowing a graceful merging of human-edited and mechanically generated content. Technorati Tags: knowledge representation, model-based documentation, ontiki, ontology, semantic wiki, wiki BackgroundAn ontology, in computer science parlance, is a set of statements (primarily, definitions) concerning a domain of discourse. Thus, an ontology defines the things we are talking about and makes assertions about their attributes and relationships. For example, the statement "control files may be read by processes" might be found in an ontology for computer software. Ontologies are very popular these days. They are used extensively in knowledge engineering and form a critical part of the proposed semantic web. Unfortunately, I haven't found any ontology-based tools that strike me as being as flexible and easy to use as the typical wiki. Wikis are extremely easy to edit. Links can be created by the simple act of typing in a CamelCase word. If a page doesn't already exist, the act of clicking on its link will create one. Simplified markup_languages are also available, easing the process of page creation. The web's basic architecture reduces the apparent complexity of web site (and thus wiki) generation. Although collections of pages and links form a graph-based data structure, few users think about this fact. Looking at any given page, the user sees only content and links; the global structure can be (and usually is) ignored. In Ontiki, a similar simplification should apply. Each page will only describe a given class or instance of an entity or relationship. So, although the user will get the benefits of a page's relationships (e.g., displays of deduced information, clickable diagrams for context and navigation), s/he will not need to keep the entire ontology in mind. By allowing wiki pages to have precisely specified attributes and relationships, Ontiki should be able to provide improved context and navigation, generate and display deduced content, etc. At least, I think it's worth a try! Web Limitations
Web links (e.g., Because a web link only goes to a given page, the entire graph must be traversed in order to find backlinks (links that come from other pages). For search engines such as Google, this can be a massive problem, because the "graph" in question is the entire web. Most wikis do not bother to track backlink information. Even fewer can display clickable context diagrams, showing a page's "local neighborhood". Pimki (an experimental "Personal Information Management" wiki) does both, but it is a conspicuous exception. Even Pimki, however, is constrained by the limitations of HTML links. Although a link can have many attributes, most only contain the URL for the target page and the text content to be highlighted and displayed. Nothing, in any case, indicates which links are of what "type".
Without typed links
(e.g., Ontology AwarenessBy letting users add ontological information, Ontiki would overcome these limitations, as well as provide a convenient framework for mechanical generation and/or augmentation of pages. This should work particularly well for documenting the details of highly-structured systems such as collections of computer software.
In an Ontiki web about a Unixish operating system,
pages might represent classes
such as
Because relationships (and the roles within them)
would be defined in terms of class definitions,
instances would only be allowed to take on "legal" roles.
As a Given a suitable ontology, mechanized harvesting could be used to populate many instance pages with attribute and relationship information. For example, a scan of the Unix man pages could fill in details on related documentation, files, etc. Human participants, meanwhile, could make arbitrary links and post comments or ask questions about any portion of any page. By specifying interest in particular topics (e.g., Control_File), they could also receive notification of changes, questions, etc. The Bad NewsDefining ontologies is tricky, even for experts who are dealing with limited and well-defined domains. Defining a consistent ontology for an unbounded domain, full of fuzzy definitions (e.g., the World Wide Web) is well beyond our current capabilities.If a topic is highly structured and well understood, defining an ontology for it may seem rather trivial. Even so, there are many opportunities for confusion. William Kent's short book, Data and Reality, is a very readable introduction to these sorts of problems. As topics get fuzzier, categorization can become difficult or even impossible. George Lakoff's book, Women, Fire, and Dangerous Things is a fascinating introduction to category theory, drawing on disciplines such as anthropology, cognitive science, linguistics, and philosophy. John Sowa's slide sets, The Challenge of Knowledge Soup and Representing Knowledge Soup In Language and Logic, are entertaining introductions to knowledge engineering. His introductory textbook, Knowledge Representation, is more daunting, but very worthwhile. Finally, Clay Shirky's essay, Ontology is Overrated, is an amusing and informative (if quite informal) overview and critique of ontology and the semantic web. The Good NewsIf Ontiki were intended as a full-scale expert system, the difficulties noted above would be far more worrisome. However, Ontiki is more like a "wiki on steroids", keeping track of ontological assertions and (occasionally) making trivial deductions. So, we can live with a bit of error and imprecision. In creating or editing Ontiki pages, users may assert things (e.g., attributes or relationships) that aren't useful or even "true". However, other users are perfectly free to ignore these assertions. In short, relax... ImplementationEven if I were in a position to create such a system from scratch, it would be silly to do so. By basing Ontiki on technologies with Open Source implementations, I can take advantage of existing code, interfaces, user communities, etc. Ontiki's "front end" will probably be based on a Rails-based wiki, such as Instiki or Pimki. This should give me a nice model-view-controller architecture for my base wiki, allowing great flexibility in adding new functionality. I also need to decide on a knowledge representation scheme for Ontiki's "back end". For obvious reasons, I'd like this to allow interoperability with other semantic web and knowledge representation projects. I'd also like to leverage existing work in related areas. I've found a number of promising technologies, including:
CG comes from the expert systems side of the artificial intelligence (AI) community. ORM was created as a design technique for database management systems. RDF and TM, aimed at indexing documents, are emerging standards for the semantic web. UML was created as a "standard" set of diagramming notations for software design. Separated at Birth?Despite their differing origins, there are strong similarities between these technologies. For example, CG, ORM, TM, and UML all provide variations on entity-relationship diagrams (ERDs). So, it's not inconceivable that any or all of them could be used as differing "views" of a given set of knowledge. The advantage of this, from my perspective, is that it could allow me to take advantage of the differing strengths of given representations. CG, for example, is based on a form of predicate calculus known as first-order logic (FOL). In fact, this allows CG to be used as one of the syntactic variants of Common Logic (CL), a proposed standard for knowledge interchange. ORM's notation is a bit different from CG's, but it shares many common aspects. For example, both systems describe collections of entities, playing specified roles in multi-way (i.e., N-ary) relationships. The big difference, with ORM, is that ancillary notations can be added to help in the definition of a supporting database schema. RDF is, comparatively speaking, a very low-level representation (based on subject / predicate / object "triples"): sort of an "assembly language" for knowledge representation. Nonetheless, RDF is gaining adherents (and supporting software) at a rapid rate, so it's clearly a technology to watch. It's not inconceivable that Ontiki could use (or borrow from) multiple notations and representation schemes, taking advantage of their respective strengths. Unfortunately, each of these technologies has its own supporting software, user communities, etc. What to do; what to do... Protégé, Amine, or ???My current thought is to use Protégé as the back end. Initially created as an ontology editor, Protégé is now a substantial and very extensible knowledge-base framework. It can define and use fairly arbitrary knowledge bases, either interactively or by means of a web services interface. Using Protégé as Ontiki's knowledge base would let me take advantage of a powerful system and dozens of "plug-ins". New plug-ins (e.g., for CG or ORM) are also a possibility. Thus, it might be possible for Protégé to support assorted diagramming notations as input and editing modes. Initially, Protégé can serve as an interactive tool for defining and experimenting with ontologies. Over time, some of this activity could migrate to the wiki, though it would probably be limited to "administrative" users. Better yet, Protégé isn't the only game in town. The Amine Platform, for example, offers a roughly equivalent set of capabilities. (Please feel free to direct me to other possibilities!) Back to RealityAlthough I have prototyped some relevant technology, Ontiki is entirely vaporware at this point. Thus, even the design comments are very speculative (after all, I'm still looking for interesting technologies to "borrow"). Stay tuned, however; I might eventually produce something... Technorati Tags: knowledge representation, Model-based Documentation, Ontiki, ontology, semantic wiki, wiki
Ontiki: an ontology-aware wiki
in
Computers
,
Technology
- posted at Wed, 28 Dec, 20:06 Pacific
| «e»
| TrackBack
|
Comments
Jack Park kindly pointed me to IkeWiki, which has many of the attributes I propose for Ontiki. Although some of the details aren't what I have imagined, I'll certainly enjoy "kicking the tires" on it.
And, when I have something relatively coherent to say about IkeWiki, I'll be sure to write it up!
Posted by: Rich Morin | December 29, 2005 4:01 PM
I've been directed to more resources, including the Semantic Wiki Interest Group and the Wikipedia entry for Semantic Wiki. Thanks, folks!
Posted by: Rich Morin | December 30, 2005 11:51 AM
You may want to take a look at HyperDE. It is not a semantic wiki system, but a more general hypermedia application development environment that uses ontologies as models. It is based on the SHDM method. The implementation is based on Ruby on Rails, where the ActiveRecord component was replaced by a SemanticRecord component, using the Sesame RDF(S) database.
More details can be found at
http://server2.tecweb.inf.puc-rio.br:8000/projects/hyperde/trac.cgi/wiki.
Regards,
Daniel Schwabe
Posted by: Daniel Schwabe | January 1, 2006 5:48 PM