Re: Ideas for World Hierarchy Example

From: dawn <dawnwolthuis_at_gmail.com>
Date: 11 Jan 2007 08:03:36 -0800
Message-ID: <1168531416.080371.303440_at_i39g2000hsf.googlegroups.com>


Marshall wrote:
> On Jan 10, 4:27 am, "dawn" <dawnwolth..._at_gmail.com> wrote:
> > Marshall wrote:
> > > On Jan 8, 9:43 pm, "Neo" <neo55..._at_hotmail.com> wrote:
> > > > What important things should a hierarchy of the world store? I am
> > > > thinking of adding planets, continents, countries, US states, most
> > > > populus cities, oceans, longest rivers, highest mountains and most
> > > > spoken languages for starters. Can someone suggest other items they
> > > > would find interesting? Or websites similar tohttp://worldatlas.com
> > > > for such data?
> >
> > > One easy source of info: wikipedia. Put all that stuff into a
> > > knowledgebase and you've got something.
> >
> > Isn't it already in a "knowledgebase"?
>
> Not in the sense I was thinking of. Not in the sense of Cyc, for
> example.

Yes, so it is a matter of definition, just as "database" is.

> Almost none of the data is encoded in a way a machine can do
> any useful semantic processing on it.
>
> What it is is a wiki: a document storage, editing and retrieval
> system, with a data model that is roughly a map:
>
> title -> document
>
> I'm not aware that the data model is any more sophisticated
> than that. And with that as your data model, about the only
> kinds of questions the system can answer are
>
> What is the document whose title is "Futurama"?
>
> and not much more.

when you click on words in a wiki, similar to "clicking" on a foreign key value within a database (an actual instance of a database), you navigate to another node (called a "document", think "record") that is set up as a tree (specified in xhtml, for example) with more foreign key values found by which you can find more documents (records). A wiki is a web. It can be modeled as a digraph with trees on the nodes.  This is pretty much the model for many databases that are not RDBMS's by design (e.g. UniData, UniVerse, OpenQM, Revelation, jBASE, D3, Cache', UniVision)

> Add a secondary index and
> you can ask questions like "what are the titles of
> documents that contain the words "face" and "plate"
>
> http://www.google.com/search?domains=en.wikipedia.org&q=face+plate

While this is not how a typical wiki would read, put a proposition in a wiki document like

The person with ID 13425, John Smith, is a male born on 4/6/56 who works for the company with the ID 43999.

This "Person document" would have 43999 as a hyperlink to the "Organization document" whose key is 43999.

The name of this document might be Person 13425 and the name of the one you can navigate to could be Organization 43999. (Alternatively, the first might be Party_13425 and the second Party_43999)

You can logically (not with any tools of which I am aware) take such a database (Cache', for example) and view it as a wiki where the proposition above would look more like PersonId=13425 FullName=John Smith.... The reverse would take only slightly more design effort and would also be easy to do.

Similarly, every SQL-DBMS could be reflected as a wiki, but the nodes would not be trees, but a list (as they need to appear in an order) of the header and value information. Going the other direction, however, requires a list of "links" from each node to be modeled for the SQL-DBMS, which is not as trivial (it brings in that OO-RM mismatch).

> This is quite useful, if the point is to generate reading material
> for a human. If the point is to capture knowledge such that
> the computer can operate on it and reason about it the way
> a human can, well, it simply doesn't do that.
>
>
> > > I wouldn't suggest
> > > a hierarchy is a good way to go, though.
> > Agreed. It seems like a di-graph (aka web) with trees on the nodes (eg
> > xhtml) is working well, however, right?
>
> Again, working well for what? The web is used for so many things,
> some of which it is good at and some not so much.
>
> The web is used as a hypertext document retrieval system, and
> it's excellent at that. Which should not be surprising, since that's
> what it's designed for.

But prior to the web, databases were also designed with a similar data model -- di-graph with trees on the nodes.

> The web is used as an application platform, and it succeeds at
> that, but mostly on the basis of the fact that it has a universal
> client (a powerful idea) that has near-total penetration on the
> basis of its killer app, which is hypertext document retrieval.

I would suggest that the "data model" is also a contributing factor to its success, a data model that (roughly) is the same as one that has held up for decades in spite of the RDBMS attempts to label it "legacy" and disregard it as having been proven to be a bad model (labeled "network" or sometimes even called "hierarchical" because the nodes are trees).

> If you judge it solely on its merits as a development platform,
> and compare it to other development platforms, then I do not
> think I am exaggerating if I say in my best Comic Book Guy
> imitation: "Worst. Development. Platform. Evar."
>
> The web is used as a dbms. Put together HTTP, HTML,
> XML, XHTML, XPath, XQuery, XSLT, and pick only one
> of DTD, XMLSchema, or RELAX-NG, and you've got ...
> a frickin' nightmare.

Now that is where I can wholeheartedly agree with you. I would much prefer to work with the structure of a "real" dbms that has the same data model.

> Here I will only quote Wadler:
>
> "So the essence of XML is this: the problem it solves is
> not hard, and it does not solve the problem well."

I've used that quote before and agree with it. The data model, however, is one I prefer working with over the RDMBS model which has castrated databases by putting them in "the form once known as 1NF", removed lists (insisting that all ordering be explicitly handled by the user/developer), and required 3VL.

In lieu of other good names for the data model I prefer, given that it is a web where each page provides a definition/description of its "title" we could call it the wiki data model.

Other than the opinions I stated above about preferring the wiki data model to the relational data model, is there anything else that is not clear or with which you disagree? --dawn

> http://homepages.inf.ed.ac.uk/wadler/topics/xml.html#xml-essence
>
>
> Marshall
Received on Thu Jan 11 2007 - 17:03:36 CET

Original text of this message