Re: native xml processing vs what Postgres and Oracle offer

From: rpost <rpost_at_pcwin518.campus.tue.nl>
Date: Wed, 7 Jan 2009 17:03:22 +0000 (UTC)
Message-ID: <gk2n8q$ulk$1_at_mud.stack.nl>



Keith H Duggar wrote:

>On Nov 10 2008, 9:17 am, salmobytes <salmoby..._at_closenuf.org> wrote:
>> I'm thinking about starting a hobby project.
>> I wrote a files-based Bulletin Board years ago.
>> I'd like to convert it to a more database-like system, so
>> password-identified users could edit old posts.
>>
>> Forums are inherently hierarchical
>
>Discussions that evolve in forums are in fact not hierarchal.
>Claims that they are arise, I believe, chiefly from a lack of
>imagination and brainwashing by current interfaces.

I strongly doubt it.

>For example, one often finds the need to respond, with one
>post, to many prior posts across multiple levels in a typical
>hierarchal view such as the "tree" view Google groups creates.

Indeed, sometimes I do; but not often. Is this due to an arbitrary restricion in the interfaces, or is it due to a more fundamental restriction in how discussions proceed? I think the latter. Reply to multiple postings would be more complex in character, e.g. quoted material would now have to be marked with the originating posting in some way and it's not clear whether they would be sufficiently understandable to those who arrive at them having read just one or only a few of them. Will readers be prepared to back up all the time into threads they haven't read in order to make sense of the exchange? Won't the result produce the 'lost in hyperspace' problem that has caused pretty much every hypertext and website to structure its material into a hierarchy full of crosslinks even when there is little or no technological support to do so? I think it will.

But you have a good point: has it even been tried?

>That is what I am doing write now. This paragraph responds to
>several posts at different levels in the google tree that all
>claim forums are hierarchies. However, since google provides
>the capability to "reply" to but a single message I had to
>choose one thus perpetuating this false structuring.

In my posting software I can arbitrarily edit the References: header, but you're right, all the viewers I know only present threads as trees, never as arbitrary directed acyclic graphs.

>What's more, a forum post may respond to content from
>other forum topics, other forums or even entirely different
>sources such as articles, emails, books, television, etc.

Cross-linking in discussion happens a lot in web-based writing of course. E.g. blogs responding to each other, with talkback/pings to create the forward links. This approaches what you have in mind, I think. Yet, while blogs are full of hyperlinks, their internal organization is nearly always linear or hierarchical. This is not because of necessary tehcnological limitations, but because of limitations in their users: if they weren't, postings would be much harder to find, to read and to write. E.g. I find editing and organizing Wikis pretty difficult.

>Even more amusing is that posts can actually preemptively
>respond to posts from the future! This most often happens
>when ignorant or lazy or time constrained or just plain
>stupid participants blurt out their two cents without having
>comprehended or read or cared (respectively) about said prior
>post that already address their belched vociferous reply.

Yes, but we can't preemptively guess NNTP Message-IDs. This is of course an implementation restriction, not a fundamental one.

>Furthermore, different parts of single post may reply to
>different subsets of prior posts, topics, forums, external,
>or future sources. Likewise those parts may respond only
>to parts of said sources.

Yes, this happens all the time, and in USENET well-established conventions exist for keeping this manageable (that I'm using here). A strong point is that they are really simple and expressed in plain text. Can something equally simple suffice for a discussion environment in which multi-replying is the norm?

>Thus, often in a general and very useful sense a post does
>not have a "parent" post in the narrow sense of a hierarchal
>tree as some have claimed here.

No, but the question is how useful it would be for the discussion environment to allow postings with *multiple* parents (meaning, I suppose, that we can navigate the postings as a DAG rather than just a tree).

>To improve the design flaws or your (and most or all other
>forums) I would humbly (because am and certainly not expert
>enough to claim this as a very "good" set of requirements)
>suggest that you aim to achieve at least the following:
>
>Phase 1 : Basic
> For every post the ability to:
> 1) refer to multiple posts (including THIS post and
> posts in other threads and forums)
> 2) refer to external sources

What does this mean, exactly? That we can follow the reference? Just hyperlink to it, quote it or attach a copy. That we have multiple documents open while browsing? In my web browser I have this all the time. That we can quickly determine a specific set of documents that become parents when initiating a reply? This is harder. Hypertext systems of the past supported stuff like this but I don't know how user-friendly it is.

> 3) denote that a referent REPLIES to a referent

What does this mean? That when at the referenced source we can follow the reference backwards to arrive at the reply? This is also hard, because the software controlling the creatin of the reply doesn't usually control how the referred sources are presented (usually to others, and written by others). But e.g. trackbacks/pings address it.

Anything more?

>Phase 2 : Content Parts
> For arbitrary parts of posts the ability to:
> 4) refer to multiple arbitrary parts of multiple posts

How to do this in a sufficiently useable way?

>Phase 3 : Temporal Correction
> For arbitrary content parts the ability to
> 5) edit the content part to add or remove referents

Some forum software allows this. Replies may become invalid. What you end up with is not a discussion forum, but a Wiki: writing for Wikis is very different.

>Phase 4 : Semantic Enrichment
> 6) In addition to the basic REPLIES, the ability to
> denote that a referent SUPPORTS, DISPUTES, REBUTS,
> AGREES, CLARIFIES, CALLS-UTTER-BULLSHIT, etc a
> referent (possibility including THIS).

The problem with this idea, as with any semantic enrichment, is that the labels, even when users can be trained to apply them, will rarely be accurate, unambiguous or complete. E.g. I may agree with your premise, but disagree that it supports your conclusion. Do I get to modify your SUPPORTS to CALLS-INTO-QUESTION?

>I think you would find that the above far more advanced forum
>fits nicely into a relational model

It doesn't make any difference.

The basic issue is the need to traverse along the discussion threads, which relational systems aren't usually optimized for, if they can express it at all.
Whether the relation forms trees or arbitrary DAGS doesn't make any difference.

The resolution, I think, is to optimize this type of use, either within the query engine or in some other way.

>and would support more
>efficient and productive discussion. For example, imagine how
>much easier it would be to refute a vociferous ignoramus when
>they continue to repeat the same bullshit. You can simply edit
>one of your prior responses adding a CALLS-UTTER-BULLSHIT
>reference to their latest post and immediately it could appear
>in various forum views.

You can; but will you? And where do you stop? E.g. why not label with specific logical fallacies? (STRAW-MAN, AD-NAUSEAM, BEGS-QUESTION). I'll tell you: the labelers won't agree on when to use which labels.

>KHD

-- 
Reinier
Received on Wed Jan 07 2009 - 11:03:22 CST

Original text of this message