Re: Xquery might have some things right

From: Eric Kaun <ekaun_at_yahoo.com>
Date: Mon, 08 Mar 2004 14:32:10 GMT
Message-ID: <Kt%2c.21999$Ao4.10506_at_newssvr31.news.prodigy.com>


"Corey Brown" <corey_at_spectrumsoftware.net> wrote in message news:V5m2c.66912$Tn.41083_at_bignews5.bellsouth.net...
>
> "Mikito Harakiri" <mikharakiri_at_iahu.com> wrote in message
news:52a2c.37$zW4.263_at_news.oracle.com...
> > [SNIP]
> > Sorry, but no, there is no logical redundancy in my approach. Every
client
> > queries only the info they want to know. It is extremely rare when
client
> > wants the whole database. Usually, client query filters most of the data
out
> > according to some criteria. The other common case is aggregated report,
> > where the output is small as well. In both cases extra information is
> > quickly discarded on server and would never cross the network boundary.
> > Whereas in your broadcasting proposal you flood the network with spam
that
> > clients rarely care about.
>
> Sorry, you're wrong again. I described a publish and subscribe
mechanism
> that broadcasts information only to those clients that have
"subscribed" to
> that particular publication.

Here's the key word in the entire argument: "publication." This is where the XML / text publishing camp and relational camp diverege greatly. Perhaps there are different needs that favor the different camps, but I think the relational camp simply subsumes the others because of its ability to represent hierarchies simply and cleanly, whereas the converse is not true.

What is a "publication"? If that's a simple thing to identify, then fine - you're sending a doc identified by some ID or other. What about when a client wants only part of a document? What about when they want a small collection of documents? What about when they want pieces of various documents matching certain criteria?

If their needs are simple, then the simple XML solution might work. When needs increase in complexity (i.e. specificity), then the "specification" that the clients use to indicate what "publication" they want approaches a relational query. And trying to accomplish that via an XML request is doomed to overcomplication, for all the reasons spelled out in Fabian Pascal's and Chris Date's various books.

Again, there are applications where the simplistic need is the only need... but they're subsumed by relations.

> In your solution, every client interested in a particular
> piece of information would have to query the server, essentially at
the same
> time, to retrieve it. Unless you can think of some way of throttling
your clients,
> I don't see how your solution would ever scale.

You're assuming a remote DB access that works the same way as existing JDBC and ODBC libraries: creating a connection, sending various queries, closing the connection. The query could easily work like an HTTP request, for which there are many varied "scalable" solutions. Or using Jabber protocols. Or an asynchronous one. There are many options. The point is this: when client needs vary widely, trying to "push" to each of them involves a lot of extra complexity on the server side, plus you have to reinvent a pseudo-relational query mechanism anyway. Another wheel!

And as Mikito pointed out, doesn't the client have to know a schema anyway? Two, in fact: the query schema and the response schema.

> Allowing clients to directly query the server via SQL makes your
design
> too fragile for real world application. Server side schema or
application
> changes will immediately break every client in your system. Your
solution
> requires that clients have intimate knowledge of the servers table
structures
> in order to properly formulate a "query" that would allow the client
to pull
> the information that they're interested in.

Depends on what you mean by "intimate." Knowing relation names and attribute names isn't very fragile to me. Knowing schema nesting is. And views hide most of this problem. If your data changes so much that the views can't keep up, then you'd have a corresponding mangling of your existing schemas anyway, so in either case the client needs to change.

Another advantage of relational: you know the structure of the answer from the question you ask. Not so in any XML/SOAP examples I've seen: you submit a query XML doc, and you receive a response XML doc, with no necessary correspondence between the two. You just have to know what to expect. And if there's one thing that kills integration, it's when you Just Have To Know.

> Interfaces between systems must form an abstraction layer between the
> server and the client in order to insulate one from the other and to
allow
> for application flexibility.

And that's precisely what relational was designed for and gives you! Even SQL does it.

> Not to mention, I don't know of too many application
> owners that are going to allow untrusted clients to have direct
database access
> to their systems.

What do you mean by "direct database access"? The ability to submit queries doesn't imply a need for "direct database access."

> Even if the access is in a read-only mode, there may be information
> stored on the server that is considered sensitive and cannot be shared
with other systems.
> For instance let's say your bank account numbers or social security
number was stored along
> with your stock information. Would you want external clients to be
able to randomly
> query for such information?

You already know the answer to that, and relationally-specified security constraints give a solution. Allowing the submission of relational queries doesn't imply that you open the entire schema!

> An AAIS is the only real world solution, and the AAIS has to be agreed
upon
> by both parties.

I'd appreciate a definition of "AAIS." Not just what the acronym stands for.

> The message protocol used between the server and client needs
> to be robust enough

What does "robust enough" mean? I hate the word "robust" - it's just motherhood and apple pie (unless you can clarify it for me).

> to allow the server to be able to service multiple clients with
> the same message protocol.

In other words, your protocol has to be complete and allow some degree of flexibility?

> This means that the messaging layer needs to be wordy
> enough for the clients to be able to distinguish between the the parts
of the message
> that is meant for them specifically and what is part of the message is
really only meant
> for another client.

Oh that sounds phenomenally bad. In other words, client A is going to see stuff "intended" (whatever that means) for client B, whether it wants it or not? Isn't that wasted bandwidth?

> XML fits that need fairly well.

If the "need" is to filter out noise, then maybe (I'm being generous). But I thought the need was for client X (any client) to get a coherent answer to its questions?

> But what I am saying is
> that it fits a particular class of computing problem very nicely and
should not be discredited
> simply because it can't be queried as nicely as a relational model.

And I'm saying that that class is both vanishingly small (at least with an app that grows in any meaningful way), and completely subsumed by the relational model.

  • erk
Received on Mon Mar 08 2004 - 15:32:10 CET

Original text of this message