Re: Question on Structuring Product Attributes
From: Jan Hidders <hidders_at_gmail.com>
Date: Tue, 8 Oct 2013 19:51:35 +0200
Message-ID: <52544627$0$1659$e4fe514c_at_dreader35.news.xs4all.nl>
>>> In article <524d7370$0$618$e4fe514c_at_dreader34.news.xs4all.nl>,
>>> Jan Hidders <hidders_at_gmail.com> wrote:
>>>
>>>
>>> Loosely speaking, the complexity, measured as a function of the size of the
>>> database (the so-called “data complexity”), of answering a query of
>>> relational
>>> calculus is in L (solvable in deterministic logarithmic space), so it is
>>> tractable. If the query language is extended with existential second-order
>>> quantification, the problem becomes NP-complete. Adding only transitive
>>> closures
>>> makes the problem complete for NL (still in P). These results (and others)
>>> can
>>> be found in “The complexity of relational query languages” by M. Vardi.
>>> This
>>> should partially answer your latter question.
Date: Tue, 8 Oct 2013 19:51:35 +0200
Message-ID: <52544627$0$1659$e4fe514c_at_dreader35.news.xs4all.nl>
On 2013-10-08 12:17:15 +0000, Nicola said:
> In article <5251dfef$0$574$e4fe514c_at_dreader34.news.xs4all.nl>, > Jan Hidders <hidders_at_gmail.com> wrote: > >> On 2013-10-05 15:48:36 +0000, Nicola said: >>
>>> In article <524d7370$0$618$e4fe514c_at_dreader34.news.xs4all.nl>,
>>> Jan Hidders <hidders_at_gmail.com> wrote:
>>>
>>>> On 2013-10-02 18:55:36 +0000, Nicola said: >>>> >>>>> In article <1601143370402385874.040013hidders-gmail.com_at_news.xs4all.nl>, >>>>> Jan Hidders <hidders_at_gmail.com> wrote: >>>>> >>>>>> Derek Asirvadem <derek.asirvadem_at_gmail.com> wrote: >>>>>>> This one should not pass unnoticed. >>>>>>> >>>>>>> On Monday, 18 February 2013 09:09:15 UTC+11, Eric wrote: >>>>>>>> >>>>>>>> To quote from Codd[1970]: >>>>>>>> >>>>>>>> "A firstorder predicate calculus suffices >>>>>>>> if the collection of relations is in normal form." >>>>>>> >>>>>>> Codd specifies what that Normal Form is, that reference is not merely a >>>>>>> generic or general term. >>>>>> >>>>>> Indeed. Also note that Codd's claim is not unproblematic. Even if you >>>>>> restrict yourself to the flat relational model, first order logic cannot >>>>>> express many practical queries. So what exactly does he mean by >>>>>> "sufficient" here? >>>>> >>>>> I interpret that quote as follows: "If we allow relations as attribute >>>>> values, >>>>> we need (at least) a second-order relational calculus". In this sense, >>>>> first-order logic "suffices" when relations are "flat" (we don't need to >>>>> quantify over relations). Of course I agree with you, many interesting >>>>> queries >>>>> cannot be expressed in FOL, but you can always design a query language >>>>> with >>>>> additional power (SQL is an example). >>>> >>>> Agreed, that is indeed probably what he meant. But I guess I'm trying >>>> to argue how weak that argument actually is. Sure, keeping things >>>> first-order looks like something that keeps it simple, but does it >>>> really? If you need extensions anyway, how bad then, relatively >>>> speaking, is it to move to a higher-order setting?
>>>
>>> Loosely speaking, the complexity, measured as a function of the size of the
>>> database (the so-called “data complexity”), of answering a query of
>>> relational
>>> calculus is in L (solvable in deterministic logarithmic space), so it is
>>> tractable. If the query language is extended with existential second-order
>>> quantification, the problem becomes NP-complete. Adding only transitive
>>> closures
>>> makes the problem complete for NL (still in P). These results (and others)
>>> can
>>> be found in “The complexity of relational query languages” by M. Vardi.
>>> This
>>> should partially answer your latter question.
>> >> I'm a little bit insulted that you think I did not know that. :-) > > Sorry, I've realized that I was not adding much to the discussion at the same > time as I've hit "Post"... That's what happens when one replies without > forethought :) > >> There are similar results for the Nested Relational Calculus, so I do not see >> that as a very convincing reason to stick to a first order data model. >> But maybe that's not what you meant. >> […] >> Computational complexity is not the only thing that determines where a >> database query or integrity language is an effective one. I'm not >> saying these results do not matter or should not be taken into account >> when desiging these languages, but ultimately the proof of the pudding >> is in the eating, not in theorizing about it. > > If you mean that theoretical bounds on the complexity do not say much about the > actual behavior of an algorithm, I thoroughly agree (if not, I'm afraid I don't > understand what “effective” means to you).
Close. I define a good query language as one that makes the DBMS a better DBMS(*). Computational complexity is only one aspect, and not necessarily the decisive one. A language can be computationally expensive but still good because it is easy to use and understand, and it is implemented such that a well-defined subset is efficient, and still the more expensive queries can be delegated to the DBMS if you want, et cetera.
I know this is comp.databases.theory, but let us not forget that theory is an approximation of reality, simplified enough so the theoreticians can have their theorems. :-)
(*) Not a very well-defined concept, I know, but that's besides the point.
- Jan Hidders