Re: Dreaming About Redesigning SQL

From: Lauri Pietarinen <lauri.pietarinen_at_atbusiness.com>
Date: Sun, 12 Oct 2003 00:15:21 +0300
Message-ID: <bm9rvi$les$1_at_nyytiset.pp.htv.fi>


Patrick K. O'Brien wrote:

>lauri.pietarinen_at_atbusiness.com (Lauri Pietarinen) writes:
>
>Please allow me to take exception to a few of your points.
>
Sure!

>>If you take a standard SQL-query with sorting and grouping and some
>>joins and compare it with hand written "navigational" code you will
>>notice that what can be expressed in SQL in 5-10 lines of code will
>>require several pages of hand written code for equivalent result.
>>
>>
>
>That really depends on two things: the specific SQL query, the
>"navigational" programming language, and the capabilities of the
>ODBMS. In general, I would agree that a declarative language is going
>to be able to express an operation more concisely than a procedural or
>object-oriented language. But a language like Python is very powerful
>and expressive. I can do the equivalent of a simple SQL query in
>about as many lines of code in Python using the PyPerSyst ODBMS. It
>would take an incredibly complex SQL query before I'd get to "several
>pages of hand written code". PyPerSyst itself is only several pages
>of hand written code. ;-)
>
Well, just to make things a bit more concrete, how would you write the following query

SELECT c.custname, p.prodname, sum(od.order_qty*p.prod_price) as part_cust_total
  from customer c,

     order o,
      order_detail od,
      part p

where c.custid= o.custid and
o.orderid = od.orderid and
od.partid = p.partid
group by c.custname, p.partname
order by part_cust_total

with obvious tables:
customer(custid, custname)
order(custid, orderid, orderdate, etc...) order_detail(orderid,partid,order_qty)
product(prodid, prodname, prodprice)

>>The third issue is that the SQL query can be optimised to a much
>>higher degree than procedural code. Say you add a new index. The
>>DBMS will/can immediately start using it without any user
>>intervention (if it decides to). In the procedural alternative you
>>would have to recode your query.
>>
>>
>
>You wouldn't necessarily *have to*. PyPerSyst does not yet have a
>declarative query capability. But if you add an index, the same
>procedural code will now make use of that index (or rather, the
>procedural code is calling methods of objects that are smart enough to
>look for indexes to optimize themselves without requiring any changes
>in application code).
>
If there are several indexes available, how does it choose which one to use?

>>If the parameter given at run time is 'M' then it could be
>>advantageous to use an index, but not if parameter is 'F'. This
>>decision could be made by the optimiser "on the fly". I hope you get
>>the picture...
>>
>>
>
>I hope you get the picture that every example you have given can be
>true about procedural code. Here is a very simple example of the
>source code for the "find" method of the PyPerSyst Extent class:
>
> def find(self, **criteria):
> """Return list of instances exactly matching all criteria."""
> instances = []
> names = criteria.keys()
> spec = self._makekeyspec(names)
> if spec in self._altkeys:
> # Use alternate keys as a shortcut.
> key = self._makekey(criteria, spec)
> d = self._altkeys[spec]
> if key in d:
> instance = d[key]
> instances.append(instance)
> else:
> # Scan all instances for a match.
> for instance in self._instances.values():
> match = True
> for name, value in criteria.items():
> if getattr(instance, name) != value:
> match = False
> break
> if match:
> instances.append(instance)
> return instances
>
This find method returns a list of instances matching some criteria

>supplied by the application code. A specific example of its use would
>look like this:
>
> people = db.root['Person'].find(name='Lauri Pietarinen')
>
>If there is an alternate key index on the name attribute, the find
>method will use that as an internal optimization. If one does not
>exist, then it will scan all instance of the Person class looking for
>a match. Let's assume that today there is no index, and all instances
>are scanned. If tomorrow the schema is changed to declare that there
>should be an alternate key enforced on the name attribute, then
>PyPerSyst will build and maintain an index that will be used to
>enforce that constraint. As a side benefit, the find method will also
>use that index to optimize its performance. The application code will
>not have changed.
>
What if this method was to return, say 1 million rows, but I only wanted to show the first
10 on screen? Would I have to wait for all the rows to be read? What if I want the rows
in a spesific order (not necessarily the same as the search criteria)?

>Thanks for listening. :-)
>
>

No problem!

best regards,
Lauri Pietarinen Received on Sat Oct 11 2003 - 23:15:21 CEST

Original text of this message