Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re[4]: to_number question

Re[4]: to_number question

From: Jonathan Gennick <jonathan_at_gennick.com>
Date: Mon, 19 Jul 2004 08:00:33 -0400
Message-ID: <1273679070.20040719080033@gennick.com>

NL> Thank you for the post. It seems to me that we always get further when
NL> we post what we understand, even when we are sure that our
NL> understanding is incomplete/ subject to error, rather than being
NL> afraid to expose our own ignorance.

Thanks Niall. I've been pressing the issue quite a bit actually, but only because I want to work towards an explanation that satisfies my inner soul. It's taken me awhile to get to that point. However, I had a bit of an epiphany over the weekend, and now I've got a draft of an article that I hope will make the whole issue clearer to at least a few people besides myself.

NL> It seems to me that what you describe below is a perfectly normal way NL> of reading a given sql statement - it just isn't correct :) . ...

NL> It took me a long time (I don't have a comp.sci background) to decide
NL> that a sql statement was not a set of instructions to the database to
NL> tell it how to find the data I was looking for, but a description of
NL> the set of data that I wanted back. This may or may not help your
NL> reflection.

Yes. Someone else said something that got me to thinking along those lines. What you say here is a key point. Unfortunately, I believe the SQL language designers have not done us any favors when it comes to implementing a language and syntax that reflects the proper way of thinking about queries. Verbs such as SELECT and keywords such as FROM seem to work against a correct interpretation of a query by humans.

Bear with me a moment, and let me repeat your two examples. There's more from me down below.

NL>  SELECT to_number(a.a), b.b
NL>   FROM 
NL> ( SELECT a
NL>          FROM a
NL>          WHERE a.id = 1 ),
NL>  b 
NL> where b.id=a.id;

NL> as compared with

NL> select to_number(a.a), b.b
NL> from a,b
NL> where a.id=b.id
NL> and a.id=1;

NL> I hope that you'll agree that the second query doesn't
NL> limit the optimizer to going straight to a, finding all records with NL> an id of 1 and then doing some sort of join to b.

Nor does the first query make that limitation.

NL> Your mental model however imposes this execution order NL> on the database for the first query.

No. Not at all, and this is where I think many people misunderstand me. I expect (or used to expect) that the results would be "as if" the subquery had been materialized first. However, I never required that the subquery actually be materialized. Nor did I require anything else. I only required that, no matter what the database really did, that the end results be the same "as if" the outer query had never seen rows other than those returned from the inner query.

Dan Tow pointed out in one of his earlier notes that it's at least possible to conceive of ways in which the database might operate in the manner I've just described.

The question seems to boil down to whether a row source should be treated as opaque with respect to the rest of the query. Interestingly, joins, as in your example, are part of the epiphany I had over the weekend, which has led to me the "correct" way of thinking. We allow the evaluation of WHERE clause predicates either before or after rows have been joined. This is the same issue, really, as the subquery issue we've been discussing.

If you can accept that a WHERE clause predicate can be applied to a row before that same row is tossed out by a JOIN predicate, and thus the WHERE clause predicate was applied to a row not returned by the join, and thus not in the row source for the query, then you must also accept that a predicate outside an inline view may be applied to a row that is not ultimately returned by that view.

NL> These two queries should be identical

I still don't quite accept this statement. The results are identical. The queries themselves are not. I can only accept the rewritten query as a valid interpretation of the first if I can also rationalize that the first query might result in "where b.id=a.id" being applied before the subquery result set is fully defined. Fortunately<grin>, I'm now able to accept this.

Let me try and be a bit more clear on what I just said:

Query_1 -> optimization -> execution plan

Query_1 -> transformed to query_2 -> execution plan

I can only accept the transformation to query_2 if the same, ultimate execution plan can legitimately result from query_1. I don't like to think in terms of queries being rewritten. This is probably another topic, but I believe the intent of SQL has always been:

Query -> some optimization process -> execution plan

If it helps those who program the optimizer to manipulate and rewrite the SQL in order to come up with various execution plans, that's fine, but such rewriting is, to me, an implementation detail. The execution plan in the cases we've been discussing needs to be (and is, I now believe) reachable without necessarily rewriting the SQL.

I hope I have not driven you all to distraction by the way I think about things. Rest assured, I've managed to adjust my mental model enough to allow for the behavior we've all been discussing.

Best regards,

Jonathan Gennick --- Brighten the corner where you are http://Gennick.com * 906.387.1698 * mailto:jonathan@gennick.com

Join the Oracle-article list and receive one article on Oracle technologies per month by email. To join, visit http://five.pairlist.net/mailman/listinfo/oracle-article, or send email to Oracle-article-request_at_gennick.com and include the word "subscribe" in either the subject or body.

Monday, July 19, 2004, 5:55:13 AM, Niall Litchfield (niall.litchfield_at_gmail.com) wrote: NL> Hi Jonathan

NL> Thank you for the post. It seems to me that we always get further when
NL> we post what we understand, even when we are sure that our
NL> understanding is incomplete/ subject to error, rather than being
NL> afraid to expose our own ignorance. My ignorance will as usual be
NL> interposed with yours.

NL> On Fri, 16 Jul 2004 15:54:09 -0400, Jonathan Gennick NL> <jonathan_at_gennick.com> wrote:
>> I've been on the fence about whether to post the note below.
>> I think I'll go ahead and post it, because it sums up as
>> best I can why I'm so flummoxed over this issue. It'll give
>> you an idea of where my head is at, though you might feel
>> that I have a rather strange way of looking at things.

NL> It seems to me that what you describe below is a perfectly normal way NL> of reading a given sql statement - it just isn't correct :) .

>> To get any resolution, I'm just going to have to go off in a
>> corner and reflect for awhile.

NL> It took me a long time (I don't have a comp.sci background) to decide
NL> that a sql statement was not a set of instructions to the database to
NL> tell it how to find the data I was looking for, but a description of
NL> the set of data that I wanted back. This may or may not help your
NL> reflection.


>> Neither Stephen, nor I are trying to impose execution plans
>> on the optimizer.

NL> I think you are - see below

>>
>> With SQL you have the conceptual execution of a query versus
>> the actual execution performed by the database. Having the
>> proper conceptual model in mind is critical, because that's
>> what you use when you actually write a SQL statement.
>>
>> My mental model for subqueries in the FROM clause has
>> always been:
>>
>> The outer SELECT operates against the rows
>> and columns returned by the subquery

NL> here is an attempt at an example

NL>  SELECT to_number(a.a), b.b
NL>   FROM 
NL> ( SELECT a
NL>          FROM a
NL>          WHERE a.id = 1 ),
NL>  b 
NL> where b.id=a.id;

NL> as compared with

NL> select to_number(a.a), b.b
NL> from a,b
NL> where a.id=b.id
NL> and a.id=1;


NL> These two queries should be identical (if I haven't screwed up the
NL> translation to a straight join). In both cases I'm trying to say that
NL> I want records from a and b where the id's are identical (and  are
NL> equal to 1). I hope that you'll agree that the second query doesn't
NL> limit the optimizer to going straight to a, finding all records with
NL> an id of 1 and then doing some sort of join to b. Your mental model
NL> however imposes this execution order on the database for the first NL> query.

>> Perhaps the correct mental model is:
>>
>> The outer SELECT operates against the rows
>> and columns returned by the subquery, except
>> when the optimizer surprises you by doing
>> otherwise.
>>
>> This is certainly not very satisfactory. At least, I'm not
>> at all happy with it. Better might be:

NL> Indeed, how horrible.

>> The outer SELECT and subquery together describe the state of
>> the data to be returned by the statement, but to get that
>> state the database will perform various operations in some
>> indeterminate order.

NL> This I like a lot better (it being close to my understanding :) ). Why
NL> have the clause about the database operations at all. What the
NL> database does to return the data is an implementation detail
NL> (admittedly its kind of an important detail, but consider issuing the
NL> equivalent statement against - for example - sql server, or informix.
NL> The available operations would be different, the set of data to be
NL> returned would be the same.

 

>> This I like better, but still leaves me vaguely unsatisfied.
>> For one thing, the "state of the data" cannot be properly
>> understood without also visualizing the results from the
>> subquery, so I'm left with a bit of a chicken/egg problem.

NL> That depends on the query. I think you can visualize my example above
NL> with either a subquery, or not. In *most* cases the query could be
NL> rewritten without a subquery.

----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.
--
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------
Received on Mon Jul 19 2004 - 06:57:27 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US