Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge)

From: Nick Landsberg <hukolau_at_NOSPAM.att.net>
Date: Sat, 29 May 2004 01:55:52 GMT
Message-ID: <I4Stc.85971$hH.1561236_at_bgtnsc04-news.ops.worldnet.att.net>


Neo wrote:

>>I changed my model ...
>>the average execution time is now down to 11.0 ms.

>
>
> XDb1 v4.5.4 takes 0.973 msec to generate the report in RAM only (does
> not include writing the report to hard disk). With SQL Server, can one
> generate the report to RAM only and measure the time? When running the
> report via Query Analyzer, does the reported time include writing
> table NCancestor to hard disk?
>
> Data: Small hierarchy of 8 things (as published on web page).
> OS: Windows NT 4, sp6a
> PC: Off-line, 500 Mhz, 512 MB, Dell PowerEdge,
> Dual Pentium, Dual Ultra SCSC II 10K RPM HDs.
> Setup: Restart PC, shut down all apps possible.
> Apps: XDb1 (v4.5.4) and Notepad.
>
> Run Time Elapsed
> --- -------------
> 1 0.972470 msec
> 2 0.973493 msec
> 3 0.974164 msec
> 4 0.972988 msec
> 5 0.972986 msec
> 6 0.973224 msec
> 7 0.969972 msec
> 8 0.977150 msec
> 9 0.971688 msec
> 10 0.971067 msec
> -----------------
> Avg 0.973 msec

Ahh... I've got you both thinking about performance. Good. (I've seen the other posts about performance by Neo and Hugo, but didn't have time to respond until tonight.)

(I will post no opinions as to whether the challenge was answered or not. Not my bag.)

First of all, you have both (Neo and Hugo) done a good thing, so that those of us in who are anal retentive about performance can try to extrapolate apples to apples, if possible. Thanks.

Also, by shutting down all other apps, your elapsed time approximates the CPU time. (In my environment I can measure the actual CPU time by several means, but I won't belabor that point.) For the sake of argument, let's call the above results 1 ms. for this query. Since the application(s) *I* deal with don't usually deal with extensive "reports", rather they deal with queries for specific items within a rather large universe. (Hash tables preferred for individual items, multi-table queries discouraged. We've found that out by testing.)

Let me try to do some extrapolation based on relative CPU speeds and other factors.

Our typical single record query on a 1.28 MHz machine (most efficient mode) is about 100 microseconds for an in memory database. The typical query is not a "join" but extracting the foreign key from the first record and then accessing the next table, and so on. (Relational purists, please don't look! :) ) There are typically 5-6 of these during the processing of a transaction, so that this would take approximaely 500-600 usec.

Your results for a 500 MHz machine would extrapolate to about 400-500 usec on our box (assuming the query would have to be single-threaded). So, if we ever had to formulate a query of that kind, your method would be equivalent or marginally faster if our hierarchy was small.

Now, the kickers. (Not meant with malice)

  1. - You, Neo, know this implementation, inside and out. How long would it take for the average programmer to become proficient with the database layout to get the same results you did? Note: I have meddled in the Relational world since about 1980, and have a pretty good feel for how to optimize RDBMS's. Even to this day, not everyone has that feel. Could a bad design of the schema still "work" but give 10-100-1000 times the numbers that you posted above? Not everyone is destined to be a superstar.
  2. - How does it scale? Hugo has posted (and admitted) that his model seems to be exponential with the number of records in the hierarchy (not surpising to me, since I know a tad about the innards of the relational implementations and query processing). You may have posted that too, but, if you have, I missed it.

No, I'm not busting chops. Just (hopefully) asking questions which will have to be answered along the way to making this an industrial-strength product.

NPL

--
"It is impossible to make anything foolproof
because fools are so ingenious"
  - A. Bloch
Received on Sat May 29 2004 - 03:55:52 CEST

Original text of this message