Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge)

From: Hugo Kornelis <hugo_at_pe_NO_rFact.in_SPAM_fo>
Date: Sun, 30 May 2004 00:24:48 +0200
Message-ID: <pd2ib0t0lvdm3g2bpjksinb823bofainua_at_4ax.com>


On 29 May 2004 11:18:22 -0700, Neo wrote:

>> There seems to be some puzzling differences in our time measurements.
>
>I believe the reason why measurements made on my machine (500 Mhz)
>aren't translating into proportionally shorter measurements on your
>machine (1.3 Ghz) is due to a difference in disk I/O characteristics.
>XDb1 v4.5.3 (and earlier) writes to the hard drive at the end of each
>line on the report.
>
>Are the below sequence of events approximately correct when creating
>the report via SQL Server's Query Analyzer?
>
>1. Run report procedure
>2. Print Start Time.
>3. Create Report in RAM table.
>4. Issue asynchronous command to save RAM table to HD.
>5. Print End Time.
>6. RAM table continues to be written to HD.

Hi Neo,

No, they are not. There is no such thing as a "RAM table" in MS SQL Server. And if there was, you should have been able to see it, since I posted the script to produce the report.

Why are you asking, by the way? Even if it were true, what is the relevance? I didn't ask if XDb1 uses RAM or disk for storage and it is not of any importance for the challenge. The challenge was to produce a report, my solution does produce a report (as does XDb1).

To save you the trouble of going through Books Online (the full documantation of SQL Server, avaialble on the CD but can be browsed through internet as well), I'll give you a quick breakdown on how SQL Server stores data.

All modifications to data are written to disk. In fact, they are written *twice*. The data file contains the actual table contents. The layout is optimised for fast access. Data pages are cached to speed up retrieval if the same data is to be accessed again. Changes are made to data pages in cache; writing the changes to disk is an asynchronous process. But to prevent data loss due to power outage, all changes are also written to the log file. These writes are synchronous - SQL Server will never report a transaction completed until the HD controller confirms that the relevant log pages have been written to disk.

Another thing to be aware of is the overhead introduced by the locking mechanism. Not only every write, but even every write requires a check if a lock can be made, the actual setting of that lock and the release of the lock when the transaction is complete. When running the tests on large sets of data (to answer Nick's messages), I did run one test with SQL Server started in single user mode. This reduced the elapsed time by some 20 to 30% !! I decided to leave those tests out of the results I posted here, since using SQL Server in single user mode is very atypical (in fact, this mode is actually intended for maintenance work only).

Best, Hugo

-- 

(Remove _NO_ and _SPAM_ to get my e-mail address)
Received on Sun May 30 2004 - 00:24:48 CEST

Original text of this message