Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge)
Date: 23 May 2004 13:49:30 -0700
Message-ID: <4b45d3ad.0405231249.3a2a28f6_at_posting.google.com>
> OK folks, I have found this discussion fascinating and educational.
> Given the possible implementations, how long would this query run
> on a 1 million record database?
It could take hours, days, months, maybe even years, depending on the
number of interconnections between things (I believe the human brain
performs somewhat similar computations but in a massively parallel
manner)!
Sorry, but XDb1 is not suitable for commerical applications yet. It is
currently mostly an experimental db meant to test out an alternate
data model. IMO, an apple-to-apple comparision hasn't been made yet
because of the following differences:
The provided RM db's schema is not as generic as XDb1's in the
following respects:
1. It does not model a thing's name as an attribute.
2. It does not model attributes with multiple values in a normalized
manner.
3. It requires a separate table for values of each each data type!!!
4. It does not model the class hierarchy in the appropriate table thus
making it inaccessible for reporting.
5. It does not model a way to normalize names of things. (ie
person/john and dog/john)
6. It does not model a way to normalize parts of a name (ie 'simth' in
'john smith' and 'mary smith')
7. It does not model a way to normalize symbols in names. (ie 'o' in
'john' and 'god')
Also the provided RM db's data is not as normalized as XDb1's. RM's has duplicate data which needs to be replaced with proper refs (keys) to the one and original data within the db. A proper ref must be independent of data being represented, otherwise problems arise. For example, in RM's, two things can't have the same name and can't properly represent an unnamed thing.
With a very small dataset, on a off-line, 500 Mhz, Dell PowerEdge, 512 MB, Ultra SCSC II, 10,000 RPM HDs, NT 4 sp6a, SQL Server 7: RM's solution takes 65 ms on avg, but as low as 30ms and as high as 1186 ms. XDb1's simplified solution (only printing thing's ID) which very roughly approximates RM's solution (prints char-based keys which happen to encode thing's name also), takes as high as 2.23 ms. XDb1's generic solution (which is normalized down to atomic symbols and accesses the same symbol 'o' in the db when printing 'john' and 'god' on the report and will print an unnamed thing's classes and properties), takes 2 or 3 ms.
Please note: the above are apple-to-orange comparisions. Received on Sun May 23 2004 - 22:49:30 CEST
