Path: dp-news.maxwell.syr.edu!spool.maxwell.syr.edu!drn.maxwell.syr.edu!news.maxwell.syr.edu!border1.nntp.dca.giganews.com!nntp.giganews.com!local01.nntp.dca.giganews.com!nntp.comcast.com!news.comcast.com.POSTED!not-for-mail
NNTP-Posting-Date: Fri, 04 Nov 2005 05:59:53 -0600
From: "VC" <boston103@hotmail.com>
Newsgroups: comp.databases.theory
References: <1130610162.744817.239070@g14g2000cwa.googlegroups.com> <rpqdnYfKoIs1XPneRVnyrQ@pipex.net> <1130765800.098046.93870@f14g2000cwb.googlegroups.com> <9v-dnYE-SanY6PreRVnygg@pipex.net> <y6OdnRe7l8Eeb_reRVn-qA@comcast.com> <YHY9f.4683$Rl1.3852@newsread1.news.pas.earthlink.net> <P6KdnREFdtrKAfXeRVn-jQ@comcast.com> <goqdnYb4Cps9JfXenZ2dnUVZ8qudnZ2d@pipex.net> <1130939293.784399.319040@g43g2000cwa.googlegroups.com> <e-Sdndl8GKQgSPXenZ2dnUVZ8qednZ2d@pipex.net> <1130949100.933681.55190@z14g2000cwz.googlegroups.com> <1130965639.482009.182400@g49g2000cwa.googlegroups.com> <xpednZGpW5rOofTeRVn-pQ@comcast.com> <xGctaeGgBWaDFwgS@shrdlu.com> <MNqdnfUFqvsGGfTenZ2dnUVZ_tGdnZ2d@comcast.com> <2eyaf.8082$Hj2.1929@news-server.bigpond.net.au>
Subject: Re: Modelling objects with variable number of properties in an RDBMS
Date: Fri, 4 Nov 2005 07:00:03 -0500
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.2180
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180
X-RFC2646: Format=Flowed; Response
Message-ID: <naOdnUMTycmk0PbeRVn-tw@comcast.com>
Lines: 57
NNTP-Posting-Host: 24.91.127.140
X-Trace: sv3-jGGQzp4+GlV2hdJ+6byHmMmSa/GtHtSz+1B6p6fLmfO0DPUi+CY17Jjc2QdhcBjRhEEC4HzaUWI/eKR!l+M7zK23mLx8NnndezjDSfMuv4RaeqOzVx6h/lm1/co154teR2nNS20I1ztwi4GfREqsas0oYg==
X-Complaints-To: abuse@comcast.net
X-DMCA-Complaints-To: dmca@comcast.net
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.32
Xref: dp-news.maxwell.syr.edu comp.databases.theory:34210


"Frank Hamersley" <terabitemightbe@bigpond.com> wrote in message 
news:2eyaf.8082$Hj2.1929@news-server.bigpond.net.au...
> VC wrote:
>> "Bernard Peek" <bap@shrdlu.com> wrote in message 
>> news:xGctaeGgBWaDFwgS@shrdlu.com...
>>
>>>In message <xpednZGpW5rOofTeRVn-pQ@comcast.com>, VC 
>>><boston103@hotmail.com> writes
>>>
>>>>>Should these tests really be considered attributes?  Wouldn't the act
>>>>>of presenting them as a two-tuple relation <test#, test result> simple
>>>>>be an wise step of normalization - not an example of an EAV
>>>>>decomposition?  Am I totally missing the boat on this?
>>>>
>>>>It's a single test with ~6000 measurements.
>>>
>>>That doesn't sound like something that requires multiple attributes. If 
>>>it's a single test then the results are presumably in one domain so a 
>>>single entity should work.
>>
>> I am not intimately familiar with the testing process.  According to the 
>> person who created the model, it's a drug discovery chemical compound 
>> testing process which runs daily.  The test result for a given compound 
>> is represented by many thousands of numbers,  hence 600 attributes per 
>> entity. They generated close to half a terabyte of experimental data each 
>> day.
>>
>>>What distinguishes one set of test results from another? Is it the time 
>>>the test was done, the temperature?
>
> It seems to me that this sounds like a situation where practical concerns 
> to cope with the volume of data has induced an attempt to achieve 
> compression (by reducing the repeated occurence of key information were 
> the schema to be normalised per RM theory.

"Compression" or disk space saving was not a consideration.  As I said 
several time before,  bad performance due to excessive unions/joins was. 
They simply could not  process the data o time.

> If the first model produces 1/2 Tb then the second will surely occupy much 
> more space.

See above.

>don't think I could ever sell out the RM and go with option 1.  It is much 
>better IMO to find ways to overcome the problems of the second option by 
>choosing the right DBMS and configuring it appropriately, rather than (to 
>coin a metaphor) to "use an SUV (or even a fleet of SUV's) to move 20 
>tonnes of freight from coast to coast"!

As I said before,  replacing SQL Server with another db was not an option 
due to non-technical reasons.




