Re: Mixing OO and DB

From: David BL <davidbl_at_iinet.net.au>
Date: Tue, 12 Feb 2008 18:06:55 -0800 (PST)
Message-ID: <9a8cea05-05de-43df-9db2-41cb873a9363_at_d4g2000prg.googlegroups.com>


On Feb 12, 9:53 pm, JOG <j..._at_cs.nott.ac.uk> wrote:
> On Feb 12, 5:37 am, David BL <davi..._at_iinet.net.au> wrote:
>
>
>
>
>
> > On Feb 12, 10:01 am, JOG <j..._at_cs.nott.ac.uk> wrote:
>
> > > On Feb 11, 4:10 pm, David BL <davi..._at_iinet.net.au> wrote:
>
> > > > On Feb 11, 11:08 pm, JOG <j..._at_cs.nott.ac.uk> wrote:
>
> > > > > On Feb 11, 12:44 pm, David BL <davi..._at_iinet.net.au> wrote:
>
> > > > > > On Feb 11, 8:07 pm, JOG <j..._at_cs.nott.ac.uk> wrote:
>
> > > > > > > On Feb 11, 2:05 am, David BL <davi..._at_iinet.net.au> wrote:
>
> > > > > > > > On Feb 11, 3:29 am, JOG <j..._at_cs.nott.ac.uk> wrote:
>
> > > > > > > > > On Feb 10, 5:45 pm, "Dmitry A. Kazakov" <mail..._at_dmitry-kazakov.de>
> > > > > > > > > wrote:
> > > > > > > > > > [What is data, in your opinion?
>
> > > > > > > > > Data. Lots of datum - from latin, meaning statement of fact. Predicate
> > > > > > > > > and value in FOL. A value without description is of course just
> > > > > > > > > noise.
>
> > > > > > > > Latin datum is past participle of dare, "to give". What make you say
> > > > > > > > data is necessarily a set of propositions?
>
> > > > > > > The OED. "Facts, esp. numerical facts, collected together for
> > > > > > > reference or information." The etymology stems from 'dare', because
> > > > > > > facts are always communicated or "given". I understand of course that
> > > > > > > the term is thrown around wantonly and ambiguosly nowadays, but as
> > > > > > > data theorists, we shouldn't be party to that imo ;)
> > > > > > > > Are you suggesting a value
> > > > > > > > is meaningless without a proposition? Why can't a datum just be a
> > > > > > > > value?
>
> > > > > > > Because ta value has to be associated with something. Hofstadter gave
> > > > > > > a good example of this with the groove modulations on a vinyl record.
> > > > > > > To us they are (musical) data, to an alien not knowing their context,
> > > > > > > it is not. You need the context.
>
> > > > > > > > Wouldn't you say a recorded image is data?
>
> > > > > > > Of course, so long as I know it's an image. If its just ones and
> > > > > > > zero's stored in a computer, without anyway of telling they represent
> > > > > > > a picture, then it is simply noise.
>
> > > > > > Let's indeed assume we know how to interpret the 1's and 0's as an
> > > > > > image. So what have we got? Nothing but a *value*.
>
> > > > > No, you now have a value with applied context. That creates a fact.
> > > > > You now therefore have data. It's simple to show - consider "1000001".
> > > > > Thats currently a value, but its not data. Its only data when I store
> > > > > it, and state one of the following:
>
> > > > > "100001" is a text string
> > > > > "100001" is an integer (i.e. 65)
> > > > > "100001" is an ascii character (i.e. A)
> > > > > etc..
>
> > > > These "facts" are all tautologies that are true whether you record
> > > > them or not.
>
> > > I'm not seeing whats so controversial or difficult about the fact that
> > > "10001" is just a meaningless binary value until you give it a
> > > context. It seems somewhat obvious to me.
>
> > When you say "meaningless binary value", are you suggesting a value
> > can exist independently of its type?
>
> I am suggesting that 10001 is a binary number, no more and no less,
> until you apply some meaning to it and hence turn it into data. Same
> with a picture. Until you say something about it, its just a picture
> value.

I'm having a lot of trouble with your terminology. I don't even know what you mean by "binary number".

An integer value like 65 can be encoded in binary. It is the encoding that is binary, not the value being encoded.

> Consider /unallocated/ RAM in your PC. Look at 5 contiguous bits at
> random. Are you telling me that the binary number you are looking at
> is "data"? I'd accept that it is a value (albeit a meaningless one)
> but "data"? You really think that?

No I don't.

When data is recorded on some medium there is a lot of implicit *knowledge* about how it has been encoded. This knowledge has to account for all sorts of details, such as what designates a 1 versus a 0. How many bits in a word? What order do they appear in? Is there an address bus? How is the address bus organised? The binary encoding is only a tiny part of it. Obviously we both agree that all that knowledge is implicit in correctly decoding the data.

Our point of contention is rather that I suggest that most generally the data is nothing other than encoded values, and doesn't necessarily convey any facts. I'm assuming that the knowledge implicit in the encoding of the data is by definition not part of the data itself, whereas I think you are suggesting it is part of the data.

I would prefer to say that the knowledge of the encoding (the "protocol") exists independently of the recorded data (or even the particular instance of the media it is recorded on). For example I don't want to create a new file format for every file.

> > > > I dispute your premise that the purpose of the data in
> > > > this case is to state a fact that is known a-priori to be true.
>
> > > A Datum is a given fact. That's what the word means formally. I have
> > > said nothing more, and I have no idea what you are on about talking
> > > about "the purpose of data".
>
> > Let me use an example: I give you a disk with some data, tell you a-
> > priori that it records a string, describe the format and you are able
> > to determine that the recorded value is a poem
>
> > "Is it binary or is it data?
> > Is it info or knowledge,
> > or is it wisdom -
> > the whole enchilada?"
>
> Thats a value imo, and its only data if we say "The file myPoem.txt
> contains 'Is it binary or is it...'".

I don't know what that means or what distinction you are making. I'm very suspicious of introducing a proposition that is referring by name to the file. Names don't have absolute meaning. What is the context for this proposition?

> I do realise that the
> definitions I am suggesting as formal are at odds to the handy wavy,
> nebulous way we throw around terms such as 'data', 'data model', etc.
>
> As proof (!) consider your above example if you placed the poem
> written on paper in front of me. Are you telling me that is data?
> Course not, its just a poem written down - a value. So then what is
> the difference between this and your example on a disk? That its
> encoded in binary?

Since this is a computer science discussion group I'm happy to narrow the definition of data to encodings of values that are intended for both reading and writing by a computer.

> > Note that no additional context has been provided. I would say the
> > purpose of the data was to convey a value, but not to convey a fact.
>
> > > > If that is its purpose then it conveys precisely zero information.
>
> > > > > > We can display
> > > > > > it. We can comment on whether we like it - even if we haven't a clue
> > > > > > where it came from. But I don't see any sense in which the image
> > > > > > value gives us any statements of fact beyond the specification of a
> > > > > > value. A value simply "is".
>
> > > > > > I would suggest that a lot of the data in the world is characterised
> > > > > > more closely as "interesting values" than collections of
> > > > > > propositions.
>
> > > > > You cannot store these interesting values without implicitly stating
> > > > > some fact about them.
>
> > > > By definition, when a value is specified, its type is specified as
> > > > well (except possibly if type inheritance is supported).
>
> > > > C. Date states the following in "Introduction to Database Systems",
> > > > section 5.2, and subsection titled "Values and Variables are typed":
>
> > > > "Every value has ... some type...Note that,
> > > > by definition, a given value always has
> > > > exactly one type, which never changes.
> > > > [footnote: except possibly if type
> > > > inheritance is supported]"
>
> > > > When a particular value like the integer 73 is specified, there is no
> > > > implicit fact being specified. The statement that the integer 73
> > > > exists in any absolute sense is entirely metaphysical and meaningless
> > > > within computer science.
>
> > > So you just wriite "73" down and are telling me its a datum? I'm
> > > pretty sure that's what we call a "value", not data.
>
> > C.Date distinguishes between a value (that by definition doesn't exist
> > in time and space), versus the *appearance* of a value which appears
> > in time and space and is encoded in a particular way.
>
> Is this what your view of the terms is based upon?

These definitions seems reasonable to me.

I don't agree with everything Date says. For example Date defines "data" in the same way you do!

> > I would suggest that data should by definition be regarded as
> > synonymous with the appearance of a value.
>
> > If you don't agree with that, then let's treat it as a definitional
> > matter. However I'm curious to know what you would say is the
> > distinction.
>
> Yes, sure, this is just definitional. However I am on the side of
> traditional scientific notion of data ("On the third experiment the
> electrical current was x amps"), as well as Codds! Regards, J.

I agree that most scientific data consists of facts. However you seem to have agreed that a poem is a value, and that a poem encoded on a disk can be regarded as data. I realise that skips past the point you want to make, but I thought it had nothing to do with the difference between scientific or poetry application domains. Received on Wed Feb 13 2008 - 03:06:55 CET

Original text of this message