Re: How to model searchable properties of an entity

From: Laconic2 <laconic2_at_comcast.net>
Date: Thu, 19 Aug 2004 09:19:26 -0400
Message-ID: <yL-dnWnPUNPANbncRVn-hA_at_comcast.com>


"Tony" <andrewst_at_onetel.net.uk> wrote in message news:c0e3f26e.0408190338.288380cc_at_posting.google.com...

> Me too, and I'm lucky enough to be working in an environment where
> that is still practiced reasonably well. But I have a friend who
> works as a DBA for a large oil company that uses Smalltalk and Oracle,
> and she gets her requirements from the Smalltalk team: "we've designed
> this new set of screens for <whatever>, and we need you to create some
> tables to save the data into". Most tables (created before she
> joined) have no constraints of any kind, and the majority of her time
> seems to be spent writing scripts to find and fix corrupt data. But I
> digress...

I don't think that's a digression at all. When the Smalltalk team says, "We've designed this new set of screens", that as close as that culture is going to get to a "conceptual data model". It's not that the Smalltalk people have small minds. They think in concepts, but they are not used to projecting their concepts onto the flat screen of "data". So, instead, they project their concepts onto the flat screen of "set of screens".

BTW, I've worked in environments where the shoe was on the other foot: "We've designed this set of tables to hold the data we need. Now go off and build a set of screens to capture it." It turns out that this is a nightmare for the screen designer, because a set of tables isn't really a conceptual data model, either.

The better way is to start from the requirements and derive a conceptual data model, and a companion conceptual process model. Then, everything else gets derived and/or designed from there. I'm convinced that OOA or UML can express both the conceptual process model and the conceptual data model in a single model, but I can't verify that from my own experience.

> Yup. And I must admit, maybe I have just been lucky (again) never to
> have come up against such a requirement. I do remember once in my
> early days in relational databases seeing a database design for a
> local government inventory system. It had separate tables called
> things like street_lamp, post_box, telegraph_pole, etc. etc. I
> remember saying to a colleague "that's crazy, defining a separate
> table for each item type". Now I seem to have been advocating that
> approach here!

Somewhere in this long discussion, someone asked "why have a load of tables for what is basically the same abstraction?"

That question deserves a more thoughtful answer than I have been able to give. Before you can deal with that question, you have to be able to answer the question "when are two abstractions the same abstraction?" This sounds like an easy question, but it's not.

With regard to "parts" , there is a level of abstraction at which they are all the same. If I want to order a wheel and a tie rod, I probably have to know exactly the same data about each to fill out the order form. I think you've outlined that data below.

But if I'm diagnosing to determine whether a new wheel or a new tie rod (or both) will fix the problem, I probably need to treat wheels and tie rods as different types of entities, because they play different roles in the diagnosis model.

Diagnosis and order taking are both "valid" ways of looking at the data. Depending on the requirements, the system may have to support both.

>Actually, I think my inclination now would probably
> still be more pragmatic: for the most part, all those inventory items
> can be considered similar: their most important attributes would be
> stuff like location (grid ref), cost, date installed, date last
> inspected, make, model, etc. common to all. That data should probably
> reside in a single table. With a "notes" attribute, there really
> might not be a need to hold further item-specific information for 80%
> of the item types. Of course, I am probably over-simplifying
> drastically here.

The problem I have with "notes" is that "notes" are not data. Invariably, someone down the road asks, "why can't you just search the notes field?" If you make the mistake of trying to explain the difference between parsing natural language and scanning structured data, you are apt to be dismissed as a geek that doesn't understand people very well. Received on Thu Aug 19 2004 - 15:19:26 CEST

Original text of this message