Re: Looking for how to model 3D objects in 2D relational databases

From: Christian Paro <christian.paro_at_gmail.com>
Date: 14 Oct 2005 10:50:26 -0700
Message-ID: <1129312226.594871.196400_at_g14g2000cwa.googlegroups.com>


Perhaps it would be better to tackle this question from the more specific context of modelling molecular structures *before* trying to wade through the tarpit of theoretical semantics.

Modelling *a* molecule as a relational database is straightforward enough - the molecule is a graph consisting of some set of atoms and a relation enumerating the bonds between each. I'm assuming, for the moment, that you are only trying to store a simple "ball-and-stick" model rather than some fancy quantum representation that deals with all the physical causes and effects responsible for and resulting from each of the bonds within the molecule. All you need in this case is three relations:

  • A relation mapping atomic number to whatever info you're interested in using for each element present. A "periodic table", you could say.
  • A relation listing each of the atoms in an instance of the molecule being described by a generated ID and relating each to the appropriate element by atominc number.
  • A relation which associates each atom with each of the other atoms in the molecule, enumerating the molecule's bonds by way of an adjacency list.

A simple database for the friendly water molecule would look like this (my apologies for the ugly notation, I haven't much patience for trying to format things in plain text):


tbl_periodic (atomic_number, name):
{(1, Hydrogen), (16, Oxygen)}

tbl_atoms (id, atomic_number):
{(1, 1), (2, 1), (3, 16)}

tbl_bonds (id1, id2):
{(1, 3), (2, 3), (3, 1), (3, 2)}


At least it would if you want your bonds to be undirected, you could also represent the polarity relationship by making them directed - but good luck with non-polar bonds.

Now, it *does* seen awfully silly to use an entire database for each molecule, so you'd probably make some adjustments to this concept for a practical implementation. You could, for instance, add a "molecule" table which identifies molecules and use the molecule id as a foreign key. To illustrate, let's make this change to our example db's structure and add Hydrochloric Acid to the dataset:


tbl_periodic (atomic_number, name):
{(1, Hydrogen), (16, Oxygen), (17, Chlorine)}

tbl_molecules (mid, name):
{(1, Water), (2, Hydrochloric Acid)}

tbl_atoms (mid, aid, atomic_number):
{(1, 1, 1), (1, 2, 1), (1, 3, 16), (2, 1, 1), (2, 1, 17)}

tbl_bonds (mid, aid1, aid2):
{(1, 1, 3), (1, 2, 3), (1, 3, 1), (1, 3, 2), (2, 1, 2), (2, 2, 1)}


And thus you have a well-normalized relational database that can store multiple molecules and represent a simple ball-and-stick model of their structure. Fancier models would require a bit more work to tie in the extra information, and you would probably benefit from adding a set of rules that can identify and remove descriptions of impossible molecular structures (bonds that violate orbital rules, 'molecules' that don't form a connected graph...), but this should be a good starting point.


Now, to my take on the dimensionality issue: It's really just a perfect example of how diction often seems to fall apart at the boundry of two disparate technical disciplines.

When you speak of a molecule as a 3-dimensional structure, you are probably referring to the fact that most molecules are non-planar structures whose physical manifestations exist in a (for all practical purposes, anyhow) 3-dimensional data model I'll refer to as the "real world".

>From a computer science and database theory perspective, however, N-dimensional seems to be interpereted more along the lines of "having N independent measurable quantites". I could be off by a bit on this one, being just a lowly undergrad student with a fuzzy memory and sloppy study habits, but it seems to be that the dimensionality of anything in a software context is really determined less by the nature of the thing being modeled than by the cardinality of the set of attributes which are to be accounted for in the digital realm. A cluster of balloons could be an element in a 3-dimensional data structure which tracks the cluster's latitude, longitude, and altitude as it flies over a radar dish - but it could just as easily be an element in a 2-dimensional structure recording the fact that there were 99 of them, and that they were red.

>From a user perspective, databases (or at least all of the tables within) are 2-dimensional. This is just a side effect of using tables to display the contents of a database. Throw in some OpenGL and render the same data as a 3-dimensional network graph and the data suddenly becomes 3-dimensional. This hasn't much to do with either the nature of the thing being modeled or the nature of the model underlying the view. It's just a pragmatic observation that the men in the cave may be 3D, but their shadows are not. The mistake comes in confusing the shadow for that by which it is cast.

In the case of our molecular database, we have two 2-D relations (the periodic table and the molecule table) and two 3-D relations (the atom table and the bond table). This isn't because the molecules are 3-D, since the first version of the database did fine with only 2-D relations, but because an extra dimension was required to disambiguate atoms and bonds belonging to disparate molecular structures. Also, depending on what information you extract from it and how you put it together for viewing, the result might be a 1-D string of characters representing the molecule's formula, a 2-D adjacency matrix or Lewis dot-diagram, or a rotatable 3-D visualization of the molecule's physical structure. Colour code the atoms by element and that 3-D visualization is actually representing four dimensions of data - three for position and one for colour/type.

In essence, the problem with "dimensionality" doesn't seem to be that it's a difficult concept - but that its meaning is largely dependent upon the context in which it is being applied. Received on Fri Oct 14 2005 - 19:50:26 CEST

Original text of this message