Re: Storing data and code in a Db with LISP-like interface

From: Neo <>
Date: 16 Apr 2006 15:30:55 -0700
Message-ID: <>

TopMind, you understandably have some misperceptions about the experimental db. The best and sometimes the only way to clarify them is to actually represent some things and compare the process and results. Would you be willing go thru this process with me? It requires you to post SQL script to represent some things and perform some queries. It isn't my goal to embarass anyone, but simply to compare the two methodologies, each of which have strong points and weakness that make them appropriate for different scopes. I am not concerned about trivial matters such as naming conventions\, punctuation, style, obvious omissions, simple errors, etc. I am able to email you a small zip file (200Kb, 1/7th of a floppy) for verification, if requested. It contains the script to create db, db file and db executable. The program requires no installation. Just double-click to run. Throw in trash when done.

During the pask week, I have posted the following examples: 1) People in multiple matrix hierarchies and getRoot function. 2) Judges, bailiffs, clerks, etc in various parts of court building.

      (see thread titled "Data Model")
3) Foods that a john likes.

Would you be wishing to proceed with the last example which the simplest?

Below I have answered your questions, which can lead to you asking even more questions, and so on forever! So I prefer you engage in a simple example which will begin to answer your questions more clearly and definitively.

> I find that trees don't model real organizations very well. Having multiple bosses is fairly common. (Known as "a matrix organization" in some co's)

First, I am not using the hierarchal methodology to represent things. I have not ever clearly described my data model (nor do I want to at this time). For now, you will only be able to judge by the ability of a particular implementation of that model (which is difficult to achieve completely because ideally it needs massive memory and a massively multi-processor system, similar to the brain).

Second there is a difference between
a) representing things using a methodology b) and displaying some limited view of those things in a GUI control.

For example, while I can display things in text, grids and tree controls, all of these only provide a limited view of the actual data. I would like to display things in a control that can draw nodes and link with labels, however I currently do not have such a control that is common and preferrably from Microsoft. (If someone knows one, please suggest it).

To be clear, I haven't asserted that hierarchal methodolgy is ideal for representing organizations with multiple parents. In fact, I assert that while it is theoretically possible to use the hierarchal methodology to model hierachies that have things with multiple parents, it is mostly impractical (and probably can't be normalized).

In the example I gave, the hierarchy does have things with multiple parents. For example, abraham has two parents, adam and eve. If you would like to specify a more "matrix-like" hierarchy, please describe and I will represent it, but note that I can only display that hierarchy/matrix in a grid or tree control at this time. Also, among other means, by displaying the ID of nodes in the tree control, one can verify that even though a thing appears multiple times in the tree control, the underlying data is not redundant as they all have the same IDs.

Also in the example I gave, some of the things are in multiple hiearchies. For example, not only are adam and eve in the parent/child hierarchy, but also in the boss/employee hierarchy.

Also in the example I gave, the hierarchy going "up" is independent of the hierarchy going "down". While not appropriate for the parent/child hierarchy, it is possible for thing A to be the "parent" of thing B while thing B is not a "child" of thingA. While not commonly desired, this level of flexibility requires extra complexity in RM. This is partially what allows the getRoot function to work in either direction.

> If you want flexibility, then have a many-to-many Boss (or ReportsTo) table.

This is true IF one is using RM. To realize how easy/systematic it is to represent a complex matrix with the experimental db, one needs to actually model the same matrix with an RMDB. Please try to model that which I already have, using RM (or as a 2nd option, please specify a new matrix of similar complexity).

> > and even works after adding a person without a name in the hierarchy; and still works when we allow the name employe to have a second spelling 'employee'; and still works when boss is given an alternate name employer; ALL AT RUN-TIME, without the user ever having to specify a schema, IDs, referential intergrity constraints or normalizing data, yet the db is fully normalized and NULL-less!!!
> If the user can change the schema willy-nilly, ...

First I need to know the exact definition of schema. Unfortunately, the one you are probably thinking of is in context to RM. So it becomes something like a car mechanic asking a jet mechanic where are the pistons? It seems to me that when generalized to correspond with the experimental data model, schema is actually a set of user-defined constriants that limit what data can be entered (partially for the benefit of the poor programmer who has to implement the data model). Rather than saying whether there is a schema (piston) or not and if it changes, it is better for me to describe the functionality (of the jet engine). And that is, the db is capable of storing, recalling, querying and manipulating representations of things without the user specifying something the equivalent of a schema in RM.

> ... then there is no way on earth technology can enforce normalizaton (short of AI)

And yet the experimental db does exactly this! I woundn't exactly call it AI but it lays the ground for AI-ishness later. However I am trying to no longer user the term normalization because it's definition is specific/limited to RM. In the experimental db, each thing is represented only once. There are no redundant representations of things in the experimental db. Among other methods, this can be verified by seeing the ID of a thing is the same no matter how many times it appears in a view.

> because it cannot know if two things with a different name are really the same thing in practice. It would have to have common sense.

??? I think you mis-stated what you intended. If you have two things, then they are different, regardless of the names of the two things.

If you meant, how can the experimental db knows if two things with the same name are really different, it is not because it has common sense, but because enough data was given to it and the query is selective enough.

If you meant, how can the experimental db knows if two names really identify the same thing, it is not because it has common sense, but because enough data was given to it and the query is selective enough.

Your puzzlement of how the db does this, should cue you to the possibility that more is going on than appears initially. Best way to find out is to engage in an example and actually submit those queries to the experimental db.

> Further, a Lisp DB cannot be shared with other languages. One of the advantages of a RDBMS is that different languages and tools can all use the *same* database.

The experiment db is NOT a LISP DB! The data resides in a file. To interface to that file, one must use the provided DLL which implements the db engine (50KB). The DLL provides a low-level and a high-level interface to the data. The low-level interface consists of functions which provide more power/flexibility than the high-level interface. The high-level interface is LISP-like. To utilize the LISP-like interface, one calls a function with the preformed string such as "(create john like mary)" or "(select john like *)". Also an executable is provided which has the DLL embedded in it. The executable provides a tree, grid and text based interface. For example, one can submit LISP-like expression via the edit box. For example, one can view/add/delete/modify data via the tree control. For example, one can view and add certain data via the grid control. The DLL is best interfaced to languages like C/C++ and Delphi but can also be called by Visual Basic.

> Unless you want to turn Lisp into a query language (ick!), that won't happen.

Actually it turns out that LISP-like syntax is appropriate for the experimental db. If you were to actually create the equivalent queries in SQL and compare it with some of the queries for the experimental db, you would begin to realize why that is. You will not realize how complex a simple looking query like "(select john like *)" really is until you attempt to implement the equivalent things/flexibility in RMDB.
> Nulls are a vendor-specific idea, not inherent to relational.

I would disagree. NULLs have almost nothing to do with vendors.

In the case one wants to add a tuple without having a value for every attribute defined by the relation's schema, he can: 1) not choose to insert that tuple (not always acceptable) thus avoiding NULLs.
2) choose to insert the tuple anyways and incurr NULLs for the missing values which can be filled with masking values either of which result in three-valued logic.
3) redesign the db schema (add more relations and links) to enter the data without NULLs. This can impact existing scripts, queries, code and GUI. The experimental db can never incur NULLs in the above manner as it has no constraints equivalent to schema (but an application can represent the constriants in the db and enforce them via code, if desired).

> This battle was already faught between Charles Bachman and Dr. Codd in the 70's when navigational fought with relational. Bachman wanted ad-hoc "pointers" to build databases, while Codd felt that tables added more discipline to databases.

All I can say is that, I want you to evaluate based on actual results. When using the high-level interface, the user doesn't deal with IDs or pointers. You won't see either in the scripts also.

> As far as dynamic schemas, you have yet to back your claim that they are inharently inefficient.

I think it may be better for people who think dynamic schemas are practical in RM to implement it, than it is for me to explain further why it is impractical.

> It is a trade-off to weigh against patterns of change.

My primary concern isn't about memory requirements, processing time, or asthetics. It is to implement the most general/flexible method of representing things.

> The idea of a "multiple paradigm database" can basically have one big dynamic table

The experimental db isn't a multiple-paradigm database. It has a single methodology that is so general that it is capable of doing what other paradigms can, albeit not as efficiently in the scope of a particular paradigm which become more and more impractical the further they get outside their scope.

> but it is essentially a navigational database in relational clothing: A navigational database is like a shanty town: no guidence and no rules. Yes, it is flexible because if you need something, you simply hammer it on. However, it is a mess to navigate and manage.

You are now making assertions that are false. If you model the data with similar flexiblities and post the SQL queries that are equivalent to the experimental db's, this will be disprove your assertion.

> No database paradigm will make changes in quantities-of-relationships

I not exactly sure what you mean, but maybe you can show this when engaging in an actual example.

> For example, if a relationship changes from one-to-many into many-to-many, this will be a mess regardless of whether you use Bachman's techique or relational.

It isn't in my methodology where the db is solely responsible for representing thing, not for implementing user-defined constraints.

TopMind, it is easier to clear up some of the misconceptions if you engage in an actual example. Received on Mon Apr 17 2006 - 00:30:55 CEST

Original text of this message