Re: Database Builders, Code Generators, On-Topic?
Date: Mon, 26 May 2003 19:11:07 -0400
Message-ID: <58s0bb.ge4.ln_at_mercury.downsfam.net>
Quoting unnamed sources, Marshall Spight claimed:
Marshall: This may be a duplicate post, I had the date screwed up on my box
and cannot see the first try.
> "Kenneth Downs" <ken_remove_underscores_downs_at_downsfam.net> wrote in
> message news:0gvrab.dhl.ln_at_mercury.downsfam.net...
>> Upgrades are actually the first reason I developed early versions of this
>> system, when I was working independently 7-8 years ago. The flow is:
>>
>> 1. Our catalog equals "what I want"
>> 2. The system catalog reveals "what is"
>> 3. Generate a bunch of DDL to turn "what is" into "What I want"
>> 4. Optionally Execute. Or stop and report the intended plan for
>> review, approval, etc.
>>
>> This has the nifty added advantage that a first time installation is the
>> same as an upgrade. If "what is" is empty, it builds the entire
>> database.
>
> This is extremely cool.
Thanks, we like it :)
>
>
>> Taken with the statement given below, "All operations that can be >> performed against a database can themselves be depicted within a database >> as scalar values", it is not so much a system catalog that we have as a >> business catalog.
>
> Well, this is certainly true, because a string is a scalar, and all
> operations that can be performed against a database can be depicted as a
> string containing SQL. But...
>
> I have been pondering lately this issue of representing database
> operations as relational data, and I'm beginning to believe that it's
> seductive but not all that useful.
>
> Consider: if I want to model simple inserts into a table, then that
> beautifully manifests itself as a set of tuples that match the table
> heading. No problem. Now what if I want to model deletes? Uh, I suppose I
> could model that as a set of tuples to remove, but that's not that
> efficient. All I need is one key attribute to uniquely specify the tuples.
> So which key to use? Uh, I guess the "primary" one, eh? :-) Now what if I
> want to model updates? They come in so many flavors.
>
> update table set a = 7 where primary_key =8
>
> Now I need two columns: more than for delete, less than for insert.
>
> update table set a = 7
>
> How do I represent that?
>
> update table set a = 7 where primary_key in (select b from different_table
> where c = 8)
>
> What about that one?
>
> Eventually it becomes clear that representations of updates need the full
> value of the relational calculus. (Which, stated that way, I guess should
> have been obvious.)
>
>
> Marshall
The idea is that you do not want a schema to capture the syntax of SQL itself. Rather, you first need a very detailed system catalog (as Mr. Badour immediately surmised), and then the operations you perform tend to fall into a few categories, such as imports, exports, warehouse generation, and data procesing.
Before I forget, the actual answer on your code questions is twofold. First, given the paragraph above, you are not trying to replace application SQL with application data, you are trying to recast the application as a series of tasks, rules, maps, etc, so it turns out we don't often need that type of SQL. Second, we find we only need that code in things like user-defined reports, and it so happens my predecessor coded something up a few years ago that is Good Enough, so we are not actively developing that.
Here is a single example of the type of code we generate when we have a strong system catalog. Any mature app in the field is going to have an import or three, or six hundred, so how do you handle that? Do you write a lot of very specific routines in a host language and/or SQL? Much better to create an IMPORT handler that expects parameters that are Scalar Complete. We ask, what is the bare minimum number of parameters I need to give an IMPORT engine so that it will import a table? Turns out that you need only two, the source and target tables, if you are willing to make some assumptions. So you start with the table of import definitions:
IMPORT_NAME VERSION SOURCE TARGET
=========== ======= ======= ========
imp_items 1.0.0 j T1_inbox T1_live
Make these assumptions (or drive them with more options)
-> Like column names are copied straight over -> Where appropriate, similar columns can be padded out -> A map of columns can be provided when you need to specify
things that are not safe for defaults, such as trimming values and so forth.
Armed with these assumptions (or again, more parameters to control them), we are tempted to jump to generating this code:
INSERT INTO T1_live (...generated field list...) SELECT (...generated field list...) FROM t1_inbox
but this is premature. In this case we are relying on the constraints we have created with DDL to control the operation, but then we are deep in implementation-specific territory, and making the operation behave the same way across different servers is a bummer. Turns out we can do a lot better if we query the catalog for a list of foreign keys in the target table, and then generate this statement for each of them:
INSERT INTO Errors_list
SELECT T1_inbox_Key,"Derived_error_Literal"
FROM t1_inbox
where reference_column not in (Select key_column from ref_table)
You can then code in parameters that do the opposite, such as what we call a
"force true" by populating the reference table:
INSERT INTO Reference_table
SELECT Reference_column FROM t1_inbox
where reference_column NOT IN (Select key_column from ref_table)
..and then you want to be able to split the feed into a parent-child pair of tables if necessary, or the opposite. When you are done, you've got a generalized INSERT.
Here is a grenade I'll throw out there. We are coding generators w/o any OO in our code, because it is irrelevant. The variations are expressed in the data, not in the code. The code that generates the SQL is a flat library of subroutines that are called when necessary. All of the customization is in the customer's data, we never produce a child class modified for a customer or for ourself.
-- Kenneth DownsReceived on Tue May 27 2003 - 01:11:07 CEST
