Re: Sourcing Metadata for Database Independence

From: mAsterdam <mAsterdam_at_vrijdag.org>
Date: Mon, 09 Aug 2004 13:40:55 +0200
Message-ID: <411762ca$0$65124$e4fe514c_at_news.xs4all.nl>


Dawn M. Wolthuis wrote:

> mAsterdam wrote:

>>Dawn M. Wolthuis wrote:
>>
>>> For database independent applications, such as those written
>>> by many application software development companies,
>>> the most basic of metadata, such as the names of
>>> attributes, must be sourced so it is useful with any
>>> target database implementation.
>>
>>What do you mean with "database independent"? ...

>
> Yes, independent of any particular implementation or
> even type of database (e.g. SQL-DBMS)
>
>>> There are many different strategies for where to source such
>>> metadata (as well as how extensive this metadata should be
>>>since all code is, itself, metadata).
>>
>>Code contains and uses metadata, but it *is* not metadata, imho.

>
> I knew others would disagree

This (dis)agreement runs deeper.
In order to discuss how code and metadata relate, we should know about the relation between code and data, no?

A few examples:

In prolog, data and code are the same thing. So, the problem is non-existent.

In state-transition modeling data is about state, code about the valid transitions. The valid transitions could easily reside in any database as data - so yep, metadata (but of course not *all* metadata is code here). Any language would be capable of implementing the STN, given this metadata. It has meaning in different contexts.

But not all code from all programming languages is easily tuplified in a human understandable form (of course it is possible to tuplify, or the interpreter/compiler could not do it's job).

> as I think I've said it before, but if metadata were to have
> an IF statement in it, would that make it no longer data?

"IF statement" - hm... could you give an example?

> Are database constraints metadata?
>

>>Metadata: data about data. Code uses that to achieve manipulation
>>and representation of data.

>
> it depends on how you define metadata and data.

What's wrong with "metadata: data about data"?

> Code is a type of data too and it is data that includes
> information about how to handle data -- i.e.,
> it's data about data.

In prolog: yes. Using STN (in any language): almost. In non-declarative languages: Not, or at least not readily.

>>>It could be sourced in code, thereby fixing 
>>>on a particular language while remaining
>>>database-independent.
>>
>>Could you please explain what you mean by that?

>
> Java classes, for example, could include all of the metadata for a model.
> Reflection could then be used to extract that information and use it,
> including pushing it to a database schema so that the metadata is actually
> sourced in the code.

Sure. Any language (language? even dos .BAT or mainframe JCL) could do that.
Metadata is transportable via code, because it is data. However, and this is important imho, reasoning the other way around does not work here: Not ALL code is metadata. Some Java has only meaning (and use) in a Java context.

Iow: It buys database (vendor/type) indepedency at the expense of two things: language dependency and (possibly) additional complexity;
Language dependency - that's obvious, no? Additional complexity: You 'ld have that if the language (such as Java) was not made to hold data in a understandable and shareable form, it was made to manipulate and represent data. XML, otoh, *is* designed to transport documents contaning data in both human and computer easily renderable, understandable form.

>>>It could be sourced in the metadata repository of 
>>>a development database from which the product is
>>>built for any environment.  It could be sourced in
>>>XML documents (or any type of parm file) that serve 
>>>as input for the code and for the database processes.

>
>>>It could be sourced as data in a database 
>>>(rather than simply as metadata).
>>
>>>This could be an embedded database in a metadata service.
>>
>>I have seen this reasonably in place several times.
>>Both in tagged textfiles (to be specific: DCF/GML, not
>>unlike SGML/XML - but I don't think it really matters),
>>and in some database product.
>>
>>>If you figure that the specification of a data type for
>>>a given attribute is a business rule, of sorts, you could 
>>>have a business rules repository that is the source of all 
>>>metadata.  You are then tied to a particular rules engine 
>>>(which might then tie you to a language or database too) 

What happened to the platinum repository after the ca-takeover? http://www.ca.com/acq/platinum/

>>>even if it is written in-house.

Yep.

>>>I know there are some not-very-widely-used standards for 
>>>metadata repositories -- are there industry standards for rules
>>>specifications other than SQL?  There are also third party metadata
>>>repositories.

Would OCL (object constraint language), used to formalise te UML be of interest to you?

>>>I'm thinking more about the future than about what are
>>>the currently most accepted practicies.  
>>>If you are not tied to a specific language or database up front 
>>>and are developing a new software application to be deployed at
>>>many customer sites on many different databases, 
>>>where/how would you source the metadata?  
>>>Thanks in advance for your thoughts on this..
>>
>>I don't think the where/how (i.e.: which type of metadata
>>database) is particularly important - except of course for
>>a solution provider. Otoh a clear investment choice from time to
>>time does wake up the decision makers to the cost, and, more
>>importantly the benefits of datamanagement. The deployment
>>of a specialised tool may give the leverage to get the
>>necessary procedures accepted :-)

>
> Because the metadata is important for reporting against the data, among
> other things, it lends itself to standardization, such as has been done with
> SQL-DBMS's. This has prompted every other type of database and file system
> to try to speak SQL in order to be included.

Yep. There is even some metadata standardisation going on in SQL. I don't know about the status of that, though, only that certain dictionary views would be required. Any knowledgable input on this is welcome.

> While OLTP systems might make use of some non-database-specific

??? How ???

> repository of metadata, reporting tools tend to use the
> database-specific schema. Today all reporting tools that claim
> to be independent of any specific database are dependent
> on SQL (to my knowledge). Tomorrow that might be XQuery or
> something that lends itself to a broader set of database
> implementations.
>
> But I think that having query tools necessarily interface with each
> database-specific implementation of metadata might not be the future.
> I'm not sure that made sense, but my point is that
> I'm thinking about reporting tool alternative approaches.
Received on Mon Aug 09 2004 - 13:40:55 CEST

Original text of this message