Re: Declarative constraints in practical terms

From: dawn <dawnwolthuis_at_gmail.com>
Date: 23 Feb 2006 14:18:01 -0800
Message-ID: <1140733081.318114.74100_at_e56g2000cwe.googlegroups.com>


Brian Selzer wrote:
> I think you're missing the point. A database is a knowledge repository, not
> an application.

It is a portion of one or more software applications, right?

> It is the foundation upon which applications are built.

I don't see it that way. I see it as a software component. One could also tip things another direction and say "The UI is the foundation of any software application" or "The processing of data is foundational to any software application." I see these all as components to the greater whole. The design and architecture of software are foundational. I do think that how solid a data model for any aspect of software is has a significant impact on how good the whole is.

> If
> the foundation is weak, then the building will crumble.

agreed

> The purpose of the
> Relational Model isn't to build applications,

clearly

> but rather to provide a sound
> footing. The Principles of Normalization and Orthogonal Design strengthen
> the foundation by eliminating redundancy.

eliminating it in some places and introducing it in others

> Integrity constraints seal the
> foundation so that inconsistencies cannot leach their way into the
> foundation, causing it to crack.
>
> Comments inline.
>
> "dawn" <dawnwolthuis_at_gmail.com> wrote in message
> news:1140555731.823399.160020_at_g44g2000cwa.googlegroups.com...
> > This is related to a recent exchange related to declarative
> > constraints. I know I've worked on it before, but I am still perplexed
> > on this one. Here are two common options for handling constraints
> > (business rules):
> >
> > a) declarative constraints: SQL declarations enforced by a DBMS engine
> > b) metadata + code: metadata specifications plus custom-build
> > validation/constraint functions or services written in a general
> > purpose language
> >
> > Some differences I can see are
> >
> > 1) Declarative constraints are coded in the SQL sublanguage while with
> > metadata + code, the metadata declarations are simpler, being in the
> > form of name=value pairs (e.g. maxValue=100, valTable=MyTable) but
> > proprietary code is written.
> >
> > 2) The functions combining the data and constraints in the declarative
> > case are in well-tested (over time, at least) engines written by a
> > database vendor, otherwise often less generic, more specific, functions
> > written by whomever whatever team is writing validation routines for
> > software services. .
> >
> > 3) The functions for validation of data can be used anywhere within an
> > application if packaged outside of the dbms, otherwise most of them
> > need to be coded at least twice -- once for the dbms and once for use
> > in a UI or web service.
>
> But now it's up to the application(s) to maintain integrity, which is a
> recipe for disaster. It is the database's responsibility to maintain
> integrity. If the database contains redundant information, then it's up to
> every application to make sure that every instance is updated at the same
> time so that queries can be relied upon. If constraints aren't declared on
> the objects in the database, then it's up to every application to ensure
> that garbage doesn't make it into the database. If you delegate the
> database's responsibility to the applications, then all it takes is one
> poorly written or malicious application to introduce garbage into the
> database--making the results of all applications suspect.
>
> >
> > 4) The declarative constraints use the RM, so they work with the
> > restrictions of the RM, including 1NF. Because they are written in
> > SQL, they use a 3VL whereas using metadata + code, 2VL and non-1NF are
> > the norm.
>
> NULLs and 3VL are optional. I prefer to avoid them. For every relational
> database that uses NULLs, there exists an equivalent relational database
> that does not.
>
> I would argue that 1NF is a good thing. Sets are already covered, because
> relations are sets. Bags and lists can also be easily mapped onto
> relations. Once mapped and indexed, queries are easier and faster.

How do you know that once mapped and indexed queries are easier and faster? There are no benchmarks of which I am aware that are data model independent for database performance. As for easier, I find it easier to write

select name, gpa from students where every major <> "math"

(converted from a non-1NF query language into pseudo-SQL) than the equivalent sql-92 statement

> >
> > 5) In the case of declarative constraints, they are necessarily
> > employed by any application writing to the database. In the case of
> > metadata + code, each organization must determine whether and how to
> > technically enforce business rules for all applications or enforce them
> > through standards and QA approaches.
> >
> > 6) If there are local constraints (constraints for one application and
> > not another) then these are either declared in SQL as local constraints
> > (something I've read about, but never used) or, often, coded only in
> > the individual applications using metadata + code. So, local
> > constraints are often handled the same but in one case local and global
> > constraints are coded in the same language, typically by the same team.
>
> How can there be constraints for one application and not another? Either
> the information in the database is consistent or it is not.

I'll have to think about this more, but I think a local constraint could be seen as a view (an updatable view). For example, for a group called the Women's Coffee Break group, the view might be restricted to women, where the database includes men as well but they are not prospects. I suppose this could be captured in a global constraint where a new data element is added to indicate whether a person is in this group or not and then we could write the constraint indicating that only gender=F can have WCB="Y" or some such but then we still want to indicate who the prospects are for our application and only show those as options for mailings or for adding to the club or whatever. hmmm. If I'm rambling and making no sense, then just let me know.

> If you don't
> enforce all constraints all of the time, then garbage can get into the
> database! Once garbage gets in, every query is suspect.

Of course.

> > I lean toward not duplicating constraints, coding and maintaining them
> > in multiple places and languages, but I understand that someone else
> > might choose the other strategy. Whatever choice, it doesn't look
> > obvious to me that declarative constraints are better as I gather it
> > appears to many others.
> >
> I think that it is more important to keep garbage out of the database.

That is definitely a goal.

> Again, if the foundation is weak, then the building will fall.

If the foundations are weak. I don't buy "the database is the foundation" type of statements. It is certainly an important aspect of software, however, so I am not trying to minimize it, but to indicate that other aspects of the software are also important and even foundational.

> An
> additional layer of protection is a good thing. Rollbacks are expensive, so
> it's better to scrub the data before it's introduced to the database.

Agreed.

> > This relates to the fact that the RM is not sufficient for writing
> > software (as mentioned in my current blog entry that I'll again boldly
> > advertise as being at http://www.tincat-group.com/mewsings ) and coding
> > constraints using the RM doesn't seem like it can get you all the way
> > there. So if you go that route, you end up duplicating your work, both
> > up front and for all maintenance.
>
> I read your blog. I disagree. It's a simple matter to persist object state
> in a relational database.

But what if it is simpler to persist it using a non-relational interface to the database?

> It also makes it much easier to use the
> information for other purposes, such as reporting.

I've worked quite a bit with reporting in both a relational and a non-relational environment. There were pros and cons in both. What I saw was the scale tilting in favor of the non-relational, however.

> > Is there a way to get the best of both worlds on this one? This issue
> > is really bothering me, so thanks in advance for any help you can give
> > me to gain a better understanding and apologies for bringing it up
> > again. --dawn

Cheers! --dawn Received on Thu Feb 23 2006 - 23:18:01 CET

Original text of this message