Re: Declarative constraints in practical terms

From: Brian Selzer <brian_at_selzer-software.com>
Date: Fri, 24 Feb 2006 01:55:09 GMT
Message-ID: <1ItLf.51782$dW3.8817_at_newssvr21.news.prodigy.com>


See comments below.

"dawn" <dawnwolthuis_at_gmail.com> wrote in message news:1140733081.318114.74100_at_e56g2000cwe.googlegroups.com...
> Brian Selzer wrote:
>> I think you're missing the point. A database is a knowledge repository,
>> not
>> an application.
>
> It is a portion of one or more software applications, right?
>

Wrong. It's a separate and distinct entity that is used and manipulated by applications.

>> It is the foundation upon which applications are built.
>
> I don't see it that way. I see it as a software component. One could
> also tip things another direction and say "The UI is the foundation of
> any software application" or "The processing of data is foundational to
> any software application." I see these all as components to the
> greater whole. The design and architecture of software are
> foundational. I do think that how solid a data model for any aspect of
> software is has a significant impact on how good the whole is.
>

Huh??? Isn't software a synonym for Information System? Isn't the whole point of software to manipulate information? The user interface is arbitrary and at best peripheral to the purpose of an application. How data is manipulated is important, but depends on the definition of the data. Thus the definition of the data is the first and in my opinion the most important development task.

Before you can even begin the process of design and architecture, you must first perform a thorough analysis, and the ultimate purpose of that analysis is to determine what information needs to be collected and retained.

>.... I do think that how solid a data model for any aspect of
> software is has a significant impact on how good the whole is.
>

I'm glad you do. The first time I read this passage, I thought you meant "don't."

>
> eliminating it in some places and introducing it in others
>

I disagree. The definition of a database should be all that is required to determine whether a database value is legal. Constraints are an integral part of that definition.

>> Integrity constraints seal the
>> foundation so that inconsistencies cannot leach their way into the
>> foundation, causing it to crack.
>>
>> Comments inline.
>>
>> "dawn" <dawnwolthuis_at_gmail.com> wrote in message
>> news:1140555731.823399.160020_at_g44g2000cwa.googlegroups.com...
>> > This is related to a recent exchange related to declarative
>> > constraints. I know I've worked on it before, but I am still perplexed
>> > on this one. Here are two common options for handling constraints
>> > (business rules):
>> >
>> > a) declarative constraints: SQL declarations enforced by a DBMS engine
>> > b) metadata + code: metadata specifications plus custom-build
>> > validation/constraint functions or services written in a general
>> > purpose language
>> >
>> > Some differences I can see are
>> >
>> > 1) Declarative constraints are coded in the SQL sublanguage while with
>> > metadata + code, the metadata declarations are simpler, being in the
>> > form of name=value pairs (e.g. maxValue=100, valTable=MyTable) but
>> > proprietary code is written.
>> >
>> > 2) The functions combining the data and constraints in the declarative
>> > case are in well-tested (over time, at least) engines written by a
>> > database vendor, otherwise often less generic, more specific, functions
>> > written by whomever whatever team is writing validation routines for
>> > software services. .
>> >
>> > 3) The functions for validation of data can be used anywhere within an
>> > application if packaged outside of the dbms, otherwise most of them
>> > need to be coded at least twice -- once for the dbms and once for use
>> > in a UI or web service.
>>
>> But now it's up to the application(s) to maintain integrity, which is a
>> recipe for disaster. It is the database's responsibility to maintain
>> integrity. If the database contains redundant information, then it's up
>> to
>> every application to make sure that every instance is updated at the same
>> time so that queries can be relied upon. If constraints aren't declared
>> on
>> the objects in the database, then it's up to every application to ensure
>> that garbage doesn't make it into the database. If you delegate the
>> database's responsibility to the applications, then all it takes is one
>> poorly written or malicious application to introduce garbage into the
>> database--making the results of all applications suspect.
>>
>> >
>> > 4) The declarative constraints use the RM, so they work with the
>> > restrictions of the RM, including 1NF. Because they are written in
>> > SQL, they use a 3VL whereas using metadata + code, 2VL and non-1NF are
>> > the norm.
>>
>> NULLs and 3VL are optional. I prefer to avoid them. For every
>> relational
>> database that uses NULLs, there exists an equivalent relational database
>> that does not.
>>
>> I would argue that 1NF is a good thing. Sets are already covered,
>> because
>> relations are sets. Bags and lists can also be easily mapped onto
>> relations. Once mapped and indexed, queries are easier and faster.
>
> How do you know that once mapped and indexed queries are easier and
> faster? There are no benchmarks of which I am aware that are data
> model independent for database performance. As for easier, I find it
> easier to write
>
> select name, gpa from students where every major <> "math"
>
> (converted from a non-1NF query language into pseudo-SQL) than the
> equivalent sql-92 statement
>

How do I know? Experience and just plain common sense. A list must be traversed sequentially. A set or bag must be scanned. Indexes make queries against these much, much faster.

>> >
>> > 5) In the case of declarative constraints, they are necessarily
>> > employed by any application writing to the database. In the case of
>> > metadata + code, each organization must determine whether and how to
>> > technically enforce business rules for all applications or enforce them
>> > through standards and QA approaches.
>> >
>> > 6) If there are local constraints (constraints for one application and
>> > not another) then these are either declared in SQL as local constraints
>> > (something I've read about, but never used) or, often, coded only in
>> > the individual applications using metadata + code. So, local
>> > constraints are often handled the same but in one case local and global
>> > constraints are coded in the same language, typically by the same team.
>>
>> How can there be constraints for one application and not another? Either
>> the information in the database is consistent or it is not.
>
> I'll have to think about this more, but I think a local constraint
> could be seen as a view (an updatable view). For example, for a group
> called the Women's Coffee Break group, the view might be restricted to
> women, where the database includes men as well but they are not
> prospects. I suppose this could be captured in a global constraint
> where a new data element is added to indicate whether a person is in
> this group or not and then we could write the constraint indicating
> that only gender=F can have WCB="Y" or some such but then we still want
> to indicate who the prospects are for our application and only show
> those as options for mailings or for adding to the club or whatever.
> hmmm. If I'm rambling and making no sense, then just let me know.
>
>> If you don't
>> enforce all constraints all of the time, then garbage can get into the
>> database! Once garbage gets in, every query is suspect.
>
> Of course.
>
>> > I lean toward not duplicating constraints, coding and maintaining them
>> > in multiple places and languages, but I understand that someone else
>> > might choose the other strategy. Whatever choice, it doesn't look
>> > obvious to me that declarative constraints are better as I gather it
>> > appears to many others.
>> >
>> I think that it is more important to keep garbage out of the database.
>
> That is definitely a goal.
>
>> Again, if the foundation is weak, then the building will fall.
>
> If the foundations are weak. I don't buy "the database is the
> foundation" type of statements. It is certainly an important aspect of
> software, however, so I am not trying to minimize it, but to indicate
> that other aspects of the software are also important and even
> foundational.
>

The whole point of software is to manipulate information. All other aspects of software development depend on the definition of that information--that is, the database schema: you have to know what you're manipulating before you can manipulate it. I would think that that makes the database the foundation upon which applications are built.

>> An
>> additional layer of protection is a good thing. Rollbacks are expensive,
>> so
>> it's better to scrub the data before it's introduced to the database.
>
> Agreed.
>
>> > This relates to the fact that the RM is not sufficient for writing
>> > software (as mentioned in my current blog entry that I'll again boldly
>> > advertise as being at http://www.tincat-group.com/mewsings ) and coding
>> > constraints using the RM doesn't seem like it can get you all the way
>> > there. So if you go that route, you end up duplicating your work, both
>> > up front and for all maintenance.
>>
>> I read your blog. I disagree. It's a simple matter to persist object
>> state
>> in a relational database.
>
> But what if it is simpler to persist it using a non-relational
> interface to the database?
>
>> It also makes it much easier to use the
>> information for other purposes, such as reporting.
>
> I've worked quite a bit with reporting in both a relational and a
> non-relational environment. There were pros and cons in both. What I
> saw was the scale tilting in favor of the non-relational, however.
>
>> > Is there a way to get the best of both worlds on this one? This issue
>> > is really bothering me, so thanks in advance for any help you can give
>> > me to gain a better understanding and apologies for bringing it up
>> > again. --dawn
>
> Cheers! --dawn
>
Received on Fri Feb 24 2006 - 02:55:09 CET

Original text of this message