Re: Design question
From: Mark S. (UK) <"Mark>
Date: Sat, 24 Jan 2009 19:01:20 +0000
Message-ID: <497B6580.2030808_at_nospamhere.com>
>> Mark S. (UK) wrote:
>>
>>> Hi,
>>> I'm currently working on a system that consists of several sites.
>>> Each site runs a system with its own local database, and then
>>> periodically each site updates the "master" site with its data, allow
>>> the data to be viewed centrally.
>>>
>>> It must be done like this because each site may not always be on the
>>> network, but the data must always be available from the central server.
>>>
>>> To enable the master database to contain all the site database
>>> information, I use a composite key, with the first part being the
>>> site identification, and the second part being the row
>>> identification. This allows user with row id 5 for instance to exist
>>> more than once because paired with the site id it is still unique.
>>>
>>> On top of this, there is another layer. Each site must be contained
>>> within an "Enterprise", which is just another identifier. So, I have
>>> a table called enterprise which contains the list of site ids.
>>>
>>> So far then, I have this (obviously cut down to only show the
>>> relevant columns) : -
>>>
>>> table: enterprise
>>> column: id
>>>
>>> table: site
>>> column: id
>>>
>>> table: user
>>> column: site_id
>>> column: id
>>> column: label
>>> etc..
>>>
>>> With the site_id and id making the composite key.
>>>
>>> Now, I've just had to add the concept of "enterprise" level users.
>>> Initially the only clean way I could think of doing this is having
>>> two separate user tables; enterprise_user and site_user : -
>>>
>>> table: enterprise_user
>>> column[key]: enterprise_id
>>> column[key]: id
>>> column: label
>>> etc..
>>>
>>> table: site_user
>>> column[key]: site_id
>>> column[key]: id
>>> column: label
>>> etc..
>>>
>>> However this does not seem very "normalized"..
>>>
>>> Other ways I considered are : -
>>>
>>> table: user
>>> column[key]: id
>>> column[key]: enterprise_id
>>> column: site_id
>>> column: label
>>> etc..
>>>
>>> But this does not make sense as an enterprise user is not associated
>>> with any site. I could have a dummy site, but I really do not like
>>> that. I could just set site_id to NULL, but I'm not a big fan of
>>> NULL in database and it just seems a bit hacky.
>>>
>>> So then I thought of adding a link table:
>>>
>>> table: user
>>> column[key]: id
>>> column[key]: enterprise_id
>>> etc..
>>>
>>> table: user_site <== link table
>>> column: enterprise_id
>>> column: user_id
>>> column: site_id
>>> etc..
>>>
>>> with a unique constraint on enterprise_id, user_id (the composite key
>>> that identifies a user in an enterprise).
>>>
>>> I think that also works.. basically with no entry in the "user_site"
>>> table, they are enterprise users. An entry in user_site effectively
>>> limits them to being a site user.
>>>
>>> However.. this might create complications in the "master" update
>>> process.
>>>
>>> Part of the batch update process involves updating user information.
>>> Enterprise users (only editable at the master) must be propagated
>>> down to each site, so that the site is "aware" of enterprise users.
>>>
>>> Site users (only editable at the site) must be propagated up to the
>>> master, purely so all the (read only) data at the master is complete.
>>>
>>> So I'm wondering if having the two tables separate will be just much
>>> simpler (although not being terribly efficient in terms of pretty
>>> much repeating a table structure).
>>>
>>> I hope all that makes sense!
>>>
>>> Thanks,
>>>
>>> Mark.
>>
>>
>> After some further thought, I've realised there is another problem:
>> user name (or label as I called it above) uniqueness.
>>
>> A user name at a site may be re-used, as the site is pretty much a
>> self contained system. So a user "fred" can exist at site 1, and at
>> site 2. Enterprise users however must remain unique.
>>
>> So.. site users are unique to sites, and enterprise users are unique
>> to the enterprise (which is the "container" for sites).
>>
>> This brings up a problem with the normalized design:
>>
>> table: user
>> column[key]: enterprise_id
>> column[key]: id
>> column: label
>>
>> add constraint unique on user(enterprise_id, label)
>>
>> table: user_site <== link table
>> column: enterprise_id
>> column: user_id
>> column: site_id
>>
>> This is fine for keeping users unique at the enterprise level, but
>> doesn't work out for site users when they've been propagated back to
>> the "master" database.
>>
>> So, currently I'm thinking two separate tables. I'm not repeating
>> data, I'm just repeating some elements of a table, so perhaps it isn't
>> that bad?
>>
>> Having read these posts back, I'm not sure I've explained the scenario
>> very clearly. I hope it makes some sense at least.
>>
>> Mark.
Date: Sat, 24 Jan 2009 19:01:20 +0000
Message-ID: <497B6580.2030808_at_nospamhere.com>
> Mark S. (UK) wrote:
>> Mark S. (UK) wrote:
>>
>>> Hi,
>>> I'm currently working on a system that consists of several sites.
>>> Each site runs a system with its own local database, and then
>>> periodically each site updates the "master" site with its data, allow
>>> the data to be viewed centrally.
>>>
>>> It must be done like this because each site may not always be on the
>>> network, but the data must always be available from the central server.
>>>
>>> To enable the master database to contain all the site database
>>> information, I use a composite key, with the first part being the
>>> site identification, and the second part being the row
>>> identification. This allows user with row id 5 for instance to exist
>>> more than once because paired with the site id it is still unique.
>>>
>>> On top of this, there is another layer. Each site must be contained
>>> within an "Enterprise", which is just another identifier. So, I have
>>> a table called enterprise which contains the list of site ids.
>>>
>>> So far then, I have this (obviously cut down to only show the
>>> relevant columns) : -
>>>
>>> table: enterprise
>>> column: id
>>>
>>> table: site
>>> column: id
>>>
>>> table: user
>>> column: site_id
>>> column: id
>>> column: label
>>> etc..
>>>
>>> With the site_id and id making the composite key.
>>>
>>> Now, I've just had to add the concept of "enterprise" level users.
>>> Initially the only clean way I could think of doing this is having
>>> two separate user tables; enterprise_user and site_user : -
>>>
>>> table: enterprise_user
>>> column[key]: enterprise_id
>>> column[key]: id
>>> column: label
>>> etc..
>>>
>>> table: site_user
>>> column[key]: site_id
>>> column[key]: id
>>> column: label
>>> etc..
>>>
>>> However this does not seem very "normalized"..
>>>
>>> Other ways I considered are : -
>>>
>>> table: user
>>> column[key]: id
>>> column[key]: enterprise_id
>>> column: site_id
>>> column: label
>>> etc..
>>>
>>> But this does not make sense as an enterprise user is not associated
>>> with any site. I could have a dummy site, but I really do not like
>>> that. I could just set site_id to NULL, but I'm not a big fan of
>>> NULL in database and it just seems a bit hacky.
>>>
>>> So then I thought of adding a link table:
>>>
>>> table: user
>>> column[key]: id
>>> column[key]: enterprise_id
>>> etc..
>>>
>>> table: user_site <== link table
>>> column: enterprise_id
>>> column: user_id
>>> column: site_id
>>> etc..
>>>
>>> with a unique constraint on enterprise_id, user_id (the composite key
>>> that identifies a user in an enterprise).
>>>
>>> I think that also works.. basically with no entry in the "user_site"
>>> table, they are enterprise users. An entry in user_site effectively
>>> limits them to being a site user.
>>>
>>> However.. this might create complications in the "master" update
>>> process.
>>>
>>> Part of the batch update process involves updating user information.
>>> Enterprise users (only editable at the master) must be propagated
>>> down to each site, so that the site is "aware" of enterprise users.
>>>
>>> Site users (only editable at the site) must be propagated up to the
>>> master, purely so all the (read only) data at the master is complete.
>>>
>>> So I'm wondering if having the two tables separate will be just much
>>> simpler (although not being terribly efficient in terms of pretty
>>> much repeating a table structure).
>>>
>>> I hope all that makes sense!
>>>
>>> Thanks,
>>>
>>> Mark.
>>
>>
>> After some further thought, I've realised there is another problem:
>> user name (or label as I called it above) uniqueness.
>>
>> A user name at a site may be re-used, as the site is pretty much a
>> self contained system. So a user "fred" can exist at site 1, and at
>> site 2. Enterprise users however must remain unique.
>>
>> So.. site users are unique to sites, and enterprise users are unique
>> to the enterprise (which is the "container" for sites).
>>
>> This brings up a problem with the normalized design:
>>
>> table: user
>> column[key]: enterprise_id
>> column[key]: id
>> column: label
>>
>> add constraint unique on user(enterprise_id, label)
>>
>> table: user_site <== link table
>> column: enterprise_id
>> column: user_id
>> column: site_id
>>
>> This is fine for keeping users unique at the enterprise level, but
>> doesn't work out for site users when they've been propagated back to
>> the "master" database.
>>
>> So, currently I'm thinking two separate tables. I'm not repeating
>> data, I'm just repeating some elements of a table, so perhaps it isn't
>> that bad?
>>
>> Having read these posts back, I'm not sure I've explained the scenario
>> very clearly. I hope it makes some sense at least.
>>
>> Mark.
> > Mark, > > I don't think theory-based answers to your questions exist, and I also > find design via usenet counterproductive at best. > > You are struggling with a number of issues including distribution and > partial fault tolerance. When needed, these are very important > considerations. > > As a general principle, it is best to handle your distribution and fault > tolerance needs at the physical level of discourse without exposing any > of that in your logical data model. You are trying to handle it entirely > at the logical level, which I think is probably inappropriate. > > Pragmatically, it is best to consider these issues at the earliest > stages and to use your distribution and fault tolerance requirements as > a basis for comparing dbms products and then choosing the dbms product > that best addresses your distribution needs. Once you have chosen a dbms > product, which it sounds like you have already done, then it is best to > consider the specific features that dbms product has for handling > distribution, replication, fault tolerance et cetera physically with the > least impact to your logical design. > > Many products now have features for distribution or replication, for > assigning globally unique numeric identifiers, for reconciling off-line > changes and more. Perhaps even the product you use has such features. > > The issue you have with user id's is a basic requirements and design > issue. Assuming you are using user id in the conventional way, the > purpose of a user id is to uniquely identify users of a system. > Pragmatically, the scope of such identity has to be an entire > independent system. Generally, that means globally throughout a system. > In your case, that may not be true. Individual sites, though, are not > independent systems because they are tied together into enterprises. It > may be your enterprises are entirely independent from one another. Or it > may be they have interdependencies too. > > Sometimes, it makes sense to scope user ids to have partial > independence, which is what email addresses do. The entire globally > unique id is the full email address with a user name, the _at_ symbol, and > a domain, while the locally unique id is simply the user name. Doing so > simplifies some things while complicating others. > > In any case, I think you need to decide the scope of your user ids based > on what users will expect and what meets the needs of the system you are > designing. You may have to trade off conflicts between them. > > In the end, at this point in time, design is more art than science.
--Received on Sat Jan 24 2009 - 20:01:20 CET