Re: 3 value logic. Why is SQL so special?

From: Anne & Lynn Wheeler <lynn_at_garlic.com>
Date: Wed, 13 Sep 2006 18:37:08 -0600
Message-ID: <m3wt87uvuz.fsf_at_lhwlinux.garlic.com>


"Cimode" <cimode_at_hotmail.com> writes:
> And many other information...
> This is a purely pedagogical case (far from being complete) to
> demonstrate that it is perfectly possible to build some logical design
> in minutes (took me 20 of them) WITHOUT using NULLS...while sticking to
> the God Damn Real World (lazyness) excuse...

small matter ... scheduled arrival, projected arrival, actual arrival can have different business uses. the scheduled arrival ... for scheduled flights have all sorts of issues for reservations, planning, etc. the projected arrival may be used in real time for things like decisions about holding other flights for the convinence of other connecting passengers. actual arrival may be used for long term planning purposes ... as well as computed values like difference between scheduled and actual.

one approach about information not yet known is to have different tables that don't yet have entries for the unknown values. another possible mechanism is to have fields mean different things depending on other state information ... for instance have a state flag to switch the meaning of the field containing the projected arrival to actual arrival (or other domain knowledge that changes what a field means based on other circumstances).

it may turn out to not only be useful to have different tables ... but possibly even completely different servers and applications. for instance the projected arrival tends to be of relatively short-term use ... and rarely needs to be carried for very long. the table of projected flight departures and arrivals ... may be relatively few flights with lots of inserts, updates, and deletes. people needing real-time information for scheduling and provisioning adjustments tend to have totally concerns (i.e. will the equipment coming in on a different scheduled flight be available in time to turn around for a brand new flight, will the crew arriving on a different flight be available in time for the their new assigned flight, will connecting passengers be able to make their connection, etc) than some of the longer term considerations. Having totally different tables and databases eliminates the requirement for trying to maintain global encompassing information.

an unscheduled flight ... is they are flying some equipment that isn't a "scheduled" flight ... but for which they have to file a flight plan and actually fly ... or not ... possible to file a flight plan and then have it abort because of weather ... and then file a brand new flight plan ... which possibly might be a totally different unscheduled flight (as opposed to a flight plan for a scheduled flight).

res. system with 400k plus flight segments for scheduled flights, scheduled departures, scheduled arrivals ... tends to be close to 400k plus flight segments every day (i.e. a database with 400k plus records) .... slightly less than that because there are re-occuring flights every day ... but there are also re-occuring flight segments specified just for weekdays ... with different re-occuring flight segments specified for saturday and/or sunday.

actual flight departures and arrivals on the order of 400k some per day accumulate and can be kept for months ... tens of millions of records with 400k some inserts everyday for new flights and 400k some deletes everyday for old flights that are aged out (actually this tends to be partitioned into smaller subsets that are carrier specific).

PNR (passenger name record) can be a couple orders larger ... starts when the reservation is made ... possibly a couple months before the actual flight (hundred or so people per flt) and may be kept from a couple months to possibly a couple years. a few tens of millions reservations are made every day, a few tens of millions of updates for flights taken, and a few tens of millions of deletes as reservations are canceled or aged out ... the number of records should hover around the avg. number of reservations made per day (few tens of millions) times how long the records are kept (several hundred days). That can be several billion records ... the total size isn't too bad since PNR record information is fairly condensed and runs around 4k bytes ... frequently has lots of nulls for unknown information since they try and have everything in a single record read/write/update/insert/delete/etc ... so possibly only several terabytes aggregate. another kind of PNR null might be zip code field that is not applicable (as opposed to unknown) ... to get around possible SQL issues ... fill it in with anything as a substitute for actual null ... like the letters N/A. Then any mailing processing would have to know that N/A actually means null when it goes to generate an actual address (as opposed to the zip code: N/A). Not available information could be filled with N/A also (for not available as opposed to not applicable) or possibly UNKN ... which i've also seen used. Received on Thu Sep 14 2006 - 02:37:08 CEST

Original text of this message