small matter ... scheduled arrival, projected arrival, actual arrival
can have different business uses. the scheduled arrival ... for
scheduled flights have all sorts of issues for reservations, planning,
etc. the projected arrival may be used in real time for things like
decisions about holding other flights for the convinence of other
connecting passengers. actual arrival may be used for long term
planning purposes ... as well as computed values like difference
between scheduled and actual.
one approach about information not yet known is to have different
tables that don't yet have entries for the unknown values. another
possible mechanism is to have fields mean different things depending
on other state information ... for instance have a state flag to
switch the meaning of the field containing the projected arrival to
actual arrival (or other domain knowledge that changes what a field
means based on other circumstances).
it may turn out to not only be useful to have different tables ... but
possibly even completely different servers and applications. for
instance the projected arrival tends to be of relatively short-term
use ... and rarely needs to be carried for very long. the table of
projected flight departures and arrivals ... may be relatively few
flights with lots of inserts, updates, and deletes. people needing
real-time information for scheduling and provisioning adjustments tend
to have totally concerns (i.e. will the equipment coming in on a
different scheduled flight be available in time to turn around for a
brand new flight, will the crew arriving on a different flight be
available in time for the their new assigned flight, will connecting
passengers be able to make their connection, etc) than some of the
longer term considerations. Having totally different tables and
databases eliminates the requirement for trying to maintain global
encompassing information.
an unscheduled flight ... is they are flying some equipment that isn't
a "scheduled" flight ... but for which they have to file a flight plan
and actually fly ... or not ... possible to file a flight plan and
then have it abort because of weather ... and then file a brand new
flight plan ... which possibly might be a totally different
unscheduled flight (as opposed to a flight plan for a scheduled
flight).
res. system with 400k plus flight segments for scheduled flights,
scheduled departures, scheduled arrivals ... tends to be close to 400k
plus flight segments every day (i.e. a database with 400k plus
records) .... slightly less than that because there are re-occuring
flights every day ... but there are also re-occuring flight segments
specified just for weekdays ... with different re-occuring flight
segments specified for saturday and/or sunday.
actual flight departures and arrivals on the order of 400k some per day
accumulate and can be kept for months ... tens of millions of records
with 400k some inserts everyday for new flights and 400k some deletes everyday
for old flights that are aged out (actually this tends to be
partitioned into smaller subsets that are carrier specific).
PNR (passenger name record) can be a couple orders larger ... starts
when the reservation is made ... possibly a couple months before the
actual flight (hundred or so people per flt) and may be kept from a
couple months to possibly a couple years. a few tens of millions
reservations are made every day, a few tens of millions of updates for
flights taken, and a few tens of millions of deletes as reservations
are canceled or aged out ... the number of records should hover around
the avg. number of reservations made per day (few tens of millions)
times how long the records are kept (several hundred days). That can
be several billion records ... the total size isn't too bad since PNR
record information is fairly condensed and runs around 4k bytes
... frequently has lots of nulls for unknown information since they
try and have everything in a single record
read/write/update/insert/delete/etc ... so possibly only several
terabytes aggregate. another kind of PNR null might be zip code field
that is not applicable (as opposed to unknown) ... to get around
possible SQL issues ... fill it in with anything as a substitute for
actual null ... like the letters N/A. Then any mailing processing
would have to know that N/A actually means null when it goes to
generate an actual address (as opposed to the zip code: N/A). Not
available information could be filled with N/A also (for not available
as opposed to not applicable) or possibly UNKN ... which i've also
seen used.