Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Database design and load
I have what I hope is an interesting opportunity. My company is about
to put me on a project to optimize an oracle application. I have some
ideas, but I am not actually on the project, so I can only guess on
certain parts of it. I do know it is currently on 9i, on a Sun machine.
It is a 24/7 operation where data is constantly flowing in (I don't
know how much, but I suspect it is a significant amount/second). There
are two "databases", I don't know if these are actually instances or
schemas (you know management). One "database" contains the most current
data for the last 3 days; this one is "big". The other has all the
data, this one is terabytes and growing very fast. The first "database"
is partitioned by date (day). To maintain the last three, the oldest
partition is dropped and a new one for the current day is added (I
assume in a round robin way). The historical "database" is not
partitioned and just keeps growing.
Currently they have a java program loading the data into both the
current and historical "databases" concurrently. Reporting is done off
of both, and is needed in "real time". As an added benefit, "they" want
to go to oracle 10g and change the whole thing to a grid for speed and
efficiency of loading and retrieving the data.
My thought is:
Partition the historical "database".
Load data only into the current "database", and move partitions from
current to historical, creating a new partition each day in the current
"database".
For queries on the historical data, join the data in current and
historical (is this possible if it is actually two databases/instances?
How?)
For loading the data, would it be more efficient to build a flat file
and use sql*loader, say once a minute or even second depending on the
amount of data and the time necessary to run the queries rather than a
java program (I don't know java, and do not know how efficient it is
for loading data in Oracle). Am I correct in thinking that a query in
oracle only sees the data as of the start of the query, and any records
added (committed or not) after the start of the query are not available
to it? This is the thought behind loading the data in batches using
sql*loader. If the query takes a second or two to process, then batch
loading every second would not change the currency of the data based on
my assumption above, assuming the load takes less than a second itself.
I do not know the response time, therefore I would have to balance it
with the speed of the sql*loader to see if this is even a possible
scenario.
I don't know the load on the system, whether it is constant over the 24
hours/7 days, but I suspect it is, since it is used around the world,
it is relatively constant. Is this scenario a good candidate for a grid
database? I am under the impression that grid is best for load
balancing over several applications or operations (on-line/batch), and
this is a single application with a fairly level load. If the grid
would help, what should I do in optimizing the 9i to prepare for moving
it to 10g? I hope this is not too big of a question. As I do not have
any empirical data, I am looking for impressions, opinions, and maybe
some suggestions, or references to study in preparing to actually get
on this project and get my hands "dirty". Thanks for any help.
Received on Wed May 11 2005 - 07:27:47 CDT