Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: fastest method for reading tables

Re: fastest method for reading tables

From: M Sweger <mikesw_at_dhp.com>
Date: 1997/11/20
Message-ID: <651lfp$ps2$1@stronghold.dhp.com>

Keith Boulton (boulke_at_globalnet.co.uk) wrote:
: On 18 Nov 1997 00:17:51 GMT, hasan_at_DB.Stanford.EDU (Waqar Hasan)
: wrote:
 

: >
: >I have an application program that needs to read data
: >by the fastest method possible.
: >
: >Is OCI the fastest available interface?
: >What parameters need to be tuned to get the application
: >to go faster?
: >Is there any other interface that might go faster than OCI?
: >
: >The application makes queries that retrieve *all* rows of a table
: >(but only a subset of the columns).
: >
: >The application does not care if the data is slightly out of date.
: >
: >thanks,
: >
: >-Waqar Hasan
: > hasan_at_db.stanford.edu
 

: I would be surprised if the OCI was significantly faster than a Pro*
: embedded SQL program. I don't have a suitable compiler at the moment
: to experiment.
 

: The work going on in a query can be split into two seperately
: optimised stages:
 

: 1. Read data from the database to a server process.
: 2. Communiciate the results from the server process to a user process.
 

: To optimise step 1 involves ensuring that the data blocks are
: available in the sga i.e. the
 

: db_block_buffers initialisation parameter should be as high as
: possible without inducing paging
 

: on the server and the server process should rarely be shutdown so that
: the cache hit ratio is
 

: as high as possible.
 

: If your application primarily accesses tables by full table scan, then
: issuing ALTER TABLE
 

: <your table> CACHE will favour your table at the expense of others
: though this may not have the
 

: desired effect.
: If you are accessing the table via a full-table scan and only
: accessing a small proportion of
 

: each row ( e.g. 3 of 20 columns, or only 30 bytes out of an average
: record length of 2000 ) it
 

: may be worth splitting the table into two with a one-to-one
: relationship between them although
 

: the same effect can be achieved by using an index fast full scan
: (index_ffs hint ) - see the
 

: Oracle Server Tuning manual.
: Alternatively, it may be worth defining one or more composite indexes
: on the columns required.
 

: Oracle can use this index to order the rows, and if all the columns
: required to be retrieved
 

: are in the index no data block accesses will be required. Even without
: ordering, Oracle
 

: (version 8 at least) can do a fast full scan of an index in place of a
: full table scan where
 

: all the columns in the select list are in the index so reducing the
: number of data blocks to be
 

: read.
 

: Examples:
: create table bill ( a varchar2(20) not null, b varchar2(2000) not
: null, c varchar2(2000), d
 

: number not null )
: storage( initial 2m next 2m pctincrease 0 );
 

: After inserting 32768 records. Note in all the following tests, the
: ratio real to elapsed was
 

: similar.
 

: select avg(a) from bill; almost entirely reflects the cost of the full
: table scan on the server
 

: giving real: 7811ms
: repeating the same query gives real: 3500 ms reflecting the impact of
: caching in the sga and in
 

: this case, if file buffers. Analyze table has no impact on these
: results except that the
 

: analyze command itself causes some data to be cached.
: Issuing alter table bill cache; followed by select avg(a) from bill;
: gives real: approx 3500ms
 

: but subsequent executions give real: approx 550 ms, nearly seven times
: faster. I would not
 

: expected the cache option to have much effect here, because the sga
: was already plenty big
 

: enough to hold all of the table.
 

: Creating another table john which holds only column a of Bill gives an
: initial time after
 

: database restart for select avg(a) from john of real: 900ms,
: subsequent executions give real:
 

: 600ms, after alter table john cache; subsequence executions give real:
: 460ms.
 

: Creating an index on bill also improves results:
: create index bill_i1 on bill(a) storage( initial 1m next 1m
: pctincrease 0 );
: analyze table bill estimate statistics;
: perform shutdown and restart to clear the sga
: select avg(a) from bill; real: 731
: select avg(a) from bill; real: 541
 

: A slight improvement on subsequent executions results from caching of
: index blocks. The access
 

: path shows that the new index is being used in place of the table:
: SELECT STATEMENT Cost = 365
: SORT AGGREGATE
: INDEX FULL SCAN BILL_I1
 
: The same access path minus the sort aggregate is used both for select
: a from bill and select
 

: avg(a) from bill, confirming the earlier hypothesis that select avg(a)
: is the same as select a
 

: in terms of database processing.
 

: In general, data is required to be sorted and we do not wish to
: distort sga caching too much,
 

: so use indexes rather than caching, unless you are referencing most of
: the columns in the
 

: table. Bear in mind, however, the impact of large numbers of indexes
: on record
 

: insertion/update/delete.

: Optimising communication between the server process and the user
: process is done primarily by
 

: passing more than one row of data using array fetches. This can be
: seen in the performance of
 

: the export command with different buffer sizes:
: Buffer Time(s)
: 0 20
: 10k 15
: 100k 7
: 1000k 7
 

: Notice that at some point there is not further gain.
 

: The fastest method possible would be to use a direct path read of the
: database blocks. I believe there is enough information available in
: the documentation to allow you to decode the database blocks directly
: (should you wish to be so foolish).

Hi,

        I'm surprised that the performance to search just 32k data records takes 3-7 secs. It sure seems slow to me, particularily if I'm a Web surfer trying to search a companies product list. Question:

  1. What version of oracle V7, V8? I thought multidimensional DBMS such as V8 is supposed to fast.

        2). What about parallizing the query(ies)? I'm not sure about this.

        3). What platform? I.E. Unix, processor(s) speed.

	4). What about loading the whole DBMS into memory thereby reducing
	    disk access?

	5). How can Oracle claim 50K Transactions/sec when it takes 3-7 secs
	    to do one?

	The reason I ask these questions is that:

	a) I need to search a table of 186K records four times for four
	   different numbers to see if they are valid and,

	b). I need to search a table of 4k records two times to check
	    for a valid code value and,

	c). I need to search a table of 1Meg records for a max of 999 times
	    to validate a 13 digit alphanumeric number and,

	d). I need to search various tables (approx 10) of no more than
	    1K records each to validate some code values

	e). What happens when table joins come into play in terms
	    of adding to the performance overhead.

	f). How can Oracle advertise a 50K transaction per second
	    performance number when it takes 3-7 secs for one. And,
	    does this mean only one table search to qualify it as a
	    transaction or is it based on how many records in that
	    table is searched per second. I.E. relabling a record
	    as a transaction.


	hence,

	I need to search items (a-d) as fast as possible - preferably via
a WEB implementation and/or OLTP. I need to do thousands of these transactions per hour (sec is better) without any noticeable degradation. Preferably with a 1 sec response time for each user for their transaction. I probably can do parallelized queries on item (a) since it's just four numbers searching the same set of data (others may apply too). However, I need to indicate an error and be specific as to which of the four was invalid.

        Therefore,

        This application just reads a bunch of data and validates for correct data input, generates an appropriate error message and stores the data in the DBMS. Later on, it is required that the data be extracted from DBMS and be revalidated again for auditing purposes.

        Additionally,

        Once, the data is validated it is used to search 10K records where each record is a chunk of data. Each piece of this record then is used to match a more specific data criteria.

Conclusion:

        If the DBMS is this slow I'll never get done. Note: Currently I'm not using an Oracle DBMS.

--
	Mike,
	mikesw_at_whiterose.net
Received on Thu Nov 20 1997 - 00:00:00 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US