Re: Query performance: DivideAndConquer vs. BruteForce

From: Jim Kennedy <kennedy-family_at_attbi.com>
Date: Tue, 08 Jan 2002 06:16:56 GMT
Message-ID: <s9w_7.4412$DG5.49393_at_rwcrnsc53.ops.asp.att.net>


[Quoted] [Quoted] I haven't had this problem under Oracle. (joins greater than 3 tables) Jim
"David Williams" <djw_at_smooth1.fsnet.co.uk> wrote in message news:a1b33k$idf$1_at_news8.svr.pol.co.uk...
>
> "Andrew Wilson" <andrew_at_blueoffice.com> wrote in message
> news:a177qa$2b1m$1_at_news.cybercity.dk...
> > As a rule of thumb - "The more you perform on the database server, (and
> the
> > less on the client) the faster will be the overall response time"!
> >
> True.
>
> > Performing a loop on the client, performing multiple queries will
probably
> > be a factor of many times slower.
> >
> True.
>
> > In the full query case I would imagine that the server would perform a
> table
> > scan on the parent table, and then perform full sort merges for each of
> the
> > child tables (depending on table sizes, statistics on columns, table
> > structuring, cleverness of optimiser, complexity of query, etc, etc).
All
> > data then returned across the network to the client.
> >
> > In the "divide-and-conquer" case a table scan of the parent table is
> > peformed and the data returned across the network (or locally).
> >
> > Now we get into the trouble area, because for each row in the parent
table
> > we now perform a select from each of the child tables. As they are
primary
> > key selects, and I assume the table is structured on primary key, then
(if
> > btree/ISAM) a tree walk is performed on the child table. The total
number
> of
> > pages read will be greater (index pages need to be re-read each time
> [maybe
> > buffered]), the amount of server CPU will be greater (each seperate
query
> > will have to be re-analysed), network traffic will certainly be greater.
> >
> > At a total guess I would imagine that in the first case the selection is
> of
> > a logN type, whilst in the second case it is more likely N to some
power.
> >
> > Stick to the server ......
> >
> However under Informix I find that >3 table queries are slower.
>
> Use a temp (unlogged) table and select results into it, index it and
> then join to it.
>
> I usually find many tables queries come in several forms
>
> 1. Lots of small reference data tables and one large table
>
> Join the large tables so a few of the small tables (the main ones
> which restrict the rows from the large table at low cost).
> Put this into the small temp table. Join the temp table to the rest.
>
> 2. Several large tables which produce a small number of rows and
> many small tables.
>
> Join the large tables to get a small temp table then join the
> temp table to the rest.
>
> 3. Lots of complex sub selects and one/few main tables.
>
> Partially do the subselects into one or more temp tables which
> simplify the joins to the main tables. Then use a set of several
> temp tables to gradually join the data together.
>
> Here you can use knowledge you have the data to work out
> the best way to do the query. Hints:
>
> - Each new temp table which results should have fewer
> rows than the last. Less rows to join is better. Get the most
> restrictive filters in earlier
> - Perhaps only get primary jeys and the final query can get
> long strings from the table.
>
> E.g.
>
> select key1,key2, long string1 from table1,....
> where <a> and <b> and ..
>
> It may be better to initially do
>
> select key1 from table1
> where <a>...
> into temp t1;
> ...
> select key1,key2,string1 from t1..
> where <b> and
>
> if <b> does not restict the number of row from <t1>
> very much...
>
>
> This is where benchmarking comes in..!
>
>
> >
> > Andrew
> >
> >
> >
> > "Tushar Mahapatra" <tushar_mahapatra_at_yahoo.com> wrote in message
> > news:1ad0000.0201040705.651f411_at_posting.google.com...
> > > I have a question about what would be the best way to query an SQL
> > > database (I am using Oracle) where many parent and child tables are
> > > involved. I seek opinions on which of the following two methods would
> > > require the least execution time:
> > >
> > > Method 1 - BruteForce method:
> > > Execute a single query where all the parent and child tables are
> > > joined and all the data is returned at once. For example:
> > >
> > > select *
> > > from ParentTable P, Child1Table C1, Child2Table C2, ...
> > > where P.Child1Key = C1.Key and P.Child2Key = C2.Key and ...
> > >
> > >
> > > Method 2 - DivideAndConquer method:
> > > Execute more than one query. The first query gets data from the parent
> > > table:
> > >
> > > select *
> > > from ParentTable P
> > > where ...
> > >
> > > Next, in the front-end code, iterate through the parent table data
> > > received and execute queries into the children tables:
> > >
> > > select *
> > > from Child1Table C1
> > > where C1.Key = varPChild1Key and ...
> > >
> > >
> > >
> > > I am inclined to believe that the DivideAndConquer method would be
> > > faster, but I invite your comments, especially with respect to Oracle
> > > 9i.
> > >
> > > The scenarios I have described are actually much simplified
> > > representations of my situation.
> >
> >
>
>
Received on Tue Jan 08 2002 - 07:16:56 CET

Original text of this message