Re: Designing database tables for performance?

From: Valentin Minzatu <valentinminzatu_at_yahoo.com>
Date: 14 Feb 2007 10:35:22 -0800
Message-ID: <1171478122.776716.112000@v45g2000cwv.googlegroups.com>

On Feb 13, 3:55 pm, "Jonathan Lewis" <jonat..._at_jlcomp.demon.co.uk> wrote:
> "Timasmith" <timasm..._at_hotmail.com> wrote in message
>
> news:1170997448.300092.235340_at_s48g2000cws.googlegroups.com...
>
>
>
>
>
> > Hi,
>
> > I am trying to get my head around who to best design a couple of
> > tables for performance. The entities are fake so ignore that, I am
> > just focusing on the table sizes and values.
>
> > Suppose I had a table called items with 50,000 rows
>
> > items (item_id number, parent_item_id number, item_name string)
>
> > and I had a table called orders with 10,000,000 rows
>
> > orders (order_id number, order_date timestamp, order_status number,
> > customer_id number, item_id number)
>
> > Now items are a hierarchy (hence the parent_item_id) and your average
> > custom has 1000 rows in the orders table.
> > The 1000 rows represents 100 different items on average.
>
> > There are two primary use cases for querying. One is like this:
>
> > select i.item_name, o.order_date, o.order_status
> > from orders o, items i
> > where o.customer_id = ?
> > and o.order_date > sysdate-1
> > and o.item_id = i.item_id
> > and i.item_id in (?....?);
>
> > Now ?...? represents a potentially very long list of items -
> > essentially a subtree of the item hierarchy.
>
> > The other query is searching for a few items across a longer timeframe
>
> > from orders o, items i
> > where o.customer_id = ?
> > and o.order_date > sysdate-30
> > and o.item_id = i.item_id
> > and i.item_id in (?,?,?);
>
> > It seems to me very difficult to optimize this, obviously we leave it
> > to the database and cross our fingers that the optimizer will pick the
> > right query in the right situation but I cant help thinking there must
> > be *something* one can do to help out. The fact the items are a
> > hierarchy doesnt help - difficult to avoid the 'in' statement without
> > knowing how many levels there are in the hierarchy.
>
> > Any ideas?
>
> I may have missed something, but based on the
> number of orders per customer, and number of
> orders in the table, it seems as if the number of
> order for a customer since
> sysdate - 1
> or sysdate - 30
>
> (a) will be quite small
> and
> (b) will be quite scattered if you use a standard heap table.
>
> Optimum strategy therefore seems to be to identify
> the orders for the customer in the date range using
> an indexed access path: and you could reduce the
> I/O costs of this step by storing the orders as an
> IOT (index organized table) possibly partitioned by
> date.
>
> Once the orders have been identified, you can then
> filter out the orders which are not for the required
> items with the predicate against the item table.
>
> An "IN SUBQUERY" with a hierarchical query
> against the items table (suitably indexed) should
> be sufficient to do the filtering in the case where
> you have a large item selection - followed by a
> join back to the items table to pick up the item
> name for the final order set. (Note: it really would
> be perfectly reasonable to have the items table
> appearing twice in the query).
>
> To minimise the cost of the subquery, you could
> always create a one-to-one trigger on the items
> table to maintain in real-time clone table (item_id, parent_id)
> as an IOT - low-risk, no contention on maintenance,
> minimum time, minimum space, maximum efficient
> for the hierarchical subquery.
>
> --
> Regards
>
> Jonathan Lewishttp://jonathanlewis.wordpress.com
>
> Author: Cost Based Oracle: Fundamentalshttp://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
>
> The Co-operative Oracle Users' FAQhttp://www.jlcomp.demon.co.uk/faq/ind_faq.html- Hide quoted text -
>
> - Show quoted text -

Going along the lines Jonathan described above, you may want to consider using an index organized materialized view (refresh on commit) which could speed up development and reduce you maintenace. Received on Wed Feb 14 2007 - 12:35:22 CST