Need to query 4 tables....3 are 'subordinates'...(need outer join?)

From: Bomb Diggy <techguy_chicago_at_yahoo.com>
Date: 2 Oct 2002 21:07:30 -0700
Message-ID: <94599bb3.0210022007.1fef3cdd_at_posting.google.com>



This is a database/app design question as best I can figure. I have a main table called Main which has about 15M rows. It has a two-column primary key. I want to collect all the information I can find for all the rows on one-half of that primary key, and then hit the three 'subordinate' tables to get extra information. The pk of table Main is 2 parts of the 3-part primary keys of each of the remaining 3 tables. I will know the 3rd part of the remaining pk's at runtime (no worries).

Is there a best approach to gather the information I need? For instance, I think I finally got my outer join to work for two tables
(the master and one subordinate), and it's already over my head
(shoot, don't even know if it's technically correct, nor how to prove
it sufficiently one way or the other!).

OUTER JOIN W/ ONLY TWO TABLES



SELECT *
FROM     t2 LEFT OUTER JOIN t1
ON       t1.PK1 = t2.PK1    AND
         t1.PK2 = t2.PK2    AND
         t2.PK2 = <rt_value>

WHERE t1.PK1 = <rt_value>

So, the questions are:

  1. Is this hub-and-spoke design common? I can see the logic of normalizing the data, but this query seems like it's going to grow to the point where nobody else on my team can understand it - starting with me.
  2. Is it possible to do a multi-table outer join? Is that what I need to do? Do I have other SQL options? Subqueries? Temp tables?
  3. I'm working w/ some mainframe programmers who say they never write crazy SQL. Instead, they just go out and get the data for each row one table at a time. Meaning, for each fully-populated row of data they wished to gather for final presentation/writing/whatever, they would actually make four queries - one to each of the four tables - resulting in a lot of queries. My Main table's 15 M rows would have our Java app making a total of 60 M queries (4 * 15 M) - nice. That mainframe must have some serious gitty-up. Is there something about the load capabilities of mainframe databases I don't know about?
  4. Does it matter whether I do a LEFT OUTER vs. a RIGHT OUTER join, and switch the order of the tables in the query?
  5. I took a shot at SQL code for all four tables below. It seems like it works for up to 3 tables, but I think it's more dumb luck than anything else. I tried the same syntax for the fourth table, and no go. I effectively lose my outer join -> my resulset goes from 30+k rows to 2k.
SELECT             *
FROM               t4, t3, t2 
RIGHT OUTER JOIN   t1
ON       t1.PK1 = t2.PK1 AND t1.PK2 = t2.PK2 AND t2.PK3 = <rt_val>
WHERE    t1.PK1 = <rt_val>   AND
         t3.PK1=t1.PK1 AND t3.PK1=t1.PK2 AND t3.COL=<rt_val> AND 
         t4.PK1=t1.PK1 AND t4.PK1=t1.PK2 AND t4.COL=<rt_val>

Thanks for any suggestions... Received on Thu Oct 03 2002 - 06:07:30 CEST

Original text of this message