Re: Need to query 4 tables....3 are 'subordinates'...(need outer join?)

From: Alvin SIU <alvinsiu_at_hkstar.com>
Date: Fri, 04 Oct 2002 00:10:58 +0800
Message-ID: <3D9C6C12.783587C2_at_hkstar.com>


Bomb Diggy wrote:

> This is a database/app design question as best I can figure. I have a
> main table called Main which has about 15M rows. It has a two-column
> primary key. I want to collect all the information I can find for all
> the rows on one-half of that primary key, and then hit the three
> 'subordinate' tables to get extra information. The pk of table Main
> is 2 parts of the 3-part primary keys of each of the remaining 3
> tables. I will know the 3rd part of the remaining pk's at runtime (no
> worries).
>
> Is there a best approach to gather the information I need? For
> instance, I think I finally got my outer join to work for two tables
> (the master and one subordinate), and it's already over my head
> (shoot, don't even know if it's technically correct, nor how to prove
> it sufficiently one way or the other!).
>
> OUTER JOIN W/ ONLY TWO TABLES
> -----------------------------------
> SELECT *
> FROM t2 LEFT OUTER JOIN t1
> ON t1.PK1 = t2.PK1 AND
> t1.PK2 = t2.PK2 AND
> t2.PK2 = <rt_value>
> WHERE t1.PK1 = <rt_value>
>
> So, the questions are:
>
> 1. Is this hub-and-spoke design common? I can see the logic of
> normalizing the data, but this query seems like it's going to grow to
> the point where nobody else on my team can understand it - starting
> with me.
>
> 2. Is it possible to do a multi-table outer join? Is that what I
> need to do? Do I have other SQL options? Subqueries? Temp tables?
>
> 3. I'm working w/ some mainframe programmers who say they never write
> crazy SQL. Instead, they just go out and get the data for each row
> one table at a time. Meaning, for each fully-populated row of data
> they wished to gather for final presentation/writing/whatever, they
> would actually make four queries - one to each of the four tables -
> resulting in a lot of queries. My Main table's 15 M rows would have
> our Java app making a total of 60 M queries (4 * 15 M) - nice. That
> mainframe must have some serious gitty-up. Is there something about
> the load capabilities of mainframe databases I don't know about?

Yes, the words of your mainframe programmers is correct. Mainframe is a computer shared by many many users. Therefore, each job cannot be allocated too much CPU nor memory. For huge data, instead of join, we will make a SQL1 on table1 to build a cursor.
Then, loop through the whole cusor for each recod, make a SQL2 to table2 to build another cursor.
Then, go on with table3 and table4.

If SQL1 return 20 records and so do SQL2, SQL3 & SQL4, a total of 80 records will sit in the mainframe at any one time. However, using join, a total of 20 x 20 x 20 x 20 = 160,000 records will sit in the mainframe before you process the 1st record.

That's why the mainframe programming will not use crazy join.

I will do table join only on small tables with careful considerable on the number of records produced by the join.

Mainframe does not have those 'client-server' concept as Java. After you submit a job to mainframe, the program will run on its own. The user does not need to interatively 'talk' with the program. The program will visit the database, do the processing, write reports, etc.
Then, when the job finishes, the user simply goes to examine the report. Because the database, disk, I/O, ... are all inside the mainframe, such SQL queries is very fast.

>
>
> 4. Does it matter whether I do a LEFT OUTER vs. a RIGHT OUTER join,
> and switch the order of the tables in the query?
>
> 5. I took a shot at SQL code for all four tables below. It seems
> like it works for up to 3 tables, but I think it's more dumb luck than
> anything else. I tried the same syntax for the fourth table, and no
> go. I effectively lose my outer join -> my resulset goes from 30+k
> rows to 2k.
>
> SELECT *
> FROM t4, t3, t2
> RIGHT OUTER JOIN t1
> ON t1.PK1 = t2.PK1 AND t1.PK2 = t2.PK2 AND t2.PK3 = <rt_val>
> WHERE t1.PK1 = <rt_val> AND
> t3.PK1=t1.PK1 AND t3.PK1=t1.PK2 AND t3.COL=<rt_val> AND
> t4.PK1=t1.PK1 AND t4.PK1=t1.PK2 AND t4.COL=<rt_val>
>
> Thanks for any suggestions...
Received on Thu Oct 03 2002 - 18:10:58 CEST

Original text of this message