Re: more efficient sql

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: 7 Nov 2006 04:48:14 -0800
Message-ID: <1162903694.362994.198520@b28g2000cwb.googlegroups.com>

Charles Hooper wrote:
> Jeff Kish wrote:
> > Hi.
> >
> > I'm trying to create a table that contains for the user rows that reflect all
> > the tables that contain matching column names.
> >
> > I am using this but it has duplicate in it, and also takes over a minute to
> > run.
> >
> > Is there a spiffier way of making this table from the user_tab_columns table?
> >
> > insert into usetabcolumns2TempTable ( select a.table_name TABLE_NAME_START,
> > a.column_name COLUMN_NAME, b.table_name TABLE_NAME_END from user_tab_columns
> > a, user_tab_columns b where b.table_name != a.table_name and b.column_name =
> > a.column_name )
> >
> > I'm targetting Oracle 9 onwards.
> >
> > Much Appreciated,
> > Jeff
> > Jeff Kish
>
> I am not quite sure why you would want to do this, but, you should be
> able to do as you would like using analytical functions. For example:
> SELECT
> TABLE_NAME,
> COLUMN_NAME,
> MIN(TABLE_NAME) OVER (PARTITION BY COLUMN_NAME ORDER BY TABLE_NAME)
> TABLE_NAME_START,
> MAX(TABLE_NAME) OVER (PARTITION BY COLUMN_NAME ORDER BY TABLE_NAME)
> TABLE_NAME_END,
> COUNT(*) OVER (PARTITION BY COLUMN_NAME) NUM_TABLES
> FROM
> USER_TAB_COLUMNS;
>
> MIN, MAX, and COUNT in this case provide you with the first table name,
> last table name, and the number of tables containing the column, when
> grouped by the column name.
>
> If you only want cases where a column is used in more than one table,
> you can slide the above into an inline view like this:
> SELECT
> *
> FROM
> (SELECT
> TABLE_NAME,
> COLUMN_NAME,
> MIN(TABLE_NAME) OVER (PARTITION BY COLUMN_NAME ORDER BY TABLE_NAME)
> TABLE_NAME_START,
> MAX(TABLE_NAME) OVER (PARTITION BY COLUMN_NAME ORDER BY TABLE_NAME)
> TABLE_NAME_END,
> COUNT(*) OVER (PARTITION BY COLUMN_NAME) NUM_TABLES
> FROM
> USER_TAB_COLUMNS)
> WHERE
> NUM_TABLES>1;
>
> Slight reformatting should give you what you want. There likely is a
> more efficient method.
>
> Charles Hooper
> PC Support Specialist
> K&M Machine-Fabricating, Inc.

Moving in a slightly different direction, as the previous may not SQL statements may not have provided what you need. Another experiment with analytical functions:
SELECT
  TABLE_NAME,
  COLUMN_NAME,
  ROW_NUMBER() OVER (PARTITION BY COLUMN_NAME ORDER BY TABLE_NAME) ROW_POSITION,
  COUNT(*) OVER (PARTITION BY COLUMN_NAME) NUM_TABLES FROM
  USER_TAB_COLUMNS; The above lists all tables and columns in USER_TAB_COLUMNS, along with the total number of tables containing the same column, and the relative position of each table from 1 to the number of tables when sorted by the table name.

Now, eliminate those columns that only appear in one table: SELECT
  TABLE_NAME,
  COLUMN_NAME,
  ROW_POSITION,
  NUM_TABLES
FROM
  (SELECT
    TABLE_NAME,
    COLUMN_NAME,
    ROW_NUMBER() OVER (PARTITION BY COLUMN_NAME ORDER BY TABLE_NAME) ROW_POSITION,
    COUNT(*) OVER (PARTITION BY COLUMN_NAME) NUM_TABLES   FROM
    USER_TAB_COLUMNS)
WHERE
  NUM_TABLES>1; If we then slide the above into an inline view, repeated twice and join the tables, it would look like this:
SELECT

  TC1.TABLE_NAME FIRST_TABLE,
  TC2.TABLE_NAME SECOND_TABLE,
  TC1.COLUMN_NAME,
  TC1.ROW_POSITION FIRST_ROW_POSITION,
  TC2.ROW_POSITION SECOND_ROW_POSITION,
  TC1.NUM_TABLES

FROM
(SELECT
  TABLE_NAME,
  COLUMN_NAME,
  ROW_POSITION,
  NUM_TABLES
FROM
  (SELECT
    TABLE_NAME,
    COLUMN_NAME,
    ROW_NUMBER() OVER (PARTITION BY COLUMN_NAME ORDER BY TABLE_NAME) ROW_POSITION,
    COUNT(*) OVER (PARTITION BY COLUMN_NAME) NUM_TABLES   FROM
    USER_TAB_COLUMNS)
WHERE
  NUM_TABLES>1) TC1,
(SELECT
  TABLE_NAME,
  COLUMN_NAME,
  ROW_POSITION,
  NUM_TABLES
FROM
  (SELECT
    TABLE_NAME,
    COLUMN_NAME,
    ROW_NUMBER() OVER (PARTITION BY COLUMN_NAME ORDER BY TABLE_NAME) ROW_POSITION,
    COUNT(*) OVER (PARTITION BY COLUMN_NAME) NUM_TABLES   FROM
    USER_TAB_COLUMNS)
WHERE
  NUM_TABLES>1) TC2
WHERE
  TC1.COLUMN_NAME=TC2.COLUMN_NAME
  AND TC1.ROW_POSITION<TC2.ROW_POSITION;

Note that I check to see if the ROW_POSITION from the first inline view is less than that of the second - this eliminates the duplicates.

A more efficient method is as follows, where the inline view only needs to be resolved once - this may not work in all query environments:

WITH TC AS
(SELECT
  TABLE_NAME,
  COLUMN_NAME,
  ROW_POSITION,
  NUM_TABLES
FROM
  (SELECT
    TABLE_NAME,
    COLUMN_NAME,
    ROW_NUMBER() OVER (PARTITION BY COLUMN_NAME ORDER BY TABLE_NAME) ROW_POSITION,
    COUNT(*) OVER (PARTITION BY COLUMN_NAME) NUM_TABLES   FROM
    USER_TAB_COLUMNS)
WHERE
  NUM_TABLES>1)
SELECT

  TC1.TABLE_NAME FIRST_TABLE,
  TC2.TABLE_NAME SECOND_TABLE,
  TC1.COLUMN_NAME,
  TC1.ROW_POSITION FIRST_ROW_POSITION,
  TC2.ROW_POSITION SECOND_ROW_POSITION,
  TC1.NUM_TABLES

FROM
  TC TC1,
  TC TC2
WHERE
  TC1.COLUMN_NAME=TC2.COLUMN_NAME
  AND TC1.ROW_POSITION<TC2.ROW_POSITION;

Charles Hooper
PC Support Specialist
K&M Machine-Fabricating, Inc. Received on Tue Nov 07 2006 - 06:48:14 CST