Oracle Row Generator Techniques

From Oracle FAQ
Jump to: navigation, search

Sometimes you need a table with N rows as a seed for a dataset. The content of the records doesn't matter. They just need to be there. For example, suppose you need a table with enough records to create a record for each day of this year, how would you go about it?

This concept is known as a row generator.

Also see the forum page: row generator

Pivot Table[edit]

As it is discussed in Oracle Magazine (Sept. 2002, no more available on line), you can physically create a table containing the number of rows you like.

This is straightforward but...it is limited. How many rows should you create to satisfy your needs? Is 100 enough? 1000 rows? This article discusses several ways of making things more flexible.

ALL_OBJECTS[edit]

One way is searching for a table with enough records in it already, like ALL_OBJECTS. Then we could do the following:

SELECT ROWNUM n
FROM ALL_OBJECTS
WHERE ROWNUM <= 365

But what if ALL_OBJECTS doesn't contain enough rows for your needs? You could use a Cartesian join to increase the number of records returned.

GROUP BY CUBE[edit]

But Oracle has evolved and it would be a pity if we didn't use one of those features, after all the access of ALL_OBJECTS can be a performance killer.

Browsing through the Oracle SQL Reference we come across the following passage:

The CUBE operation in the simple_grouping_clause groups the selected rows based on the values of all possible combinations of expressions in the specification, and returns a single row of summary information for each group. You can use the CUBE operation to produce cross-tabulation values.

For example, given three expressions (n=3) in the CUBE clause of the simple_grouping_clause, the operation results in 2^n = 2^3 = 8 groupings. Rows grouped on the values of 'n' expressions are called regular rows, and the rest are called superaggregate rows.

So, suppose we have a table with 1 row and add enough expressions to get to 365 rows we can get there too.

First, we have to find out how many expressions are needed for this:

SQL> select log(2,365) x
  2  from dual
  3  /
 
         X
----------
8.51175265

SQL> select power(2,8) x
  2  from dual
  3  /

         X
----------
       256

SQL> select power(2,9) x
  2  from dual
  3  /

         X
----------
       512

As we can see in the result of our first query 8 expressions wouldn't be enough. Our second and third queries confirm this. We have also found our table of 1 record: it should always be available and it is called DUAL. With this knowledge we can construct our 365 row table:

SELECT ROWNUM n
FROM ( SELECT 1 just_a_column
       FROM dual
       GROUP BY CUBE(1,2,3,4,5,6,7,8,9) )
WHERE ROWNUM <= 365

It works, but the performance drops dramatically once you increase the number of group expressions. Another problem is that depending on your demand you can get far too many records out of the inner select and that is eating resources. If it's a small number of rows, this is a viable option though.

CONNECT BY LEVEL[edit]

A final pure SQL solution lies in the (ab)use of hierarchical queries. We can make a 'recursive hierarchy' using, yet again, dual:

SELECT ROWNUM n
FROM   ( SELECT 1 just_a_column
         FROM   dual
         CONNECT BY LEVEL <= 365
       )

or from 10g:

SELECT LEVEL just_a_column
FROM dual
CONNECT BY LEVEL <= 365

The theory behind this is that you connect a row to every row of a level higher by omitting the PRIOR keyword. You have no restriction on joining children to parent rows. And if we omit the START WITH clause too, we indicate that ALL records occur at level 1. With a table of one row this means the following: one record (in fact all records) are at root level (level 1) because we have no START WITH clause. And at each iteration your hierarchical tree gets deeper (and the LEVEL pseudo column increases) because you match all records of the table to all records of the level above. We stop the recursive hierarchy creation by our constraint. At LEVEL 366 the condition isn't matched anymore and the query exits its recursive loop.

It should be noted that in later versions of oracle, at least as far back as 10gR1, operations against dual are optimized such that they require no logical or physical I/O operations. This makes them quite fast.

For this mechanism to work we need a table with only one row, like noted before. If we try it with two rows and limit the depth to LEVEL 3 we get:

2*LEVEL1 (all records of the table)

4*LEVEL2 (all records of the table (2 records) get matched to the higher level (2 records))

8*LEVEL3 (all records of the table (2 records) get matched to the higher level (4 records))

This results in a total of 14 records.

SQL> SELECT LEVEL, therow
  2  FROM ( SELECT 1 therow
  3         FROM dual
  4         UNION ALL
  5         SELECT 2 therow
  6         FROM dual )
  7  CONNECT BY LEVEL <= 3
  8  ORDER BY LEVEL, therow
  9  /

     LEVEL     THEROW
---------- ----------
         1          1
         1          2
         2          1
         2          1
         2          2
         2          2
         3          1
         3          1
         3          1
         3          1
         3          2 
         3          2
         3          2
         3          2

14 rows selected.

This functionality seems to work for larger numbers too but note that this function will not work in Oracle versions prior to 9i! In Oracle 8i for example you get "ORA-01436: CONNECT BY loop in user data". To prevent from this add the condition "AND PRIOR dbms_random.value IS NOT NULL" is tne CONNECT BY clause.

To get the numbers between 2 values you can use:

SELECT &initial_value + LEVEL -1 n
FROM dual       
CONNECT BY LEVEL <= &final_value  - &initial_value + 1
/

PL/SQL table or pipelined function[edit]

Oracle has another neat feature that can be of help here. It is called 'TABLE FUNCTIONS'. We could create a function that would return a set of sequential numbers. This option requires two database objects:

  1. A user-defined table type
  2. The function itself
CREATE OR REPLACE TYPE numtab_type IS TABLE OF NUMBER;
CREATE FUNCTION rowgen(pin_nor IN NUMBER)
RETURN numtab_type DETERMINISTIC
AS
  v_numtab numtab_type;
BEGIN
  v_numtab := new numtab_type();
  FOR i IN 1..pin_nor
  LOOP
    v_numtab.extend();
    v_numtab(i) := i;
  END LOOP;
  
  RETURN v_numtab;
END rowgen;
/

Note that the function has been created DETERMINISTIC. This is an indication for Oracle that for the same input the output will also be the same. We could also opt to make the function PIPELINED. That would look like this:

CREATE FUNCTION rowgen(pin_nor IN NUMBER)
RETURN numtab_type DETERMINISTIC PIPELINED
AS
BEGIN
  FOR i IN 1..pin_nor
  LOOP
    PIPE ROW(i);
  END LOOP;
  
  RETURN;
END rowgen;
/

Once the objects are in place, we can query the function as if it was a table:

SELECT ROWNUM n FROM TABLE(rowgen(365));

Both the normal and the pipelined version work the same way. Once in place there's no difference in usage, but depending on the number of rows you want to return you might want to consider a pipelined function.

The drawbacks of this approach are that

  1. you need to create some extra database objects.
  2. you will have a little overhead due to the context switch from SQL to PL/SQL.

On the other hand, the use of a function

  1. is more elegant, it is more obvious what you are after. Especially if you choose an appropriate name for your function.
  2. is more flexible, it is instantly reusable.

MODEL Clause[edit]

A row generator can also be created using the SELECT command's MODEL clause (introduced in Oracle 10g) like this.

SQL> define N=5
SQL> select integer_value
  2  from dual
  3  where 1 = 2
  4  model
  5    dimension by (0 as key)
  6    measures (0 as integer_value)
  7    rules upsert 
  8      (integer_value [for key from 1 to &N increment 1] = cv(key))
  9  /
INTEGER_VALUE
-------------
            1
            2
            3
            4
            5

Another use of MODEL is:

SQL> define N=5
SQL> select val
  2  from dual
  3  where 1 = 2
  4  model
  5    dimension by (0 as rn)
  6    measures (0 as val)
  7  rules iterate (&N)
  8    ( val[iteration_number] = iteration_number+1 )
  9  /
       VAL
----------
         1
         2
         3
         4
         5

Note that the first query internally generates N rules that will generate each one 1 row whereas the second query iterates N times a rule generating a row at each iteration. The first query will then take more and more memory as N increases and the second one should be preferred.

To get the numbers between 2 values you can use:

SELECT num
FROM dual
MODEL RETURN UPDATED ROWS
DIMENSION BY (1 num)
MEASURES ( &initial_value i1, &final_value i2 )
RULES ( i1[for num from i1[1] to i2[1] increment 1] =  null )
/

Recursive query[edit]

11g introduces recursive query (that has been known for years in DB2). This feature allows to recursively reapplies a query from a starting one and so is a good solution to generate numbers and rows.

SQL> define N=5
SQL> WITH data(r) AS (
  2    SELECT 1 r FROM dual
  3    UNION ALL
  4    SELECT r+1 FROM data WHERE r < &N
  5    )
  6  SELECT r FROM data
  7  /
         R
----------
         1
         2
         3
         4
         5

Conclusion[edit]

Oracle has several options to generate rows. A pure SQL solution comes in different flavours, but a pl/sql solution might be equally elegant.

How about this "annual table"? Just take the option of your preference and replace

SELECT ROWNUM n

with

SELECT TRUNC(SYSDATE,'Y')+ROWNUM-1 thedate

Of course, you have to take leap years into consideration, but that's just tweaking. Add a where clause to limit the output to the current year.

SELECT TRUNC(SYSDATE,'Y')+ROWNUM-1 THEDATE
FROM   ( SELECT 1 just_a_column
         FROM dual
         CONNECT BY LEVEL <= 366
       )
WHERE  ROWNUM <= ADD_MONTHS(TRUNC(SYSDATE,'Y'),12)-TRUNC(SYSDATE,'Y')
/

Another common question that is answered using a row generator is the rows to columns transformation. The first link of the "External Links" section gives you a detailed example of this.

External Links[edit]