Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: Estimate the size of a table in advance

Re: Estimate the size of a table in advance

From: DI Karl Heinz Hörmann <kh.hoermann_at_penta.at>
Date: Tue, 25 Jan 2000 05:59:45 +0100
Message-ID: <86jaa8$83v$1@newsmaster01.magnet.at>

extern13_at_my-deja.com schrieb in Nachricht <861gq0$u7u$1_at_nnrp1.deja.com>...

>I want to estimate the size in KB/MB of a table to use in the storage
>section in the "create table" command. The size is based on the
>estimated size of a row, initial number of rows and growth rate. Also I
>would like to be able to get some size figures to estimate the size of
>the indexes I need to create for a new table.


Some hints from Appendix A of the DBA-Manual, which covers :

A Space Estimations for Schema Objects
Estimating Space Required by Non-Clustered Tables Estimating Space for Indexes
Estimating Space Required by Clusters
Estimating Space Required by Hash Clusters

Estimating Space Required by Non-Clustered Tables The procedures in this section describe how to estimate the total number of data blocks necessary to hold data inserted into a non-clustered table. Within this sample calculation, no concurrency is assumed, and users are not performing intervening delete or update operations.




Note:

This is a best-case scenario only when users insert rows without performing deletes or updates.



Typically, the space required to store a set of rows will exceed this calculation when updates and deletes are also being performed on the table. The actual space required for complex workloads is best determined empirically, and then scaled by the number of rows in the table. In general, increasing amounts of concurrent activity on the same data block results in additional overhead (for transaction records), so it is important that you take into account such activity when scaling empirical results.

To Calculate Space Required by Non-Clustered Tables

Calculate the total block header size.
Calculate the available data space per data block. Calculate the space used per row.
Calculate the total number of rows that will fit in a data block. Step 1: Calculate the Total Block Header Size The space required by the data block header is the result of the following formula:

Space after headers (hsize)
=
DB_BLOCK_SIZE - KCBH - UB4 - KTBBH - ((INITRANS - 1) * KTBIT) - KDBH Where:

DB_BLOCK_ SIZE   is the database block size as viewed in the V$PARAMETER view

KCBH, UB4, KTBBH, KTBIT,KDBH   are constants whose sizes you can obtain by selecting from entries in the V$TYPE_SIZE view

INITRANS   is the initial number of transaction entries allocated to the table

Step 2: Calculate the Available Data Space Per Data Block The space reserved in each data block for data, as specified by PCTFREE, is calculated as follows:

available data space (availspace)
=
CEIL(hsize * (1 - PCTFREE/100)) - KDBT

Where:

CEIL   rounds a fractional result to the next highest integer

PCTFREE   is the percentage of space reserved for updates in the table

KDBT   is a constant whose size you can obtain by selecting the entry from the V$TYPE_SIZE view




Note:

If you are unable to locate the value of KDBT, use the value of UB4 instead.



Step 3: Calculate the Space Used per Row Calculating the amount of space used per row is a multi-step task.

First, you must calculate the column size, including byte lengths:

Column size including byte length
=
column size + (1, if column size < 250, else 3)



----Note: You can also determine column size empirically, by selecting
avg(vsize(colname)) for each column in the
able.  ---------------------------------------------------------------------

Then, calculate the row size:

Rowsize
=
row header (3 * UB1) + sum of column sizes including length bytes

Finally, you can calculate the space used per row:

Space used per row (rowspace)
=
MAX(UB1 * 3 + UB4 + SB2, rowsize) + SB2

Where:

UB1, UB4, SB2

  are constants whose size can be obtained by selecting entries from the V$TYPE_SIZE view

When the space per row exceeds the available space per data block, but is less than the available space per data block without any space reserved for updates (for example, available space with PCTFREE=0), each row will be stored in its own block.

When the space per row exceeds the available space per data block without any space reserved for updates, rows inserted into the table will be chained into 2 or more pieces, hence, this storage overhead will be higher.

Figure A-1 depicts elements in a table row.

Figure A-1 Calculating the Size of a Row Step 4: Calculate the Total Number of Rows That Will Fit in a Data Block You can calculate the total number of rows that will fit into a data block using the following equation:

Number of rows in block
=
FLOOR(availspace / rowspace)

Where:

FLOOR   rounds a fractional result to the next lowest integer

In summary, remember that this procedure provides a reasonable estimate of a table's size, not an exact number of blocks or bytes. After you have estimated the size of a table, you can use this information when specifying the INITIAL storage parameter (size of the table's initial extent) in your corresponding CREATE TABLE statement.

See Also: See your operating system-specific Oracle documentation for any substantial deviations from the constants provided in this procedure.

Space Requirements for Tables in Use
After a table is created and in use, the space required by the table is usually higher than the estimate derived from your calculations. More space is required due to the method by which Oracle manages free space in the database.

Estimating Space for Indexes
The calculations in the procedure rely on average column lengths of the columns that constitute an index; therefore, if column lengths in each row of a table are relatively constant with respect to the indexed columns, the estimates calculated by the following procedure are more accurate. Also, the following factors can impact the accuracy of your calculations:

port-specific variables
the 5 per cent multiplication factor in the branch blocks portion, which is an arbitrary variable
internal fragmentation
To Estimate Space for Indexes

Calculate the total block header size.
Calculate the available data space per data block. Calculate the combined column lengths of an average index value. Calculate the total average index value size. Calculate the number of blocks and bytes required for the index.




Note:

Several calculations are required to obtain a final estimate, and several of the constants (indicated by *) provided are operating system-specific. Your estimates should not significantly differ from actual values.



See Also: See your operating system-specific Oracle documentation for any substantial deviations from the constants provided in the following procedure.

Step 1: Calculate the Total Block Header Size Figure A-2 shows the elements of an index block used in the following calculations. The space required by the data block header of a block to contain index data is given by the formula:

block header size = fixed header + variable transaction header

where:

fixed header

  113 bytes

variable transaction header

  24*I is the value of INITRANS for the index.

If INITRANS =2 (the default for indexes), the previous formula can be simplified:

block header = 113 + (24*2) bytes

Figure A-2 Calculating the Space for an Index

Step 2: Calculate Available Data Space Per Data Block The space reserved in each data block for index data, as specified by PCTFREE, is calculated as a percentage of the block size minus the block header:

available data = (block size - block header) - space per block ((block size - block header)*(PCTFREE/100))

The block size of a database is set during database creation and can be determined using the Server Manager command SHOW, if necessary:

SHOW PARAMETERS db_block_size;

If the data block size is 2K and PCTFREE=10 for a given index, the total space for new data in data blocks allocated for the index is:

available data space per block

Step 3: Calculate Combined Column Lengths The space required by the average value of an index must be calculated before you can complete Step 4, calculating the total row size. This step is identical to Step 3 in the procedure for calculating table size, except you only need to calculate the average combined column lengths of the columns in the index.

Step 4: Calculate Total Average Index Value Size Figure A-3 shows elements of an index entry used in the following calculations. Once you have calculated the combined column length of an average index entry, you can calculate the total average entry size according to the following formula:

bytes/entry = entry header + ROWID length + F + V + D

Where:

entry header

  2 bytes

ROWID length

  6 bytes

F

  Total length bytes of all columns that store 127 bytes or less. The number of length bytes required by each column of this type is 1 byte.

V

  Total length bytes of all columns that store more than 127 bytes. The number of length bytes required by each column of this type is 2 bytes.

D

  Combined data space of all index columns (from Step 3).

Figure A-3 Calculating the Average Size of an Index Entry

For example, given that D is calculated to be 22 bytes and that the index is comprised of three VARCHAR(10) columns, the total average entry size of the index is:

avg. entry size = 2 + 6 + (1 * 3) + (2 * 0) + 22 bytes


----Note: For a non-unique index, the ROWID is considered another column, so
it must have one length
yte.  ----------------------------------------------------------------------

Step 5: Calculate Number of Blocks and Bytes Calculate the number of blocks required to store the index using the following formula:

# blocks for index =




Note:

The additional 5% added to this result (by means of the multiplication factor of 1.05) accounts for the extra space required for branch blocks of the index.



For example, continuing with the previous example, and assuming you estimate that indexed table will have 10000 rows that contain non-null values in the columns that constitute the index:

# blocks for index =

This results in 204 blocks. The number of bytes can be calculated by multiplying the number of blocks by the data block size.

Remember that this procedure provides a reasonable estimate of an index's size, not an exact number of blocks or bytes. Once you have estimated the size of a index, you can use this information when specifying the INITIAL storage parameter (size of the index's initial extent) in your corresponding CREATE INDEX statement.

CreatingTemporary Space Required for Index When creating an index for a loaded table, temporary segments are created to sort the index. The amount of space required to sort an index varies, but can be up to 110% of the size of the index.




Note:

Temporary space is not required if the NOSORT option is included in the CREATE INDEX command. However, you cannot specify this option when creating a cluster index.



Received on Mon Jan 24 2000 - 22:59:45 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US