PHYSICAL DATABASE CHECKLIST

CONSIDERATIONS FOR DSS/GIS ENVIRONMENT

Prepared by P. Stankus

Last Update 2/24/2003

REDO LOGGING: 

 

ARCHIVELOG MODE:

 

 

DATABASE CREATION ISSUES THAT CANNOT BE CHANGED AFTER CREATION W/O GREAT DIFFICULTY:

 

 

SIZING DATABASE OBJECTS:

 

 

UNIX DATA LOADING TIPS:

 

 

CONFIGURATION PARAMETERS:

 

 

 

DATA WAREHOUSE:

 

-avoid setting up data warehouse w/ objects that have to be updated.  If objects do not have to be updated then set pctfree=0 for more efficient sequential scans (less blocks because less wasted space to scan) – more efficient.

·        Unless there is not enough window – analyze objects after creating.  With partitioning can just analyze the new partition using:  ANALYZE TABLE <TABLENAME> PARTITION <PARTITION_NAME> ESTIMATE STATISTICS SAMPLE 10 PERCENT;  10 percent is usually sufficient. 

·        Conduct architectural feasibility tests with generated data as early in process as possible.  If using generated data then don’t have to have all load and conversion processes done first:
-test query response
–test times take to do incremental loads
–test time to load
–test time backup and recovery
–test time analyze

·        Index rebuilds – ensure double amount of space as existing index segments take.  May decide to use SINGLEROW option of direct load to maintain indexes as loading if small number of rows and faster than rebuilding or not enough space.

·        Check status column of dba_indexes to make sure indexes in place after load.

·        Use temporary tablespace to do all aggregates/manipulations then only populate new table at end of process (various ways).

·        Restrict random use of SQL*PLUS through PRODUCT_USER_PROFILE table.

·        Create one user that owns all objects and runs processes to run conversions in production.

·        RBS used for load processes since read-only all other times.  See guidelines for #/size above.

·        “Lock-down” tablespaces after load to make read-only – esp. with partitioning so can make backup just backup new partitions/tablespaces.  Use read-only mode on tablespaces. 

·        Using SQL*LOADER direct path means during this phase RBS are not used at all. 

·        Can have serious temp. segment requirements for large queries esp. if PQO used because will be based on DEGREE of parallelism.  Consider ratios of 5-10% of total database size to start and monitor.  (size of data segments).  Also look at sort-area-size.  If ratio starts to approach 15-20% of total data size, question validity of queries and how written, look for opportunities for denormalization and aggregation. 

·        Still might need logging so start w/ 3 groups redo logfiles of 100MB apiece

·        Consider how you will populate development database or QA database.  Really can’t afford HUGE data warehouse stored repeatedly.  Will start with a develop. Database for testing webdb – etc. 

·        Run in archivelog mode and run backups immediately after database load.  Do most loading using unrecoverable option to greatly minimize redo creation (empirically redo information is still generated just a much smaller amount).  Using RMAN allows for incremental backups further minimizes the amount of recovery needed especially in a DSS environment.

·        Place partitioned tables into separate tablespaces to reduce backup and recovery at any particular point in time.

·        Look for opportunities to preaggregate, prejoin and partition!!

 

CONSIDER THE MAIN PHASES OF DATA WAREHOUSING:

 

·        Extraction from sources and moving data to final host

·        Transformation to dimensions, aggregates, etc.

Since final tables will be likely aggregated, denormalized from source – there will be a separate database (ODS) for doing this type of activity then from the final storage and presentation database.  The ODS will be the first database we setup and depending on the requirements may change to its degree of normalization. 

 

·        Bulk load into final database for reporting

·        Quality Assurance checks – pre user and user QA.

·        Release/Publishing to end-user community

·        Controlled and scheduled updates (possibly)

·        Querying

·        Auditing

·        Security

·        Backup and Recovery

 

BATCH PROCESSING FOR DATA LOADS, REPORTING:

 

 

 

ON-GOING MONITORING:

 

·        tablespace fragmentation
  Can completely avoid if initial=next and if all objects in a tablespace has uniformly sized extents – best if only one extent size per tbspace.  Also, place database objects in profiles of small, medium large and perhaps huge.  See Craig Shallahamer’s “Avoiding a Database Reorganization” and Cary Millsap’s “Oracle 7 Server Space Management” and “How to Stop Defragmenting and Start Living:  The Definitive Word on Fragmentation” by Bhaskar Himatsingka and Juan Loaiza.

 

REFERENCES:

 

Top Partitioning Table Issues