Could Perl's DBI (DBD::Oracle) be faster than SQL*Plus?

From: Adam Monsen <adamm_at_wazamatta.com>
Date: Tue, 14 May 2002 15:48:37 -0700
Message-ID: <3CE19445.8050801_at_wazamatta.com>


I was playing with dumping data from an Oracle 8i database to a CSV formatted file. The query executed very quickly, but because of either writing to disk or going over the network, fetching all the results took about an hour and wrote out a 275M to disk. I executed it in dbish (DBI::Shell) like this:

dbish> select id,email from users where email like '%_at_hotmail.com' dbish> / > tmpfile.csv

(This executes the query and writes the output to tmpfile.csv) the query did a full table scan, an explain plan from TOra (http://tora.sf.net) showed the cost of the query to be about 200,000. About 1/3 the total rows of the users table was returned and written to the flatfile. The flatfile looked something like this:

...

'39285','joe_at_hotmail.com'
'59192','moe_at_hotmail.com'
'359192','foo_at_hotmail.com'

...

Just for fun, and in hopes of dumping data efficiently in the future, I decided to see if SQL*Plus could do this any faster.

Getting SQL*Plus to output CSV is a bit of a chore. I used this script to generate the flatfile:

  • hints from http://www.jlcomp.demon.co.uk/faq/flatfile.html -- set pause off
  • suppress terminal output set termout off
  • suppress timing information set timing off
  • suppress header set heading off
  • suppress showing sql in result set set echo off
  • eliminate row count message set feedback off
  • make line long enough to hold data set linesize 9000
  • suppress headings and page breaks set pagesize 0
  • eliminate SQL*Plus prompt from output set sqlprompt ''
  • eliminate trailing blanks set trimspool on spool tmpfile.csv

SELECT id||','||email

   FROM users
  WHERE UPPER(email) like UPPER('%_at_hotmail.com') /

I then timed how long it took for Perl (via dbish) vs. the time it took for SQL*Plus to write one megabyte of data to disk. Perl took about twenty seconds to write a meg of data, SQL*Plus took about one minute.

Yikes! Why the discrepancy? Is this test even valid? Is DBI::Shell really faster at dumping data than SQL*Plus?

Thank you,
Adam Monsen Received on Wed May 15 2002 - 00:48:37 CEST

Original text of this message