Re: External table performance

From: William Robertson <william_at_williamrobertson.net>
Date: Sat, 13 Jan 2018 11:40:07 +0000
Message-Id: <1D8DEF8C-7E1B-4804-A902-94F0C676AB54_at_williamrobertson.net>



On the subject of Parallel Query, I haven’t tested this but I would not expect it to be very useful for a single file, as each PQ slave still has to read the whole file from top to bottom, and the overhead of coordinating PQ sessions tends to outweigh any benefit for small amounts of data.

If the file was huge, then in theory it might be worth having a pre-processing step split it into multiple files using an OS utility, so that external PQ slaves could read one file each (similar to partitioning), but as the file is so small in this case I doubt you would gain much from the extra complication.

I would look at those selection criteria, in case the time is somehow going in parsing the file rather than simply reading it. Is a plain “select count(*)” or “select *” with no filtering criteria any faster? Are there any clues in the execution plan or the external table logfile? Anything unusual in the external table definition?

Are you just reading the file once or multiple times? If multiple times then it would be worth staging to a database table.

I don’t see how indexes will ever be possible on external tables, unless the data resided in multiple OS files and the “external index" indicated which files contained a particular value. It would have to build the index the first time the file was read, adding overhead, and then track the files' timestamp/size/checksum on subsequent reads to make sure they hadn’t changed. (Or if the index is in the database and contains all of the data, then it’s in effect a materialised view, which you can create already.) I doubt this would ever be useful enough to anyone to be worth Oracle’s time adding at as a feature.

William

On Fri, Jan 12, 2018 at 2:31 PM Sanjay Mishra <dmarc-noreply_at_freelists.org> wrote:
> Mark
>
> Oracle 12.1.2 and it is Single Node RAC instance and is Enterprise Edition. The issue with Select is where Clause for 2-3 column where Index can definitely be helpful but External table dont allow index. Select was Tried with Parallel but no difference in terms of performance. One option trying is MV and as data is not much but table is extensively used for various purpose.
>
> Sanjay
>

>> ===> It is single table Select 
>> select columns
>> from   table 
>> where  state_cd=:cid
>> and    DateCol between <date1> and <date2>
>> and    county_id in (<list of 3 Countie>)
>> and    status_cd in (<list of 4Status>);

--
http://www.freelists.org/webpage/oracle-l
Received on Sat Jan 13 2018 - 12:40:07 CET

Original text of this message