Oracle FAQ Your Portal to the Oracle Knowledge Grid

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Informatica Bulk Mode behavior

RE: Informatica Bulk Mode behavior

From: Thomas Jeff <>
Date: Tue, 27 Apr 2004 07:52:37 -0500
Message-ID: <>

Tim, Waleed, we got this response from Informatica Support:

We are using the OCI DPL (direct path load) in the following way:

First, We get a block with "x" # of rows.

Do following loop

   Call ColArrayToStream for these "x" rows.    Load whatever we can (call DirPathLoadStream)    Do a DirPathStreamReset.
while ColArrayToStream did not convert all the rows, repeat the above steps for the remainder rows.

Every so often, depending on our "commit" interval, we will call DirPathFinish (and then reprepare).

The 2 differences between our process and Oracle's demo program are: 1) the # of rows "x" may be different. They may be optimizing the "x" that they pick. We derive "x"=20
from the buffer block size.

2) Oracle does 1 DirPathFinish at the end of the process, whereas we do it on every commit. =20

Thus, we could probably simulate exactly what Oracle's test program is doing by making sure that the=20
# of rows in the block matches their test program and the target commit interval is really high.

So there seem to be 2 possible places where we may be causing more extents to be created than necessary. =20

  1. The last load in the loop may be less than the amount of rows that can fit in a transfer block. Not sure if this matters at all, but if the last load contains only 1 row, this may be a wasted allocated extent.
  2. The DirPathFinish DEFINITELY seems to be causing a new extent to be created.

We still have one X factor that we need to account for. The CR mentions that if the commit interval is high, we=20 sometimes get the following error:

ORA-01658: unable to create INITIAL extent for segment in table space=20 Our guess is that somehow we ran out of space to get an extent. (perhaps in the temp table space) Perhaps=20 this can be limited somehow if there is a way to turn off logging (if it was, indeed, on), but I didn't see=20
this option available. Although we haven't tested this yet, we would think that the sample program would have=20 similar issues, although it may take a little longer because they may not have wasted extents as listed in option=20 1) above.

The workaround fix is to play with the buffer block size and the commit size. If the buffer block size fits=20
the correct amount of rows, we may have a somewhat optimal fix. We may want to consider logging certain=20
information that may help the user pick a correct buffer size, but, of course, this can be kinda cryptic.

-----Original Message-----
[] On Behalf Of Khedr, Waleed Sent: Tuesday, April 27, 2004 12:17 AM
Subject: RE: Informatica Bulk Mode behavior

I use Informatica 6.x and so far did not have any problem. But you may be right, that this might have to do with the way =3D Informatica calls Oracle OCI.

The other way I see it, Informatica is giving bulk load as a relational =3D method to load the target instead of external loader, which must be = =3D
thread-safe to run without problems using their multi-thread options. = =3D
This leads to the issues we encountered when we run direct load in =3D parallel, which is different than parallel DML/insert in direct mode =3D where threads try to broker the HWM.



-----Original Message-----
From: Tim Gorman [] Sent: Monday, April 26, 2004 3:16 PM
Subject: Re: Informatica Bulk Mode behavior

Jeff and Waleed,

I think Informatica is blowing smoke on this error message, or at least =3D they are smoking something...

Each time you initiate a direct-path load operation, it will allocate a =3D new extent, but the arraysize or data save frequency has no bearing = on
size =3D or number of extents, whether using "direct-path" (a.k.a. = "bulk")
loads or "conventional" SQL INSERT loads. In other words, based on the behavior described, it is quite likely that Informatica has a bug in their =3D bulk-load functionality resulting from improper use of the Oracle Call Interface =3D
application programming interface (API). I don't think that the cause = =3D
of the ORA-01658 error has to do with a problem of any kind within Oracle =3D at all, no matter whose perspective you use.

Programs like Informatica can only use the OCI functions for direct-path loading (i.e. Functions named "OCIDirPath...()") and none of these OCI functions presents any controls for extent sizing.

The only explanation is that Informatica is doing something silly and pointless like calling "OCIDirPathFinish()" instead of "OCIDirPathDataSave()" after each data save.

The "OCIDirPathFinish()" function closes a direct-path load session, requiring that a new direct-path load session be initiated, causing a = =3D
new extent to be allocated. The "OCIDirPathDataSave()" function merely =3D saves the rows to blocks on disk, but does not initiate a new extent or have =3D any control over extents.

In which case, the Informatica documentation should read something like:

    Informatica code bug: Due to a misreading of Oracle's documentation

        and sample OCI code, Informatica is improperly performing data
        saves during bulk loads using the "direct-path" API.

    Solution:   Set the commit interval very high, to mask the fact that
        Informatica is improperly ending the direct-path load with each
        data save.  This will result in fewer extents being used and
        less space being wasted, depending on the table's extent size.

Just my $0.02 as a long-time OCI programmer. I don't know for sure if = =3D
this is what is happening, but it sure as heck sounds like it. It would be interesting to "truss" the Informatica load process, to see what OCI function calls it is issuing... :-)

Hope this helps...


P.S. It would be interesting to find out if Informatica v7.x

        includes any "performance enhancements" for bulk loading
        over v6.x.  If so, the only "enhancement" I can think of is
        that they detected their own bug and fixed it... :-)

        After all, nothing new has been added to the "OCIDirPath...()"
        functions in the way of a performance enhancement since they
        became available in v8.0.x, I'm pretty sure...

on 4/26/04 6:48 AM, Thomas Jeff at wrote:

> Waleed, =3D3D20


> Indeed you were on the right track. We got the following msg from
> Informatica
> Support. "Oracle design limitation"?????? =3D3D20

> Problem:

> In 6.x version running session targeting to Oracle using bulk mode =20
>creates new extents=3D3D20 at every commit. This sometimes leads to = the=20
>following error: =3D
> unable to create=3D3D20
> initial extent."=3D3D20

> Solution:

> This is a known issue (CR 53791) that occurs when using Oracle bulk =20
>loading. To resolve this=3D3D20  issue increase the commit interval to=20
>very high value. SUPPORTING

> INFORMATION: This issue is=3D3D20
> due to the following design limitation in Oracle: Direct Path Loading
>(using same code as bulk=3D3D20

> loading) is going to use only blocks above the high water mark. Also =20
>Direct Path loading does=3D3D20  not check for Partial blocks are not=20
>used, so no reads are needed to  find them and fewer writes=3D3D20

> are performed. In a conventional data load the bind array buffer is
> filled and passed to Oracle=3D3D20
> with a SQL INSERT command which results in the space management by the
> SQL Command Processing to=3D3D20
> be utilized. Therefore bulk loading will allocate more extents.=3D3D20

> -----Original Message-----
> From:
> [] On Behalf Of Khedr, Waleed
> Sent: Thursday, April 22, 2004 11:34 AM
> To:
> Subject: RE: Informatica Bulk Mode behavior

> I would ask the developers how they configured the commit interval (if
> =3D3D3D
> it's set to a very low value)

> Waleed

> -----Original Message-----
> From: Thomas Jeff []
> Sent: Thursday, April 22, 2004 10:33 AM
> To:
> Subject: Informatica Bulk Mode behavior

> List,
> Is anyone familiar with how Informatica's bulk mode works? =3D3D3D20=20
>=3D20 We have a situation where I created a 4GB tablespace for a new = and
> simple=3D3D3D20 truncate/load operation from Informatica, around 7 =3D
> rows estimated=3D3D3D20 to take up 1.5GB. =3D3D3D20 =3D3D3D20 Using =
bulk =3D
mode, =3D3D
> which
> appears to be a direct load (in the SQL cache, the=3D3D3D20 INSERT =3D
> has a hint that I've never seen before: SYS_DL_CURSOR which=3D3D3D20 =
> =3D

> assume
> stands for Direct Load), they run out of space in the tablespace =20
>after=3D3D3D20 about 200K rows have been inserted. =3D20
> If I then manually rebuild the table, the 200K rows gets compressed =
> down=3D3D3D20 to one extent.

> So there's a lot of either empty or preallocated space. Thinking
> somehow=3D3D3D20
> the high-water mark was the culprit, I manually truncated the table =20
>before=3D3D3D20 they reran their job.  =3D3D3D20 =3D20

> Still the same problem. =3D3D3D20

> If the job runs in 'normal' mode, which is row-by-row processing, it
> runs fine,=3D3D3D20 although of course, performance is quite poor.
> Any ideas/experience as to how this bulk mode operates?

Please see the official ORACLE-L FAQ:

To unsubscribe send email to: put 'unsubscribe' in the subject line.
Archives are at
FAQ is at
Please see the official ORACLE-L FAQ:
To unsubscribe send email to: put
'unsubscribe' in the subject line.
Archives are at
FAQ is at
Please see the official ORACLE-L FAQ:
To unsubscribe send email to:
put 'unsubscribe' in the subject line.
Archives are at
FAQ is at
Received on Tue Apr 27 2004 - 07:50:51 CDT

Original text of this message