Re: *** JDBC Statement faster than PreparedStatement ***

From: Joseph Weinstein <joe_at_weblogic.com>
Date: 2000/02/29
Message-ID: <38BC61FD.C559A05D@weblogic.com>

Hi again.
PreparedStatements offer the possibility of better performance if the DBMS and the client-DBMS protocol
permit the re-use of pre-parsed SQL by sending it just once, and thenceforth just sending the new parameter
values. This saves the time it takes to parse fresh SQL, and can somewhat the amount of data sent from the
client to the DBMS. Depending on the driver and protocol, the overhead may consist of the conversion of some
Java parameter objects to a suitable form for transmittal, and possibly sending each parameter value synchronously
one-by-one to the DBMS before a re-execute. This latter case would be an immediate performance killer. In fact,
the ideal case for PreparedStatement performance is probably where a large, *non-parameterized* SQL statement
is to be reused many times, saving the recompile each time, and only costing the re-execute call. Because the areas
of possible benefit and the areas of new overhead are not symmetrical, there may well be a 'cross-over' where
at some number of parameters the performance winner might swing back to a simple statement. For some drivers this
might be true for *any* number of parameters. An example is for MS SQLServer, which doesn't have lightweight
stored procedures or a pre-parse that can be used by a driver to establish a query plan for reuse in the DBMS.
Therefore every execution of a PreparedStatement will be the moral equivalent of a fresh SQL execution anyway,
with the added cost of parameter translation as well. Sybase does have lightweight stored procedures (not involving
system tables and automatically disappearing at logout), which serve well for this purpose. Making a full-fledged
MS SQLServer procedure would be disastrous, killing concurrency because the procedure creation would be
logged, and would involve hotspotting system tables to install it, and the driver would be responsible for removing
the procedures when no longer needed, even if the client <CTL>C'ed out of the application.

There is still a benefit to PreparedStatements that isn't available at all from basic Statement objects: Some
data types, like BinaryStreams, have values that cannot be conveniently transmitted to the DBMS via unparameterized
SQL. Therefore the functionality of PreparedStatement.setBinaryStream() etc, is enough alone IMO to require
the class, independent of performance issues. For best performance, if your parameter values are amenable to inclusion in straight SQL and your SQL isn't
complicated (ie: it doesn't take long to parse), and you're not really going to be executing it over and over,
nothing will beat just sending it in one client-DBMS packet.

Joe Weinstein at BEA

Jonatan Kelu wrote:

> Thanks for that. That may fix my immediate problem, however it does not explain why
> PreparedStatement is slower and more inefficient than the ordinary Statement. I mean, the whole
> purpose of having a PreparedStatement is for efficiency. The fact that it is not nearly efficient
> as the ordinary Statement defeats the purpose of its existence. Incidentally, in further
> experiments I found that this is not only the problem of the Oracle OCI driver. In fact, it
> happens on every driver I have tried so far. The drivers I have tried are the Ashna driver for SQL
> Server, the Inet driver for SQL Server, and the JDBC-ODBC bridge working against SQL Server. All
> these drivers showed marked performance improvements when using the ordinary Statement as opposed
> to the PreparedStatement to do inserts on a table with more than say 20 columns. After these
> findings, I don't see why the PreparedStatement exists at all. Or, if it exists, why isn't the
> implementation simpler? I think people should know about this because I'd imagine there'd be a lot
> of applications out there that use the PreparedStatement instead of the Statement for its supposed
> performance advantages. Well this is just not the case, and people should know that they could
> actually be taking a significant performance hit if using the PreparedStatement to do inserts into
> tables with more than say 20 columns. I ask again, does anybody have an explanation as to why all
> these drivers perform worse on the PreparedStatement than on the Statement? Does anybody know why
> the PreparedStatement has been held up in the public's eye as THE statement to use when you want
> better performance, only to be found that it actually performs worse, and actually much worse the
> more columns you have? Please let me know. Jonatan Kelu
>
> Joseph Weinstein wrote in message <38BAC167.9A582B55_at_weblogic.com>...Hi Jonatan. Here's
> a way to do batch inserts to Oracle really fast.
> Do it all with a simple Statement, with a batch string you build yourself:
>
> String big_batch = " BEGIN ";
>
> while (I'm not tired of batching)
> big_batch += " insert into myTable values(myValues)" + "; ";
>
> big_batch += " END;";
>
> st.executeUpdate(big_batch);
>
> This will get rid of the major performance hit of the client-server chat for each
> row and/or driver inefficiencies. There is supposed to be a limit (64k?) on the
> size of your SQL, so maybe watch that... Let me know if this helps.
>
> Joe Weinstein at BEA WebLogic
>
> Jonatan Kelu wrote:
>
> > In doing various experiments measuring the performance of JDBC
> > PreparedStatement against that of the equivalent ordinary Statement, I was
> > surprised to discover that in certain instances (which in some case may be
> > the majority of cases) the ordinary JDBC Statement is faster than the
> > PreparedStatement.
> >
> > I'll briefly outline the conditions under which I was testing.
> >
> > I was using the Oracle JDBC 1.22 OCI driver for Oracle 8.0.x. My experiment
> > consisted of doing 10,000 inserts into an empty table and timing how long it
> > takes. The table contained only "VARCHAR2" columns of length 10, declared as
> > "NOT NULL". Autocommit was turned off and the transaction was only committed
> > once after all 10,000 records had been inserted. The experiment was repeated
> > with different numbers of columns.
> >
> > My findings were that if the number of columns is < 20 or so,
> > PreparedStatement worked faster than Statement. However, for more than
> > around 20 columns, Statement was faster - and this even more so the greater
> > the number of columns. I found that the performance degradation using
> > Statement was pretty much linear with the number of columns in the table.
> > However, with Prepared Statement, the performance degradation was worse than
> > linear, thus causing it to diverge more greatly from the performance of
> > Statement the greater the number of columns there was.
> >
> > An even more interesting thing I found was that one thing that Oracle
> > claimed was a performance improvement actually performs even worse. That is,
> > the Oracle documentation says that you can cast the PreparedStatement down
> > to an OraclePreparedStatement and then execute the "setExecuteBatch(int)"
> > method that specifies the number of executions of PreparedStatement that
> > should be batched together for a batch execution so as to save server
> > round-trip delays. I set this to a value of 100. What I found was that for a
> > small number of columns, this did actually improve performance. However,
> > again for more than about 20 columns, the performance of this degraded to be
> > worse than the ordinary Statement, and even worse than the ordinary
> > PreparedStatement that this performance improvement is supposed to make more
> > efficient!!! I couldn't believe it!
> >
> > I think someone's stuffed up pretty bad at Oracle, or they never tested the
> > driver for tables with more than 20 columns!
> >
> > You can make your own program that inserts 10,000 records into a database
> > table with say 50 or 100 VARCHAR2 columns, committing only after all records
> > are inserted, and using both Statement and PreparedStatement to do this. It
> > shouldn't be too complicated, and this will show these results immediately.
> >
> > Please, can somebody explain to me why this is the case. Why does the
> > ordinary Statement perform better than the PreparedStatement for tables with
> > greater than 20 columns? And why does the PreparedStatement using batch
> > execution perform even worse? This seems to go against all the conventional
> > database wisdom out there.
> >
> > If you do have any ideas, or if you have the real answer, please contact me
> > at my below e-mail address as I'm not always able to read the news group.
> > But please also do post it to the newsgroup for anybody else that may be
> > interested.
> >
> > Thanks,
> >
> > Jonatan Kelu
> > ACUS, UNISYS
> > Phone: +61-2-9390-1328
> > E-mail: Jonatan.Kelu_at_au.unisys.com
>

--

Received on Tue Feb 29 2000 - 00:00:00 CST