Re: performance of updates

From: robert <robert_at_barbarian.thesecondnexus.net>
Date: Fri, 19 Dec 2008 23:00:53 -0600
Message-ID: <494C7C05.4050705@barbarian.thesecondnexus.net>


Ed Prochak wrote:
> I am in the midst of some performance improvements. I have never
> thought about this aspect before but has anyone tested the difference
> in performance between
>
> UPDATE mytable SET
> status = new_status
> WHERE mykey = thiskey;
>
> and
>
> UPDATE mytable SET
> status = new_status
> WHERE mykey = thiskey
> AND status <> new_status;
>
> ???
>
> I haven't run my own test yet. The actual UPDATE is in a trigger but I
> don't think that matters. The argument a coworker is making is that we
> should update only if it makes a difference (ie if the value changed).

I did take some measurements some years ago when I must have had too much time on my hands :) I could not demonstrate enough difference to make it worth the effort. On the other hand if you already have the trigger on the view anyway, I don't see how it can hurt either (performance, that is) in any measurable degree, to have the "status" clause tagging along.

The numbers themselves would probably be less than relevant given today's technology, even had I kept them around (which I didn't).

What might be still relevant would be perhaps the lessons learned from the exercise.

Consider these three points:

  1. If this check were worth making, and under the assumption that plain jane vanilla DML is going on, then why wouldn't Oracle be making the same comparison behind the scenes? Maybe they are not -- I don't know -- but I DO know that out of all the ways to improve performance, trying to outsmart the Oracle engine has never been a very productive approach, at least not in my experience.
  2. Supposing the argument is that you are saving a physical write by not updating the row...now if there is enough activity to make any difference at all, then we have to consider the odds that some other row on the same page or buffer got updated _anyway_, forcing the page to be written out no matter what, and so we didn't really save any physical writes after all.
  3. Possibly the assumptions behind 1) and 2) are false, meaning there is a lot more going on than just a simple row update. In this case, I would not be surprised if you had bigger fish to fry.

I can think of two examples:

  1. business requirements force you to provide some kind of status date, that is given say, systime, each time the status is updated.
  2. more generally developers sometimes insist on an indexed "update date" column, that gets updated on every update of the row, presumably so they can do some date-range extracts.

In both a) and b) updating a global index on every row update is a big waste in best case, and hideously expensive in worst cases. I could understand trying to avoid it. But in both cases I would argue that there are better ways to do things and avoid the need for such indexes altogether, which is what I mean by bigger fish to fry...

> Since this is an update to one row, I'm not sure if the overhead of
> checking additional columns is worth the minimal savings of avoiding
> writing the data.
>
> The difference seems to me like it would be minimal. Anyone have
> evidence either way?
>

just my 2c

> Ed

  • robert
Received on Fri Dec 19 2008 - 23:00:53 CST

Original text of this message