Re: Hashes from composite keys?

From: Karsten Wutzke <>
Date: Fri, 23 Jul 2010 16:58:55 -0700 (PDT)
Message-ID: <>

On 24 Jul., 00:58, Bob Badour <> wrote:
> Karsten Wutzke wrote:
> > On Jul 23, 11:14 pm, Eric <> wrote:
> >>On 2010-07-23, Karsten Wutzke <> wrote:
> >>>On 23 Jul., 22:22, Bob Badour <> wrote:
> >>>>Karsten Wutzke wrote:
> >>>>>Hello,
> >>>>>what are the best practices for generating hash codes from composite
> >>>>>keys? I need to mimic something like what a composite index does. In
> >>>>>fact, it's for mapping between relational keys and object IDs.
> >>>>>Can anyone point me into the right direction please?
> >>>>>Karsten
> >>>>Yes, I can point you in the right direction. Simply turn completely
> >>>>around from what you are trying to do, and go in the opposite direction.
> >>>One of the most useless comments on the Usenet I've read in my whole
> >>>life. Congratulations. You must have a lot of time for writing such a
> >>>crap. If you think you are cool, you're not.
> >>>Anyone else with a useful tip? XORing the individual elements?
> >>>Karsten
> >>He genuinely believes that both what you want to do and why you want to
> >>do it are very bad ideas indeed. There are sensible arguments behind
> >>that opinion, but presenting them here usually results only in a spate
> >>of ill-considered counter-arguments from people who know nothing about
> >>it. They won't answer your question either.
> > OK.
> >>If you want hash codes, they are a topic in their own right, which you
> >>can look up, or ask about elsewhere. The fact that the data you want to
> >>hash happens to be a composite key in a database is pretty-much
> >>irrelevant.
> > Yes. I had I reason to believe I'd find the people that know what I'm
> > talking about HERE, not elsewhere. That's why I also asked for a
> > direction, not a solution primarily.
> >>Alternatively, you may wish to take a step back from your problem,
> >>and ask again the questions to which your answers were "mapping" and
> >>"hash codes". Here may or may not be the right place for that.
> >>Finally, if you think that his response, or even mine, are the most
> >>useless comments on Usenet (_not_ "the Usenet"), you haven't read very
> >>much of it. Asking you to re-consider the reasoning that led to your
> >>question is potentially very useful indeed.
> >>Eric
> > Well, as I'm thinking about it, there's nothing I can come up with to
> > avoid generating robust hash codes from composite keys.
> You are too focused on how. You have lost touch with what.
> A composite key is a key. Keys provide logical identity. Since you were
> looking for something that provides identity, you were already done
> before you even started. From that observation, the analysis of what you
> are doing starts by noting that anything else you add will simply
> introduce redundancy then goes downhill from there.
> > If you think
> > the direction is "backward", why not be more verbose?
> Because I cannot afford to provide a post-secondary education for free
> to everyone who hasn't yet grasped the fundamentals.

But you can give philosophical answers and afford an off-topic discussion in such a dry thread? That doesn't sound logical.

> The deficit in your
> understanding may not be your fault, but ultimately it is your
> responsibility to eliminate it.
> > Instead of
> > discussing other issues, you might have just showed me a direction, as
> > I believe, it's not something that hasn't been solved before. I just
> > cannot find it. My XORing test already failed with 4 rows, so, as you
> > can see, I probably have to find out the hard way.
> If your hash code is to maintain identity, it must have at least as many
> unique possible values as the original key.

Now that sounds logical and gives me a direction.

> If you understand that, you
> will immediately understand why simply XORing the bytes cannot possibly
> work.

Yes, it's the wrong approach. The problem is: I can't predict what the composite key will be like. Chances are the types are anything that SQL allows in primary keys, such as VARCHAR(100) or even higher. Many of my keys are short codes such as CHAR(2), but also INTEGERs, BOOLEANs, and DATETIMEs. I suspect the problem with the larger character strings, but it stops right there.

Java has several different hashCode implementations AFAIK, but I don't know if they are useful for persistent object identity at all. Just found out that "Ca".hashCode() and "DB".hashCode() already produce a collision, making the default implementation factually useless. *scratch*

Persistent object identity seems to be a much more mighty task than expected to be honest.

Karsten Received on Fri Jul 23 2010 - 18:58:55 CDT

Original text of this message