Re: A question for Mr. Celko

From: Marshall Spight <mspight_at_dnai.com>
Date: Tue, 20 Jul 2004 02:25:57 GMT
Message-ID: <Vo%Kc.101471$WX.79009_at_attbi_s51>


"Jan Hidders" <jan.hidders_at_REMOVETHIS.pandora.be> wrote in message news:pan.2004.07.19.23.23.46.800729_at_REMOVETHIS.pandora.be...
> On Mon, 19 Jul 2004 14:05:01 +0000, Marshall Spight wrote:
> > "Jan Hidders" <jan.hidders_at_REMOVETHIS.pandora.be> wrote in message
> > news:pan.2004.07.19.09.35.30.941204_at_REMOVETHIS.pandora.be...
> >>
> >> There are whole conferences
> >> dedicated to optimization in list-manipulating languages.
> >
> > I had no idea. In my experience there is the array list and the linked
> > list, and the linked list isn't all that useful.
>
> Are you kidding? What if I have a lot of inserts in my list?

Nope, I'm not kidding. I've never worked in scientific fields; everything I've done in the last 10-12 years has been either business apps or desktop apps, or webapps. For those kinds of applications, lists never get very big and array list is your best friend. Linked list, OTOH, will insert faster but address a lot slower, and 95% of what you're doing is addressing specific items from the list. Also, it'll fragment your heap because every insert or delete will mean a node allocate/deallocate.

> Any idea
> about the asymptotic time and space complexity of that?

Sure. Doesn't matter what order of complexity those are, though, when n < 10.

A few years back I was in the middle of another pointless flamewar on comp.lang.java.programmer about linked lists vs. array lists, and for fun I decided to instrument lists in the app I was working on. I replaced Vector.class inside rt.jar with my own, instrumented version, then ran the app for a while using my stardard usecase benchmark. Then I collected my stats.

The average list size was 0.93 items, and I forget the exact number, but something like 99% of lists never grew beyond 10 items.

> And how about doubly linked lists?

Those were very interesting back when I was in school and played with different list implementations in C.

> Or lists encoded in relational tables? You can do
> that with a simple index number (hard to update like the array) or
> logically simulate a linked list.

Now *that's* interesting, although I guess I'm most interested in how one might implement relations so as to allow fast list operations in bulk.

> And remember we are not talking necessarily about data structures in
> memory but data structures on disk.

We might be talking either, although I find on-disk structures somewhat less interesting.

> That's a whole new ball game. And did
> you think about indexing structures for lists? What if we have interval
> queries? Ever heard of spatio-temporal indexes? Quad-trees? R-trees? K-d
> trees?

I haven't heard of K-d trees. Are they interesting? I'm not too interested in the spatial stuff. I expect I'll become interested in interval queries at some point. Skip lists, anyone? I heard a talk by William Pugh recently; there's an interesting guy.

> These all come into play. And what if we start nesting things? If
> you cannot resist google for "xml query optimization" (xml is
> essentielly a data model of nested lists) and see a whole new universe of
> problems and solutions opening up.

Well, problems anyway. :-)

> Have I scared you a litte? I hope so. :-) If you just start googling
> around without a specific research question or objective then my guess
> is you will just drown and not really get anywhere.

That is a very real concern. However, there is an opposite problem, which is that if one only ever goes reading when one has to solve a specific problem, one loses many extraordinary chances for serendipitous discovery. That's part of why I'm on this newsgroup.

> May the focus be with you, ... :-)

Thanks! Actually I'm quite focused; I'm just not so public about what my focus is. Also, there's a queue of things I'm focused on, so sometimes I ask about something that's further down the queue, even though I'm mostly thinking about something else at the moment. I probably look particularly unfocused when I do that!

Marshall Received on Tue Jul 20 2004 - 04:25:57 CEST

Original text of this message