Re: Good way to implement string interning?

Peter Seibel wrote:
> Suppose I have a file containing several million strings (say URIs)
> and I need to assign each unique string a numeric ID and store the
> string-to-ID and ID-to-string mapping as (time) efficiently as
> possible. Is there some clever technique databases use for this kind
> of task. What about if we know that many of the strings are going to
> have common prefixes (e.g. "http://www.gigamonkeys.com/book/" and
> "http://www.gigamonkeys.com/lispbox/")

An often used solution for the string -> id direction is a trie. For a short introduction see Wikipedia:

But I'm not really familiar with that myself. If you don't have access to the ACM portal, let me know and I'll see what I can do.

Received on Tue Feb 28 2006 - 19:52:21 CET

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]