Re: How an index actually works

From: Parker Shannon <pshannon_at_ixpres.com>
Date: Tue, 8 Jul 2003 15:30:10 -0400
Message-ID: <3f0b1799_2_at_news.vic.com>


The table you want to build is called Person (singular)

Its fields are:

Id - primary key - unique integer - indexed Name - secondary key - unique - indexed "Jones, Mary Q." - if there are 2 Jones, Mary Q. then add birthdate, "Jones, Mary Q., 07/23/1987" Last Name - not indexed - "Jones"
FirstName - not indexed - "Mary"
MiddleInitial - not indexed - "Q."
BirthDate - not indexed - "07/23/1987"
etc. - not indexed . . .

Now don't worry about how the indices work. That's the job of the DBMS. But if you need to know, the new record gets added at the end of the existing records (it's Appended). However, the DBMS stores the physical location of the record on the mass storage device in the indices, one index for the primary key (Person.Id) and one for the secondary key (Person.Name) using the table example above.

On Insert, the DBMS adds the new record to the end of the table space and stores the physical location of the record (1 - DD-CC-TT-RR-LL ) and the primary key value (2 - 1234) and secondary key value (3 - "Jones, Mary Q.") in the indices.

On Query, when you send the DBMS the search argument "Jones, Mary Q." the DBMS looks up "Jones, Mary Q." in the secondary key index. When it finds it, there is a physical device number, physical cylinder, physical track number, physical record number and logical record number address right next to "Jones, Mary Q." that tells the DBMS exactly where your (logical) record is on the mass storage device and it goes and gets it.

In some databases, there is actually more space devoted to the indices than to the prime data area. This occurs from building indices over lots and lots of fields.

As far as writing a database management system is concerned, I hope you are independently wealthy and have a lot of time on your hands.

Later . . .

"Chris" <ctacke_at_innovativedss.com> wrote in message news:de75ddef.0306191926.2a75185_at_posting.google.com...
> I've got a good understanding of how to use a RDBMS and SQL syntax,
> but what I (and probably most DB users) don't understand is what the
> DB engine is actually doing under the hood. Tables themselves are
> fairly simple to understand (I thin) but what I'm not so sure on are
> indexes.
>
> For example let's assume I have a table of People with FirstName,
> LastName and Age. Lets assume all fixed-length fields and view this
> somewhat like a spreadsheet. On a basic level, if I insert a record,
> it simply gets added as a row at the end of the table.
>
> However, let's say I have an idex based on LastName. After I insert
> into the table, how does the index row get inserted? Does the index
> have "holes" in it to be filled with inserts? That seems bad, as you
> don't know how big to make the holes and what happens when a hole is
> filled? Do I move all data below the insert and the push my new data
> in place? That seems slow, especially if we insert near the front and
> a lot of data is in the table.
>
> Basically I'd just like recommendations on what to read/look up. I've
> spent the last hour trying to find information using Google, but let's
> face it, searching on terms like "database" "fundamentals" "basics"
> and "index" gives way more chaff than wheat.
>
> My end goal is to write a basic database engine (in C# if it matters)
> that I can create, read, update and delete from. I want to support
> basic indexes for speed, but I don't plan (yet anyway) to support
> anything like a SQL parser or many of the great features of an RDBMS.
> This is purely a learning/hobby exercise. Yes, I am a masochist.
Received on Tue Jul 08 2003 - 21:30:10 CEST

Original text of this message