Re: webmail user mailbox schema design?
Date: Mon, 4 May 2009 14:25:45 -0700 (PDT)
On May 1, 9:06 am, meranayag..._at_yahoo.com wrote:
> Hi Folks,
> Does anyone have any insight into what user mailbox stores look like
> for yahoo, gmail, hotmail, etc? I am curious if they have, e.g. a
> mailbox table for each user that contains all their email or whether
> they have a single large table with email for all users or whether
> they have a smaller subset of tables with the users distributed
> between them? Each user can have a large number of messages, so a
> single table would end up having billions of records (for one of these
> webmail systems with millions of users) and likewise you wouldn't want
> separate tables for each user because then you'd have millions of
> tables. Any thoughts? Thanks.
I have no idea, but check out http://www.johnvey.com/features/gmailapi/
I interpret this as they simply have either a unix-style mbox with a searchable index in php or something, or a simple modified open-source db with the same sort of proprietary interface and indexing. Looking at the wikipedia entry for gmail, and following it to the guy that wrote it, found this gem in his blog, which he still writes in so maybe you can simply ask him:
"I wrote the first version of Gmail in one day. It was not very impressive. All I did was stuff my own email into the Google Groups (Usenet) indexing engine. I sent it out to a few people for feedback, and they said that it was somewhat useful, but it would be better if it searched over their email instead of mine. That was version two. After I released that people started wanting the ability to respond to email as well. That was version three. That process went on for a couple of years inside of Google before we released to the world."
So it sounds like the answer is, it just uses the regular google searchable storage with an API on top of that.
Yahoo mail is rocketmail, again, see wikipedia.
Hotmail started on solaris, then migrated to windows. Again, wikipedia has the history, and it's not pretty.
I wouldn't expect any to use Oracle as a db engine, but one never knows. It's just mail, after all, what relations would users want? Mostly they'd be LIFOing or FIFOing and forgetting. So yeah, in this case deletion would be a primary determinant in design, and if they use a database, each user would have one or a small number of tables.
So, google is your friend, evil may they be.
-- _at_home.com is bogus. http://www.informationweek.com/news/showArticle.jhtml?articleID=217201334Received on Mon May 04 2009 - 23:25:45 CEST