Re: saving emails in an Oracle table

From: Tim X <timx_at_nospam.dev.null>
Date: Sun, 31 Jan 2010 00:32:19 +1100
Message-ID: <871vh7d8gs.fsf_at_lion.rapttech.com.au>



Frank van Bortel <frank.van.bortel_at_gmail.com> writes:

> Wallyraju wrote:
>> Oracle 11g R1
>> RHEL (not sure of version)
>>
>> We are in the process of designing a table(s) in Oracle to store all
>> the information pertaining to saving everything related to an email
>> being sent out.
>
> Why don't you take a look at the definition of MIME?
> http://en.wikipedia.org/wiki/MIME looks like a valid starting point.
> Basically, email is text, so a single CLOB should do it (yes; Clob, not
> Blob - just take a look at a base64 coded email)
>
> Apart from that - how do you get your mail IN the database? I know
> about emails: how I can SEND FROM, but I sure would like to
> RECEIVE INTO the database.

Hi frank,

a bit depends on platform, but I did this a while back using a simple perl script and a very simple procmail filter. Essentially, the procmail script passed the message with certain headers to the perl script which used perl DBI to then insert the message into a database table.

this was used for a job request/tracking system. I'd send you a copy, but that was a few contracts back and I don't actually have one! However, it was very straight-forward. Probably about 2 days work to get the basic script working, Which included getting additional access to allow me to build the Oracle DBD driver for perl DBI. The reason it was so straight-forward was that perl had all the necessary packages to handle different mail formats and enabled me to extract the mime objects of interest without too much hassle. When it had problems, the message was just passed on to another address which had a human on the end who could decide what action to take.

When I left, the system was handling over 80% of the messages it recieved wiht no problems. Of the other 20%, most were badly formatted spam messages that the spam filter missed.

From memory, I had some problems handing messages from apple Mail because of the 'nested' MIME objects it creates and I had some initial issues with messages from outlook that were in HTML only. However, these were handled using a perl module that jus stripped out the HTML and sent the plain text to the database. Likewise, the perl modules soon caught up and were able to handle the apple mail mime stuff fine - until they did, I hacked together a module that woked pretty well.

The table used a varchar2 fiield for the sender address, a varchar2 field for the subject, a date field for the date sent and a clob for the message body. The perl script then also sent the message to an archiving address, so we had a complete copy of the message if it was ever required.

funny thing was, this simple little task was the one the users loved the most. The overall development work took over 18 months, but it was that little task that end users noticed the most.

Another method I've seen is to simply query a mailbox using imap/pop. There are plenty of java classes to do most of the work, so it should be pretty easy to do. I initially considered this route, but I liked the ability to use perl to 'sanatise' the mail before inserting it and as I hate java, I wanted to avoid that route.

Tim

-- 
tcross (at) rapttech dot com dot au
Received on Sat Jan 30 2010 - 07:32:19 CST

Original text of this message