DBMS recommendation?
Date: Mon, 5 Jan 2009 02:10:13 -0800 (PST)
Message-ID: <8562dfe9-8e89-4c75-837b-efa3cb7df567_at_r15g2000prh.googlegroups.com>
Hi,
We are currently thinking of replacing our existing database system
which is no longer supported.
The data corpus encompasses 4 million records each of which having
about 30 fields. Half a million records would have a full text in PDF
format which we also put as a text field (by a self-made PDF
extraction script).
Our DB usage is moderate, about 1000 searches per day. We load balance
by means of pound dealing queries to 6 different virtual machines
(each holding a copy of the database).
The web interface is decoupled from the DB; we use a MVC framework
that talks to the DB via an API and retrieves data only.
Currently we use a Windows system which can easily be replaced with
Linux if need be.
Our existing solution has a built-in thesaurus (controlled vocabulary
is static) in addition every term of which holds the number of records
currently tagged with it.
The new solution should of course be well performing, a thesaurus
functionality would be nice as would be a relevance ranking and
proximity searching – yet not a must.
A cost free solution would be desirable since we want to open our
database to the internet thus we might encounter the need to add new
instances of the DB (virtual machine). Having to pay additional
licences would be too expensive for us.
What would you recommend?
In addition:
1. Our data repository is a large XML file from which we update our
database on a weekly basis by means of a self-made update script.
Would an XML database be an alternative, esp. viewing at the
performance?
2. I was also asked to investigate a poosibiltiy to implement a
federal search on 2 to 3 other sources (different data structure). I
assume this then would be a different beast and not a feature for the
above mentioned new DB I am looking for. Indeed this is not a
requirement yet what options would I have in that concern?
Many thanks for you input,
JR
PS: If this group is not the right place please point me to a proper
one