From: max
Date: Thu, 26 Jul 2007
Have not found an active data mining forum yet so asking here.

I'm new to data mining but since I have the most theoretical comp. sci. background at my work I've been given the task of setting up a data mining system. The problem is that our setup seems pretty non standard and I'm not sure how to use data mining on it or what my expectations should be. The situation is:

We have a large and growing set of strings which we get requests for
(> 100,000). The requests have many, mostly nominal (non-numeric),
variables associated with them. We can only handle a subset (probably less than 10,000) or the strings at any one time. We want to use data mining to analyze historic requests so we can figure out which strings we are going to handle under a given set of variables. So given that our variable currently have values X1, ..., Xn, what subset should be use given than a large database of historic string requests.

thanks in advance,
