Re: What is data mining ???

From: Umar Khan <shed_at_dgs.dgsys.com>
Date: 28 Nov 1994 07:41:06 -0500
Message-ID: <3bcj52$dte_at_DGS.dgsys.com>


In article <1994Nov26.210027.39403_at_ucl.ac.uk>, Ms H Nguyen <ubac1ef_at_ucl.ac.uk> wrote:
>
>Sorry for my ignorance, and if my following question has been
>posted inappropriately, but can you please educate me
>
> - What is data mining ?
> - What are the most popular topics in data mining research ?
>
>
>If possible, make a copy of your answers to me at ubac1ef_at_ucl.ac.uk

There are really two answers that I know of for your question: one precise and legal, and the other helpful but imprecise. The term "Database Mining" is a trademark term of the Hecht-Nielson Company and is specific to their reputed neural net hardware-software product. In this context it is not a term of art but a term of commerce for what is, IMHO, an unimpressive and overhyped product. It almost made it as a term of art because the words do have a convenient metaphor for what has otherwise become known as "knowledge discovery in databases" or KDD.

The metaphor is cute and appropriate. When you mine, you must somehow sift through massive amounts of useless or ininteresting rock, dirt, etc. to find the rare nuggets of pure gold (or silver or whatever). Similarly, if you look at the legacy databases of any established company, agency, or office, you are likely to find that there is much more value beneath the surface that the original database designers took into consideration. There are associations between data elements, patterns which can be brought to light by the use of machine learning techniques. For example (and this is a naive example) a company may have a huge inventory database, you know... the typical kind which is designed only to keep track of inventory in a warehouse, ordering of replenishment items, recording of sales which depleat the inventory, etc. KDD could help discover trends in buying habits or could discover some patterns in how distributors fill orders or something like that.

There is a very good high level discussion of this in Patrick Winston's most recent edition of his classic _Artificial Intelligence_ published by Addison Wesley, I believe it is copyrighted last year (i.e., 1993). Winston is one of the VERY BEST at explaining complex technological and scientific concepts and his books are always well worth reading.

Another siminal book on the subject -s _Knowledge Discovery in Databases_ edited by Gregory Piatevsky-Schapiro et al. I am not in my office as I write this, or I'd give you better citations on both this and the Winston book. Greg has done an excellent job of pulling together the best papers of the day from the researchers and applications folk who are heavily into KDD. I use it as a kinda Bible for KDD. It describes techniques, approaches, and applications. Greg has also hosted workshops in KDD at the annual National Conference of the American Association for Artificial Intelligence (AAAI). Moreover, Greg and others also operate an internet mailing list to connect KDD people for cross-polenation of ideas.

I hope the above has been helpful.

Cheers!
:-)Umar

*****************************ooooooo***********************************
           Time ripens all things.  No man's born wise.
                                 -- Miguel de Cervantes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
khan_at_spdcc.com                                       shed_at_DGS.dgsys.com
Received on Mon Nov 28 1994 - 13:41:06 CET

Original text of this message