Re: Generating fake databases

From: Derek Asirvadem <derek.asirvadem_at_gmail.com>
Date: Fri, 14 Oct 2011 06:36:57 -0700 (PDT)
Message-ID: <b7fd60a0-40cc-4b44-bccf-fe63e7193faa_at_k2g2000yqh.googlegroups.com>


On Oct 14, 9:11 pm, Roy Hann <specia..._at_processed.almost.meat> wrote:

Ok.

So you are looking for state-of-the-art, and papers, on generating completely synthetic data of specified characteristics. I still have no idea what that means , but I think I have the title right. You think that is worth a paper. I have never heard of one (I did spend a bit of time looking for specific papers last year, although not on that subject). Frankly, the task is so pedestrian, that I do not think it is worth researching, I do not think you will find such papers. The state of that art is in shell, awk and SQL, your ability to design and code; it has not changed for about 25 years. Even the productised scripts are cheap, $800 for a full set.

You might have more luck looking at the way the more established ETL products are architected, their strategies.

Maybe one of the writers in the MS stable, they have huge research centres that produce mountains of papers that are not worth reading; no peer reviews, just publication. Snodgrass et al. Many are available free on the web, unlike the academic, peer-reviewed papers. Received on Fri Oct 14 2011 - 15:36:57 CEST

Original text of this message