Best (fastest) way to get the number of rows from a table? [message #317120] |
Tue, 29 April 2008 10:47  |
Mopper
Messages: 11 Registered: April 2008 Location: Belgium
|
Junior Member |
|
|
Hi everyone.
I have quite peculiar issue to deal with. We are working on an experimental datamining algorithm. This application processes lots of data, and generates massive amounts of data, since it builds thousands of models and then tests these models on perticular datasets. The number of results from those tests can go up to a couple million. Those results need to be processed afterwards too.
Now, the building of all those models and testing them obviously takes a long time (a few days, at least) and we'd like an application so that we can monitor the progress a bit.
There is no way to hook into the processes that build the models, since these are completely independent from each other, run on separate servers and are completely agnostic of everything except for the slice of data they need to process.
So, we thought the easiest way to monitor the progress was to check how big the output tables have gotten with a plain COUNT(*) FROM anyOutputTable , and then compare that result to the amount of results that will be generated in the end (we can calculate the exact total amount of results).
But I think this will be terribly slow, running COUNT(*) against such big tables. Also, it might slow the actual generating of results down since that would be inserting records into the same table, at the same time.
Now, the thing is that the results from the COUNT-query will be exact, but we don't really need that. It would be ok for the counting to be more of an estimate and be 5-10% off. So I was wondering if there would be any way to estimate the number of rows in a table (only if that would be faster than actually counting all the records, obviously)?
All suggestions are appreciated.
Thanks in advance.
|
|
|
|
|