Re: Random sample

From: Michael Nolan <nolan_at_helios.unl.edu>
Date: 1995/04/28
Message-ID: <3nr725$kjc_at_crcnis3.unl.edu>#1/1


dericks_at_merle.acns.nwu.edu (Daniel Erickson) writes:

>Is there an easy way to select a random sample of rows from a table? I.E.,
>give a random sample of rows that is 5% percent of the table size?

If you are using Oracle 7.1 or above, you can create a SQL function to produce a pseudorandom number (ie, linear congruence or some similar generator) and then use that to select the desired percentage.

This gets tricker if you want to be able to replicate the random selection, because Oracle doesn't guarantee that the order the rows will be scanned or selected in will always be the same. One possible method, although I haven't had reason to try it yet, would be to use the rowid as the 'seed' for the random number generator. (This brings up the old argument as to whether the rowid remains constant. As I recall the consensus answer to this is 'probably', and that's likely to be good enough for the duration of your random selection.)

Prior to 7.1, about your only options are either to add a 'random' column to your table, populate it using a random generator written in PL/SQL and then select from it, or to create another table consisting of the 'random' column and the rowid (or any other unique key) from your target table. (Either method has the advantage of making replicable selections easier.)

---
Michael Nolan, Sysop for the DBMS RoundTable on GEnie
nolan_at_tssi.com, dbms_at_genie.com, nolan_at_inetnebr.com
(posted from nolan_at_helios.unl.edu)
Received on Fri Apr 28 1995 - 00:00:00 CEST

Original text of this message