| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
Home -> Community -> Usenet -> c.d.o.server -> Re: challenging sql question
In article <335A81C2.27FD_at_ventera.com>,
"Lan H. Tran" <lantran_at_ventera.com> wrote:
>
>please help me come up with an elegant solution for the following
>problem:
>
>i have a table with 2 columns:
> state varchar(10)
> weight integer
>
>which may have the following values:
> maryland 10
> delaware 10
> virginia 30
>
>the weight column represents the random likelihood of retrieving that
>column. that is, virginia may be retrieved 60% of the time, maryland
>may be retrieved 20% of the time, and virginia may be retrieved 20% of
>the time. the table may have many entries and the weight column may be
>changed by the user.
>
>the application (written in c w/embedded sql) will repeatedly call a
>function that will randomly retrieve the data according to the weight
>distribution. how do we write this function and the accompanying sql
>call?
>
>i hope that i have clearly explained the problem.
>
>here is my preliminary thought:
>
>i retrieve a randomly generated value with the range of 1 and the sum of
>the weight column (50). i determine under which row that value
>belongs. for example, if the value is 15, the function returns
>delaware, because 15 does not fall in between 1 and 10 (maryland), does
>fall in between 11 and 20 (delaware), and does not fall in between 21
>and 50 (virginia). the coding is not too hard, but would seem
>inefficient. we would have to open up a cursor and continue to fetch
>next until we find the range. creating an internal array of records may
>also be inefficient, because the table may have many entries and not
>just 3 entries.
>
>please reply with e-mail to me as well as the newsgroup. thank you.
>
>Lan Tran
Try this out. It shifts the 'work' from the 'generator' program to the program which allows changes to weights.
Possibility 1.
maryland, 1
maryland, 2
..
delaware, 11
..
delaware, 19
virginia, 21
..
virginia, 50
2. Store the 'total_weight' in your parameter table.
3. Index your table on weight.
4. Retrieve a randomly generated value between 1 and the parameter stored and go for a 'direct hit' using weight as the key.
Drawbacks :
1. Too many rows created (probably livable; consider codifying
'state' into a smallint or its shortform)
2. Concurrency during weight changes - could be a big problem.
Possibility 2 :
Similar to 1, except reduce the number of rows by changing the table structure to
state
start_weight
end_weight
giving rows of
maryland, 1, 10 delaware, 11, 20 virginia, 21, 50
Again, go for a 'direct hit' using BETWEEN.
This could work out better than 1, if the number of rows reduce to a level where the whole table is in Shared Memory.
That will be 0.02
Cheers,
![]() |
![]() |