OraFAQ Forum: General » Puzzle n°04 - Evenly share batches of articles into groups ***

Home » Other » General » Puzzle n°04 - Evenly share batches of articles into groups ***

Show: Today's Messages :: Polls :: Message Navigator
E-mail to friend

Puzzle n°04 - Evenly share batches of articles into groups *** [message #290794]

Mon, 31 December 2007 15:57

Michel Cadot
Messages: 68647
Registered: March 2007
Location: Nanterre, France, http://...

Senior Member
Account Moderator

We have a table of batches containing some articles (see Puzzle n°03).
We now want to aggregate these batches into groups with an as equal as possible number of articles without spliting the batches.
With the following test case:

DROP TABLE batches PURGE;
CREATE TABLE batches (batch_id INTEGER PRIMARY KEY, avail_qty INTEGER NOT NULL);
INSERT INTO batches VALUES (1, 1);
INSERT INTO batches VALUES (2, 1);
INSERT INTO batches VALUES (3, 1);
INSERT INTO batches VALUES (4, 1);
INSERT INTO batches VALUES (5, 1);
INSERT INTO batches VALUES (6, 1);
INSERT INTO batches VALUES (7, 1);
INSERT INTO batches VALUES (8, 2);
INSERT INTO batches VALUES (9, 3);
INSERT INTO batches VALUES (10, 4);
INSERT INTO batches VALUES (11, 4);
INSERT INTO batches VALUES (12, 4);
INSERT INTO batches VALUES (13, 5);
INSERT INTO batches VALUES (14, 5);
INSERT INTO batches VALUES (15, 13);
INSERT INTO batches VALUES (16, 20);
INSERT INTO batches VALUES (17, 20);
COMMIT;

If we want to aggregate into 3, 4 or 5 groups we must have something like (quantities are given and not batch_id):

SQL> DEFINE groups=3
SQL> /
     GROUP QUANTITIES           TOTAL
---------- --------------- ----------
         1 20,4,3,1,1              29
         2 20,4,2,1,1,1            29
         3 13,5,5,4,1              28

SQL> DEFINE groups=4
SQL> /

     GROUP QUANTITIES           TOTAL
---------- --------------- ----------
         1 20,1,1                  22
         2 20,1,1                  22
         3 13,4,3,1                21
         4 5,5,4,4,2,1             21

SQL> DEFINE groups=5
SQL> /

     GROUP QUANTITIES           TOTAL
---------- --------------- ----------
         1 20                      20
         2 20                      20
         3 13,1,1,1                16
         4 5,4,4,1,1               15
         5 5,4,3,2,1               15

The displayed ouput here is just to show the result it is not mandatory, you can display it as you want.

Enjoy!

Regards
Michel

[Updated on: Sat, 20 June 2009 10:12]

Report message to a moderator

Re: Puzzle n°04 - Evenly share batches of articles into groups *** [message #300263 is a reply to message #290794]

Thu, 14 February 2008 15:38

Frank_Zhou
Messages: 5
Registered: February 2008
Location: Braintree , MA

Junior Member

The SQL solution for this puzzles is based on my SQL Query for the "Bin FITING" problem
http://www.jlcomp.demon.co.uk/faq/Bin_Fitting.html
The restriction of this SQL solution is that we must predetermine
the number of groups. So this SQL is not flexiable at all. We can consider it as a prelimary try of this puzzle.

SELECT bucket_name AS GROUPS,
trim(BOTH ','FROM regexp_replace(XMLAgg(
     XMLElement(X,avail_qty) ORDER BY avail_qty desc),'<X>|</X><X>|</X>',',')) AS QUANTITIES, 
SUM(avail_qty) TOTAL        
FROM 
(SELECT batch_id , avail_qty , bucket_name, rn, bucket_1, bucket_2, bucket_3 
 FROM
 (SELECT batch_id , avail_qty , count(*) OVER ( ) counter,
         ROW_NUMBER() OVER (ORDER BY avail_qty  desc) rn FROM batches)
 MODEL 
 DIMENSION BY (rn)
 MEASURES (batch_id , avail_qty , 0 it, counter, 
 CAST(NULL AS NUMBER) bucket_name, CAST(NULL AS NUMBER) min_tmp,
 CAST(NULL AS NUMBER) bucket_1,    CAST(NULL AS NUMBER) bucket_2,  
 CAST(NULL AS NUMBER) bucket_3,    CAST(NULL AS NUMBER) pbucket_1,   
 CAST(NULL AS NUMBER) pbucket_2,   CAST(NULL AS NUMBER) pbucket_3 
 )
 RULES ITERATE (100000)
 UNTIL (ITERATION_NUMBER>= counter[1])
 (
 pbucket_1[2] = CASE WHEN it[ITERATION_NUMBER] = 0 THEN 0 END ,
 pbucket_2[2] = CASE WHEN it[ITERATION_NUMBER] = 0 THEN 0.00001 END ,
 pbucket_3[2] = CASE WHEN it[ITERATION_NUMBER] = 0 THEN 0.00002 END ,
 min_tmp[1] = least(sum(pbucket_1)[ANY], sum(pbucket_2)[ANY], sum(pbucket_3)[ANY]) ,
 bucket_1[ITERATION_NUMBER] = CASE WHEN sum(pbucket_1)[ANY] = min_tmp[1] 
                                   THEN avail_qty [ITERATION_NUMBER]  END,
 bucket_2[ITERATION_NUMBER] = CASE WHEN sum(pbucket_2)[ANY] =  min_tmp[1]
                                   THEN avail_qty [ITERATION_NUMBER]  END, 
 bucket_3[ITERATION_NUMBER] = CASE WHEN sum(pbucket_3)[ANY] =  min_tmp[1] 
                                   THEN avail_qty [ITERATION_NUMBER]  END, 
 bucket_name[ITERATION_NUMBER] =CASE WHEN sum(pbucket_1)[ANY] = min_tmp[1] THEN 1
                                     WHEN sum(pbucket_2)[ANY] = min_tmp[1] THEN 2
                                     WHEN sum(pbucket_3)[ANY] = min_tmp[1] THEN 3  
                                END,                         
pbucket_1[1] = sum(bucket_1)[ANY],
pbucket_2[1] = sum(bucket_2)[ANY],
pbucket_3[1] = sum(bucket_3)[ANY],
it[ITERATION_NUMBER] = ITERATION_NUMBER  
 )                  
) 
WHERE batch_id  IS NOT NULL
GROUP BY bucket_name; 




   GROUPS QUANTITIES      TOTAL                            
--------- --------------- ----------                            
1          20,4,3,1,1      29                            
2          20,4,2,1,1,1    29                            
3          13,5,5,4,1,1    29   
  
SQL> spool off;

[Updated on: Fri, 22 February 2008 13:41] by Moderator

Report message to a moderator

Re: Puzzle n°04 - Evenly share batches of articles into groups *** [message #687416 is a reply to message #300263]

Tue, 07 March 2023 09:10

mathguy
Messages: 107
Registered: January 2023

Senior Member

"as equal as possible" is undefined. There are several ways to interpret that, which are not all equivalent; they may lead to different optimal solutions on the same data.

For example: we may take the difference between the highest and the lowest "TOTAL" and require that this difference be minimized. Alternatively, we may take the sum of the absolute pairwise differences between individual TOTALs and ask to minimize this sum.

A common idea in such problems is to penalize larger differences more than smaller ones. So, we might consider the squares of all pairwise differences between TOTALs, and ask to minimize the sum of squares. This last definition of "as equal as possible" has the pleasing property that it is equivalent to minimizing the standard deviation of the TOTALs.

- - - - - -

Ceux qui ne peuvent lire plus que cinq lignes de texte... should stop reading now! Smile

- - - - - -

So, let's say the task is to minimize the standard deviation of the TOTALs.

The next question is, are we asking for the optimal solution (strictly speaking, the formulation given in the puzzle does ask for that), or are we just asking for a good approximation?

So far there is only one solution offered to this puzzle. The solution uses the LPT algorithm (more about this below), which does not give the optimal solution. For example, if there are only five batches with quantities 4, 5, 6, 7, 8 and we ask to generate 2 groups, the solution (using LPT) will give the answer 8,5,4 and 7,6 (TOTALs of 17 and 13) when in fact 8,7 and 6,5,4 has TOTALs of 15 and 15 - a better solution.

The problem is known as a "scheduling" problem. Specifically, it is an identical-machines scheduling problem where the objective function is the standard deviation (or, better, the variance) of the TOTALs. There are theoretical results about the "goodness" of various approximation approaches. To ask for the exact optimum is not reasonable; even with just two groups, and even to ask simply whether there is a grouping where the two TOTALs are equal, is the NP-complete partition problem.

So, what is really the question? Find the absolute optimum? This can be done by brute force (inspect ALL groupings, select the best one), but that becomes unfeasible very quickly. Is it to implement one of the approximate algorithms? Perhaps simply to implement the LPT (longest-process-time first) algorithm, as Frank Zhou has done?

The puzzle is certainly "very hard" if we talk about the math problem. If we must implement a given approximate algorithm, the programming problem may not be too difficult; another question, at this point, is whether only pure-SQL solutions are accepted, or if PL/SQL solutions are OK too. (Implementing any algorithm in any procedural language should be pretty straight-forward, and therefore easy.) And since the addition of functions in the WITH clause in Oracle 12, we should agree on what we mean by pure-SQL; that should probably mean "no functions in the WITH clause, that's just cheating".

Report message to a moderator

Re: Puzzle n°04 - Evenly share batches of articles into groups *** [message #687417 is a reply to message #687416]

Tue, 07 March 2023 10:11

Michel Cadot
Messages: 68647
Registered: March 2007
Location: Nanterre, France, http://...

Senior Member
Account Moderator

Quote:

that should probably mean "no functions in the WITH clause, that's just cheating".

I can only agree. Smile

But if someone posts a PL/SQL solution then it'll be accepted, this has already be done in other puzzles.
The purpose of the puzzles is to learn, whatever we learn.

(Whew! I have learned something: I am able to read more than 5 lines! Smile

)

Report message to a moderator

Previous Topic:	Puzzle n°09 - Finding a query for sine function distribution **
Next Topic:	Puzzle n°08 - Display the Numerical Pyramid using straight SQL query (Not PL/SQL) **

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Fri Apr 26 19:54:30 CDT 2024