Re: Nested sort, trying again

From: Vadim Tropashko <vadimtro_invalid_at_yahoo.com>
Date: 30 Sep 2005 14:34:19 -0700
Message-ID: <1128116059.132453.136020_at_g43g2000cwa.googlegroups.com>


Bob Stearns wrote:
> I have a table (see below for ddl) which implements a set of locations
> for an organization, from the whole organization to shelves in a
> particular room. Users never see the primary key, only the NAMEX,
> usually as a drop down selection list, sometimes a single selection
> other times a multiselection. Users, being users, want the ability to
> change names and to see the names in sorted order at each level. If I
> didn't have to have changeable names, I could define the data in
> alphabetic order. Given that I can't do that, is there (relatively) easy
> way to present the data in a sorted manner? This is the same as seeing
> you directory tree in Windows explorer with the name chosen as the sort
> key. The root directory is sorted in filename order, then every
> directory under in it also in name order, down an arbitrary number of
> levels. In my examples, users attach people, functions, things etc. to
> locations at any level, can add additional locations at any level
> (dividing up a room, adding shelving, buying a new building, etc.).
> Because history is important, locations can only become inactive, never
> deleted. Because of the arbitrary number of levels, I can not join any
> other table to achieve the sort I want.
>
> CREATE TABLE IS3.LOCATIONS (
> LOC_ID INTEGER GENERATED BY DEFAULT AS IDENTITY
> (START WITH 1, INCREMENT BY 1, CACHE 20) NOT NULL,
> NAMEX VARCHAR(30) NOT NULL,
> LEFTX INTEGER NOT NULL,
> RIGHTX INTEGER NOT NULL,
> ENTITY_ID INTEGER NOT NULL,
> INACTIVE CHARACTER(25),
> PRIMARY KEY(LOC_ID)
> )
> GO
> ALTER TABLE IS3.LOCATIONS
> ADD CONSTRAINT INACTIVE
> CHECK (INACTIVE in ('Y','N'))
> GO
> ALTER TABLE IS3.LOCATIONS
> ADD CONSTRAINT SQL040717191150980
> FOREIGN KEY(ENTITY_ID)
> REFERENCES IS3.ENTITIES_PUB(ENTITY_ID)
> ON DELETE NO ACTION
> ON UPDATE NO ACTION
Continuing the "order by path" idea, let's discuss the LIST aggregate function first.

Here is a relevant exerpt from my book (you may scroll down to the solution at the end.

List Aggregate
^^^^^^^^^^^^^^

List aggregate is not exactly a missing feature. It is implemented as a built-in operator in Sybase SQL Anywhere and MySQL. Given the original Emp relation

select deptno, ename from emp;

 DEPTNO 	 ENAME
 10 	 CLARK
 10 	 KING
 10 	 MILLER
 20 	 ADAMS
 20 	 FORD
 20 	 JONES
 20 	 SCOTT
 20 	 SMITH
 30 	 ALLEN
 30 	 BLAKE
 30 	 JAMES
 30 	 MARTIN
 30 	 TURNER
 30 	 WARD

the query

select deptno, list(ename||', ')
from emp
group by deptno

is expected to return

DEPTNO 	 LIST(ename||', ')
 10 	 CLARK, KING, MILLER
 20 	 ADAMS, FORD, JONES, SCOTT, SMITH
 30 	 ALLEN, BLAKE, JAMES, MARTIN, TURNER, WARD

The other vendors don't have built-in list aggregate, but offer excessive functionality that allows implementing it easily. If your platform allows programming user-defined aggregate functions, you just have to search the code on the net, as it is most likely that somebody already written the required code. For Oracle you may easily find stragg function implementation on the Ask Tom forum.

-----------------------------Soap Box----------------------------------
User Defined Functions
^^^^^^^^^^^^^^^^^^^^^^

Originally, SQL intended to be "pure" declarative language. It had some built-in functions, but soon it has been discovered that introducing User Defined Functions (UDF) makes SQL engine extensible. Today, UDF is arguably one of the most abused features. In the industry, I have seen UDF with 200+ parameters wrapping a trivial insert statement, UDF used for query purposes, etc. Compare it to the integer generator UDF from chapter 2, which was written once only, and which intended to be used in numerous applications.

User-defined aggregate is the standard way of implementing list aggregate. Let's get explore alternative solutions. One more time the recursive SQL doesn't miss an opportunity to demonstrate its power. The idea is to start with the empty list for each department, and add records with list incremented whenever department has an employee name greater than last list increment. Among all the list of names select those that have maximal length:

with emp_lists (deptno, list, postfix, length) as ( select distinct deptno, '', '', 0
from emp
  union all
  select e.deptno, list || ', ' || ename, ename, length+1 from emp_lists el, emp e
where el.deptno = e.deptno
and e.ename > el.postfix
)
select deptno, list from emp_lists e
where length = (select max(length)

                from emp_lists ee
                where e.deptno = ee.deptno)

Watch out for pitfalls. It is very tempting to start with not the empty set but with the (lexicographically) minimal employee name per each department. This wouldn't work for departments with no employees.

The idea behind recursive SQL solution carries over to connect by solution

with concat_enames as (
  select deptno, sys_connect_by_path(ename,',') aggr, level depth   from emp e
  start with ename=(select min(ename) from emp ee

                    where e.deptno=ee.deptno)
  connect by ename > prior ename and deptno = prior deptno ) select deptno, aggr from concat_enames e where depth=(select max(depth) from concat_enames ee

             where ee.deptno = e.deptno);

Next, there go various methods leveraging collections CREATE or replace TYPE strings AS TABLE OF VARCHAR2(100); /

CREATE or replace Function CONCAT_LIST ( lst IN strings )   RETURN VARCHAR2 AS
    ret varchar2(1000);
BEGIN
    FOR j IN 1..lst.LAST LOOP

        ret := ret || lst(j);
    END LOOP;
    RETURN ret;
END;
/

SELECT deptno,

        CONCAT_LIST(
          CAST(MULTISET(
             SELECT ename||',' FROM EMP ee WHERE e.deptno=ee.deptno )
        AS strings)) empls

FROM emp e
group by deptno;

including the one with a little bit cleaner syntax

SELECT deptno,

       CONCAT_LIST(CAST( COLLECT(ename) AS strings )) empls FROM emp
group by deptno;

Admittedly, the function concatenating values in a collection adds a little bit procedural taste to. Again, such general purpose function shouldn't necessarily be exposed as a part of the solution at all. It is obvious that the CONCAT_LIST function should already be in the RDBMS library.

<************end of exerpt*****************>

You can generate the paths from nested sets as follows:

create table input (

   left integer,
   right integer,
   name varchar2(10)
);

insert into input
select 1 left, 10 right, 'A' name from dual union
select 2, 3, 'C' from dual
union
select 4, 5, 'B' from dual
;

SELECT ii.left

       ,CONCAT_LIST(CAST( COLLECT('.'||i.name) AS strings )) path FROM input i, input ii
where ii.left between i.left and i.right group by ii.left;

 LEFT PATH

 1   .A  
 2   .A.C  
 4   .A.B
Received on Fri Sep 30 2005 - 23:34:19 CEST

Original text of this message