Re: Q: How do you do this in SQL ?

From: S.Victor <_at_concentric.net>
Date: 1996/08/22
Message-ID: <4vhoes$6oj_at_herald.concentric.net>


Roland Mueller wrote:
>
> Hi Satish,
>
> I have solved the problem with SPL using recursive calls:
>
> CREATE PROCEDURE expand_tree1(start_here INTEGER)
> RETURNING INTEGER,CHAR(1),CHAR(15),INTEGER,INTEGER;
> DEFINE m_id INTEGER;
> DEFINE m_kc CHAR(1);
> DEFINE m_name CHAR(15);
> DEFINE m_parent INTEGER;
> DEFINE m_lev INTEGER;
>
> FOREACH WITH HOLD
> SELECT id,kc,name,parent
> INTO m_id,m_kc,m_name,m_parent
> FROM keywordclass
> WHERE id = start_here
>
> RETURN m_id,m_kc,m_name,m_parent,0
> WITH RESUME;
>
> FOREACH EXECUTE PROCEDURE expand_tree2(m_id,1)
> INTO m_id,m_kc,m_name,m_parent,m_lev
>
> RETURN m_id,m_kc,m_name,m_parent,m_lev
> WITH RESUME;
> END FOREACH;
>
> END FOREACH;
> END PROCEDURE;
>
> CREATE PROCEDURE expand_tree2(start_here INTEGER,
> lev INTEGER) RETURNING
> INTEGER,CHAR(1),CHAR(15),INTEGER,INTEGER;
> DEFINE m_id INTEGER;
> DEFINE m_kc CHAR(1);
> DEFINE m_name CHAR(15);
> DEFINE m_parent INTEGER;
> DEFINE m_lev INTEGER;
>
> FOREACH WITH HOLD
> SELECT id,kc,name,parent
> INTO m_id,m_kc,m_name,m_parent
> FROM keywordclass
> WHERE parent = start_here
>
> RETURN m_id,m_kc,m_name,m_parent,lev
> WITH RESUME;
>
> FOREACH EXECUTE PROCEDURE expand_tree2(m_id,lev+1)
> INTO m_id,m_kc,m_name,m_parent,m_lev
>
> RETURN m_id,m_kc,m_name,m_parent,m_lev
> WITH RESUME;
> END FOREACH;
>
> END FOREACH;
> END PROCEDURE;
>
> The first procedure queries all rows with the starting point, then
> invokes the second stored procedure with the returned id as parent.
> The 2nd procedure now calls itself until the tree is completely
> expanded.
>
> I wrote this proc for another table with a similar structure than
> yours and changed the names to fit for your needs, I hope I made
> not typos.
>
> Sincerely,
> Roland
>
> Satish Katiyar wrote:
> >
> > I should know SQL better but for now can you please help me ?
> >
> > I have a table
> >
> > create table keywordclass (
> > Id integer,
> > KC char(1), // keyword or class ?
> > Name char(15),
> > Parent integer // refers to Id
> > );
> >
> > It stores keyword hierarchies. Sample data is
> >
> > 1, 'C', 'Living', 1
> > 2, 'C', 'Animal', 1
> > 3, 'C', 'Plant', 1
> > 4, 'K', 'Tiger', 2
> > 5, 'C', 'Bird', 2
> > 6, 'K', 'Parrot', 5
> >
> > How do you write a query in SQL that will retrieve the full
> > keyword hierarchy (tree) ? (Assuming that names are unique).
> > For example, retrieve the tree for "Animal", should return
> >
> > 2, 'C', 'Animal', 1
> > 4, 'K', 'Tiger', 2
> > 5, 'C', 'Bird', 2
> > 6, 'K', 'Parrot', 5
> >
> > What will be the query if keywords and class names could be
> > the same and duplicate ?
> >
> > Thanks a lot for your time. I would appreciate if you could
> > send your reply as email.
> >
> > Satish
> > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> > Satish Katiyar Email : katiyar_at_informix.com
> > Informix Software, Inc. Work : (510) 873 - 8494
> > 1111 Broadway, Suite 2000 Fax : (510) 873 - 6275
> > Oakland, California 94607 Home : (510) 769 - 6953
> > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

I saw this in MS Sql Server Online Manual.

Hope this helps.

Expanding Hierarchies


Databases often store hierarchical information. The following data, for example, is a hierarchical representation of regions of the world:

Parent	Child
-------------	-------------
World	Europe
World	North America
Europe	France
France	Paris
North America	United States
North America	Canada
United States	New York
United States	Washington
New York	New York City
Washington	Redmond

This representation does not clearly show the structure implied by the data. The following example is easier to interpret: World

	North America
		Canada
		United States
			Washington
				Redmond
			New York
				New York City
	Europe
		France
			Paris

The following Transact-SQL procedure expands an encoded hierarchy to any arbitrary depth. Although Transact-SQL supports recursion, it is more efficient to use a temporary table as a stack to keep track of all of the items for which processing has begun but is not complete. When processing is complete for a particular item, it is removed from the stack. New items are added to the stack as they are identified. (Note that this example is for illustration only and it does not come from the pubs database.)
create proc expand (_at_current char(20)) as set nocount on
declare _at_level int, @line char(20)
create table #stack (item char(20), level int) insert into #stack values (_at_current, 1) select _at_level = 1
while _at_level > 0
begin
	if exists (select * from #stack where level = _at_level)
		begin
			select _at_current = item
			from #stack
			where level = _at_level
			select _at_line = space(@level - 1)  @current
			print _at_line
			delete from #stack
			where level = _at_level
				and item = _at_current
			insert #stack
				select child, _at_level  1
				from hierarchy
				where parent = _at_current
			if _at__at_rowcount > 0
				select _at_level = @level  1
		end
	else
		select _at_level = @level - 1

end
In this example, the input parameter (_at_current) is used to indicate the place in the hierarchy to start. It also keeps track of the current item in the main loop.
The two local variables used are _at_level, which keeps track of the current level in the hierarchy, and _at_line, which is a work area used to construct the indented line.
The SET NOCOUNT ON statement avoids cluttering up the output with ROWCOUNT messages from each SELECT.
The temporary table, #stack, is created and primed with the item identifier of the starting point in the hierarchy, and _at_level is set to match. The level column in #stack allows the same item to appear at multiple levels in the database. Although this situation does not apply to the geographic data in the example, it can apply in other examples. In this example, when _at_level is greater than 0, the procedure follows several steps:
  1. If there are any items in the stack at the current level (_at_level), choose one and call it @current.
  2. Indent the item _at_level spaces, and then print the item.
  3. Delete the item from the stack so it won't be processed again, and then add all its child items to the stack at the next level (_at_level 1).

Note With a conventional programming language, you would have to find each child item and add it to the stack individually. With Transact-SQL, you can find all child items and add them with a single statement, avoiding another nested loop.

This is the only place where the hierarchy table (#stack) is used.

        4. If there are child items (IF _at__at_ROWCOUNT > 0), descend one level to process them (_at_level = @level 1); otherwise, continue processing at the current level.

        5. Finally, if there are no items on the stack awaiting processing at the current level, go back up one level to see if there are any awaiting processing at the previous level (_at_level = @level - 1). When there is no previous level, the expansion is complete.   

Expanding Networks


In the example in the preceding section, each item had only one superior. In a network, an item can have more than one superior. The following data, for example, is a representation of airline flights among a number of cities:

Departure	Destination
---------	-------------
Chicago	New York
Chicago	Milwaukee
Denver	Chicago
Seattle	Chicago
Seattle	Denver
Seattle	San Francisco

A common problem you can have with this data is finding all routes between a given pair of cities:
Itineraries

Seattle, Chicago, New York
Seattle, Denver, Chicago, New York
A few changes to the example in the preceding section can accomplish this:
· Two additional input parameters are required: the goal city and the depth-of-search limit.
· The current itinerary is saved in another temporary table and displayed only when a goal is reached.
· To avoid expanding around a cycle in the network, don't expand cities that appear in the current itinerary.

These changes are illustrated in the following example (not from pubs): create proc route
(_at_current char(20), @dest char(20), @maxlevel int = 5) as set nocount on
declare _at_level int
create table #stack (city char(20), level int) create table #list (city char(20), level int) insert #stack values (_at_current, 1)
select _at_level = 1
while _at_level > 0
begin

	if exists (select * from #stack where level = _at_level)
		begin
			select _at_current = city
			from #stack
			where level = _at_level
			delete from #stack 
			where level = _at_level 
				and city = _at_current
			delete from #list 
			where level >= _at_level
			if exists(select * from #list where city = 
_at_current)
				continue
			insert #list values(_at_current, @level)
			if(_at_current = @dest)
			begin
				select itinerary = city
				from #list
				continue
			end
			insert #stack
			select destination, _at_level  1
			from flights
			where departure = _at_current
				and _at_level < @maxlevel
			if _at__at_rowcount > 0
				select _at_level = @level  1
		end
	else
		select _at_level = @level - 1

end
In this example, when _at_level is greater than 0, the process follows several steps:
  1. The current city is added to #list by clearing anything at the current level or below (DELETE FROM #list WHERE level > = _at_level) and then adding the current city (INSERT #list VALUES(_at_current, @level)).
  2. Whenever the goal city is reached (_at_current = @dest), this example displays the path (SELECT itinerary = city FROM #list) and doesn't expand the path any further (CONTINUE).
  3. The depth of search is limited by adding a condition (_at_level < @maxlevel) to the INSERT statement that adds cities to the stack.

The IF EXISTS statement at the beginning of the WHILE loop skips the current city if it's already in the current itinerary. If the flights table in the preceding example also contains cost information, the lowest cost route can be found by saving the current itinerary if its total cost is less than the best cost so far: select _at_cost = sum(cost)
from #list
if _at_cost < @lowest_cost
begin

	_at_lowest_cost = @cost
	truncate table #best_route
	insert #best_route
		select *
		from #list

end
For greater efficiency, you can stop expanding the current route if the current cost far exceeds the cost of the best route: if (select sum(cost) from #list) > _at_lowest_cost

        continue
If the flights table also includes a departure and arrival time, you can add an IF statement to expand only the routes that have a departure time at least one hour after the arrival time of the current route: if ((select sum(cost) from #list) > _at_lowest_cost)

        and datediff(hh, departuretime, _at_arrivaltime) > 1) continue Received on Thu Aug 22 1996 - 00:00:00 CEST

Original text of this message