Home » SQL & PL/SQL » SQL & PL/SQL » Search for value withing hierarchy paths
Search for value withing hierarchy paths [message #228967] Wed, 04 April 2007 13:52 Go to next message
Ivan
Messages: 180
Registered: June 2000
Senior Member
Greetings!

I have a requirement to find ALL hierarchy paths that reference a given member.

For example, given the following diagram:
        100
      /  /\
    101 / 102
   /  \/ /
  103 104   
      /
     105

I need to find all paths that use value 104, no mater if it's at the root level, somewhere in the middle, or at the end.

The definition of the path is beyond (or so I think) of what the SYS_CONNECT_BY_PATH function does. For the diagram above I'd have the following links and paths:
link_id child_id parent_id
------- -------- ---------
      1      100
      2      101       100
      3      102       100
      4      103       101
      5      104       101
      6      104       100
      7      104       102
      8      105       104

link_id path
------- ------------
      1 
      2 :100
      3 :100
      4 :100:101
      5 :100:101
      6 :100
      7 :100:102
      8 :100:101:104
      8 :100:104
      8 :100:102:104

Notice the different paths for combination 105-104 (last 3 records, for link_id 8 ).

Here's the code to create and load the hierarchy tables:
create table links
(link_id   int not null
,child_id  int not null
,parent_id int
);

insert into links values (1, 100, null);
insert into links values (2, 101, 100);
insert into links values (3, 102, 100);
insert into links values (4, 103, 101);
insert into links values (5, 104, 101);
insert into links values (6, 104, 100);
insert into links values (7, 104, 102);
insert into links values (8, 105, 104);

create table paths
(link_id int not null
,path    varchar2 (4000)
);

insert into paths values (1, '');
insert into paths values (2, ':100');
insert into paths values (3, ':100');
insert into paths values (4, ':100:101');
insert into paths values (5, ':100:101');
insert into paths values (6, ':100');
insert into paths values (7, ':100:102');
insert into paths values (8, ':100:101:104');
insert into paths values (8, ':100:104');
insert into paths values (8, ':100:102:104');

commit;

As you can see, even though the link only references the child_id and its immediate parent_id, the path value has to show all parent members, starting from the "top most" parent.

I guess in order to build the path value I'd have to come up with a recursive (since multiple paths are possible for the same link) procedure.

But the bigger problem here is finding all those paths, that reference a given member. With the current structure and design we're doing a full table scan on the paths table
 where path like :id || ':%'
    or path like '%:' || :id || ':%'

Also, in case you think there's a better approach to this, please... you're more than welcome!

Thank you beforehand.

[Updated on: Wed, 04 April 2007 13:57]

Report message to a moderator

Re: Search for value withing hierarchy paths [message #228973 is a reply to message #228967] Wed, 04 April 2007 14:12 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
sys_connect_by_path function is the answer.

Regards
Michel

Re: Search for value withing hierarchy paths [message #228975 is a reply to message #228973] Wed, 04 April 2007 14:20 Go to previous messageGo to next message
Ivan
Messages: 180
Registered: June 2000
Senior Member
Thanks Michel,
But I've already mentioned that sys_connect_by_path doesn't cover my requirement completely.
Re: Search for value withing hierarchy paths [message #228976 is a reply to message #228975] Wed, 04 April 2007 14:30 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
I don't understand why but if you say so...

Regards
Michel
Re: Search for value withing hierarchy paths [message #228980 is a reply to message #228976] Wed, 04 April 2007 14:47 Go to previous messageGo to next message
Ivan
Messages: 180
Registered: June 2000
Senior Member
Michel,
This topic is not about using sys_connect_by_path.
It's about searching through it's (if using it) results.

Also,
Examples usually prove the point. By pointing people to documentation, especially when you can clearly see that they are aware of the subject you're pointing to, is meaningless (not to mention the waste of effort), unless you're simply trying to improve the count of messages you responded to (and "helped").

Again,
Thank you for your time.
Your point was well taken.
Re: Search for value withing hierarchy paths [message #228993 is a reply to message #228967] Wed, 04 April 2007 21:56 Go to previous messageGo to next message
vadimtro
Messages: 8
Registered: December 2006
Junior Member
It might help if you'll use standard terms. A hierarchy is formally a tree. You have a problem on graphs.

Admittedly, the "connect by" syntax has been designed for hierarchies, so it looks a little bit awkward when applied to graphs. Assuming that you have a graph defined as a list of edges:

table Edges (
tail integer,
head integer
);

the query that lists all the edges on a path from the "startNode" to the "endNode" is:

select connect_by_root(tail) startNode,
head endNode,
sys_connect_by_path('['||tail||','||head||'>',',')
from Edges
connect by prior tail = head

Note that the first argument for the sys_connect_by_path is very confusing. I never have been able to remember if I have to plug in the tail, or the head node into it. So I just plug in both! Think of the edges as little arrows that connect tail node to the head. This is why it is intuitive to decorate the first argument for sys_connect_by_path with those funny "[...,...>" brackets.





Re: Search for value withing hierarchy paths [message #228994 is a reply to message #228980] Wed, 04 April 2007 22:28 Go to previous messageGo to next message
rleishman
Messages: 3724
Registered: October 2005
Location: Melbourne, Australia
Senior Member
It looks like you don't have a pure tree structure, so CONNECT BY on v9i or earlier will detect loops and raise an error. 10g would work though.

Using 10g, you could generate the PATHS table with:
SELECT link_id, SUBSTR(SYS_CONNECT_BY_PATH(parent_id, ':'),2)
FROM links
CONNECT BY parent_id = PRIOR child_id
START WITH parent_id IS NULL
order by 1


You shouldn't need the PATHS table though, you might be able to get by with LINKS.

For a given ID, you can find all of the links involved in all of the paths to :id by starting at the required link and searching upwards:
SELECT DISTINCT link_id, child_id, parent_id
FROM   links
CONNECT BY child_id = PRIOR parent_id
START with child_id = :id


You could use SYS_CONNECT_BY_PATH(parent_id, ':') and filter out the partial paths to get the paths from :id to the root using this SQL, but the links in the path will be in reverse order. You could easily write a PL/SQL function to reverse them.

Alternatively, now that you have a minimal subset of the LINKS table, you can start at the top and search back down:

SELECT pth
FROM (
  SELECT substr(SYS_CONNECT_BY_PATH(parent_id, ':'),2) pth, child_id
  FROM (
    SELECT DISTINCT link_id, child_id, parent_id
    FROM   links
    CONNECT BY child_id = PRIOR parent_id
    START with child_id = :id
  )
  START WITH parent_id IS NULL
  CONNECT BY parent_id = PRIOR child_id
)
WHERE child_id = :id


Ross Leishman
Re: Search for value withing hierarchy paths [message #229013 is a reply to message #228980] Thu, 05 April 2007 01:06 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Ivan,

I don't care about post count.
I'm just trying to help.
When I pointed to sys_connect_by_path I thought you'd then follow the associated topics: Hierarchical queries, then CONNECT BY, CONNECT_BY_ROOT, NOCYCLE, CONNECT_BY_ISCYCLE...
And when I pointed to sys_connect_by_path I didn't mean you don't have to use other functions like SUBSTR or INSTR to build the final result.

Regards
Michel
Re: Search for value withing hierarchy paths [message #229157 is a reply to message #228994] Thu, 05 April 2007 10:12 Go to previous messageGo to next message
Ivan
Messages: 180
Registered: June 2000
Senior Member
Ross,
Thank you very much for your explanation, and examples.

I'm using Oracle 9i, and querying the entire tree doesn't detect loops (because there aren't any).

The last query you showed, will only display paths ending with child_id 104
10:29:40 SQL> col pth form a60
10:29:41 SQL> select *
10:29:45   2    from (
10:29:45   3         select child_id
10:29:45   4              , parent_id
10:29:45   5              , substr (sys_connect_by_path (parent_id, ':'), 2) pth
10:29:45   6           from (
10:29:45   7                select distinct link_id
10:29:45   8                     , child_id
10:29:45   9                     , parent_id
10:29:45  10                  from links
10:29:45  11                connect by child_id = prior parent_id
10:29:45  12                  start with child_id = 104
10:29:45  13                )
10:29:45  14         connect by parent_id = prior child_id
10:29:45  15           start with parent_id is null
10:29:45  16         )
10:29:45  17   where child_id = 104
10:29:45  18  /

  CHILD_ID  PARENT_ID PTH
---------- ---------- ------------------------------------------------------------
       104        101 :100:101
       104        102 :100:102
       104        100 :100

Elapsed: 00:00:00.01
10:29:45 SQL>

However, I need the query to also return the record(s) where child_id=105 and parent_id=104, since that link's paths also contain member 104.
I accomplished that by adding an additional query, for the same ID as a parent, and combined the results using UNION ALL, as follows:
10:29:45 SQL> select *
10:43:05   2    from (
10:43:05   3         select child_id
10:43:05   4              , parent_id
10:43:05   5              , substr (sys_connect_by_path (parent_id, ':'), 2) pth
10:43:05   6           from (
10:43:05   7                select distinct link_id
10:43:05   8                     , child_id
10:43:05   9                     , parent_id
10:43:05  10                  from links
10:43:05  11                connect by child_id = prior parent_id
10:43:05  12                  start with child_id = 104
10:43:05  13                )
10:43:05  14         connect by parent_id = prior child_id
10:43:05  15           start with parent_id is null
10:43:05  16         )
10:43:05  17   where child_id = 104
10:43:05  18  union all
10:43:05  19  select *
10:43:05  20    from (
10:43:05  21         select child_id
10:43:05  22              , parent_id
10:43:05  23              , substr (sys_connect_by_path (parent_id, ':'), 2) pth
10:43:05  24           from (
10:43:05  25                select distinct link_id
10:43:05  26                     , child_id
10:43:05  27                     , parent_id
10:43:05  28                  from links
10:43:05  29                connect by child_id = prior parent_id
10:43:05  30                  start with parent_id = 104
10:43:05  31                )
10:43:05  32         connect by parent_id = prior child_id
10:43:05  33           start with parent_id is null
10:43:05  34         )
10:43:05  35   where parent_id = 104
10:43:05  36  /

  CHILD_ID  PARENT_ID PTH
---------- ---------- ------------------------------------------------------------
       104        101 :100:101
       104        102 :100:102
       104        100 :100
       105        104 :100:101:104
       105        104 :100:102:104
       105        104 :100:104

6 rows selected.

Elapsed: 00:00:00.01
10:43:06 SQL>

I now have all the results I needed, and I thank you for that!

The only problem I having (still) is the full table scan. Here's the plan of the inner query:
10:52:25 SQL> set autotrace on explain
10:53:02 SQL>               select distinct link_id
10:55:58   2                     , child_id
10:55:58   3                     , parent_id
10:55:58   4                  from links
10:55:58   5                connect by child_id = prior parent_id
10:55:58   6                  start with child_id = 104
10:55:58   7  /

   LINK_ID   CHILD_ID  PARENT_ID
---------- ---------- ----------
         1        100
         2        101        100
         3        102        100
         5        104        101
         6        104        100
         7        104        102

6 rows selected.

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
          0
SELECT STATEMENT Optimizer=ALL_ROWS (Cost=18 Card=1 Bytes=16)


          1                  0
  SORT (UNIQUE) (Cost=18 Card=1 Bytes=16)


          2                  1
    CONNECT BY (WITH FILTERING)


          3                  2
      TABLE ACCESS (BY INDEX ROWID) OF 'LINKS'


          4                  3
        INDEX (RANGE SCAN) OF 'IDX_LINKS_1' (NON-UNIQUE) (Cost=1 Card=1 Bytes=6)


          5                  2
      NESTED LOOPS


          6                  5
        BUFFER (SORT)


          7                  6
          CONNECT BY PUMP


          8                  5
        TABLE ACCESS (BY INDEX ROWID) OF 'LINKS' (Cost=2 Card=1 Bytes=16)


          9                  8
          INDEX (RANGE SCAN) OF 'IDX_LINKS_1' (NON-UNIQUE) (Cost=1 Card=1)


         10                  2
      TABLE ACCESS (FULL) OF 'LINKS'

Notice the TABLE ACCESS (FULL) OF 'LINKS' at the very end.

Any suggestions on why a full table scan was used?
The table has about 100,000 records, and both, child_id and parent_id columns are indexed. Statistics are current.

Again,
Thank you for taking the time to help on this matter.

Regards,
Ivan.
Re: Search for value withing hierarchy paths [message #229164 is a reply to message #229157] Thu, 05 April 2007 11:01 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Ivan,

I'm glad to see you finally got a solution with sys_connect_by_path.

Regards
Michel
Re: Search for value withing hierarchy paths [message #229175 is a reply to message #229157] Thu, 05 April 2007 12:12 Go to previous messageGo to next message
vadimtro
Messages: 8
Registered: December 2006
Junior Member
This full table scan is bogus -- it doesn't do anything. It is there for some legacy reason (to perform some kind of initialization). You can witness it by getting execution statistics on rowsource level (v$sql_plan_statistics_all) and see the buffer gets and execution time is 0 at this rowsource node.
Re: Search for value withing hierarchy paths [message #229372 is a reply to message #229175] Sat, 07 April 2007 01:11 Go to previous messageGo to next message
rleishman
Messages: 3724
Registered: October 2005
Location: Melbourne, Australia
Senior Member
I agree with @vadimtro, the third child of a CONNECT BY is bogus.

Ross Leishman
Re: Search for value withing hierarchy paths [message #229379 is a reply to message #228980] Sat, 07 April 2007 01:38 Go to previous message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
I think still don't understand why the following does not fit your criteria.
Maybe you should post the result you want for 104 or 105 with your data.
SQL> def node=104;
SQL> with
  2    all_paths as (
  3      select sys_connect_by_path(child_id, ':')||':' path
  4      from links
  5      connect by parent_id = prior child_id
  6      start with parent_id is null
  7    )
  8  select path
  9  from all_paths
 10  where path like '%:&node.:%'
 11  /
PATH
--------------------------------------------------
:100:101:104:
:100:101:104:105:
:100:102:104:
:100:102:104:105:
:100:104:
:100:104:105:

6 rows selected.

SQL> def node=105
SQL> /
PATH
--------------------------------------------------
:100:101:104:105:
:100:102:104:105:
:100:104:105:

3 rows selected.

Regards
Michel
Previous Topic: log error feature in 10g
Next Topic: How to Update multiple rows dynamiccally
Goto Forum:
  


Current Time: Thu Dec 08 18:46:16 CST 2016

Total time taken to generate the page: 0.10783 seconds