Home » SQL & PL/SQL » SQL & PL/SQL » XMLSEQUENCE and Group By
XMLSEQUENCE and Group By [message #261270] Wed, 22 August 2007 04:25 Go to next message
poring
Messages: 4
Registered: August 2007
Junior Member
Hello,
I am having trouble extracting data from an XML type, espacially when I need to do a group by.

Here is a simplified version of my XML data:

<File>

<Record>
<A>Something</A>
<B>
<C id="1">
<C id="2">
<C id="3">
</B>
</Record>

<Record>
<A>Something else</A>
<B>
<C id="1">
<C id="2">
</B>
</Record>

</File>



The content of the tag A is different for each record. I would like to count the number of C tags there are in the B tag of each record.

Here is the query I made:

SELECT
extract (value(p), '//A/text()', 'xmlns="myns"').getStringVal() AS A,
count(extract (value(p), '//B/C', 'xmlns="myns"').getStringVal()) AS C_count
FROM my_xml_table, TABLE(XMLSEQUENCE(extract (my_xml_column, '/File/Record', 'xmlns="myns"'))) p
GROUP BY A;


This doesn't work though. I get an error about the group by.
Any ideas? Thanks.
Re: XMLSEQUENCE and Group By [message #261296 is a reply to message #261270] Wed, 22 August 2007 05:08 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
Try this:
create table xml_table (col_1 xmltype);

insert into xml_table values (xmltype('<File><Record><A>Something</A><B><C id="1"/><C id="2"/><C id="3"/></B></Record><Record><A>Something else</A><B><C id="1"/><C id="2"/></B></Record></File>'));

select extractvalue(value(p),'//A/text()') A
      ,(select count(*) from table(xmlsequence(extract(col_1,'/File/Record[A='''||extractvalue(value(p),'//A/text()')||''']/B/C')))) cnt      
from xml_table
    , table(xmlsequence(extract (col_1, '/File/Record'))) p;
Re: XMLSEQUENCE and Group By [message #261401 is a reply to message #261296] Wed, 22 August 2007 09:00 Go to previous messageGo to next message
poring
Messages: 4
Registered: August 2007
Junior Member
That works! Thank you very much!

Just one thing though. I'm being picky, but is it possible to make it faster? I have hundreads of "record" elements in my DB, and it takes about 10 minutes to execute this query... I assume it's because of the nested extract..?
Re: XMLSEQUENCE and Group By [message #261419 is a reply to message #261401] Wed, 22 August 2007 09:36 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
Well, in an ideal world we'd be able to use the Xpath operator countto get to number of C nodes, but that's not an option.
I'll have a look and see what I can do.
Re: XMLSEQUENCE and Group By [message #261425 is a reply to message #261419] Wed, 22 August 2007 10:01 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
Well, I've got a solution, but it stinks.
It relies on the fact that there is an implicit order to the rows returned from the XMLSEQUENCE.
select value(p)
      ,rownum rnum
from xml_table
    , table(xmlsequence(extract (col_1, '/File/Record/A|/File/Record/B/C'))) p;    

"VALUE(P)"	            "RNUM"
"<A>Something</A>"	    "1"
"<C id="1"/>"	        "2"
"<C id="2"/>"	        "3"
"<C id="3"/>"	        "4"
"<A>Something else</A>"	"5"
"<C id="1"/>"	        "6"
"<C id="2"/>"	        "7"

We can abuse this implicit order to get the number of C rows between each A row:
select distinct maxval
      ,max(rnum) over (partition by maxval order by rnum desc)
      -min(rnum) over (partition by maxval order by rnum)      records
from (select rnum
            ,max(val) over (order by rnum) maxval
      from (select extractvalue(value(p),'/A') val
                  ,rownum rnum
            from xml_table
          , table(xmlsequence(extract (col_1, '/File/Record/A|/File/Record/B/C'))) p)
          );

"MAXVAL"	        "RECORDS"
"Something"         "3"
"Something else"	"2"

It works, but it's not good.
I'll see if I can come up with something else
Re: XMLSEQUENCE and Group By [message #261622 is a reply to message #261425] Thu, 23 August 2007 02:40 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
Scratch that - it works for te test data, but not in general.
Here's another, working, solution using the same very iffy approach:
select val
      ,nvl(lead(rnum) over (order by rnum),max_rnum) -rnum -1 C_Elements
from (
select val
      ,max(rnum) over(order by rnum desc)+1 max_rnum
      ,rnum
from (select extractvalue(value(p),'/A') val
            ,row_number() over (order by null) rnum
      from xml_table
          ,table(xmlsequence(extract (col_1, '/File/Record/A|/File/Record/B/C'))) p))
where val is not null          
;
icon14.gif  Re: XMLSEQUENCE and Group By [message #261674 is a reply to message #261622] Thu, 23 August 2007 05:32 Go to previous messageGo to next message
poring
Messages: 4
Registered: August 2007
Junior Member
You Sir are a hero!

I just tried it and it's much faster than the original, and seems to give the same results.

It is a fact that in my actual data, the tags equivalent to C are grouped together, so I guess that's why it works..

Thanks so much for your time and expertise! I would never have found this myself..
Re: XMLSEQUENCE and Group By [message #261693 is a reply to message #261674] Thu, 23 August 2007 06:23 Go to previous message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
Here is another way of doing it.
This is slightly better, in that it doesn't rely on any implicit ordering, but it does have a maximum limit of Element C that it can deal with (100 in this case).

what we do here is join the core of the previous query to a row generator which returns a single column containing a value between 1 and 100.

For each of these values (i), we get the i'th element C for the current row.

All we have to do then is to see what the largest value of i for which we get a value back is.
select col_a
      ,max(l)
from (      
select extractvalue(value(p),'/A')  col_A
      ,l
      ,extract(col_1,'/File/Record[A='''||extractvalue(value(p),'/A')||''']/B/C['||l||']') col_c
from xml_table
    , table(xmlsequence(extract (col_1, '/File/Record/A'))) p
    ,(select level l from dual connect by level <= 100)
)
where col_c is not null
group by col_a
;    
Previous Topic: how to get the individual execution time of operation in an execution plan
Next Topic: Reset Value
Goto Forum:
  


Current Time: Mon Dec 05 08:38:50 CST 2016

Total time taken to generate the page: 0.08643 seconds