Home » Developer & Programmer » JDeveloper, Java & XML » Procedure to query an XML document (Oracle 11g)
Procedure to query an XML document [message #550954] Fri, 13 April 2012 16:13 Go to next message
gentleman777us
Messages: 122
Registered: April 2005
Senior Member
Hi,

Iam using an oracle 11g database. I have an XML document named test_table.xml stored in table TEST_CLOB, Ex: TEST_CLOB(ID NUMBER, XML_DOC CLOB)

Here are the contents of test_table.xml file.

<xml version = "1.0"" ?>
<! DOCTYPE main [
<! Element main (DATA_RECORD*) >
<! COLUMN1?, COLUMN2?, COLUMN3, COLUMN4?, COLUMN5, COLUMN6?, GEOM?) +>
<! ELEEMENT COLUMN1 (#PCDATA)>
<! ELEEMENT COLUMN2 (#PCDATA)>
<! ELEEMENT COLUMN3 (#PCDATA)>
<! ELEEMENT COLUMN4 (#PCDATA)>
<! ELEEMENT COLUMN5 (#PCDATA)>
<! ELEEMENT COLUMN6 (#PCDATA)>
]>
<main>
<DATA_RECORD>
<COLUMN1>ARENAS</COLUMN1>
<COLUMN2>NBA</COLUMN2>
<COLUMN3>STAPLES CENTER</COLUMN3>
<COLUMN4>11.11</COLUMN4>
<COLUMN5>22.22</COLUMN5>
<COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>
<DATA_RECORD>
<COLUMN1>ARENAS</COLUMN1>
<COLUMN2>NBA</COLUMN2>
<COLUMN3>UNITED CENTER</COLUMN3>
<COLUMN4>33.33</COLUMN4>
<COLUMN5>44.44</COLUMN5>
<COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>
<DATA_RECORD>
<COLUMN1>ARENAS</COLUMN1>
<COLUMN2>NBA</COLUMN2>
<COLUMN3>GUND ARENA</COLUMN3>
<COLUMN4>55.55</COLUMN4>
<COLUMN5>66.66</COLUMN5>
<COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>
<DATA_RECORD>
<COLUMN1>ARENAS</COLUMN1>
<COLUMN2>NBA</COLUMN2>
<COLUMN3>CAPITAL ONE</COLUMN3>
<COLUMN4>77.77</COLUMN4>
<COLUMN5>88.88</COLUMN5>
<COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>
<DATA_RECORD>
<COLUMN1>ARENAS</COLUMN1>
<COLUMN2>NBA</COLUMN2>
<COLUMN3>us AIRWAYS</COLUMN3>
<COLUMN4>64.32</COLUMN4>
<COLUMN5>76.83</COLUMN5>
<COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>
</main>

I need a database procedure where I will pass in COLUMN1, COLUMN2 AND COLUMN3 values and it should return all the columns from the xml document. ie. ex: below is what I would like to query

select column1, column2, column3 column4 column5, column 6
from the TEST_TABLE.XML ---- XML DOCUMENT FILE NAME
WHERE column1 = 'ARENAS'
AND column2 = 'NBA'
AND column 3 = 'UNITED CENTER';


Your help is greatly appreciated.

Thanks

Re: Procedure to query an XML document [message #550955 is a reply to message #550954] Fri, 13 April 2012 16:14 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
Please read and follow the forum guidelines, to enable us to help you:

http://www.orafaq.com/forum/t/88153/0/
Re: Procedure to query an XML document [message #550956 is a reply to message #550955] Fri, 13 April 2012 16:22 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
XML Parsing Error: not well-formed
Location: file:///home/bcm/testtable.xml
Line Number 1, Column 21:<xml version = "1.0"" ?>
--------------------^

where testtable.xml is a direct COPY & PASTE from your post.
Re: Procedure to query an XML document [message #550963 is a reply to message #550954] Fri, 13 April 2012 18:36 Go to previous messageGo to next message
gentleman777us
Messages: 122
Registered: April 2005
Senior Member
unfortunately I could not copy and paste since they are on different environments. and one machine I have posted it can not access the other environment so I had to type it
Re: Procedure to query an XML document [message #550965 is a reply to message #550963] Fri, 13 April 2012 22:56 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
You can either select directly from the file or from a column of a table that the file data has been inserted into. I have demonstrated both below. You should be able to figure out how to use either query within a procedure, passing the parameters and returning a ref cursor or whatever you want.

-- selecting directly from the file:
SCOTT@orcl_11gR2> create or replace directory my_dir as 'c:\my_oracle_files'
  2  /

Directory created.

SCOTT@orcl_11gR2> select *
  2  from   (select t.id,
  3  		    extractvalue (value (x), '//COLUMN1') column1,
  4  		    extractvalue (value (x), '//COLUMN2') column2,
  5  		    extractvalue (value (x), '//COLUMN3') column3,
  6  		    extractvalue (value (x), '//COLUMN4') column4,
  7  		    extractvalue (value (x), '//COLUMN5') column5
  8  	     from   test_clob t,
  9  		    table
 10  		      (xmlsequence
 11  			(extract
 12  			  (xmltype
 13  			    (bfilename ('MY_DIR', 'test_table.xml'),
 14  			     NLS_CHARSET_ID ('WE8MSWIN1252')),
 15  			 '//DATA_RECORD'))) x)
 16  where  column1 = 'ARENAS'
 17  and    column2 = 'NBA'
 18  and    column3 = 'UNITED CENTER'
 19  /

        ID COLUMN1    COLUMN2    COLUMN3         COLUMN4    COLUMN5
---------- ---------- ---------- --------------- ---------- ----------
         1 ARENAS     NBA        UNITED CENTER   33.33      44.44

1 row selected.


-- selecting from a clob column of a table that contains the data from the file:
SCOTT@orcl_11gR2> select *
  2  from   (select t.id,
  3  		    extractvalue (value (x), '//COLUMN1') column1,
  4  		    extractvalue (value (x), '//COLUMN2') column2,
  5  		    extractvalue (value (x), '//COLUMN3') column3,
  6  		    extractvalue (value (x), '//COLUMN4') column4,
  7  		    extractvalue (value (x), '//COLUMN5') column5
  8  	     from   test_clob t,
  9  		    table
 10  		      (xmlsequence
 11  			(extract
 12  			  (xmltype (t.xml_doc),
 13  			   '//DATA_RECORD'))) x)
 14  where  column1 = 'ARENAS'
 15  and    column2 = 'NBA'
 16  and    column3 = 'UNITED CENTER'
 17  /

        ID COLUMN1    COLUMN2    COLUMN3         COLUMN4    COLUMN5
---------- ---------- ---------- --------------- ---------- ----------
         1 ARENAS     NBA        UNITED CENTER   33.33      44.44

1 row selected.

Re: Procedure to query an XML document [message #551020 is a reply to message #550965] Sat, 14 April 2012 20:42 Go to previous messageGo to next message
gentleman777us
Messages: 122
Registered: April 2005
Senior Member
Hi barbara,

I will try it on monday, just curious as to why you have created the directory 'c:\my_oracle_files' and where is it being referred?

Did you create that directory to store the XML file?

if so do we we have to register the oracle directory?

because there is an other post of mine on this forum where when I tried to insert XML file into a clob column using DBMS_LOB package it says it can not find the directory?

I would appreciate if you could list the steps required to save an XML file into a clob column.

As allways you are my life saver Smile

Thanks in advance.
Re: Procedure to query an XML document [message #551024 is a reply to message #551020] Sat, 14 April 2012 22:39 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
In the following example, the file test_table.xml is in the directory c:\my_oracle_files on my server. It is necessary to create an Oracle directory object that points to that directory path. The Oracle syntax "create or replace directory ..." creates an Oracle directory object, not an operating system directory. It must reference a valid operating system directory path. Depending on your operating system, that path may be case sensitive and the slashes may go one way or other other and there may or may not be a letter drive. It will accept anything for the directory path as it does not verify it until you try to use the directory object. If you create the Oracle directory object without double quotes, then it must be referenced in upper case within your code. Depending on your operating system, your file name may also be case sensitive. You must also ensure that Oracle has read access to the directory and the Oracle user has read privileges on the Oracle directory object.

SCOTT@orcl_11gR2> CREATE TABLE test_clob
  2    (id	 NUMBER,
  3  	xml_doc  CLOB)
  4  /

Table created.

SCOTT@orcl_11gR2> CREATE OR REPLACE DIRECTORY my_dir AS 'c:\my_oracle_files'
  2  /

Directory created.

SCOTT@orcl_11gR2> CREATE OR REPLACE PROCEDURE test_load_xml
  2    (p_directory IN VARCHAR2,
  3  	p_filename  IN VARCHAR2)
  4  AS
  5    v_dest_lob      CLOB;
  6    v_src_bfile     BFILE;
  7    v_dest_offset   NUMBER := 1;
  8    v_src_offset    NUMBER := 1;
  9    v_lang_ctx      NUMBER := DBMS_LOB.DEFAULT_LANG_CTX;
 10    v_warning       NUMBER;
 11  BEGIN
 12    v_src_bfile :=
 13  	 BFILENAME
 14  	   (DIRECTORY  => p_directory,
 15  	    FILENAME   => p_filename);
 16    INSERT INTO test_clob (id, xml_doc)
 17  	 VALUES (1, EMPTY_CLOB())
 18  	 RETURNING xml_doc INTO v_dest_lob;
 19    DBMS_LOB.OPEN
 20  	 (FILE_LOC     => v_src_bfile,
 21  	  OPEN_MODE    => DBMS_LOB.LOB_READONLY);
 22    DBMS_LOB.LOADCLOBFROMFILE
 23  	 (DEST_LOB     => v_dest_lob,
 24  	  SRC_BFILE    => v_src_bfile,
 25  	  AMOUNT       => DBMS_LOB.LOBMAXSIZE,
 26  	  DEST_OFFSET  => v_dest_offset,
 27  	  SRC_OFFSET   => v_src_offset,
 28  	  BFILE_CSID   => DBMS_LOB.DEFAULT_CSID,
 29  	  LANG_CONTEXT => v_lang_ctx,
 30  	  WARNING      => v_warning);
 31    DBMS_LOB.CLOSE
 32  	 (FILE_LOC     => v_src_bfile);
 33    COMMIT;
 34  END test_load_xml;
 35  /

Procedure created.

SCOTT@orcl_11gR2> SHOW ERRORS
No errors.
SCOTT@orcl_11gR2> EXEC test_load_xml ('MY_DIR', 'test_table.xml')

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> SELECT id,
  2  	    XMLSERIALIZE (DOCUMENT XMLTYPE (xml_doc) INDENT) xml_doc
  3  FROM   test_clob
  4  /

        ID
----------
XML_DOC
--------------------------------------------------------------------------------
         1
<main>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>STAPLES CENTER</COLUMN3>
    <COLUMN4>11.11</COLUMN4>
    <COLUMN5>22.22</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>UNITED CENTER</COLUMN3>
    <COLUMN4>33.33</COLUMN4>
    <COLUMN5>44.44</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>GUND ARENA</COLUMN3>
    <COLUMN4>55.55</COLUMN4>
    <COLUMN5>66.66</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>CAPITAL ONE</COLUMN3>
    <COLUMN4>77.77</COLUMN4>
    <COLUMN5>88.88</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>us AIRWAYS</COLUMN3>
    <COLUMN4>64.32</COLUMN4>
    <COLUMN5>76.83</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
</main>


1 row selected.

SCOTT@orcl_11gR2> COLUMN column1 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column2 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column3 FORMAT a15
SCOTT@orcl_11gR2> COLUMN column4 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column5 FORMAT a10
SCOTT@orcl_11gR2> select *
  2  from   (select t.id,
  3  		    extractvalue (value (x), '//COLUMN1') column1,
  4  		    extractvalue (value (x), '//COLUMN2') column2,
  5  		    extractvalue (value (x), '//COLUMN3') column3,
  6  		    extractvalue (value (x), '//COLUMN4') column4,
  7  		    extractvalue (value (x), '//COLUMN5') column5
  8  	     from   test_clob t,
  9  		    table
 10  		      (xmlsequence
 11  			(extract
 12  			  (xmltype (t.xml_doc),
 13  			   '//DATA_RECORD'))) x)
 14  where  column1 = 'ARENAS'
 15  and    column2 = 'NBA'
 16  and    column3 = 'UNITED CENTER'
 17  /

        ID COLUMN1    COLUMN2    COLUMN3         COLUMN4    COLUMN5
---------- ---------- ---------- --------------- ---------- ----------
         1 ARENAS     NBA        UNITED CENTER   33.33      44.44

1 row selected.



Re: Procedure to query an XML document [message #551144 is a reply to message #551024] Sun, 15 April 2012 19:08 Go to previous messageGo to next message
gentleman777us
Messages: 122
Registered: April 2005
Senior Member
Hi Barbara,

I get ORA-22288 error, when I execute the following command

sqlplus > exec test_load_xml('MY_DIR','test.xml');

The test.xml file is in directory c:\orac_files directory

how do we copy this file into c:\my_orcle_files directory since this is not an OS directory.

what are the steps I need to do after I create the directory MY_DIR and the test.xml file and before I execute the following command( i.e. should I copy the file test.xml file to MY_DIR? and how do I set the permissions on the directory to read?)

exec test_load_xml('MY_DIR', test.xml');


Please list the commands.

Again Thank you very much for your prompt reply.

Re: Procedure to query an XML document [message #551148 is a reply to message #551144] Sun, 15 April 2012 20:38 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
The directory path was where the file was on my server operating system. If the file is in your operating system directory c:\orac_files on your server, then you need to use:

CREATE OR REPLACE DIRECTORY my_dir AS 'c:\orac_files';

Then you can use:

exec test_load_xml('MY_DIR', test.xml')

Everything else would be the same.
Re: Procedure to query an XML document [message #551245 is a reply to message #551148] Mon, 16 April 2012 11:13 Go to previous messageGo to next message
gentleman777us
Messages: 122
Registered: April 2005
Senior Member
Thanks! it worked. I was missing the file extension. ie. I was using only 'test' instead of 'test.xml'

Thanks you again. You have been very helpful.
Re: Procedure to query an XML document [message #551274 is a reply to message #551245] Mon, 16 April 2012 14:52 Go to previous messageGo to next message
gentleman777us
Messages: 122
Registered: April 2005
Senior Member
Hi Barbara,

Since we know the column names of the xml file ( i.e COLUMN1, COLUMN2, COLUMN3 etc) we could hard code it in the EXTRACTVALUE statement of the query.

Question: What if we do not know the column names of the XML document how do we query the XML document?

Could you please provide some information?

Thanks
Re: Procedure to query an XML document [message #551278 is a reply to message #551274] Mon, 16 April 2012 15:48 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
What sort of query would you expect to be able to write without giving column names? Can you give an example? What data would you be trying to search for and what results would you want returned?

Re: Procedure to query an XML document [message #551279 is a reply to message #551278] Mon, 16 April 2012 16:00 Go to previous messageGo to next message
gentleman777us
Messages: 122
Registered: April 2005
Senior Member
Barbara,

What I meant was, I would like to retrieve the column names from the XML file and then construct the query based on the retrieved columns names.

I have to read xml files from a table, and I do not know the columns names, I wanted to construct a parsing query which would retrieve the column names from the xml file and use it in the select statement.

Hope Iam clear

Thanks
Re: Procedure to query an XML document [message #551284 is a reply to message #551279] Mon, 16 April 2012 17:33 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
If you have this:

SCOTT@orcl_11gR2> SELECT id,
  2  	    XMLSERIALIZE (DOCUMENT XMLTYPE (xml_doc) INDENT) xml_doc
  3  FROM   test_clob
  4  /

        ID
----------
XML_DOC
--------------------------------------------------------------------------------
         1
<main>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>STAPLES CENTER</COLUMN3>
    <COLUMN4>11.11</COLUMN4>
    <COLUMN5>22.22</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>UNITED CENTER</COLUMN3>
    <COLUMN4>33.33</COLUMN4>
    <COLUMN5>44.44</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>GUND ARENA</COLUMN3>
    <COLUMN4>55.55</COLUMN4>
    <COLUMN5>66.66</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>CAPITAL ONE</COLUMN3>
    <COLUMN4>77.77</COLUMN4>
    <COLUMN5>88.88</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>us AIRWAYS</COLUMN3>
    <COLUMN4>64.32</COLUMN4>
    <COLUMN5>76.83</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
</main>


1 row selected.


Then you can get the column names like this:

SCOTT@orcl_11gR2> select distinct x.column_value.getrootelement() column_names
  2  from   test_clob t,
  3  	    table
  4  	      (xmlsequence
  5  		(extract
  6  		  (xmltype (t.xml_doc),
  7  		   '//DATA_RECORD/node()'))) x
  8  order  by column_names
  9  /

COLUMN_NAMES
--------------------------------------------------------------------------------
COLUMN1
COLUMN2
COLUMN3
COLUMN4
COLUMN5
COLUMN6

6 rows selected.

Re: Procedure to query an XML document [message #551320 is a reply to message #551284] Tue, 17 April 2012 03:49 Go to previous messageGo to next message
_jum
Messages: 577
Registered: February 2008
Senior Member
In ORCALE 11 you can use XQuery function fn:local-name instead of PL/SQL XMLType method getRootElement():
WITH xdata AS
 (SELECT XMLTYPE('
<main>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>STAPLES CENTER</COLUMN3>
    <COLUMN4>11.11</COLUMN4>
    <COLUMN5>22.22</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>UNITED CENTER</COLUMN3>
    <COLUMN4>33.33</COLUMN4>
    <COLUMN5>44.44</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>GUND ARENA</COLUMN3>
    <COLUMN4>55.55</COLUMN4>
    <COLUMN5>66.66</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
</main> ') xd FROM dual)
SELECT col_pos,
       col_nam
  FROM xdata, 
       XMLTABLE('//*' 
       PASSING xdata.xd COLUMNS  
         col_pos FOR ORDINALITY, 
         col_nam VARCHAR(30) PATH 'fn:local-name(.)');

   COL_POS COL_NAM                       
---------- ------------------------------
         1 main                          
         2 DATA_RECORD                   
         3 COLUMN1                       
         4 COLUMN2                       
         5 COLUMN3                       
         6 COLUMN4                       
         7 COLUMN5                       
         8 COLUMN6                       
         9 DATA_RECORD                   
        10 COLUMN1                       
        11 COLUMN2                       
        12 COLUMN3                       
        13 COLUMN4                       
        14 COLUMN5                       
        15 COLUMN6                       
        16 DATA_RECORD                   
        17 COLUMN1                       
        18 COLUMN2                       
        19 COLUMN3                       
        20 COLUMN4                       
        21 COLUMN5                       
        22 COLUMN6                       
         
Re: Procedure to query an XML document [message #551351 is a reply to message #551284] Tue, 17 April 2012 09:42 Go to previous messageGo to next message
gentleman777us
Messages: 122
Registered: April 2005
Senior Member
Barbara,

Thank you for sending the query to get column names, I guess I should have been more explicit, how can we combine both the queries into one i.e column names and the other query(column1 = ARENAS, COLUMN2 = 'NBA' and column3 = 'UNITED CENTER')

Thanks
Re: Procedure to query an XML document [message #551358 is a reply to message #551351] Tue, 17 April 2012 10:55 Go to previous messageGo to next message
_jum
Messages: 577
Registered: February 2008
Senior Member
You could use the other XQuery functions:

WITH xdata AS
 (SELECT XMLTYPE('
<main>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>STAPLES CENTER</COLUMN3>
    <COLUMN4>11.11</COLUMN4>
    <COLUMN5>22.22</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>UNITED CENTER</COLUMN3>
    <COLUMN4>33.33</COLUMN4>
    <COLUMN5>44.44</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>GUND ARENA</COLUMN3>
    <COLUMN4>55.55</COLUMN4>
    <COLUMN5>66.66</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
</main> ') xd FROM dual)
SELECT col_pos,
       col_nam,
       col_val
  FROM xdata, 
       XMLTABLE('//DATA_RECORD/*' 
       PASSING xdata.xd COLUMNS  
         col_pos FOR ORDINALITY, 
         col_val VARCHAR2(200) PATH 'fn:root()',
         col_nam VARCHAR(30)   PATH 'fn:local-name(.)');

COL_POS COL_NAM COL_VAL                     
----------------------------------------                         
1	COLUMN1	ARENAS
2	COLUMN2	NBA
3	COLUMN3	STAPLES CENTER
4	COLUMN4	11.11
5	COLUMN5	22.22
6	COLUMN6	2012/04/02
7	COLUMN1	ARENAS
8	COLUMN2	NBA
9	COLUMN3	UNITED CENTER
10	COLUMN4	33.33
11	COLUMN5	44.44
12	COLUMN6	2012/04/02
13	COLUMN1	ARENAS
14	COLUMN2	NBA
15	COLUMN3	GUND ARENA
16	COLUMN4	55.55
17	COLUMN5	66.66
18	COLUMN6	2012/04/02

Re: Procedure to query an XML document [message #551366 is a reply to message #551358] Tue, 17 April 2012 11:52 Go to previous messageGo to next message
gentleman777us
Messages: 122
Registered: April 2005
Senior Member
Barbara,

The query is great, unfortunately my xml documents have thousands of rows so it is not easy to build this query on those huge documents. Does any other approach crosses your mind?

Thanks
- Shekar
Re: Procedure to query an XML document [message #551369 is a reply to message #551366] Tue, 17 April 2012 12:32 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
gentleman777us wrote on Tue, 17 April 2012 09:52
Barbara,

The query is great, unfortunately my xml documents have thousands of rows so it is not easy to build this query on those huge documents. Does any other approach crosses your mind?

Thanks
- Shekar


The last query was provided by jum, not me, and it looks good. I don't see a problem. The WITH clause is just to simulate your data. If your data is in a table, then the whole query would just be the part below. The only thing that would be different for different tables would be the table name (xdata) in the from clause.

SELECT col_pos,
       col_nam,
       col_val
  FROM xdata, 
       XMLTABLE('//DATA_RECORD/*' 
       PASSING xdata.xd COLUMNS  
         col_pos FOR ORDINALITY, 
         col_val VARCHAR2(200) PATH 'fn:root()',
         col_nam VARCHAR(30)   PATH 'fn:local-name(.)');

Re: Procedure to query an XML document [message #551374 is a reply to message #551351] Tue, 17 April 2012 13:28 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
gentleman777us wrote on Tue, 17 April 2012 07:42


Thank you for sending the query to get column names, I guess I should have been more explicit, how can we combine both the queries into one i.e column names and the other query(column1 = ARENAS, COLUMN2 = 'NBA' and column3 = 'UNITED CENTER')


SCOTT@orcl_11gR2> VARIABLE g_ref REFCURSOR
SCOTT@orcl_11gR2> DECLARE
  2    v_sql  CLOB;
  3  BEGIN
  4    v_sql := 'select * from (select t.id';
  5    FOR r IN
  6  	 (select distinct x.column_value.getrootelement() column_name
  7  	  from	 test_clob t,
  8  		 table
  9  		   (xmlsequence
 10  		     (extract
 11  		       (xmltype (t.xml_doc),
 12  			'//DATA_RECORD/node()'))) x
 13  	  order  by column_name)
 14    LOOP
 15  	 v_sql := v_sql || ', extractvalue (value (x), ''//'
 16  	       || r.column_name || ''') "' || r.column_name || '"';
 17    END LOOP;
 18    v_sql := v_sql ||
 19  	     ' from   test_clob t,
 20  		    table
 21  		      (xmlsequence
 22  			(extract
 23  			  (xmltype (t.xml_doc),
 24  			   ''//DATA_RECORD''))) x)
 25  	       where  "COLUMN1" = ''ARENAS''
 26  	       and    "COLUMN2" = ''NBA''
 27  	       and    "COLUMN3" = ''UNITED CENTER''';
 28    OPEN :g_ref FOR v_sql;
 29  END;
 30  /

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> COLUMN column1 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column2 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column3 FORMAT a15
SCOTT@orcl_11gR2> COLUMN column4 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column5 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column6 FORMAT a10
SCOTT@orcl_11gR2> PRINT g_ref

        ID COLUMN1    COLUMN2    COLUMN3         COLUMN4    COLUMN5    COLUMN6
---------- ---------- ---------- --------------- ---------- ---------- ----------
         1 ARENAS     NBA        UNITED CENTER   33.33      44.44      2012/04/02

1 row selected.

Re: Procedure to query an XML document [message #551528 is a reply to message #551374] Wed, 18 April 2012 16:28 Go to previous messageGo to next message
gentleman777us
Messages: 122
Registered: April 2005
Senior Member
Barbara,

Thank you very much, This is exactly what I wanted.

I performed a query on an xml document with 7000 records.

it took about 3 minutes to retrieve one row of information,

Do you know of any way to improve the performance? Also do we have to close the file explicitly or release the memory? or something since second time the same query took over 10 minutes to return.

I appreciate all your efforts.


Regards
Re: Procedure to query an XML document [message #551530 is a reply to message #551528] Wed, 18 April 2012 19:12 Go to previous message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
My understanding of the problem, based on this and some of your other posts, is this:

You have a bunch of files containing xml data of the structure:

<main>
<DATA_RECORD>
...
</DATA_RECORD>
</main>

where there are multiple DATA_RECORD containing various tags of unknown names, except that you seem to know that there is a COLUMN1 and COLUMN2 and COLUMN3.

You have loaded these files into a clob column in a table, one file per row, with another column that is a unique id. When you load these files, you open each file, load it, then close it. The data is then in the table and there is no access to the files after that, so no further opening or closing is necessary.

You want to query the clob data in the table, returning all of the tags as columns, without knowing the names of the columns, except that you want to filter by certain values in COLUMN1 and COLUMN2 and COLUMN3. You have a query that does that, but it is taking 3 minutes to run, so you want to make it faster.

The following represents sample data from one row (one file) after loading into your table:

SCOTT@orcl_11gR2> SELECT id,
  2  	    XMLSERIALIZE (DOCUMENT XMLTYPE (xml_doc) INDENT) xml_doc
  3  FROM   test_clob
  4  /

        ID
----------
XML_DOC
------------------------------------------------------------------------------------------
         1
<main>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>STAPLES CENTER</COLUMN3>
    <COLUMN4>11.11</COLUMN4>
    <COLUMN5>22.22</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>UNITED CENTER</COLUMN3>
    <COLUMN4>33.33</COLUMN4>
    <COLUMN5>44.44</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>GUND ARENA</COLUMN3>
    <COLUMN4>55.55</COLUMN4>
    <COLUMN5>66.66</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>CAPITAL ONE</COLUMN3>
    <COLUMN4>77.77</COLUMN4>
    <COLUMN5>88.88</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
  <DATA_RECORD>
    <COLUMN1>ARENAS</COLUMN1>
    <COLUMN2>NBA</COLUMN2>
    <COLUMN3>us AIRWAYS</COLUMN3>
    <COLUMN4>64.32</COLUMN4>
    <COLUMN5>76.83</COLUMN5>
    <COLUMN6>2012/04/02</COLUMN6>
  </DATA_RECORD>
</main>


1 row selected.


If you create a materialized view, you can separate each DATA_RECORD:

SCOTT@orcl_11gR2> CREATE MATERIALIZED VIEW test_clob_mv
  2  AS
  3  SELECT t.id, x.column_value as data_record
  4  FROM   test_clob t,
  5  	    TABLE
  6  	      (XMLSEQUENCE
  7  		(EXTRACT
  8  		  (XMLTYPE (t.xml_doc),
  9  		   '//DATA_RECORD'))) x
 10  /

Materialized view created.

SCOTT@orcl_11gR2> SELECT * FROM test_clob_mv
  2  /

        ID
----------
DATA_RECORD
------------------------------------------------------------------------------------------
         1
<DATA_RECORD>
  <COLUMN1>ARENAS</COLUMN1>
  <COLUMN2>NBA</COLUMN2>
  <COLUMN3>STAPLES CENTER</COLUMN3>
  <COLUMN4>11.11</COLUMN4>
  <COLUMN5>22.22</COLUMN5>
  <COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>

         1
<DATA_RECORD>
  <COLUMN1>ARENAS</COLUMN1>
  <COLUMN2>NBA</COLUMN2>
  <COLUMN3>UNITED CENTER</COLUMN3>
  <COLUMN4>33.33</COLUMN4>
  <COLUMN5>44.44</COLUMN5>
  <COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>

         1
<DATA_RECORD>
  <COLUMN1>ARENAS</COLUMN1>
  <COLUMN2>NBA</COLUMN2>
  <COLUMN3>GUND ARENA</COLUMN3>
  <COLUMN4>55.55</COLUMN4>
  <COLUMN5>66.66</COLUMN5>
  <COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>

         1
<DATA_RECORD>
  <COLUMN1>ARENAS</COLUMN1>
  <COLUMN2>NBA</COLUMN2>
  <COLUMN3>CAPITAL ONE</COLUMN3>
  <COLUMN4>77.77</COLUMN4>
  <COLUMN5>88.88</COLUMN5>
  <COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>

         1
<DATA_RECORD>
  <COLUMN1>ARENAS</COLUMN1>
  <COLUMN2>NBA</COLUMN2>
  <COLUMN3>us AIRWAYS</COLUMN3>
  <COLUMN4>64.32</COLUMN4>
  <COLUMN5>76.83</COLUMN5>
  <COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>


5 rows selected.


You can then create a text index on that data to allow for fast searches for the values, without knowing any column names:

SCOTT@orcl_11gR2> CREATE INDEX test_idx
  2  ON test_clob_mv (data_record)
  3  INDEXTYPE IS CTXSYS.CONTEXT
  4  /

Index created.

SCOTT@orcl_11gR2> SELECT *
  2  FROM   test_clob_mv
  3  WHERE  CONTAINS
  4  	      (data_record,
  5  	       'ARENAS AND NBA AND UNITED CENTER') > 0
  6  /

        ID
----------
DATA_RECORD
------------------------------------------------------------------------------------------
         1
<DATA_RECORD>
  <COLUMN1>ARENAS</COLUMN1>
  <COLUMN2>NBA</COLUMN2>
  <COLUMN3>UNITED CENTER</COLUMN3>
  <COLUMN4>33.33</COLUMN4>
  <COLUMN5>44.44</COLUMN5>
  <COLUMN6>2012/04/02</COLUMN6>
</DATA_RECORD>


1 row selected.


Depending on the format of the output that you want, you can either use that as a sub-query in jum's suggested query:

SCOTT@orcl_11gR2> COLUMN col_val FORMAT A15
SCOTT@orcl_11gR2> SELECT col_pos,
  2  	    col_nam,
  3  	    col_val
  4    FROM (SELECT *
  5  	     FROM   test_clob_mv
  6  	     WHERE  CONTAINS
  7  		      (data_record,
  8  		       'ARENAS AND NBA AND UNITED CENTER') > 0) x,
  9  	    XMLTABLE('//DATA_RECORD/*'
 10  	    PASSING x.data_record COLUMNS
 11  	      col_pos FOR ORDINALITY,
 12  	      col_val VARCHAR2(200) PATH 'fn:root()',
 13  	      col_nam VARCHAR(30)   PATH 'fn:local-name(.)')
 14  /

   COL_POS COL_NAM                        COL_VAL
---------- ------------------------------ ---------------
         1 COLUMN1                        ARENAS
         2 COLUMN2                        NBA
         3 COLUMN3                        UNITED CENTER
         4 COLUMN4                        33.33
         5 COLUMN5                        44.44
         6 COLUMN6                        2012/04/02

6 rows selected.


or you can use that as a sub-query in my previously suggested method:

SCOTT@orcl_11gR2> VARIABLE g_ref REFCURSOR
SCOTT@orcl_11gR2> DECLARE
  2    v_sql  CLOB;
  3  BEGIN
  4    v_sql := 'select t.id';
  5    FOR r IN
  6  	 (select distinct x.column_value.getrootelement() column_name
  7  	  from	 (SELECT t.*
  8  		  FROM	 test_clob_mv t
  9  		  WHERE  CONTAINS
 10  			   (t.data_record,
 11  			    'ARENAS AND NBA AND UNITED CENTER') > 0) t,
 12  		 table
 13  		   (xmlsequence
 14  		     (extract
 15  		       (t.data_record,
 16  			'//DATA_RECORD/node()'))) x
 17  	  order  by column_name)
 18    LOOP
 19  	 v_sql := v_sql || ', extractvalue (value (x), ''//'
 20  	       || r.column_name || ''') "' || r.column_name || '"';
 21    END LOOP;
 22    v_sql := v_sql ||
 23  	     ' from   (SELECT t.*
 24  		       FROM   test_clob_mv t
 25  		       WHERE  CONTAINS
 26  				(t.data_record,
 27  				 ''ARENAS AND NBA AND UNITED CENTER'') > 0) t,
 28  		      table
 29  			(xmlsequence
 30  			  (extract
 31  			    (t.data_record,
 32  			     ''//DATA_RECORD''))) x';
 33    OPEN :g_ref FOR v_sql;
 34  END;
 35  /

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> COLUMN column1 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column2 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column3 FORMAT a15
SCOTT@orcl_11gR2> COLUMN column4 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column5 FORMAT a10
SCOTT@orcl_11gR2> COLUMN column6 FORMAT a10
SCOTT@orcl_11gR2> PRINT g_ref

        ID COLUMN1    COLUMN2    COLUMN3         COLUMN4    COLUMN5    COLUMN6
---------- ---------- ---------- --------------- ---------- ---------- ----------
         1 ARENAS     NBA        UNITED CENTER   33.33      44.44      2012/04/02

1 row selected.


If you use the method above, the cursor for loop opens and closes implicitly. I used a SQL*Plus refcursor variable, which also closes implicitly after printing. If you are not using SQL*Plus, then the ref cursor needs to be closed from whatever you are calling it from. Please see the following by Tom Kyte on closing ref cursors from calling programs:

http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1606716582692

The above was just a simplified demonstration. You will need to decide when and how to refresh the materialized view and synchronize the text index. It is also possible to use ctxsys.auto_section_group as a parameter to the text index, enabling you to search for the values within certain columns.

There may be other methods. I am expert at Oracle Text, so I tend to think of Oracle Text methods. Others who are better at XML may have some better XML methods.
Previous Topic: removing white spaces from Arraylist
Next Topic: pojo (4 Merged)
Goto Forum:
  


Current Time: Thu Mar 28 05:35:08 CDT 2024