Home » Developer & Programmer » JDeveloper, Java & XML » getting data out of unconventional XML (Oracle 10.2.0.4 (On UNIX - Sun Solaris 9 with current patch set))
getting data out of unconventional XML [message #380259] Fri, 09 January 2009 15:12 Go to next message
karaseg
Messages: 14
Registered: January 2007
Junior Member
I know very very little about XML.

We have an app that we are getting XML output from and need to read from it and place it into Oracle tables. It does not appear to be conventional formatting.

We will be reading each file (there are approx 20, all share a common XSD file) weekly (they will output updated XML files (i.e. the latest copy fo the data) to us into a directory) This will be an automated process, all hands off- the files will get pushed to a directory, our process runs SQL to import it into an XML schema in our Oracle DB, then another process selects from the XML schema to get the actual data out to put into our relational tables in our Data Warehouse. We are only using XML tables in Oracle because we are given the data in XML files, we really would prefer them to be Oracle tables, but they cannot give them to us as those.

The files seem fairly simple. This is an example of the XML file they provide us with:
<?xml version="1.0"?>
<typelist
  desc="Types of mailing addresses"
  final="false"
  name="AddressType">
  <typecode
    code="home"
    desc="Home"
    name="Home"/>
  <typecode
    code="business"
    desc="Business"
    name="Business"/>
  <typecode
    code="other"
    desc="Other"
    name="Other"/>
</typelist>


Our DBA created an XML schema and I inserted the XML file as follows:
insert into edstage."typelist416_TAB" values
  (XMLType(bfilename('TYPELISTS_DIR','AddressType.xml'),
  nls_charset_id('WE8ISO8859P1')));


select * from "typelist416_TAB";
SYS_NC_ROWINFO$
--------------------------------------------------------------------------------
<?xml version="1.0"?>
<typelist desc="Types of mailing addresses" final="false" name="AddressType">
  <typecode code="home" desc="Home" name="Home"/>
  <typecode code="business" desc="Business" name="Business"/>
  <typecode code="other" desc="Other" name="Other"/>
</typelist>

1 row selected.


If you notice, the lowest level of data - code, desc, and name - which is what we want to be able to extract as columns, does not have the normal < > formatting. I have looked at many many orafaq posts and read XML documentation and the best I can get to retrieve data is:

select extract(object_value,'/typelist/typecode/@code') code from "typelist416_TAB";

CODE
--------------------------------------------------------------------------------
homebusinessother

or 

select extract(object_value,'/typelist/typecode/@name') name from "typelist416_TAB";

NAME
--------------------------------------------------------------------------------
HomeBusinessOther


What I was hoping to get would be 3 separate records- looking like a relational table extract:
CODE         DESC          NAME
------------ ------------- ----------
home         Home          Home
business     Business      Business
other        Other         Other


Can someone help tell me if this is possible with this format of XML file? We do not have a choice to format differently, and cannot be editing these manually to put data in, otherwise we'd just put them in as a real Oracle table.

Anyone have any ideas on how we can read the data out of these XML schemas in this sort of way?

Also, is it in the XML schema creation that we would name the table something that we want it to be named rather than the arbitrary number thing it is doing? We need to have a consistent table name for our automated SQL extracts, regardless of how many times this XML table gets dropped or created. For the above example we would want to name it ADDRESS_TYPE.

Please do not just tell me to read the documentation, unless you can point to a specific example written for a "beginner", as I cannot seem to make sense of the variety of tools and functions or procs that people have created to use. If you have to type in the entire XML to create a function/proc to insert the data, why not just type the data into a normal Oracle INSERT statement and use a real table? I have read and read and do not find any situations that apply to mine.

If no one can help, please just let me know so that I can find a more simple way to extract the data from these XML files into an Oracle table (we are considering having a script that opens each XML file in Excel (which reads the hierarchy and columns just fine without any extra coding on our part) and then saves it as a *.csv file so that our ETL tool can read from it and load it into Oracle as real data that way)


Re: getting data out of unconventional XML [message #380274 is a reply to message #380259] Sat, 10 January 2009 00:10 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
You can select it directly from the file, as demonstrated below. I showed some intermediary steps, just to demonstrate that I am using the same thing you are and show how it works. Also, I noticed you are using desc for a column name. Desc is an Oracle reserved word, so it will cause an error unless you enclose it in quotes, which makes it case-sensitive and requires it to be enclosed in quotes anywhere it is used. So, it is best to choose something else. I used dscr in the demo below.

SCOTT@orcl_11g> HOST TYPE c:\typelists_dir\AddressType.xml
<?xml version="1.0"?>
<typelist
desc="Types of mailing addresses"
final="false"
name="AddressType">
<typecode
code="home"
desc="Home"
name="Home"/>
<typecode
code="business"
desc="Business"
name="Business"/>
<typecode
code="other"
desc="Other"
name="Other"/>
</typelist>


SCOTT@orcl_11g> CREATE OR REPLACE DIRECTORY typelists_dir AS 'c:\typelists_dir'
  2  /

Directory created.

SCOTT@orcl_11g> SELECT *
  2  FROM   TABLE
  3  	      (XMLSEQUENCE
  4  		(XMLTYPE
  5  		  (BFILENAME ('TYPELISTS_DIR', 'AddressType.xml'),
  6  		   NLS_CHARSET_ID ('WE8ISO8859P1'))))
  7  /

COLUMN_VALUE
--------------------------------------------------------------------------------
<?xml version="1.0"?>
<typelist
desc="Types of mailing addresses"
final="false"
name="AddressType">
<typecode
code="home"
desc="Home"
name="Home"/>
<typecode
code="business"
desc="Business"
name="Business"/>
<typecode
code="other"
desc="Other"
name="Other"/>
</typelist>


1 row selected.


SCOTT@orcl_11g> SELECT	*
  2  FROM   TABLE
  3  	      (XMLSEQUENCE
  4  		(EXTRACT
  5  		  (XMLTYPE
  6  		    (BFILENAME ('TYPELISTS_DIR', 'AddressType.xml'),
  7  		      NLS_CHARSET_ID ('WE8ISO8859P1')),
  8  		  '/typelist/typecode')))
  9  /

COLUMN_VALUE
--------------------------------------------------------------------------------
<typecode code="home" desc="Home" name="Home"/>
<typecode code="business" desc="Business" name="Business"/>
<typecode code="other" desc="Other" name="Other"/>

3 rows selected.


SCOTT@orcl_11g> COLUMN code FORMAT A10
SCOTT@orcl_11g> COLUMN dscr FORMAT A10
SCOTT@orcl_11g> COLUMN name FORMAT A10
SCOTT@orcl_11g> SELECT	EXTRACTVALUE (COLUMN_VALUE, '/typecode/@code') AS code,
  2  	     EXTRACTVALUE (COLUMN_VALUE, '/typecode/@desc') AS dscr,
  3  	     EXTRACTVALUE (COLUMN_VALUE, '/typecode/@name') AS name
  4  FROM   TABLE
  5  	      (XMLSEQUENCE
  6  		(EXTRACT
  7  		  (XMLTYPE
  8  		    (BFILENAME ('TYPELISTS_DIR', 'AddressType.xml'),
  9  		      NLS_CHARSET_ID ('WE8ISO8859P1')),
 10  		  '/typelist/typecode')))
 11  /

CODE       DSCR       NAME
---------- ---------- ----------
home       Home       Home
business   Business   Business
other      Other      Other

3 rows selected.

[Updated on: Sat, 10 January 2009 01:49]

Report message to a moderator

Re: getting data out of unconventional XML [message #380300 is a reply to message #380274] Sat, 10 January 2009 07:28 Go to previous messageGo to next message
karaseg
Messages: 14
Registered: January 2007
Junior Member
Wow Barbara! Thank you! I wil try this code when I get back into the office on Monday.

I am sure this is very simple for you, but it is huge for me as I have never had to deal with XML before.

So, basically, we do not need to register an XML schema and insert data into it, we instead are just reading directly from the XML files out on the server using the directory location we register in our Oracle DB?

If this is so, then this saves us a ton of time/work in having to reinsert data weekly from the XML files into the XML schema to read from, and we would just be able to replace the files on the server with the new ones and read directly from them.
Re: getting data out of unconventional XML [message #380965 is a reply to message #380300] Wed, 14 January 2009 07:35 Go to previous message
karaseg
Messages: 14
Registered: January 2007
Junior Member
Just wanted to note that it DID work as I expected and it is a much easier/simple solution than what we had envisioned. Thanks again!
Previous Topic: EXTRACTVALUE MULTIPLE NODES
Next Topic: Debugging PL/SQL with JDeveloper (merged)
Goto Forum:
  


Current Time: Thu Mar 28 04:21:30 CDT 2024