Friday, December 28, 2007

XQJ: JDBC for XQuery

In recent blog entries, I have posted about using Oracle's XQLPlus command-line XQuery tool. In one of those entries, I pointed out that if XQuery is to XML what SQL is to relational data, then XQLPlus is to XQuery what SQL*Plus is to SQL.

In this blog entry, I intend to demonstrate that we can take these relationships even further and describe the XQuery API for Java (XQJ) as XQuery's JDBC. In other words, XQJ is to XQuery as JDBC is to SQL. Both XQJ and JDBC are Java APIs for accessing data in a standard way. While XQJ is focused on XQuery access of XML data, JDBC is focused on SQL access of relational data. XQJ is currently a work-in-progress as part of Java Specification Request (JSR) 225 ("XQuery API for Java").

As work on JSR 225 precedes, several vendors have made XQJ implementations available. These include Oracle's OJXQI implementation and DataDirect's DataDirect XQuery implementation. DataDirect provides a highly useful XQJ Tutorial (PDF) and DevX provides a brief but useful example of using Oracle's XQJ implementation.

In this blog entry, I use the Oracle XQJ implementation to read an XQuery script file (with an .xql extension in my example) and process it. That processed script file references the source XML document it queries via the fn:doc function. This example does not obtain a connection, which is sometimes a necessary step (such as when querying over XML stored in the database).

The source XML that will be queried for this example is stored in a file called C:\xquery\xmlSource\planets.xml. This sample XML file is shown next.


<?xml version = '1.0'?>
<!-- Note that all data here is not meant to be factual, but is instead intended
to illustrate XQuery principles. Also, some of the planets have far too
many moons to list all of them here, so only select moons are listed in
those cases. For example, while Jupiter has over 60 moons, only its four
so-called Galilean moons (the four largest that Galileo could see with the
available equipment at the time) are listed here. -->
<Planets>
<Planet name="Mercury"
minDistanceFromSunMK="46"
maxDistanceFromSunMK="70"
class="planet"/>
<Planet name="Venus"
minDistanceFromSunMK="108"
maxDistanceFromSunMK="109"
class="planet" />
<Planet name="Earth"
minDistanceFromSunMK="146"
maxDistanceFromSunMK="152"
class="planet">
<Moons>
<Moon>The Moon</Moon>
</Moons>
</Planet>
<Planet name="Mars"
minDistanceFromSunMK="205"
maxDistanceFromSunMK="249"
class="planet">
<Moons>
<Moon>Phobos</Moon>
<Moon>Deimos</Moon>
</Moons>
</Planet>
<Planet name="Jupiter"
minDistanceFromSunMK="741"
maxDistanceFromSunMK="817"
class="planet">
<Moons>
<Moon>Callisto</Moon>
<Moon>Europa</Moon>
<Moon>Ganymede</Moon>
<Moon>IO</Moon>
</Moons>
</Planet>
<Planet name="Saturn"
minDistanceFromSunMK="1350"
maxDistanceFromSunMK="1500"
class="planet">
<Moons>
<Moon>Atlas</Moon>
<Moon>Calypso</Moon>
<Moon>Dione</Moon>
<Moon>Prometheus</Moon>
<Moon>Pan</Moon>
<Moon>Pandora</Moon>
<Moon>Titan</Moon>
</Moons>
</Planet>
<Planet name="Uranus"
minDistanceFromSunMK="2700"
maxDistanceFromSunMK="3000"
class="planet">
<Moons>
<Moon>Ariel</Moon>
<Moon>Cordelia</Moon>
<Moon>Desdemona</Moon>
<Moon>Miranda</Moon>
<Moon>Oberon</Moon>
<Moon>Ophelia</Moon>
<Moon>Puck</Moon>
<Moon>Titania</Moon>
<Moon>Umbriel</Moon>
</Moons>
</Planet>
<Planet name="Neptune"
minDistanceFromSunMK="4460"
maxDistanceFromSunMK="4540"
class="planet">
<Moons>
<Moon>Galatea</Moon>
<Moon>Larissa</Moon>
<Moon>Nereid</Moon>
<Moon>Proteus</Moon>
<Moon>Triton</Moon>
</Moons>
</Planet>
<Planet name="Pluto"
minDistanceFromSunMK="7376"
maxDistanceFromSunMK="4437"
class="dwarf">
<Moons>
<Moon>Charon</Moon>
<Moon>Hydra</Moon>
<Moon>Nix</Moon>
</Moons>
</Planet>
</Planets>


The XQuery script file to be used to query over the XML source shown above is stored in a different directory and its full path and name are C:\xquery\xqlScripts\extractIAUPlanets.xql. It is important to note that the XQL script and the XML source file are located in different directories because these differences will be reflected in the Java code that uses the XQL script to query the XML source. Here is the short contents of the extractIAUPlanets.xql file:


<IAUnionPlanets>
{for $i in doc("planets.xml")/Planets/Planet
where $i/@class = "planet"
return <Planet>{data($i/@name)}</Planet>}
</IAUnionPlanets>


This short XQuery script file will query over the XML source of planets (which includes Pluto as a planet) and will return only the planets that are still officially considered planets by the International Astronomical Union (IAU). In other words, poor Pluto gets filtered out because it is now a "dwarf planet" instead of a "planet."

Note in the five lines of XQuery script code in the above script, two lines are simply the new XML opening and closing tag for IAUnionPlanets. The other three lines are the more dynamic portion (hence the curly braces) and these only return the names of planets for planets which are of class "planet."

It is significant to note that the fn:doc function references the source XML file (planets.xml) without any path information. The XQuery script will only know where to look for the planets.xml XML source file if we tell the XQuery implementation what the base URI is. In Oracle's XQJ implementation, this is done with a oracle.xquery.Configuration.setBaseURI(String) call. That Configuration can then be passed to the oracle.xquery.XQueryContext.prepareXQuery() call shown in the Java code below. Note that this configuration base URI does not impact the location where the XQL script file will be found, but instead impacts where the XQL script file will look for the source XML file via the fn:doc function.

Here is the Java class (OracleXqjAccess.java) that uses Oracle's XQJ to invoke the XQuery script (C:\xquery\xqlScripts\extractIAUPlanets.xql) shown above on the XML source (C:\xquery\xmlSource\planets.xml) shown above. Here is the code for OracleXqjAccess.java:


package xqueryexamples;

import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;

import java.util.ArrayList;
import java.util.List;

import oracle.xml.parser.v2.XMLNode;
import oracle.xml.xqxp.datamodel.XMLItem;
import oracle.xml.xqxp.datamodel.XMLSequence;

import oracle.xquery.Configuration;
import oracle.xquery.PreparedXQuery;
import oracle.xquery.XQueryContext;

import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

/**
* The main purpose of this class is to exercise Oracle's implementation of
* XQuery API for Java (XQJ) [Java Specification Request 225].
*/
public class OracleXqjAccess
{
/** XQueryContext. */
private static XQueryContext context = new XQueryContext();

/**
* Default constructor accepting no arguments.
*/
public OracleXqjAccess ()
{
}

/**
* Obtain XQueryContext for use in performing XQuery.
*
* @return XQueryContext
*/
private XQueryContext getXQueryContext()
{
if ( context != null )
{
return context;
}
else
{
context = new XQueryContext();
return context;
}
}

/**
* Extract text content of non-root first-level XML elements returned from
* provided XQuery script.
*
* @param aXQueryScript XQuery script to be executed.
* @param aBaseUri Base URI to be used for any documents looked up within
* the provided aXQueryScript using the fn:doc function.
* @return List of String associated with child nodes.
*/
public List<String> extractSingleElementStringsFromFirstLevelElements
( final String aXQueryScript,
final String aBaseUri )
{
final String mName = "runExampleXQueryFromFile(String,String)";
Reader reader = null;
List <String> childNodesList = new ArrayList<String>();
try
{
final XQueryContext xqueryContext = getXQueryContext();
final Configuration config = new Configuration();
reader = new FileReader(aXQueryScript);

System.out.println( mName + " - XQuery Script File: " + aXQueryScript );
config.setXQueryOption(Configuration.XQUERY_NORMAL);
config.setBaseURI(aBaseUri);

System.out.println( mName + " - Provided Base URI: " + aBaseUri );
PreparedXQuery preparedXQuery =
xqueryContext.prepareXQuery(reader, config);
XMLSequence xrs = preparedXQuery.executeQuery();
while ( xrs.next() )
{
XMLItem xmlItem = xrs.getCurrentItem();
final XMLNode xmlNode = xmlItem.getNode();
NodeList childNodes = xmlNode.getChildNodes();
final int numberChildNodes = childNodes.getLength();
for ( int i=0; i < numberChildNodes; ++i )
{
final Node childNode = childNodes.item(i);
childNodesList.add( childNode.getTextContent() );
}
}

}
catch (FileNotFoundException fnfEx) // use of FileReader
{
System.err.println( "Could not find file " + aXQueryScript + ": "
+ fnfEx.getMessage() );
}
finally
{
if ( reader != null )
{
try
{
reader.close();
}
catch (IOException ioEx)
{
System.err.println(
mName
+ " - Exception thrown while trying to close Reader."
+ ioEx.getMessage() );
}
}
}

return childNodesList;
}

/**
* Provide the full-fledged planets (not dwarf) endorsed by the IAU.
*
* @return List of names of planets endorsed as planets by IAU.
*/
public List<String> getIAUPlanets()
{
final String iauPlanetsScript =
"C:\\xquery\\xqlScripts\\extractIAUPlanets.xql";
final String baseURI = "file:///C:/xquery/xmlSource/";
return extractSingleElementStringsFromFirstLevelElements(
iauPlanetsScript, baseURI );
}

/**
* Display provided list of strings.
*
* @param aListTitle Title of the list to be displayed.
* @param aListOfStrings List of Strings to be displayed.
*/
public static void displayContentsOfList( final String aListTitle,
final List<String> aListOfStrings )
{
System.out.println("----- " + aListTitle + " -----");
for ( final String string : aListOfStrings )
{
System.out.println(string);
}
}

/**
* Main function.
*
* @param aArgs Command-line arguments; none anticipated currently.
*/
public static void main( final String[] aArgs )
{
OracleXqjAccess xqjAccess = new OracleXqjAccess();
displayContentsOfList( "IAU Endorsed Planets",
xqjAccess.getIAUPlanets() );
}
}


The XQJ-specific and closely related code is highlighted above. There are two System.out statements to display the path and file name of the XQL script file and to display the base URI used implicitly in the XQL script when running it against a source XML file referenced via the fn:doc function.

There are some subtle nuances associated with the code above. As described above, the base URI only describes where the XQL XQuery Script can expect to find the file whose name is passed into the fn:doc function. The code (final String baseURI = "file:///C:/xquery/xmlSource/";) shows the base URI being expressed with all forward slashes rather than the backslashes normally associated with Windows. The Unix-style forward slashes (even in Windows as is this example) and the "file:///" are necessary in the base URI for the Oracle XQuery engine to properly locate the source XML file. Besides ensuring that "file:///" is specified and that all path separators are Unix/Linux style, the other necessary characteristic of the base URI is that it needs to end with a forward slash if being used in a relative sense as in this example. In other words, a slash is NOT automatically put after this base URI before appending the location passed to fn:doc.

While the three rules (specify protocol, always use forward slashes, and always end with a forward slash) are important in specifying the base URI, the file location and name supplied for the XQJ code to find the XQL script file uses more traditional file location syntax: C:\\xquery\\xqlScripts\\extractIAUPlanets.xql (the double backslashes are to handle escaping in Java Strings). In the simpler case of specifying the location of the XQuery script file, one could use Java's System.getProperty("file.separator") to appropriately place file separators for the applicable operating system. For the base URI to the XML source files, on the other hand, the file separators should always be forward slashes regardless of the underlying operating system.

Here is the output from running this Oracle XQJ-powered code to run the prescribed XQuery script over the designated XML file:


runExampleXQueryFromFile(String,String) - XQuery Script File: C:\xquery\xqlScripts\extractIAUPlanets.xql
runExampleXQueryFromFile(String,String) - Provided Base URI: file:///C:/xquery/xmlSource/
----- IAU Endorsed Planets -----
Mercury
Venus
Earth
Mars
Jupiter
Saturn
Uranus
Neptune


The NASA Solar System Exploration site is a useful site for details and facts on the solar system and planets. Another useful site is the "Journey Through the Galaxy" site. My limited experience made the Regular site plenty informative, but people more experienced with galactic knowledge may find the Advanced version more interesting.

No comments: