Thursday, February 12, 2015

A JAXB Nuance: String Versus Enum from Enumerated Restricted XSD String

Although Java Architecture for XML Binding (JAXB) is fairly easy to use in nominal cases (especially since Java SE 6), it also presents numerous nuances. Some of the common nuances are due to the inability to exactly match (bind) XML Schema Definition (XSD) types to Java types. This post looks at one specific example of this that also demonstrates how different XSD constructs that enforce the same XML structure can lead to different Java types when the JAXB compiler generates the Java classes.

The next code listing, for Food.xsd, defines a schema for food types. The XSD mandates that valid XML will have a root element called "Food" with three nested elements "Vegetable", "Fruit", and "Dessert". Although the approach used to specify the "Vegetable" and "Dessert" elements is different than the approach used to specify the "Fruit" element, both approaches result in similar "valid XML." The "Vegetable" and "Dessert" elements are declared directly as elements of the prescribed simpleTypes defined later in the XSD. The "Fruit" element is defined via reference (ref=) to another defined element that consists of a simpleType.

Food.xsd
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:dustin="http://marxsoftware.blogspot.com/foodxml"
           targetNamespace="http://marxsoftware.blogspot.com/foodxml"
           elementFormDefault="qualified"
           attributeFormDefault="unqualified">

   <xs:element name="Food">
      <xs:complexType>
         <xs:sequence>
            <xs:element name="Vegetable" type="dustin:Vegetable" />
            <xs:element ref="dustin:Fruit" />
            <xs:element name="Dessert" type="dustin:Dessert" />
         </xs:sequence>
      </xs:complexType>
   </xs:element>

   <!--
        Direct simple type that restricts xs:string will become enum in
        JAXB-generated Java class.
   -->
   <xs:simpleType name="Vegetable">
      <xs:restriction base="xs:string">
         <xs:enumeration value="Carrot"/>
         <xs:enumeration value="Squash"/>
         <xs:enumeration value="Spinach"/>
         <xs:enumeration value="Celery"/>
      </xs:restriction>
   </xs:simpleType>

   <!--
        Simple type that restricts xs:string but is wrapped in xs:element
        (making it an Element rather than a SimpleType) will become Java
        String in JAXB-generated Java class for Elements that reference it.
   -->
   <xs:element name="Fruit">
      <xs:simpleType>
         <xs:restriction base="xs:string">
            <xs:enumeration value="Watermelon"/>
            <xs:enumeration value="Apple"/>
            <xs:enumeration value="Orange"/>
            <xs:enumeration value="Grape"/>
         </xs:restriction>
      </xs:simpleType>
   </xs:element>

   <!--
        Direct simple type that restricts xs:string will become enum in
        JAXB-generated Java class.        
   -->
   <xs:simpleType name="Dessert">
      <xs:restriction base="xs:string">
         <xs:enumeration value="Pie"/>
         <xs:enumeration value="Cake"/>
         <xs:enumeration value="Ice Cream"/>
      </xs:restriction>
   </xs:simpleType>

</xs:schema>

Although Vegetable and Dessert elements are defined in the schema differently than Fruit, the resulting valid XML is the same. A valid XML file is shown next in the code listing for food1.xml.

food1.xml
<?xml version="1.0" encoding="utf-8"?>
<Food xmlns="http://marxsoftware.blogspot.com/foodxml"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <Vegetable>Spinach</Vegetable>
   <Fruit>Watermelon</Fruit>
   <Dessert>Pie</Dessert>
</Food>

At this point, I'll use a simple Groovy script to validate the above XML against the above XSD. The code for this Groovy XML validation script (validateXmlAgainstXsd.groovy) is shown next.

validateXmlAgainstXsd.groovy
#!/usr/bin/env groovy

// validateXmlAgainstXsd.groovy
//
// Accepts paths/names of two files. The first is the XML file to be validated
// and the second is the XSD against which to validate that XML.

if (args.length < 2)
{
   println "USAGE: groovy validateXmlAgainstXsd.groovy <xmlFile> <xsdFile>"
   System.exit(-1)
}

String xml = args[0]
String xsd = args[1]

import javax.xml.validation.Schema
import javax.xml.validation.SchemaFactory
import javax.xml.validation.Validator

try
{
   SchemaFactory schemaFactory =
      SchemaFactory.newInstance(javax.xml.XMLConstants.W3C_XML_SCHEMA_NS_URI)
   Schema schema = schemaFactory.newSchema(new File(xsd))
   Validator validator = schema.newValidator()
   validator.validate(new javax.xml.transform.stream.StreamSource(xml))
}
catch (Exception exception)
{
   println "\nERROR: Unable to validate ${xml} against ${xsd} due to '${exception}'\n"
   System.exit(-1)
}
println "\nXML file ${xml} validated successfully against ${xsd}.\n"

The next screen snapshot demonstrates running the above Groovy XML validation script against food1.xml and Food.xsd.

The objective of this post so far has been to show how different approaches in an XSD can lead to the same XML being valid. Although these different XSD approaches prescribe the same valid XML, they lead to different Java class behavior when JAXB is used to generate classes based on the XSD. The next screen snapshot demonstrates running the JDK-provided JAXB xjc compiler against the Food.xsd to generate the Java classes.

The output from the JAXB generation shown above indicates that Java classes were created for the "Vegetable" and "Dessert" elements but not for the "Fruit" element. This is because "Vegetable" and "Dessert" were defined differently than "Fruit" in the XSD. The next code listing is for the Food.java class generated by the xjc compiler. From this we can see that the generated Food.java class references specific generated Java types for Vegetable and Dessert, but references simply a generic Java String for Fruit.

Food.java (generated by JAXB jxc compiler)
//
// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.8-b130911.1802 
// See <a href="http://java.sun.com/xml/jaxb">http://java.sun.com/xml/jaxb</a> 
// Any modifications to this file will be lost upon recompilation of the source schema. 
// Generated on: 2015.02.11 at 10:17:32 PM MST 
//


package com.blogspot.marxsoftware.foodxml;

import javax.xml.bind.annotation.XmlAccessType;
import javax.xml.bind.annotation.XmlAccessorType;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlSchemaType;
import javax.xml.bind.annotation.XmlType;


/**
 * <p>Java class for anonymous complex type.
 * 
 * <p>The following schema fragment specifies the expected content contained within this class.
 * 
 * <pre>
 * <complexType>
 *   <complexContent>
 *     <restriction base="{http://www.w3.org/2001/XMLSchema}anyType">
 *       <sequence>
 *         <element name="Vegetable" type="{http://marxsoftware.blogspot.com/foodxml}Vegetable"/>
 *         <element ref="{http://marxsoftware.blogspot.com/foodxml}Fruit"/>
 *         <element name="Dessert" type="{http://marxsoftware.blogspot.com/foodxml}Dessert"/>
 *       </sequence>
 *     </restriction>
 *   </complexContent>
 * </complexType>
 * </pre>
 * 
 * 
 */
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "", propOrder = {
    "vegetable",
    "fruit",
    "dessert"
})
@XmlRootElement(name = "Food")
public class Food {

    @XmlElement(name = "Vegetable", required = true)
    @XmlSchemaType(name = "string")
    protected Vegetable vegetable;
    @XmlElement(name = "Fruit", required = true)
    protected String fruit;
    @XmlElement(name = "Dessert", required = true)
    @XmlSchemaType(name = "string")
    protected Dessert dessert;

    /**
     * Gets the value of the vegetable property.
     * 
     * @return
     *     possible object is
     *     {@link Vegetable }
     *     
     */
    public Vegetable getVegetable() {
        return vegetable;
    }

    /**
     * Sets the value of the vegetable property.
     * 
     * @param value
     *     allowed object is
     *     {@link Vegetable }
     *     
     */
    public void setVegetable(Vegetable value) {
        this.vegetable = value;
    }

    /**
     * Gets the value of the fruit property.
     * 
     * @return
     *     possible object is
     *     {@link String }
     *     
     */
    public String getFruit() {
        return fruit;
    }

    /**
     * Sets the value of the fruit property.
     * 
     * @param value
     *     allowed object is
     *     {@link String }
     *     
     */
    public void setFruit(String value) {
        this.fruit = value;
    }

    /**
     * Gets the value of the dessert property.
     * 
     * @return
     *     possible object is
     *     {@link Dessert }
     *     
     */
    public Dessert getDessert() {
        return dessert;
    }

    /**
     * Sets the value of the dessert property.
     * 
     * @param value
     *     allowed object is
     *     {@link Dessert }
     *     
     */
    public void setDessert(Dessert value) {
        this.dessert = value;
    }

}

The advantage of having specific Vegetable and Dessert classes is the additional type safety they bring as compared to a general Java String. Both Vegetable.java and Dessert.java are actually enums because they come from enumerated values in the XSD. The two generated enums are shown in the next two code listings.

Vegetable.java (generated with JAXB xjc compiler)
//
// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.8-b130911.1802 
// See <a href="http://java.sun.com/xml/jaxb">http://java.sun.com/xml/jaxb</a> 
// Any modifications to this file will be lost upon recompilation of the source schema. 
// Generated on: 2015.02.11 at 10:17:32 PM MST 
//


package com.blogspot.marxsoftware.foodxml;

import javax.xml.bind.annotation.XmlEnum;
import javax.xml.bind.annotation.XmlEnumValue;
import javax.xml.bind.annotation.XmlType;


/**
 * <p>Java class for Vegetable.
 * 
 * <p>The following schema fragment specifies the expected content contained within this class.
 * <p>
 * <pre>
 * <simpleType name="Vegetable">
 *   <restriction base="{http://www.w3.org/2001/XMLSchema}string">
 *     <enumeration value="Carrot"/>
 *     <enumeration value="Squash"/>
 *     <enumeration value="Spinach"/>
 *     <enumeration value="Celery"/>
 *   </restriction>
 * </simpleType>
 * </pre>
 * 
 */
@XmlType(name = "Vegetable")
@XmlEnum
public enum Vegetable {

    @XmlEnumValue("Carrot")
    CARROT("Carrot"),
    @XmlEnumValue("Squash")
    SQUASH("Squash"),
    @XmlEnumValue("Spinach")
    SPINACH("Spinach"),
    @XmlEnumValue("Celery")
    CELERY("Celery");
    private final String value;

    Vegetable(String v) {
        value = v;
    }

    public String value() {
        return value;
    }

    public static Vegetable fromValue(String v) {
        for (Vegetable c: Vegetable.values()) {
            if (c.value.equals(v)) {
                return c;
            }
        }
        throw new IllegalArgumentException(v);
    }

}
Dessert.java (generated with JAXB xjc compiler)
//
// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.8-b130911.1802 
// See <a href="http://java.sun.com/xml/jaxb">http://java.sun.com/xml/jaxb</a> 
// Any modifications to this file will be lost upon recompilation of the source schema. 
// Generated on: 2015.02.11 at 10:17:32 PM MST 
//


package com.blogspot.marxsoftware.foodxml;

import javax.xml.bind.annotation.XmlEnum;
import javax.xml.bind.annotation.XmlEnumValue;
import javax.xml.bind.annotation.XmlType;


/**
 * <p>Java class for Dessert.
 * 
 * <p>The following schema fragment specifies the expected content contained within this class.
 * <p>
 * <pre>
 * <simpleType name="Dessert">
 *   <restriction base="{http://www.w3.org/2001/XMLSchema}string">
 *     <enumeration value="Pie"/>
 *     <enumeration value="Cake"/>
 *     <enumeration value="Ice Cream"/>
 *   </restriction>
 * </simpleType>
 * </pre>
 * 
 */
@XmlType(name = "Dessert")
@XmlEnum
public enum Dessert {

    @XmlEnumValue("Pie")
    PIE("Pie"),
    @XmlEnumValue("Cake")
    CAKE("Cake"),
    @XmlEnumValue("Ice Cream")
    ICE_CREAM("Ice Cream");
    private final String value;

    Dessert(String v) {
        value = v;
    }

    public String value() {
        return value;
    }

    public static Dessert fromValue(String v) {
        for (Dessert c: Dessert.values()) {
            if (c.value.equals(v)) {
                return c;
            }
        }
        throw new IllegalArgumentException(v);
    }

}

Having enums generated for the XML elements ensures that only valid values for those elements can be represented in Java.

Conclusion

JAXB makes it relatively easy to map Java to XML, but because there is not a one-to-one mapping between Java and XML types, there can be some cases where the generated Java type for a particular XSD prescribed element is not obvious. This post has shown how two different approaches to building an XSD to enforce the same basic XML structure can lead to very different results in the Java classes generated with the JAXB xjc compiler. In the example shown in this post, declaring elements in the XSD directly on simpleTypes restricting XSD's string to a specific set of enumerated values is preferable to declaring elements as references to other elements wrapping a simpleType of restricted string enumerated values because of the type safety that is achieved when enums are generated rather than use of general Java Strings.

2 comments:

eegee said...

Could you please try your example with different JAVA versions (i.e. 6, 7, & *)? I am sure you will appreciate the further nuances.

Sivakumar said...

There is away to avoid the adding the @XmlSchemaType(name="String") for the enum dataType.