Netsparker Web Application Security Scanner

What is XML Schema Inference?

Schema Inference is a technique which is used to infer XSD (XML Schema Definition) after parsing the structure of any XML document. Therefore using a programming language like (.NET framework), one can generate the XSD from the structure of XML document and can use it to validate the same original XML document.

For example in .NET framework, class known as c present at the namespace of System.Xml.Schema can be used to generate XSD from the XML document which can be used to validate the original document. Also, this class infers the schema components in a constrained way after analyzing the most restrictive type of a particular attribute or element in the XML document. After gathering more and more information from the XML document, it loosens the constraints by inferring element or attribute which is least restrictive type. For example, xs:string is the least restrictive type that can be inferred from the XML document.

XML Schema Inference


Let’s understand this through a piece of XML document as follows:

<person value="2">


<address>L5B 2E8</address>


<person value="B"/>

In the above example,

  • When the value attribute of first personelement is encountered as 2 by the XmlSchemaInference class, it will infer its type as xs:unsignedByte.
  • Again when the valueattribute of second person element is encountered as B, the XmlSchemaInference class will further loosened its type as xs:stringinto XSD as it encountered its value from 2 to B.
  • Similarly, first person element has two child address elements but second person element has no child address Therefore minOccurs attribute will be loosened from minOccurs =”2” to minOccurs =”0”for all the child elements of the person element inferred in the schema (XSD) for original XML validation.

XmlSchemaInference class will throw an exception XmlSchemaInferenceExceptionif it encounters an inline XML Schema definition language (XSD) schema.

Rules to infer schema node type and structure:

During schema inference there are set of rules that are obeyed while inferring nodes from XML document to XSD schema. Here are going to understand how the rules work. There are eight types of possible structures as given below on which these rules works.

  • Element which are of simple type.
  • Empty element.
  • Empty element with attribute/attributes.
  • Element with attributes and has simple content.
  • Element with a sequence of child elements.
  • Element with a sequence of child elements and attribute/attributes.
  • Element with a sequence of choices of child elements.
  • Element with a sequence of choices of child elements and attribute/attributes.

Rules to Infer Elements with Simple Type:

Rules to infer simple types:our XmlSchemaInference class as discussed can infer the datatype for the attributes of the element while parsing the XML document a sit encounters with it. If attribute value is positive number it infers it as unsigned Byte, it is an alphabet then it infer its datatype as string and so on. Following are the list of possible datatypes that the rules can infer datatype of attributes of elements from XML document to XSD schema.

Inferred Data type (Simple Type) Attributes value for XML element
boolean If it is true or false.
byte If it is integer value between –128 to 127.
unsignedByte If it is integer value between 0 to 255.
short If it is integer value between –32768 to 32767.
unsignedShort If it is integer value between 0 to 65535.
int If it is integer value between –2147483648 to 2147483647.
unsignedInt If it is integer value between 0 to 4294967295.
long If it is integer value between –9223372036854775808 to 9223372036854775807.
unsignedLong If it is integer value between 0 to 18446744073709551615.
integer Any finite integer which is prefixed with “-”.
float If it is decimal value between -16777216 to 16777216.
double If it is decimal value between -9007199254740992 to 9007199254740992.
duration W3C duration format.
dateTime & time W3C dateTime and time format.
Date W3C date format.
string If it is single or more than one Unicode format.


JAVA mini Project:

Till now we have discussed about what Schema Inference is and how it is used in .NET to generate XSD from the XML document to validate the original document. Likewise .NET framework, it can also be achieved in JAVA Platform using xsd-gen-0.2.0-jar-with-dependencies.jar or xbean-2.2.0.jar.

JAVA Project using xsd-gen-0.2.0-jar-with-dependencies.jarto infer XSD from XML document.

Step 1: Download the jars (xsd-gen-0.2.0-jar-with-dependencies.jar) from this website at below link.

Step 2: Download the project from below link and set it up in your eclipse.

Step 3: Set up the build path by importing the external jars (xsd-gen-0.2.0-jar-with-dependencies.jar) at build path of the project.

Step 4:Once compilations is complete.

Step 5: Place the file at location C:\schemainference\sample.xml, this file can be downloaded from below link.

Step 6: Run the project main class SchemaInference (shown below) as Java application in eclipse.

package com.softwaretestingclass.kanif.xml.schema;



publicclass SchemaInference {

publicstaticvoid main(String[] args) throws Exception {

File file = new File("C://schemainference//sample.xml");

XsdGen gen = new XsdGen();


File out = new File("C://schemainference//out.xsd");

gen.write(new FileOutputStream(out));

Step 7: Output will be visible on the console as generated XSD from XML document as

<?xml version="1.0"?>

<xsd:schema xmlns:xsd="" targetNamespace="" elementFormDefault="qualified">

<xsd:element name="person">

<xsd:complexType mixed="true">


<xsd:element name="name" minOccurs="0" type="xsd:normalizedString" />

<xsd:element name="address" minOccurs="0" type="xsd:string" />


<xsd:attribute name="value" type="xsd:int" use="required" />





Step 8: Output XSD file as out.xsd will be generated at “C:\schemainference\out.xsd” path which is nothing but our inferred schema.

Input XML:This is the screenshot of the contents of sample.xml document.

Output:This is the equivalent XSD inferred from the XML document using the above project. Best part in this is that this XSD can be used to validate this XML document.


Application of Schema Inference in SOAP XML Response:

It is the simple method used to generate the XSD by Schema Inference of SOAP XML response message from web service.

Response SOAP XML message is given below. If we like to know the XSD of this response message, it is very simple and can be obtained using this mini project in JAVA.

Attached below is the screenshot for inferred XSD schema from SOAP XML response message which we tested with SoapUI in this tutorial.

This completes the tutorial on Schema Inference. Hope you find the required information.


By any chance if you you worked on XML Schema Interface then please let us know your feedback and comments. Your questions are most welcome!

If you are not regular reader of this website then highly recommends you to Sign up for our free email newsletter!! Sign up just providing your email address below:

Enter your email address:

Check email in your inbox for confirmation to get latest updates Software Testing for free.

Keep learning!!

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>