CData Python Connector for XML

Build 23.0.8839

Free-Form Queries

As discussed in Automatic Schema Discovery, intuited table schemas enable SQL access to unstructured XML data. Customizing Schemas enables you to define static tables and gives you more granular control over the relational view of your data; for example, you can change the data types reported. However, you are not limited to the schema's view of your data.

You can query any nested structure without flattening the data. Any relations that you can access through Automatic Schema Discovery can also be accessed with an ad hoc SQL query.

Extended Projection Syntax

In the SELECT clause, use dot notation to specify an XPath to the data, as in the following query.

SELECT [personal.name.last], [personal.name.first], [vehicles.1.type], [vehicles.1.model] FROM people WHERE [personal.name.last] = 'Roberts' AND [personal.name.first] = 'Jane'

Note that to specify the path to a specific array element, specify the element's ordinal position. Arrays have a zero-based index, so the preceding query retrieves the second vehicle.

Example

The preceding query draws the column names from the example people document in Raw Data. Below is a person object from the array of people:

<?xml version="1.0" encoding="utf-8"?>
<root>
  <rootAttr1>rootValue1</rootAttr1>
  <people>
    <personal>
      <age>20</age>
      <gender>M</gender>
      <name>
        <first>John</first>
        <last>Doe</last>
      </name>
    </personal>
    <jobs>support</jobs>
    <jobs>coding</jobs>
    <vehicles>
      <type>car</type>
      <model>Honda Civic</model>
      <insurance>
        <company>ABC Insurance</company>
        <policy_num>12345</policy_num>
      </insurance>
      <features>sunroof</features>
      <features>rims</features>
      <maintenance>
        <date>07-17-2017</date>
        <desc>oil change</desc>
      </maintenance>
      <maintenance>
        <date>01-03-2018</date>
        <desc>new tires</desc>
      </maintenance>
    </vehicles>
    <vehicles>
      <type>truck</type>
      <model>Dodge Ram</model>
      <insurance>
        <company>ABC Insurance</company>
        <policy_num>12345</policy_num>
      </insurance>
      <features>lift kit</features>
      <features>tow package</features>
      <maintenance>
        <date>08-27-2017</date>
        <desc>new tires</desc>
      </maintenance>
      <maintenance>
        <date>01-08-2018</date>
        <desc>oil change</desc>
      </maintenance>
    </vehicles>
    <addresses>
      <type>work</type>
      <zip>12345</zip>
    </addresses>
    <addresses>
      <type>home</type>
      <zip>12357</zip>
    </addresses>
    <source>internet</source>
  </people>
</root> 

Connection String

With the following connection string, the connector will not parse nested data -- the data is processed when you execute the query. The properties of the top-level object are still flattened through the default FlattenObjects functionality. Nested data is returned as an XML aggregate.

URI=C:\people.txt;DataModel=Document;XPath='/root/people;'

Query

You can access any nested structure in the Raw Data document as a column:

SELECT [personal.name.last], [personal.name.first], [vehicles.1.type], [vehicles.1.model] FROM people WHERE [personal.name.last] = 'Roberts' AND [personal.name.first] = 'Jane'

Note that arrays have a zero-based index. The preceding query retrieves the example person's second vehicle.

Results

The preceding query returns the following results:

Column NameData TypeExample Value
personal.name.firstStringJane
personal.name.lastStringRoberts
vehicles.1.typeStringcar
vehicles.1.modelStringHonda Accord

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 23.0.8839