Free-Form Queries
As discussed in Automatic Schema Discovery, intuited table schemas enable SQL access to unstructured XML data. Customizing Schemas enables you to define static tables and gives you more granular control over the relational view of your data; for example, you can change the data types reported. However, you are not limited to the schema's view of your data.
You can query any nested structure without flattening the data. Any relations that you can access through Automatic Schema Discovery can also be accessed with an ad hoc SQL query.
Extended Projection Syntax
In the SELECT clause, use dot notation to specify an XPath to the data, as in the following query.
SELECT [personal.name.last], [personal.name.first], [vehicles.1.type], [vehicles.1.model] FROM people WHERE [personal.name.last] = 'Roberts' AND [personal.name.first] = 'Jane'
Note that to specify the path to a specific array element, specify the element's ordinal position. Arrays have a zero-based index, so the preceding query retrieves the second vehicle.
Example
The preceding query draws the column names from the example people document in Raw Data. Below is a person object from the array of people:
<?xml version="1.0" encoding="utf-8"?>
<root>
<rootAttr1>rootValue1</rootAttr1>
<people>
<personal>
<age>20</age>
<gender>M</gender>
<name>
<first>John</first>
<last>Doe</last>
</name>
</personal>
<jobs>support</jobs>
<jobs>coding</jobs>
<vehicles>
<type>car</type>
<model>Honda Civic</model>
<insurance>
<company>ABC Insurance</company>
<policy_num>12345</policy_num>
</insurance>
<features>sunroof</features>
<features>rims</features>
<maintenance>
<date>07-17-2017</date>
<desc>oil change</desc>
</maintenance>
<maintenance>
<date>01-03-2018</date>
<desc>new tires</desc>
</maintenance>
</vehicles>
<vehicles>
<type>truck</type>
<model>Dodge Ram</model>
<insurance>
<company>ABC Insurance</company>
<policy_num>12345</policy_num>
</insurance>
<features>lift kit</features>
<features>tow package</features>
<maintenance>
<date>08-27-2017</date>
<desc>new tires</desc>
</maintenance>
<maintenance>
<date>01-08-2018</date>
<desc>oil change</desc>
</maintenance>
</vehicles>
<addresses>
<type>work</type>
<zip>12345</zip>
</addresses>
<addresses>
<type>home</type>
<zip>12357</zip>
</addresses>
<source>internet</source>
</people>
</root>
Connection String
With the following connection string, the driver will not parse nested data -- the data is processed when you execute the query. The properties of the top-level object are still flattened through the default FlattenObjects functionality. Nested data is returned as an XML aggregate.
URI=C:\people.txt;DataModel=Document;XPath='/root/people;'
Query
You can access any nested structure in the Raw Data document as a column:
SELECT [personal.name.last], [personal.name.first], [vehicles.1.type], [vehicles.1.model] FROM people WHERE [personal.name.last] = 'Roberts' AND [personal.name.first] = 'Jane'
Note that arrays have a zero-based index. The preceding query retrieves the example person's second vehicle.
Results
The preceding query returns the following results:
Column Name | Data Type | Example Value |
personal.name.first | String | Jane |
personal.name.last | String | Roberts |
vehicles.1.type | String | car |
vehicles.1.model | String | Honda Accord |