CData Python Connector for XML

Build 23.0.8839

Modeling XML Data

In this section we will show how to control the various schemes that the connector offers to bridge the gap with relational SQL and XML services. The CData Python Connector for XML provides a managed way for you to use the two prevailing techniques for dealing with nested XML data:

  • Parsing the data structure and building a relational model based on the existing hierarchy.
  • Drilling down into the nested elements using horizontal and vertical flattening.

Parsing Hierarchical Data

By default, the connector automatically detects the rows in a document, so that you do not need to know the structure of the XML to query it with SQL. Set the DataModel property to choose a basic configuration of how the connector models the rows into tables.

Flattening Objects and Arrays into Rows

To flatten data, you only need to be familiar with two data structures:

  • Object: Any parent element that does not repeat at the same height.
  • Array: Any element that repeats at the same height.

In the following example from the people collection, maintenance is an object array, since each maintenance node has child elements.

<maintenance>
  <date>07-17-2017</date>
  <desc>oil change</desc>
</maintenance>
<maintenance>
  <date>01-03-2018</date>
  <desc>new tires</desc>
</maintenance> 

Configuring Automatic Schema Discovery

The connector discovers columns and data types by scanning the RowScanDepth count of XML objects in XML arrays. Set the FlattenObjects and FlattenArrays properties to configure how nested data is flattened into columns; see Automatic Schema Discovery for examples.

Executing SQL to XML

Any relation you can access through flattening you can also access with an ad-hoc SQL query. The connector enables you to query nested data with the following capabilities:

Customizing Schemas

Customizing Schemas enables you to project your chosen relational structure on top of an XML document. This allows you to choose the names of columns, their data types, and the locations of their values in the document.

System Catalog

The System Tables reflect the schemas you configured, custom schemas or dynamically discovered. The Stored Procedures surface additional functionality in the connector's data processing operations that cannot be modeled as SELECT, INSERT, UPDATE, or DELETE. You can find the reported stored procedures defined in .rsb files in the folder specified by Location -- if Location is not specified, the db subfolder of the installation directory.

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 23.0.8839