Automatic Schema Discovery

By default, the cmdlet automatically infers a relational schema by inspecting a series of XML objects in an array. This section describes the connection properties available to configure these dynamic schemas.

Discovering Tables

This section shows how to fine-tune the discovered schemas by fine-tuning the row scan. The rows detected depend on the RowScanDepth, DataModel, and XPath properties.

RowScanDepth: This setting determines the number of object arrays to scan to find the rows and discover the columns. The process follows nested objects, counting 1 object array as 1 row.
DataModel: Select how you want to represent the object arrays as tables. This setting also affects the objects that are scanned: the Relational and FlattenedDocuments settings will find all object arrays (including nested ones) in RowScanDepth items. The Document setting will only find the first object array.
XPath: Set this to explicitly specify the object arrays you want to expose. Note that if DataModel is set to Document, you can only set one, top-level XPath. See Parsing Hierarchical Data for examples of accessing a sample document with different XPath values.

Discovering Columns

The columns identified during the discovery process depend on the FlattenArrays and FlattenObjects properties. If FlattenObjects is set (this is the default), nested objects will be flattened into a series of columns. For example, consider the following document:

<company>
  <id>12</id>
  <name>Lohia Manufacturers Inc.</name>
  <address>
    <street>Main Street</street>
    <city>Chapel Hill</city>
    <state>NC</state>
  </address>
  <office>Chapel Hill</office>
  <office>London</office>
  <office>New York</office>
  <annual_revenue>35,600,000</annual_revenue>
</company>

With the default object flattening, this document will be represented by the following columns:


Column Name	Data Type	Example Value
id	Integer	12
name	String	Lohia Manufacturers Inc.
address.street	String	Main Street
address.city	String	Chapel Hill
address.state	String	NC
office	String	<office>Chapel Hill</office><office>London</office><office>New York</office>
annual_revenue	Double	35,600,000

If FlattenObjects is not set, then the address.street, address.city, and address.state columns will not be broken apart. The address column of type string will instead represent the entire object. Its value would be the following:

<address><street>Main Street</street><city>Chapel Hill</city><state>NC</state></address>

The FlattenArrays property can be used to flatten array values into columns of their own. This is only recommended for arrays that are expected to be short, for example the offices array below:

<office>London</office>
<office>Los Angeles</office>

The FlattenArrays property can be set to 2 to represent the array above as follows:


Column Name	Data Type	Example Value
office.0	String	London
office.1	String	Los Angeles

It is best to leave other unbounded arrays as they are and piece out the data for them as needed using XML Functions.

Cmdlets for XML

Automatic Schema Discovery

Discovering Tables

Discovering Columns