Automatic Schema Discovery
By default, the driver automatically infers a relational schema by inspecting a series of JSON objects in an array. This section describes the connection properties available to configure these dynamic schemas.
Discovering Tables
This section shows how to fine-tune the discovered schemas by fine-tuning the row scan. The rows detected depend on the RowScanDepth, DataModel, and JSONPath properties.
- RowScanDepth: This setting determines the number of object arrays to scan to find the rows and discover the columns. The process follows nested objects, counting 1 object array as 1 row.
- DataModel: Select how you want to represent the object arrays as tables. This setting also affects the objects that are scanned: the Relational and FlattenedDocuments settings will find all object arrays (including nested ones) in RowScanDepth items. The Document setting will only find the first object array.
- JSONPath: Set this to explicitly specify the object arrays you want to expose. Note that if DataModel is set to Document, you can only set one, top-level XPath. See Parsing Hierarchical Data for examples of accessing a sample document with different JSONPath values.
Discovering Columns
The columns identified during the discovery process depend on the FlattenArrays and FlattenObjects properties. If FlattenObjects is set (this is the default), nested objects will be flattened into a series of columns.
For example, consider the following document:
{ id: 12, name: "Lohia Manufacturers Inc.", address: {street: "Main Street", city: "Chapel Hill", state: "NC"}, offices: ["Chapel Hill", "London", "New York"], annual_revenue: 35,600,000 }With the default object flattening, this document will be represented by the following columns:
Column Name | Data Type | Example Value |
id | Integer | 12 |
name | String | Lohia Manufacturers Inc. |
address.street | String | Main Street |
address.city | String | Chapel Hill |
address.state | String | NC |
offices | String | ["Chapel Hill", "London", "New York"] |
annual_revenue | Double | 35,600,000 |
If FlattenObjects is not set, then the address.street, address.city, and address.state columns will not be broken apart. The address column of type string will instead represent the entire object. Its value would be the following:
{street: "Main Street", city: "Chapel Hill", state: "NC"}
The FlattenArrays property can be used to flatten array values into columns of their own. This is only recommended for arrays that are expected to be short, for example the coordinates below:
"coord": [ -73.856077, 40.848447 ]The FlattenArrays property can be set to 2 to represent the array above as follows:
Column Name | Data Type | Example Value |
coord.0 | Float | -73.856077 |
coord.1 | Float | 40.848447 |
It is best to leave other unbounded arrays as they are and piece out the data for them as needed using JSON Functions.