CData Python Connector for Azure Cosmos DB

Build 24.0.9060

Automatic Schema Discovery

The connector automatically infers a relational schema by inspecting a series of Azure Cosmos DB documents in a collection. You can use the RowScanDepth property to define the number of documents the connector will scan to do so. The columns identified during the discovery process depend on the FlattenArrays and FlattenObjects properties.

Flattening Objects

If FlattenObjects is set, all nested objects will be flattened into a series of columns. For example, consider the following document:

{
  id: 12,
  name: "Lohia Manufacturers Inc.",
  address: {street: "Main Street", city: "Chapel Hill", state: "NC"},
  offices: ["Chapel Hill", "London", "New York"],
  annual_revenue: 35,600,000
}
This document will be represented by the following columns:

Column NameData TypeExample Value
idInteger12
nameStringLohia Manufacturers Inc.
address.streetStringMain Street
address.cityStringChapel Hill
address.stateStringNC
officesString["Chapel Hill", "London", "New York"]
annual_revenueDouble35,600,000

If FlattenObjects is not set, then the address.street, address.city, and address.state columns will not be broken apart. The address column of type string will instead represent the entire object. Its value would be {street: "Main Street", city: "Chapel Hill", state: "NC"}. See JSON Functions for more details on working with JSON aggregates.

You can change the separator character in the column name from a dot by setting SeparatorCharacter.

Flattening Arrays

The FlattenArrays property can be used to flatten array values into columns of their own. This is only recommended for arrays that are expected to be short, for example the coordinates below:

"coord": [ -73.856077, 40.848447 ]
The FlattenArrays property can be set to 2 to represent the array above as follows:

Column NameData TypeExample Value
coord.0Float-73.856077
coord.1Float40.848447

It is best to leave other unbounded arrays as they are and piece out the data for them as needed using JSON Functions.

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 24.0.9060