Power BI Connector for HarperDB

Build 24.0.9060

Automatic Schema Discovery

The connector automatically infers a relational schema by retrieving the mapping of the HarperDB type. The columns and data types are generated from the retrieved data based on the value of the RowScanDepth property.

Detecting Data Types

HarperDB is schemaless and allows unstructured data. To map this data into a relational table, the connector will detect the data types by scanning the data for each column. The number of rows scanned is configurable via RowScanDepth.

Detecting Columns

Since HarperDB is schemaless, columns may contain JSON objects or arrays. The connector enables special handling of these JSON structures and enables you to flatten them. Therefore, columns identified during the discovery process depend on the FlattenArrays and FlattenObjects properties.

Example Data Set

To provide an example of how these options work, consider the following sample data:

[
  {
    "name": "John Smith",
    "address": {
      "street": "Main Street",
      "city": "Chapel Hill",
      "state": "NC"
    },
    "insured_ages": [ 17, 43, 45 ], 
    "vehicles": [
      {
        "year": 2015,
        "make": "Dodge",
        "model": "RAM 1500",
        "body_style": "TK"
      },
      {
        "year": 2015,
        "make": "Suzuki",
        "model": "V-Strom 650 XT",
        "body_style": "MC"
      },
      {
        "year": 2012,
        "make": "Honda",
        "model": "Accord",
        "body_style": "4D"
      }
    ]
  }
]

Using FlattenObjects

If FlattenObjects is set, all nested objects will be flattened into a series of columns. The above example will be represented by the following columns:

Column Name Data Type Example Value
name String John Smith
address.street String Main Street
address.city String Chapel Hill
address.state String NC
insured_ages String [ 17, 43, 45 ]
vehicles String [ { "year": "2015", "make": "Dodge", ... }, { "year": "2015", "make": "Suzuki", ... }, { "year": "2012", "make": "Honda", ... } ]

If FlattenObjects is not set, then the address.street, address.city, and address.state columns will not be broken apart. The address column of type string will instead represent the entire object. Its value would be the following:

{street: "Main Street", city: "Chapel Hill", state: "NC"}
See JSON Functions for more details on working with JSON aggregates.

Using FlattenArrays

The FlattenArrays property can be used to flatten array values into columns of their own. This is only recommended for arrays that are expected to be short. It is best to leave unbounded arrays as they are and piece out the data for them as needed using JSON Functions.

Note: Only the top-most array will be flattened. Any subarrays will be represented as the entire array.

The FlattenArrays property can be set to 3 to represent the arrays in the example above as follows (this example is with FlattenObjects not set):

Column Name Data Type Example Value
insured_ages String [ 17, 43, 45 ]
insured_ages.0 Integer 17
insured_ages.1 Integer 43
insured_ages.2 Integer 45
vehicles String [ { "year": "2015", "make": "Dodge", ... }, { "year": "2015", "make": "Suzuki", ... }, { "year": "2012", "make": "Honda", ... } ]
vehicles.0 String { "year": "2015", "make": "Dodge", "model": "RAM 1500", "body_style": "TK" }
vehicles.1 String { "year": "2015", "make": "Suzuki", "model": "V-Strom 650 XT", "body_style": "MC" }
vehicles.2 String { "year": "2012", "make": "Honda", "model": "Accord", "body_style": "4D" }

Using Both FlattenObjects and FlattenArrays

If FlattenObjects is set along with FlattenArrays (set to 1 for brevity), the vehicles field will be represented as follows:

Column Name Data Type Example Value
vehicles String [ { "year": "2015", "make": "Dodge", ... }, { "year": "2015", "make": "Suzuki", ... }, { "year": "2012", "make": "Honda", ... } ]
vehicles.0.year String 2015
vehicles.0.make String Dodge
vehicles.0.model String RAM 1500
vehicles.0.body_style String TK

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 24.0.9060