Data Type Mapping
Data Type Mappings
The cmdlet maps types from the data source to the corresponding data type available in the schema. The table below documents these mappings.
Google BigQuery | CData Schema | |
STRING | string | |
BYTES | binary | |
INTEGER | long | |
FLOAT | double | |
NUMERIC | decimal | |
BIGNUMERIC | decimal | |
BOOLEAN | bool | |
DATE | date | |
TIME | time | |
DATETIME | datetime | |
TIMESTAMP | datetime | |
STRUCT | See below | |
ARRAY | See below | |
GEOGRAPHY | string | |
JSON | string | |
INTERVAL | string |
Note that the NUMERIC type supports 38 digits of precision and the BIGDECIMAL type supports 76 digits of precision. Most platforms do not have a decimal type that supports the full precision of these values (.NET decimal supports 28 digits, and Java BigDecimal supports 38 by default). If this is the case, then you can cast these columns to a string when queried, or the connection can be configured to ignore them by setting IgnoreTypes=decimal.
STRUCT and ARRAY Types
Google BigQuery supports two kinds of types for storing compound values in a single row, STRUCT and ARRAY. In some places within Google BigQuery these are also known as RECORD and REPEATED types.
A STRUCT is a fixed-size group of values that are accessed by name and can have different types.
The cmdlet flattens structs so their individual fields can be accessed using dotted names.
Note that these dotted names must be quoted.
-- trade_value STRUCT<currency STRING, value FLOAT> SELECT CONCAT([trade_value.value], ' ', NULLIF([trade_value.currency], 'USD')) FROM trades
An ARRAY is a group of values with the same type that can have any size. The cmdlet treats the array as a single compound value and reports it as a JSON aggregate.
These types may be combined such that a STRUCT type contains an ARRAY field, or an ARRAY field is a list of STRUCT values.
The outer type takes precedence in how the field is processed:
/* Table contains fields: stocks STRUCT<symbol STRING, prices ARRAY<FLOAT>> offers: ARRAY<STRUCT<currency STRING, value FLOAT>> */ SELECT [stocks.symbol], /* ARRAY field can be read from STRUCT, but is converted to JSON */ [stocks.prices], [offers] /* STRUCT fields in an ARRAY cannot be accessed */ FROM market
INTERVAL Types
The cmdlet represents INTERVAL types as strings. Whenever a query requires an INTERVAL type, it must specify the INTERVAL using the BigQuery SQL INTERVAL format:
YEAR-MONTH DAY HOUR:MINUTE:SECOND.FRACTION. All queries that return INTERVAL values use this format unless they appear in an ARRAY aggregate, where the format depends upon how the cmdlet reads the data.
For example, the value "5 years and 11 months, minus 10 days and 3 hours and 2.5 seconds" in the correct format is:
5-11 -10 -3:0:0.2.5
Type Parameters
The cmdlet exposes parameters on the following types. In each case the type parameters are optional, Google BigQuery has default values for types that are not parameterized.
- STRING(length)
- BYTES(length)
- NUMERIC(precision) or NUMERIC(precision, scale)
- BIGNUMERIC(precision) or BIGNUMERIC(precision, scale)
These parameters are primarily for restricting the data written to the table. They are included in the table metadata as the column size for STRING and BYTES, and the numeric precision and scale for NUMERIC and BIGNUMERIC.
Type parameters have no effect on queries and are not reported within query metadata.
For example, in the example below the output of CONCAT is a plain STRING even though its inputs are a STRING(100) and b STRING(100).
SELECT CONCAT(a, b) FROM table_with_length_params