Cloud

Build 22.0.8462
  • Apache Impala
    • Getting Started
      • Establishing a Connection
      • Using Kerberos
      • Changelog
    • Advanced Features
      • User Defined Views
      • SSL Configuration
      • Firewall and Proxy
      • Caching Data
        • Configuring the Cache Connection
        • Caching Metadata
        • Automatically Caching Data
        • Explicitly Caching Data
        • Data Type Mapping
      • Query Processing
      • Logging
    • SQL Compliance
      • SELECT Statements
        • Aggregate Functions
        • JOIN Queries
        • Date Literal Functions
        • Projection Functions
        • Predicate Functions
      • SELECT INTO Statements
      • SQL Functions
        • STRING Functions
        • DATE Functions
        • MATH Functions
      • INSERT Statements
      • CACHE Statements
      • EXECUTE Statements
      • PIVOT and UNPIVOT
    • Data Model
      • System Tables
        • sys_catalogs
        • sys_schemas
        • sys_tables
        • sys_tablecolumns
        • sys_procedures
        • sys_procedureparameters
        • sys_keycolumns
        • sys_foreignkeys
        • sys_primarykeys
        • sys_indexes
        • sys_connection_props
        • sys_sqlinfo
        • sys_identity
    • Connection String Options
      • Authentication
        • AuthScheme
        • Server
        • Port
        • User
        • Password
        • ProtocolVersion
        • Database
        • TransportMode
      • Kerberos
        • KerberosKDC
        • KerberosRealm
        • KerberosSPN
        • KerberosKeytabFile
        • KerberosServiceRealm
        • KerberosServiceKDC
        • KerberosTicketCache
      • SSL
        • SSLClientCert
        • SSLClientCertType
        • SSLClientCertPassword
        • SSLClientCertSubject
        • SSLServerCert
      • Firewall
        • FirewallType
        • FirewallServer
        • FirewallPort
        • FirewallUser
        • FirewallPassword
      • Proxy
        • ProxyAutoDetect
        • ProxyServer
        • ProxyPort
        • ProxyAuthScheme
        • ProxyUser
        • ProxyPassword
        • ProxySSLType
        • ProxyExceptions
      • Logging
        • Logfile
        • Verbosity
        • LogModules
        • MaxLogFileSize
        • MaxLogFileCount
      • Schema
        • Location
        • BrowsableSchemas
        • Tables
        • Views
      • Caching
        • AutoCache
        • CacheLocation
        • CacheTolerance
        • Offline
        • CacheMetadata
      • Miscellaneous
        • HTTPPath
        • MaxRows
        • Other
        • Pagesize
        • PseudoColumns
        • QueryPassthrough
        • Readonly
        • RTK
        • Timeout
        • UserDefinedViews
        • UseSSL
    • Third Party Copyrights

CData Cloud

Overview

CData Cloud offers access to Apache Impala across several standard services and protocols, in a cloud-hosted solution. Any application that can connect to a MySQL or SQL Server database can connect to Apache Impala through CData Cloud.

CData Cloud allows you to standardize and configure connections to Apache Impala as though it were any other OData endpoint, or standard SQL Server/MySQL database.

Key Features

  • Full SQL Support: Apache Impala appears as standard relational databases, allowing you to perform operations - Filter, Group, Join, etc. - using standard SQL, regardless of whether these operations are supported by the underlying API.
  • CRUD Support: Both read and write operations are supported, restricted only by security settings that you can configure in Cloud or downstream in the source itself.
  • Secure Access: The administrator can create users and define their access to specific databases and read-only operations or grant full read & write privileges.
  • Comprehensive Data Model & Dynamic Discovery: CData Cloud provides comprehensive access to all of the data exposed in the underlying data source, including full access to dynamic data and easily searchable metadata.

CData Cloud

Getting Started

This page provides a guide to Establishing a Connection to Apache Impala in CData Cloud, as well as information on the available resources, and a reference to the available connection properties.

Connecting to Apache Impala

Establishing a Connection shows how to authenticate to Apache Impala and configure any necessary connection properties to create a database in CData Cloud

Accessing Data from CData Cloud Services

Accessing data from Apache Impala through the available standard services and CData Cloud administration is documented in further details in the CData Cloud Documentation.

CData Cloud

Establishing a Connection

Connect to Apache Impala by selecting the corresponding icon in the Database tab. Required properties are listed under Settings. The Advanced tab lists connection properties that are not typically required.

Connecting to Apache Impala

In order to connect to Apache Impala, set the following:

  • Server: The name or network address of the SQL Server instance.
  • Port: The port for the connection to the Impala Server instance.
  • ProtocolVersion: The Thrift protocol version to use when connecting to the Impala server.
  • Database (optional): A default database to use when one is not supplied in the SQL query. This enables using table names without having to specify database.tablename in the query.
  • Pagesize (optional): The number of results to pull per page from Apache Impala when selecting data.
  • QueryPassthrough (optional): Indicates if the query should be passed to Impala as-is.
  • UseSSL (optional): Set this to enable TLS/SSL.

    When QueryPassthrough is set to false (default), the CData ADO.NET Provider for Apache Impala will attempt to modify the query to conform to Impala required format.

Authenticating to Apache Impala

There are several ways to authenticate to Apache Impala including:

  • NoSasl
  • LDAP
  • Kerberos

NoSasl

When using NoSasl, no authentication is performed. It is used when you are connecting to a server from a trusted location such as a test machine on your local network. By default, NoSasl is as the default AuthScheme, so no additional connection properties need to be set.

LDAP

To authenticate with LDAP, set the following connection properties:

  • AuthScheme: Set this to LDAP.
  • User: Set this to user to login as.
  • Password: Set this to the password of the user.
To authenticate, set User, Password, and AuthScheme. If the LDAP server enables the Unauthenticated Authentication Mechanism of Simple Bind, the Password is optional instead of required.

Kerberos

Set the AuthScheme property to Kerberos. Please see Using Kerberos for details about how to authenticate with Kerberos.

CData Cloud

Using Kerberos

This section shows how to use the Cloud to authenticate using Kerberos.

Kerberos

To authenticate to Apache Impala using Kerberos, set the following properties:

  • KerberosKDC: Set this to the host name or IP Address of your Kerberos KDC machine.
  • KerberosSPN: Set this to the service and host of the Apache Impala Kerberos Principal. This is the value prior to the '@' symbol (for instance, ServiceName/MyHost) of the principal value (for instance, ServiceName/[email protected]).

Retrieve the Kerberos Ticket

You can use one of the following options to retrieve the required Kerberos ticket.

MIT Kerberos Credential Cache File

This option enables you to use the MIT Kerberos Ticket Manager or kinit command to get tickets. Note that you do not need to set the User or Password connection properties with this option.

  1. Ensure that you have an environment variable created called KRB5CCNAME.
  2. Set the KRB5CCNAME environment variable to a path pointing to your credential cache file (for instance, C:\krb_cache\krb5cc_0 or /tmp/krb5cc_0). This file is created when generating your ticket with MIT Kerberos Ticket Manager.
  3. To obtain a ticket, open the MIT Kerberos Ticket Manager application, click Get Ticket, enter your principal name and password, then click OK. If successful, ticket information appears in Kerberos Ticket Manager and is stored in the credential cache file.
  4. Now that you have created the credential cache file, the Cloud uses the cache file to obtain the Kerberos ticket to connect to Apache Impala.

As an alternative to setting the KRB5CCNAME environment variable, you can directly set the file path using the KerberosTicketCache property. When set, the Cloud uses the specified cache file to obtain the Kerberos ticket to connect to Apache Impala.

Keytab File

If the KRB5CCNAME environment variable has not been set, you can retrieve a Kerberos ticket using a Keytab File. To do so, set the User property to the desired username and set the KerberosKeytabFile property to a file path pointing to the keytab file associated with the user.

User and Password

If both the KRB5CCNAME environment variable and the KerberosKeytabFile property have not been set, you can retrieve a ticket using a user and password combination. To do this, set the User and Password properties to the user/password combination that you use to authenticate with Apache Impala.

Cross-Realm

More complex Kerberos environments may require cross-realm authentication where multiple realms and KDC servers are used (e.g., where one realm/KDC is used for user authentication and another realm/KDC is used for obtaining the service ticket).

In such an environment, set the KerberosRealm and KerberosKDC properties to the values required for user authentication. Also set the KerberosServiceRealm and KerberosServiceKDC properties to the values required to obtain the service ticket.

CData Cloud

Changelog

General Changes

DateBuild
Number
Change
Type
Description
12/14/20228383GeneralChanged
  • Added the Default column to the sys_procedureparameters table.
09/30/20228308GeneralChanged
  • Added the IsPath column to the sys_procedureparameters table.
08/17/20228264GeneralChanged
  • We now support handling the keyword "COLLATE" as standard function name as well.
09/16/20217929Apache ImpalaAdded
  • Added four connection properties SSLClientCert, SSLClientCertPassword, SSLClientCertSubject, SSLClientCertType.
09/02/20217915GeneralAdded
  • Added support for the STRING_SPLIT table-valued function in the CROSS APPLY clause.
08/07/20217889GeneralChanged
  • Added the KeySeq column to the sys_foreignkeys table.
08/06/20217888GeneralChanged
  • Added the new sys_primarykeys system table.
07/23/20217874GeneralChanged
  • Updated the Literal Function Names for relative date/datetime functions. Previously relative date/datetime functions resolved to a different value when used in the projection vs te predicate. Ie: SELECT LAST_MONTH() AS lm, Col FROM Table WHERE Col > LAST_MONTH(). Formerly the two LAST_MONTH() methods would resolve to different datetimes. Now they will match.
  • As a replacement for the previous behavior, the relative date/datetime functions in the criteria may have an 'L' appended to them. Ie: WHERE col > L_LAST_MONTH(). This will continue to resolve to the same values that previously were calculated in the criteria. Note that the "L_" prefix will only work in the predicate - it not available for the projection.
07/08/20217859GeneralAdded
  • Added the TCP Logging Module for the logging information happening on the TCP wire protocol. The transport bytes that are incoming and ongoing will be logged at verbosity=5.
04/23/20217785GeneralAdded
  • Added support for handling client side formulas during insert / update. For example: UPDATE Table SET Col1 = Concat(Col1, " - ", Col2) WHERE Col2 LIKE 'A%'
04/23/20217783GeneralChanged
  • Updated how display sizes are determined for varchar primary key and foreign key columns so they will match the reported length of the column.
04/16/20217776GeneralAdded
  • Non-conditional updates between two columns is now available to all drivers. For example: UPDATE Table SET Col1=Col2

Changed
  • Reduced the length to 255 for varchar primary key and foreign key columns.
  • Updated implicit and metadata caching to improve performance and support for multiple connections. Old metadata caches are not compatible - you would need to generate new metadata caches if you are currently using CacheMetadata.
  • Updated index naming convention to avoid duplicates
  • Updated and standardized Getting Started connection help.
  • Added the Advanced Features section to the help of all drivers.
  • Categorized connection property listings in the help for all editions.
04/15 /20217775GeneralChanged
  • Kerberos authentication is updated to use TCP by default, but will fall back to UDP if a TCP connection cannot be established

CData Cloud

Advanced Features

This section details a selection of advanced features of the Apache Impala Cloud.

User Defined Views

The Cloud allows you to define virtual tables, called user defined views, whose contents are decided by a pre-configured query. These views are useful when you cannot directly control queries being issued to the drivers. See User Defined Views for an overview of creating and configuring custom views.

SSL Configuration

Use SSL Configuration to adjust how Cloud handles TLS/SSL certificate negotiations. You can choose from various certificate formats; see the SSLServerCert property under "Connection String Options" for more information.

Firewall and Proxy

Configure the Cloud for compliance with Firewall and Proxy, including Windows proxies and HTTP proxies. You can also set up tunnel connections.

Query Processing

The Cloud offloads as much of the SELECT statement processing as possible to Apache Impala and then processes the rest of the query in memory (client-side).

See Query Processing for more information.

Logging

See Logging for an overview of configuration settings that can be used to refine CData logging. For basic logging, you only need to set two connection properties, but there are numerous features that support more refined logging, where you can select subsets of information to be logged using the LogModules connection property.

CData Cloud

User Defined Views

The CData Cloud allows you to define a virtual table whose contents are decided by a pre-configured query. These are called User Defined Views, which are useful in situations where you cannot directly control the query being issued to the driver, e.g. when using the driver from a tool. The User Defined Views can be used to define predicates that are always applied. If you specify additional predicates in the query to the view, they are combined with the query already defined as part of the view.

There are two ways to create user defined views:

  • Create a JSON-formatted configuration file defining the views you want.

Defining Views Using a Configuration File

User Defined Views are defined in a JSON-formatted configuration file called UserDefinedViews.json. The Cloud automatically detects the views specified in this file.

You can also have multiple view definitions and control them using the UserDefinedViews connection property. When you use this property, only the specified views are seen by the Cloud.

This User Defined View configuration file is formatted as follows:

  • Each root element defines the name of a view.
  • Each root element contains a child element, called query, which contains the custom SQL query for the view.

For example:

{
	"MyView": {
		"query": "SELECT * FROM [CData].[Default].Customers WHERE MyColumn = 'value'"
	},
	"MyView2": {
		"query": "SELECT * FROM MyTable WHERE Id IN (1,2,3)"
	}
}
Use the UserDefinedViews connection property to specify the location of your JSON configuration file. For example:
"UserDefinedViews", "C:\\Users\\yourusername\\Desktop\\tmp\\UserDefinedViews.json"

Schema for User Defined Views

User Defined Views are exposed in the UserViews schema by default. This is done to avoid the view's name clashing with an actual entity in the data model. You can change the name of the schema used for UserViews by setting the UserViewsSchemaName property.

Working with User Defined Views

For example, a SQL statement with a User Defined View called UserViews.RCustomers only lists customers in Raleigh:
SELECT * FROM Customers WHERE City = 'Raleigh';
An example of a query to the driver:
SELECT * FROM UserViews.RCustomers WHERE Status = 'Active';
Resulting in the effective query to the source:
SELECT * FROM Customers WHERE City = 'Raleigh' AND Status = 'Active';
That is a very simple example of a query to a User Defined View that is effectively a combination of the view query and the view definition. It is possible to compose these queries in much more complex patterns. All SQL operations are allowed in both queries and are combined when appropriate.

CData Cloud

SSL Configuration

Customizing the SSL Configuration

By default, the Cloud attempts to negotiate SSL/TLS by checking the server's certificate against the system's trusted certificate store.

To specify another certificate, see the SSLServerCert property for the available formats to do so.

CData Cloud

Firewall and Proxy

Connecting Through a Firewall or Proxy

HTTP Proxies

To connect through the Windows system proxy, you do not need to set any additional connection properties. To connect to other proxies, set ProxyAutoDetect to false.

In addition, to authenticate to an HTTP proxy, set ProxyAuthScheme, ProxyUser, and ProxyPassword, in addition to ProxyServer and ProxyPort.

Other Proxies

Set the following properties:

  • To use a proxy-based firewall, set FirewallType, FirewallServer, and FirewallPort.
  • To tunnel the connection, set FirewallType to TUNNEL.
  • To authenticate, specify FirewallUser and FirewallPassword.
  • To authenticate to a SOCKS proxy, additionally set FirewallType to SOCKS5.

CData Cloud

Caching Data

CData Cloud

Configuring the Cache Connection

CData Cloud

Caching Metadata

CData Cloud

Automatically Caching Data

CData Cloud

Explicitly Caching Data

CData Cloud

Data Type Mapping

CData Cloud

Query Processing

Query Processing

CData has a client-side SQL engine built into the Cloud library. This enables support for the full capabilities that SQL-92 offers, including filters, aggregations, functions, etc.

For sources that do not support SQL-92, the Cloud offloads as much of SQL statement processing as possible to Apache Impala and then processes the rest of the query in memory (client-side). This results in optimal performance.

For data sources with limited query capabilities, the Cloud handles transformations of the SQL query to make it simpler for the Cloud. The goal is to make smart decisions based on the query capabilities of the data source to push down as much of the computation as possible. The Apache Impala Query Evaluation component examines SQL queries and returns information indicating what parts of the query the Cloud is not capable of executing natively.

The Apache Impala Query Slicer component is used in more specific cases to separate a single query into multiple independent queries. The client-side Query Engine makes decisions about simplifying queries, breaking queries into multiple queries, and pushing down or computing aggregations on the client-side while minimizing the size of the result set.

There's a significant trade-off in evaluating queries, even partially, client-side. There are always queries that are impossible to execute efficiently in this model, and some can be particularly expensive to compute in this manner. CData always pushes down as much of the query as is feasible for the data source to generate the most efficient query possible and provide the most flexible query capabilities.

More Information

For a full discussion of how CData handles query processing, see CData Architecture: Query Execution.

CData Cloud

Logging

Capturing Cloud logging can be very helpful when diagnosing error messages or other unexpected behavior.

Basic Logging

You will simply need to set two connection properties to begin capturing Cloud logging.

  • Logfile: A filepath which designates the name and location of the log file.
  • Verbosity: This is a numerical value (1-5) that determines the amount of detail in the log. See the page in the Connection Properties section for an explanation of the five levels.
  • MaxLogFileSize: When the limit is hit, a new log is created in the same folder with the date and time appended to the end. The default limit is 100 MB. Values lower than 100 kB will use 100 kB as the value instead.
  • MaxLogFileCount: A string specifying the maximum file count of log files. When the limit is hit, a new log is created in the same folder with the date and time appended to the end and the oldest log file will be deleted. Minimum supported value is 2. A value of 0 or a negative value indicates no limit on the count.

Once this property is set, the Cloud will populate the log file as it carries out various tasks, such as when authentication is performed or queries are executed. If the specified file doesn't already exist, it will be created.

Log Verbosity

The verbosity level determines the amount of detail that the Cloud reports to the Logfile. Verbosity levels from 1 to 5 are supported. These are described in the following list:

1Setting Verbosity to 1 will log the query, the number of rows returned by it, the start of execution and the time taken, and any errors.
2Setting Verbosity to 2 will log everything included in Verbosity 1 and additional information about the request.
3Setting Verbosity to 3 will additionally log HTTP headers, as well as the body of the request and the response.
4Setting Verbosity to 4 will additionally log transport-level communication with the data source. This includes SSL negotiation.
5Setting Verbosity to 5 will additionally log communication with the data source and additional details that may be helpful in troubleshooting problems. This includes interface commands.

The Verbosity should not be set to greater than 1 for normal operation. Substantial amounts of data can be logged at higher verbosities, which can delay execution times.

To refine the logged content further by showing/hiding specific categories of information, see LogModules.

Sensitive Data

Verbosity levels 3 and higher may capture information that you do not want shared outside of your organization. The following lists information of concern for each level:

  • Verbosity 3: The full body of the request and the response, which includes all the data returned by the Cloud
  • Verbosity 4: SSL certificates
  • Verbosity 5: Any extra transfer data not included at Verbosity 3, such as non human-readable binary transfer data

Best Practices for Data Security

Although we mask sensitive values, such as passwords, in the connection string and any request in the log, it is always best practice to review the logs for any sensitive information before sharing outside your organization.

Advanced Logging

You may want to refine the exact information that is recorded to the log file. This can be accomplished using the LogModules property.

This property allows you to filter the logging using a semicolon-separated list of logging modules.

All modules are four characters long. Please note that modules containing three letters have a required trailing blank space. The available modules are:

  • EXEC: Query Execution. Includes execution messages for original SQL queries, parsed SQL queries, and normalized SQL queries. Query and page success/failure messages appear here as well.
  • INFO: General Information. Includes the connection string, driver version (build number), and initial connection messages.
  • HTTP: HTTP Protocol messages. Includes HTTP requests/responses (including POST messages), as well as Kerberos related messages.
  • SSL : SSL certificate messages.
  • OAUT: OAuth related failure/success messages.
  • SQL : Includes SQL transactions, SQL bulk transfer messages, and SQL result set messages.
  • META: Metadata cache and schema messages.
  • TCP : Incoming and Ongoing raw bytes on TCP transport layer messages.
An example value for this property would be.
LogModules=INFO;EXEC;SSL ;SQL ;META;

Note that these modules refine the information as it is pulled after taking the Verbosity into account.

CData Cloud

SQL Compliance

The CData Cloud supports several operations on data, including querying, deleting, modifying, and inserting.

SELECT Statements

See SELECT Statements for a syntax reference and examples.

See Data Model for information on the capabilities of the Apache Impala API.

INSERT Statements

See INSERT Statements for a syntax reference and examples.

EXECUTE Statements

Use EXECUTE or EXEC statements to execute stored procedures. See EXECUTE Statements for a syntax reference and examples.

Names and Quoting

  • Table and column names are considered identifier names; as such, they are restricted to the following characters: [A-Z, a-z, 0-9, _:@].
  • To use a table or column name with characters not listed above, the name must be quoted using square brackets ([name]) in any SQL statement.
  • Parameter names can optionally start with the @ symbol (e.g., @p1 or @CustomerName) and cannot be quoted.
  • Strings must be quoted using single quotes (e.g., 'John Doe').

CData Cloud

SELECT Statements

A SELECT statement can consist of the following basic clauses.

  • SELECT
  • INTO
  • FROM
  • JOIN
  • WHERE
  • GROUP BY
  • HAVING
  • UNION
  • ORDER BY
  • LIMIT

SELECT Syntax

The following syntax diagram outlines the syntax supported by the SQL engine of the Cloud:

SELECT {
  [ TOP <numeric_literal> | DISTINCT ]
  { 
    * 
    | { 
        <expression> [ [ AS ] <column_reference> ] 
        | { <table_name> | <correlation_name> } .* 
      } [ , ... ] 
  }
  [ INTO csv:// [ filename= ] <file_path> [ ;delimiter=tab ] ]
  { 
    FROM <table_reference> [ [ AS ] <identifier> ] 
  } [ , ... ]
  [ [  
      INNER | { { LEFT | RIGHT | FULL } [ OUTER ] } 
    ] JOIN <table_reference> [ ON <search_condition> ] [ [ AS ] <identifier> ] 
  ] [ ... ] 
  [ WHERE <search_condition> ]
  [ GROUP BY <column_reference> [ , ... ]
  [ HAVING <search_condition> ]
  [ UNION [ ALL ] <select_statement> ]
  [ 
    ORDER BY 
    <column_reference> [ ASC | DESC ] [ NULLS FIRST | NULLS LAST ]
  ]
  [ 
    LIMIT <expression>
    [ 
      { OFFSET | , }
      <expression> 
    ]
  ] 
} | SCOPE_IDENTITY() 

<expression> ::=
  | <column_reference>
  | @ <parameter> 
  | ?
  | COUNT( * | { [ DISTINCT ] <expression> } )
  | { AVG | MAX | MIN | SUM | COUNT } ( <expression> ) 
  | NULLIF ( <expression> , <expression> ) 
  | COALESCE ( <expression> , ... ) 
  | CASE <expression>
      WHEN { <expression> | <search_condition> } THEN { <expression> | NULL } [ ... ]
    [ ELSE { <expression> | NULL } ]
    END 
  | <literal>
  | <sql_function> 

<search_condition> ::= 
  {
    <expression> { = | > | < | >= | <= | <> | != | LIKE | NOT LIKE | IN | NOT IN | IS NULL | IS NOT NULL | AND | OR | CONTAINS | BETWEEN } [ <expression> ]
  } [ { AND | OR } ... ] 

Examples

  1. Return all columns:
    SELECT * FROM [CData].[Default].Customers
  2. Rename a column:
    SELECT [CompanyName] AS MY_CompanyName FROM [CData].[Default].Customers
  3. Cast a column's data as a different data type:
    SELECT CAST(Balance AS VARCHAR) AS Str_Balance FROM [CData].[Default].Customers
  4. Search data:
    SELECT * FROM [CData].[Default].Customers WHERE Country = 'US'
  5. Return the number of items matching the query criteria:
    SELECT COUNT(*) AS MyCount FROM [CData].[Default].Customers 
  6. Return the number of unique items matching the query criteria:
    SELECT COUNT(DISTINCT CompanyName) FROM [CData].[Default].Customers 
  7. Return the unique items matching the query criteria:
    SELECT DISTINCT CompanyName FROM [CData].[Default].Customers 
  8. Summarize data:
    SELECT CompanyName, MAX(Balance) FROM [CData].[Default].Customers GROUP BY CompanyName
    See Aggregate Functions for details.
  9. Retrieve data from multiple tables.
    SELECT Customers.ContactName, Orders.OrderDate FROM Customers, Orders WHERE Customers.CustomerId=Orders.CustomerId
    See JOIN Queries for details.
  10. Sort a result set in ascending order:
    SELECT City, CompanyName FROM [CData].[Default].Customers  ORDER BY CompanyName ASC
  11. Restrict a result set to the specified number of rows:
    SELECT City, CompanyName FROM [CData].[Default].Customers LIMIT 10 
  12. Parameterize a query to pass in inputs at execution time. This enables you to create prepared statements and mitigate SQL injection attacks.
    SELECT * FROM [CData].[Default].Customers WHERE Country = @param

Pseudo Columns

Some input-only fields are available in SELECT statements. These fields, called pseudo columns, do not appear as regular columns in the results, yet may be specified as part of the WHERE clause. You can use pseudo columns to access additional features from Apache Impala.

    SELECT * FROM [CData].[Default].Customers WHERE MyPseudocolumn = 'MyValue'
    

CData Cloud

Aggregate Functions

COUNT

Returns the number of rows matching the query criteria.

SELECT COUNT(*) FROM [CData].[Default].Customers WHERE Country = 'US'

COUNT(DISTINCT)

Returns the number of distinct, non-null field values matching the query criteria.

SELECT COUNT(DISTINCT City) AS DistinctValues FROM [CData].[Default].Customers WHERE Country = 'US'

AVG

Returns the average of the column values.

SELECT CompanyName, AVG(Balance) FROM [CData].[Default].Customers WHERE Country = 'US'  GROUP BY CompanyName

MIN

Returns the minimum column value.

SELECT MIN(Balance), CompanyName FROM [CData].[Default].Customers WHERE Country = 'US' GROUP BY CompanyName

MAX

Returns the maximum column value.

SELECT CompanyName, MAX(Balance) FROM [CData].[Default].Customers WHERE Country = 'US' GROUP BY CompanyName

SUM

Returns the total sum of the column values.

SELECT SUM(Balance) FROM [CData].[Default].Customers WHERE Country = 'US'

CData Cloud

JOIN Queries

The CData Cloud supports standard SQL joins like the following examples.

Inner Join

An inner join selects only rows from both tables that match the join condition:

SELECT Customers.ContactName, Orders.OrderDate FROM Customers, Orders WHERE Customers.CustomerId=Orders.CustomerId

Left Join

A left join selects all rows in the FROM table and only matching rows in the JOIN table:

SELECT Customers.ContactName, Orders.OrderDate FROM Customers LEFT OUTER JOIN Orders ON Customers.CustomerId=Orders.CustomerId

CData Cloud

Date Literal Functions

The following date literal functions can be used to filter date fields using relative intervals. Note that while the <, >, and = operators are supported for these functions, <= and >= are not.

L_TODAY()

The current day.

  SELECT * FROM MyTable WHERE MyDateField = L_TODAY()

L_YESTERDAY()

The previous day.

  SELECT * FROM MyTable WHERE MyDateField = L_YESTERDAY()

L_TOMORROW()

The following day.

  SELECT * FROM MyTable WHERE MyDateField = L_TOMORROW()

L_LAST_WEEK()

Every day in the preceding week.

  SELECT * FROM MyTable WHERE MyDateField = L_LAST_WEEK()

L_THIS_WEEK()

Every day in the current week.

  SELECT * FROM MyTable WHERE MyDateField = L_THIS_WEEK()

L_NEXT_WEEK()

Every day in the following week.

  SELECT * FROM MyTable WHERE MyDateField = L_NEXT_WEEK()
Also available:
  • L_LAST/L_THIS/L_NEXT MONTH
  • L_LAST/L_THIS/L_NEXT QUARTER
  • L_LAST/L_THIS/L_NEXT YEAR

L_LAST_N_DAYS(n)

The previous n days, excluding the current day.

  SELECT * FROM MyTable WHERE MyDateField = L_LAST_N_DAYS(3)

L_NEXT_N_DAYS(n)

The following n days, including the current day.

  SELECT * FROM MyTable WHERE MyDateField = L_NEXT_N_DAYS(3)
Also available:
  • L_LAST/L_NEXT_90_DAYS

L_LAST_N_WEEKS(n)

Every day in every week, starting n weeks before current week, and ending in the previous week.

  SELECT * FROM MyTable WHERE MyDateField = L_LAST_N_WEEKS(3)

L_NEXT_N_WEEKS(n)

Every day in every week, starting the following week, and ending n weeks in the future.

  SELECT * FROM MyTable WHERE MyDateField = L_NEXT_N_WEEKS(3)
Also available:
  • L_LAST/L_NEXT_N_MONTHS(n)
  • L_LAST/L_NEXT_N_QUARTERS(n)
  • L_LAST/L_NEXT_N_YEARS(n)

CData Cloud

Projection Functions

ROUND(expr [, d])

Returns expr rounded to d decimal places using HALF_UP rounding mode.

  • expr: Any numeric expression.
  • d: The number of decimal places.

BROUND(expr [, d])

Returns expr rounded to d decimal places using HALF_EVEN rounding mode.

  • expr: Any numeric expression.
  • d: The number of decimal places.

FLOOR(expr)

Returns the largest integer not greater than expr.

  • expr: Any numeric expression.

CEIL(expr)

Returns the smallest integer not smaller than expr.

  • expr: Any numeric expression.

RAND([seed])

Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).

  • seed: The seed to use to generate the random value.

EXP(expr)

Returns e to the power of expr.

  • expr: Any numeric expression.

LN(expr)

Returns the natural logarithm (base e) of expr.

  • expr: Any numeric expression.

LOG10(expr)

Returns the logarithm of expr with base 10.

  • expr: Any numeric expression.

LOG2(expr)

Returns the logarithm of expr with base 2.

  • expr: Any numeric expression.

LOG(base, expr)

Returns the logarithm of expr with base.

  • base: A numeric expression to use as the base.
  • expr: Any numeric expression.

POW(expr1, expr2)

Raises expr1 to the power of expr2.

  • expr1: Any numeric expression.
  • expr2: Any numeric expression.

SQRT(expr)

Returns the square root of expr.

  • expr: Any numeric expression.

BIN(expr)

Returns the string representation of the long value expr represented in binary.

  • expr: A long expression.

HEX(expr)

Converts expr to hexadecimal.

  • expr: The expression to convert to hex.

UNHEX(expr)

Converts hexadecimal expr to binary.

  • expr: The hexadecimal value to convert to binary.

CONV(num, from_base, to_base)

Convert num from from_base to to_base.

  • num: The number to convert.
  • from_base: The original base of num.
  • to_base: The base to convert num to.

ABS(expr)

Returns the absolute value of the numeric value.

  • expr: Any valid numeric expression.

PMOD(expr1, expr2)

Returns the positive value of expr1 mod expr2.

  • expr1: Any valid numeric expression.
  • expr2: Any valid numeric expression.

SIN(expr)

Returns the sine of expr, as if computed by java.lang.Math.sin.

  • expr: Any valid numeric expression.

ASIN(expr)

Returns the inverse sine (a.k.a. arc sine) the arc sin of expr, as if computed by java.lang.Math.asin.

  • expr: Any valid numeric expression.

COS(expr)

Returns the cosine of expr, as if computed by java.lang.Math.cos.

  • expr: Any valid numeric expression.

ACOS(expr)

Returns the inverse cosine (a.k.a. arc cosine) of expr, as if computed by java.lang.Math.acos.

  • expr: Any valid numeric expression.

TAN(expr)

Returns the tangent of expr, as if computed by java.lang.Math.tan.

  • expr: Any valid numeric expression.

ATAN(expr)

Returns the inverse tangent (a.k.a. arc tangent) of expr, as if computed by java.lang.Math.atan

  • expr: Any valid numeric expression.

DEGREES(expr)

Converts radians to degrees.

  • expr: Any valid numeric expression.

RADIANS(expr)

Converts degrees to radians.

  • expr: Any valid numeric expression.

POSITIVE(expr)

Returns the postive value of expr.

  • expr: Any valid numeric expression.

NEGATIVE(expr)

Returns the negated value of expr.

  • expr: Any valid numeric expression.

SIGN(expr)

Returns -1.0, 0.0 or 1.0 as expr is negative, 0 or positive.

  • expr: Any valid numeric expression.

E()

Returns Euler's number, e.

PI()

Returns pi.

FACTORIAL(expr)

Returns the factorial of expr. expr is [0..20]. Otherwise, null.

  • expr: A numeric expression.

CBRT(expr)

Returns the cube root of expr.

  • expr: Any valid numeric expression.

SHIFTELFT(base, shift)

Bitwise left shift.

  • base: The base number to shift.
  • shift: The number of bits to shift.

SHIFTRIGHT(base, shift)

Bitwise right shift.

  • base: The base number to shift.
  • shift: The number of bits to shift.

SHIFTRIGHTUNSIGNED(base, shift)

Bitwise unsigned right shift.

  • base: The base number to shift.
  • shift: The number of bits to shift.

GREATEST(expr1, expr2 [, expr3] [, ...])

Returns the greatest value of all parameters, skipping null values.

  • expr1: Any valid expression.
  • expr2: Any valid expression.
  • expr3: Any valid expression.

LEAST(expr1, expr2 [, expr3] [, ...])

Returns the least value of all parameters, skipping null values.

  • expr1: Any valid expression.
  • expr2: Any valid expression.
  • expr3: Any valid expression.

WIDTH_BUCKET(expr, min_value, max_value, num_buckets)

Returns an integer between 0 and num_buckets+1 by mapping expr into the ith equally sized bucket. Buckets are made by dividing [min_value, max_value] into equally sized regions. If expr < min_value, return 1, if expr > max_value return num_buckets+1.

  • expr: A valid numeric expression.
  • min_value: The minimum value.
  • max_value: The maximum value.
  • num_buckets: The number of buckets.

SIZE(expr)

Returns the size of an array or a map. Returns -1 if null.

  • expr: Any valid expression.

MAP_KEYS(map)

Returns an unordered array containing the keys of the map.

  • map: A valid map expression.

MAP_VALUES(map)

Returns an unordered array containing the values of the map.

  • map: A valid map expression.

ARRAY_CONTAINS(array, expr)

Returns true if the array contains the value.

  • array: The array to search.
  • expr: The expression to search for.

SORT_ARRAY(array [, ascendingOrder])

Sorts the input array in ascending or descending order according to the natural ordering of the array elements.

  • array: The array to sort.
  • order: Identifies whether to sort in ascending order.

BINARY(expr)

Casts the value expr to the target data type binary.

  • expr: The expression to cast.

CAST(expr AS type)

Casts the value expr to the target data type type.

  • expr: Any valid expression.
  • type: The type to cast expr to.

FROM_UNIXTIME(unixtime [, format])

  • unixtime: Unix time.
  • format: The format to convert unixtime to.

UNIX_TIMESTAMP([expr [, pattern]])

Returns the UNIX timestamp of the given time.

  • expr: The time string to convert.
  • format: The format of expr.

TO_DATE(date_str [, fmt])

Parses the date_str expression with the fmt expression to a date. Returns null with invalid input. By default, it follows casting rules to a date if the fmt is omitted.

  • date_str: The date string expression.
  • fmt: The format of date_str.

YEAR(date)

Returns the year component of the date/timestamp.

  • date: The date to extract the year from.

QUARTER(date)

Returns the quarter of the year for date, in the range 1 to 4.

  • date: The date to extract the quarter from.

MONTH(date)

Returns the month component of the date/timestamp.

  • date: The date to extract the month from.

DAY(date)

Returns the day of month of the date/timestamp.

  • date: The date to extract the day from.

HOUR(timestamp)

Returns the hour component of the string/timestamp.

  • timestamp: The timestamp to extract the hours from.

MINUTE(timestamp)

Returns the minute component of the string/timestamp.

  • timestamp: The timestamp to extract the minutes from.

SECOND(timestamp)

Returns the second component of the string/timestamp.

  • timestamp: The timestamp to extract the seconds from.

WEEKOFYEAR(date)

Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.

  • date: The date to extract the week of the year from.

DATEDIFF(endDate, startDate)

Returns the number of days from startDate to endDate.

  • endDate: The end date.
  • startDate: The start date.

DATE_ADD(start_date, num_days)

Returns the date that is num_days after start_date.

  • start_date: The start date.
  • num_days: The number of days to add to start_date.

DATE_SUB(start_date, num_days)

Returns the date that is num_days before start_date.

  • start_date: The start date.
  • num_days: The number of days to subtract from start_date.

FROM_UTC_TIMESTAMP(timestamp, timezone)

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.

  • timestamp: The UTC timestamp.
  • timezone: The timezone to convert to.

TO_UTC_TIMESTAMP(timestamp, timezone)

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.

  • timestamp: The timestamp to convert to UTC.
  • timezone: The timezone of timestamp.

CURRENT_DATE()

Returns the current date at the start of query evaluation.

CURRENT_TIMESTAMP()

Returns the current timestamp at the start of query evaluation.

ADD_MONTHS(start_date, num_months [, fmt])

Returns the date that is num_months after start_date.

  • start_date: The starting date.
  • num_months: The number of months to add.
  • fmt: The output format.

LAST_DAY(date)

Returns the last day of the month which the date belongs to.

  • date: A valid date expression.

NEXT_DAY(start_date, day_of_week)

Returns the first date which is later than start_date and named as indicated.

  • start_date: The start date.
  • day_of_week: The day of week.

TRUNC(date, time_unit)

Returns date with the time portion of the day truncated to the unit specified by the format model fmt. fmt should be one of ["year", "yyyy", "yy", "mon", "month", "mm"]

  • date: A valid date expression.
  • time_unit: The time unit.

MONTHS_BETWEEN(timestamp1, timestamp2)

Returns number of months between timestamp1 and timestamp2.

  • timestamp1: A valid timestamp expression.
  • timestamp2: A valid timestamp expression.

DATE_FORMAT(timestamp, fmt)

Converts timestamp to a value of string in the format specified by the date format fmt.

  • timestamp: A valid timestamp expression.
  • fmt: A valid date format.

IF(expr1, expr2, expr3)

If expr1 evaluates to true, then returns expr2; otherwise returns expr3.

  • expr1: An expression that should evaluate to a boolean value.
  • expr2: A valid expression.
  • expr3: A valid expression.

ISNULL(expr)

Returns true if expr is null, or false otherwise.

  • expr: A valid expression.

ISNOTNULL(expr)

Returns true if expr is not null, or false otherwise.

  • expr: A valid expression.

NVL(expr1, expr2)

Returns expr1 if it's not NaN, or expr2 otherwise.

  • expr1: A valid expression.
  • expr2: A valid expression.

COALESCE(expr1, expr2 [, expr3] [, ...])

Returns the first non-null argument if exists. Otherwise, null.

  • expr1: A valid expression.
  • expr2: A valid expression.
  • expr3: A valid expression.

NULLIF(expr1, expr2)

Returns null if expr1 equals to expr2, or expr1 otherwise.

  • expr1: A valid expression.
  • expr2: A valid expression.

ASSERT_TRUE(expr)

Throws an exception if expr is not true.

  • expr: A valid expression that evaluates to a boolean.

ASCII(str)

Returns the numeric value of the first character of str.

  • str: A string expression.

BASE64(bin)

Converts the argument from a binary bin to a base 64 string.

  • bin: A binary expression.

CHAR_LENGTH(str)

Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.

  • str: A string expression.

CHR(expr)

Returns the ASCII character having the binary equivalent to expr. If n is larger than 256 the result is equivalent to chr(n % 256)

  • expr: A integer expression.

CONCAT(str1, str2 [, str3] [, ...])

  • str1: A valid string expression.
  • str2: A valid string expression.
  • str3: A valid string expression.

CONCAT_WS(sep [, exp1] [, ...])

Returns the concatenation of the strings separated by sep.

  • set: A string separator.
  • exp1: A valid expression.

DECODE(bin, charset)

Decodes the first argument using the second argument character set.

  • bin: The binary expression to decode.
  • charset: The charset to use to decode bin.

ELT(n, input1 [, input2] [, ...])

Returns the n-th input, e.g., returns input2 when n is 2.

  • n: A valid integer index.
  • input1: A valid string expression.
  • input3: A valid string expression.

ENCODE(str, charset)

Encodes the first argument using the second argument character set.

  • str: A string expression to encode.
  • charset: The charset to use to encode str.

FIELD(val1, val2 [, val3] [, ...])

Returns the index of val in the val1,val2,val3,... list or 0 if not found. For example field('world','say','hello','world') returns 3. All primitive types are supported, arguments are compared using str.equals(x). If val is NULL, the return value is 0.

  • val1: A valid expression.
  • val2: A valid expression.
  • val3: A valid expression.

FIND_IN_SET(str, str_array)

Returns the index (1-based) of the given string (str) in the comma-delimited list (str_array). Returns 0, if the string was not found or if the given string (str) contains a comma.

  • str: The string expression to search for.
  • str_array: A comma-delimited list of values.

FORMAT_NUMBER(expr1, expr2)

Formats the number expr1 like '#,###,###.##', rounded to expr2 decimal places. If expr2 is 0, the result has no decimal point or fractional part. This is supposed to function like MySQL's FORMAT.

  • expr1: A numeric expression to format.
  • expr2: The number of deciml places.

GET_JSON_OBJECT(json_txt, path)

Extracts a json object from path.

  • json_txt: JSON data.
  • path: The path to extract.

IN_FILE(str, filename)

Returns true if the string str appears as an entire line in filename.

  • str: The string to search for.
  • filename: The name of the file to search.

INSTR(str, substr)

Returns the (1-based) index of the first occurrence of substr in str.

  • str: A string expression.
  • substr: The string expression to search for.

LENGTH(expr)

Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.

  • expr: A string expression.

LOCATE(substr, str [, pos])

Returns the position of the first occurrence of substr in str after position pos. The given pos and return value are 1-based.

  • substr: The string expression to search for.
  • str: The string expression to search in.
  • pos: The starting index.

LOWER(expr)

Returns str with all characters changed to lowercase.

  • expr: A string expression.

LPAD(str, len, pad_str)

Returns str, left-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters.

  • str: A string expression.
  • len: The length to pad.
  • pad_str: The pad string.

LTRIM(str)

Removes the leading space characters from str.

  • str: A string expression.

OCTET_LENGTH(expr)

Returns the byte length of expr or number of bytes in binary data.

  • expr: Any string expression.

PARSE_URL(url, partToExtract [, key])

Returns the specified part from the URL. For example, parse_url('http://facebook.com/path1/p.php?k1=v1#Ref1', 'HOST') returns 'facebook.com'. Also a value of a particular key in QUERY can be extracted by providing the key as the third argument, for example, parse_url('http://facebook.com/path1/p.php?k1=v1#Ref1', 'QUERY', 'k1') returns 'v1'.

  • url: A valid URL expression.
  • partToExtract: The URL part to extract. Valid values for partToExtract include HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and USERINFO.
  • key: The key.

PRINTF(strfmt [, obj1] [, ...])

Returns a formatted string from printf-style format strings.

  • strfmt: The string format.
  • obj1: The object to include in the formatted string.

REGEXP_EXTRACT(str, regexp [, idx])

Extracts a group that matches regexp.

  • str: A string expression.
  • regexp: A regular expression to search for.
  • idx: The starting index.

REGEXP_REPLACE(str, regexp, rep)

Replaces all substring of str that match regexp with rep.

  • str: A string expression.
  • regexp: A regular expression to search for.
  • rep: The string to replace.

REPEAT(str, n)

Returns the string which repeats the given string value n times.

  • str: The string expression to repeat.
  • n: The number of times to repeat str.

REPLACE(str, search [, replace])

Replaces all occurrences of search with replace. If search is not found in str, str is returned unchanged. If replace is not specified or is an empty string, nothing replaces the string that is removed from str.

  • str: A string expression.
  • search: The search string.
  • replace: A string expression to replace search values.

REVERSE(str)

Returns the reversed given string.

  • str: A string expression.

RPAD(str, len, pad_str)

Returns str, right-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters.

  • str: A string expression.
  • len: The length to pad.
  • pad_str: The pad string.

RTRIM(str)

Removes the trailing space characters from str.

  • str: A string expression.

SENTENCES(str [, lang, country])

Splits str into an array of array of words.

  • str: A string expression.
  • lang: The language of str.
  • country: The country of the specified language.

SPACE(n)

Returns a string consisting of n spaces.

  • n: The number of spaces.

SPLIT(str, regex)

Splits str around occurrences that match regex.

  • str: A string expression.
  • regex: The regular expression to match.

STR_TO_MAP(text [, pairDelim [, keyValueDelim]])

Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' for pairDelim and ':' for keyValueDelim.

  • text: A string expression.
  • pairDelim: The pair delimiter.
  • keyValueDelim: The value delimiter.

SUBSTR(str, pos [, len])

Returns the substring of str that starts at pos and is of length len, or the slice of byte array that starts at pos and is of length len.

  • str: A string expression.
  • pos: The starting position.
  • len: The length of the string.

SUBSTRING_INDEX(str, delim, count)

Returns the substring from str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. The function substring_index performs a case-sensitive match when searching for delim.

  • str: A string expression.
  • delim: The delimiter.
  • count: Total number of occurrences.

TRANSLATE(input, from, to)

Translates the input string by replacing the characters present in the from string with the corresponding characters in the to string.

  • input: A string expression.
  • from: A string expression.
  • to: A string expression.

TRIM(str)

Removes the leading and trailing space characters from str.

  • str: A string expression.

UNBASE64(str)

Converts the argument from a base 64 string str to a binary.

  • str: A string expression.

UPPER(str)

Returns str with all characters changed to uppercase.

  • str: A string expression.

INITCAP(str)

Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space.

  • str: A string expression.

LEVENSHTEIN(str1, str2)

Returns the Levenshtein distance between the two given strings.

  • str1: A string expression.
  • str2: A string expression.

SOUNDEX(str)

Returns Soundex code of the string.

  • str: A string expression.

MASK(str [, upper [, lower [, number]]])

Returns a masked version of str. By default, upper case letters are converted to "X", lower case letters are converted to "x" and numbers are converted to "n". For example mask("abcd-EFGH-8765-4321") results in xxxx-XXXX-nnnn-nnnn. You can override the characters used in the mask by supplying additional arguments: the second argument controls the mask character for upper case letters, the third argument for lower case letters and the fourth argument for numbers. For example, mask("abcd-EFGH-8765-4321", "U", "l", "#") results in llll-UUUU-####-####.

  • str: The string to mask.
  • upper: The character to mask for uppercase letters.
  • lower: The character to mask for lowercase letters.
  • number: The character to mask for numbers.

MASK_FIRST_N(str [, n])

Returns a masked version of str with the first n values masked. Upper case letters are converted to "X", lower case letters are converted to "x" and numbers are converted to "n". For example, mask_first_n("1234-5678-8765-4321", 4) results in nnnn-5678-8765-4321.

  • str: The string to mask.
  • n: The number of values to mask.

MASK_LAST_N(str [, n])

Returns a masked version of str with the last n values masked. Upper case letters are converted to "X", lower case letters are converted to "x" and numbers are converted to "n". For example, mask_last_n("1234-5678-8765-4321", 4) results in 1234-5678-8765-nnnn.

  • str: The string to mask.
  • n: The number of values to mask.

MASK_SHOW_FIRST_N(str [, n])

Returns a masked version of str, showing the first n characters unmasked. Upper case letters are converted to "X", lower case letters are converted to "x" and numbers are converted to "n". For example, mask_show_first_n("1234-5678-8765-4321", 4) results in 1234-nnnn-nnnn-nnnn.

  • str: The string to mask.
  • n: The number of values to mask.

MASK_SHOW_LAST_N(str [, n])

Returns a masked version of str, showing the last n characters unmasked. Upper case letters are converted to "X", lower case letters are converted to "x" and numbers are converted to "n". For example, mask_show_last_n("1234-5678-8765-4321", 4) results in nnnn-nnnn-nnnn-4321.

  • str: The string to mask.
  • n: The number of values to mask.

MASK_HASH(str)

Returns a hashed value based on str. The hash is consistent and can be used to join masked values together across tables. This function returns null for non-string types.

  • str: The string to mask.

JAVA_METHOD(class, method [, arg1] [, ...])

Calls a method with reflection.

  • class: The class to call.
  • method: The method to call.
  • arg1: The argument to pass in.

REFLECT(class, method [, arg1] [, ...])

Calls a method with reflection.

  • class: The class to call.
  • method: The method to call.
  • arg1: The argument to pass in.

HASH(expr1 [, expr2] [, ...])

Returns a hash value of the arguments.

  • expr1: A valid expression.
  • expr2: A valid expression.

CURRENT_USER()

Returns current user name from the configured authenticator manager. Could be the same as the user provided when connecting, but with some authentication managers (for example HadoopDefaultAuthenticator) it could be different.

LOGGED_IN_USER()

Returns current user name from the session state. This is the username provided when connecting to Impala.

CURRENT_DATABASE()

Returns current database name.

MD5(expr)

Returns an MD5 128-bit checksum as a hex string of expr.

  • expr: A valid expression.

SHA1(expr)

Returns a sha1 hash value as a hex string of the expr.

  • expr: A valid expression.

CRC32(expr)

Returns a cyclic redundancy check value of the expr as a bigint.

  • expr: A valid expression.

SHA2(expr, bitlength)

Returns a checksum of SHA-2 family as a hex string of expr. SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 0 is equivalent to 256.

  • expr: A valid expression.
  • bitlength: The bit length.

AES_ENCRYPT(input, key)

Encrypt input using AES. Key lengths of 128, 192 or 256 bits can be used. 192 and 256 bits keys can be used if Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files are installed. If either argument is NULL or the key length is not one of the permitted values, the return value is NULL. Example: base64(aes_encrypt('ABC', '1234567890123456')) = 'y6Ss+zCYObpCbgfWfyNWTw=='.

  • input: The input value to encrypt.
  • key: The key to use when encrypting.

VERSION()

Returns the Impala version. The string contains 2 fields, the first being a build number and the second being a build hash. Example: "select version();" might return "2.1.0.2.5.0.0-1245 r027527b9c5ce1a3d7d0b6d2e6de2378fb0c39232". Actual results will depend on your build.

COUNT(DISTINCT expr1 [, expr2] [, ...])

Returns the number of rows for which the supplied expression(s) are unique and non-null.

  • expr1: A valid expression.
  • expr2: A valid expression.

SUM(expr)

Returns the sum calculated from values of a group.

  • expr: A valid expression.

SUM(DISTINCT expr)

Returns the sum calculated from distinct values of a group.

  • expr: A valid expression.

AVG(expr)

Returns the mean calculated from values of a group.

  • expr: A valid expression.

AVG(DISTINCT expr)

Returns the mean calculated from distinct values of a group.

  • expr: A valid expression.

MIN(expr)

Returns the minimum value of expr.

  • expr: A valid expression.

MAX(expr)

Returns the maximum value of expr.

  • expr: A valid expression.

VARIANCE(expr)

Returns the sample variance calculated from values of a group.

  • expr: A valid expression.

STDDEV_POP(expr)

Returns the population standard deviation calculated from values of a group.

  • expr: A valid expression.

STDDEV_SAMP(expr)

Returns the sample standard deviation calculated from values of a group.

  • expr: A valid expression.

COVAR_POP(expr1, expr2)

Returns the population covariance of a set of number pairs.

  • expr1: A valid expression.
  • expr2: A valid expression.

COVAR_SAMP(expr1, expr2)

Returns the sample covariance of a set of number pairs.

  • expr1: A valid expression.
  • expr2: A valid expression.

CORR(expr1, expr2)

Returns Pearson coefficient of correlation between a set of number pairs.

  • expr1: A valid expression.
  • expr2: A valid expression.

PERCENTILE(col, percentage [, accuracy])

Returns the exact percentile value of numeric column col at the given percentage. The value of percentage must be between 0.0 and 1.0. The value of frequency should be positive integral

  • col: A numeric expression.
  • percentage: The percentage.
  • accuracy: The accuracy to control approximation.

PERCENTILE_APPROX(col, percentage [, accuracy])

Returns the approximate percentile value of numeric column col at the given percentage. The value of percentage must be between 0.0 and 1.0. The accuracy parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column col at the given percentage array.

  • col: A numeric expression.
  • percentage: The percentage.
  • accuracy: The accuracy to control approximation.

COLLECT_SET(expr)

Collects and returns a set of unique elements.

  • expr: A valid expression.

COLLECT_LIST(expr)

Collects and returns a set of unique elements.

  • expr: A valid expression.

NTILE(n)

Divides the rows for each window partition into n buckets ranging from 1 to at most n.

  • n: The number of buckets.

EXPLODE(expr)

Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns.

  • expr: A valid expression.

POSEXPLODE(expr)

Separates the elements of array expr into multiple rows with positions, or the elements of map expr into multiple rows and columns with positions.

  • expr: A valid expression.

INLINE(expr)

Explodes an array of structs into a table.

  • expr: A valid expression.

STACK(n, expr1 [, expr2 ] [, ...])

Separates expr1, ..., exprk into n rows.

  • n: The number of rows.
  • expr1: A valid expression.
  • expr2: A valid expression.

PARSE_URL_TUPLE(urlStr, p1 [, p2 ] [, ...])

Takes URL string and a set of n URL parts, and returns a tuple of n values. This is similar to the parse_url() UDF but can extract multiple parts at once out of a URL. Valid part names are: HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, USERINFO, QUERY:[KEY].

  • urlStr: A valid URL string.
  • p1: A valid part name.
  • p2: A valid part name.

CData Cloud

Predicate Functions

ROUND(expr [, d])

Returns expr rounded to d decimal places using HALF_UP rounding mode.

  • expr: Any numeric expression.
  • d: The number of decimal places.

BROUND(expr [, d])

Returns expr rounded to d decimal places using HALF_EVEN rounding mode.

  • expr: Any numeric expression.
  • d: The number of decimal places.

FLOOR(expr)

Returns the largest integer not greater than expr.

  • expr: Any numeric expression.

CEIL(expr)

Returns the smallest integer not smaller than expr.

  • expr: Any numeric expression.

RAND([seed])

Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).

  • seed: The seed to use to generate the random value.

EXP(expr)

Returns e to the power of expr.

  • expr: Any numeric expression.

LN(expr)

Returns the natural logarithm (base e) of expr.

  • expr: Any numeric expression.

LOG10(expr)

Returns the logarithm of expr with base 10.

  • expr: Any numeric expression.

LOG2(expr)

Returns the logarithm of expr with base 2.

  • expr: Any numeric expression.

LOG(base, expr)

Returns the logarithm of expr with base.

  • base: A numeric expression to use as the base.
  • expr: Any numeric expression.

POW(expr1, expr2)

Raises expr1 to the power of expr2.

  • expr1: Any numeric expression.
  • expr2: Any numeric expression.

SQRT(expr)

Returns the square root of expr.

  • expr: Any numeric expression.

BIN(expr)

Returns the string representation of the long value expr represented in binary.

  • expr: A long expression.

HEX(expr)

Converts expr to hexadecimal.

  • expr: The expression to convert to hex.

UNHEX(expr)

Converts hexadecimal expr to binary.

  • expr: The hexadecimal value to convert to binary.

CONV(num, from_base, to_base)

Convert num from from_base to to_base.

  • num: The number to convert.
  • from_base: The original base of num.
  • to_base: The base to convert num to.

ABS(expr)

Returns the absolute value of the numeric value.

  • expr: Any valid numeric expression.

PMOD(expr1, expr2)

Returns the positive value of expr1 mod expr2.

  • expr1: Any valid numeric expression.
  • expr2: Any valid numeric expression.

SIN(expr)

Returns the sine of expr, as if computed by java.lang.Math.sin.

  • expr: Any valid numeric expression.

ASIN(expr)

Returns the inverse sine (a.k.a. arc sine) the arc sin of expr, as if computed by java.lang.Math.asin.

  • expr: Any valid numeric expression.

COS(expr)

Returns the cosine of expr, as if computed by java.lang.Math.cos.

  • expr: Any valid numeric expression.

ACOS(expr)

Returns the inverse cosine (a.k.a. arc cosine) of expr, as if computed by java.lang.Math.acos.

  • expr: Any valid numeric expression.

TAN(expr)

Returns the tangent of expr, as if computed by java.lang.Math.tan.

  • expr: Any valid numeric expression.

ATAN(expr)

Returns the inverse tangent (a.k.a. arc tangent) of expr, as if computed by java.lang.Math.atan

  • expr: Any valid numeric expression.

DEGREES(expr)

Converts radians to degrees.

  • expr: Any valid numeric expression.

RADIANS(expr)

Converts degrees to radians.

  • expr: Any valid numeric expression.

POSITIVE(expr)

Returns the postive value of expr.

  • expr: Any valid numeric expression.

NEGATIVE(expr)

Returns the negated value of expr.

  • expr: Any valid numeric expression.

SIGN(expr)

Returns -1.0, 0.0 or 1.0 as expr is negative, 0 or positive.

  • expr: Any valid numeric expression.

E()

Returns Euler's number, e.

PI()

Returns pi.

FACTORIAL(expr)

Returns the factorial of expr. expr is [0..20]. Otherwise, null.

  • expr: A numeric expression.

CBRT(expr)

Returns the cube root of expr.

  • expr: Any valid numeric expression.

SHIFTELFT(base, shift)

Bitwise left shift.

  • base: The base number to shift.
  • shift: The number of bits to shift.

SHIFTRIGHT(base, shift)

Bitwise right shift.

  • base: The base number to shift.
  • shift: The number of bits to shift.

SHIFTRIGHTUNSIGNED(base, shift)

Bitwise unsigned right shift.

  • base: The base number to shift.
  • shift: The number of bits to shift.

GREATEST(expr1, expr2 [, expr3] [, ...])

Returns the greatest value of all parameters, skipping null values.

  • expr1: Any valid expression.
  • expr2: Any valid expression.
  • expr3: Any valid expression.

LEAST(expr1, expr2 [, expr3] [, ...])

Returns the least value of all parameters, skipping null values.

  • expr1: Any valid expression.
  • expr2: Any valid expression.
  • expr3: Any valid expression.

WIDTH_BUCKET(expr, min_value, max_value, num_buckets)

Returns an integer between 0 and num_buckets+1 by mapping expr into the ith equally sized bucket. Buckets are made by dividing [min_value, max_value] into equally sized regions. If expr < min_value, return 1, if expr > max_value return num_buckets+1.

  • expr: A valid numeric expression.
  • min_value: The minimum value.
  • max_value: The maximum value.
  • num_buckets: The number of buckets.

SIZE(expr)

Returns the size of an array or a map. Returns -1 if null.

  • expr: Any valid expression.

MAP_KEYS(map)

Returns an unordered array containing the keys of the map.

  • map: A valid map expression.

MAP_VALUES(map)

Returns an unordered array containing the values of the map.

  • map: A valid map expression.

ARRAY_CONTAINS(array, expr)

Returns true if the array contains the value.

  • array: The array to search.
  • expr: The expression to search for.

SORT_ARRAY(array [, ascendingOrder])

Sorts the input array in ascending or descending order according to the natural ordering of the array elements.

  • array: The array to sort.
  • order: Identifies whether to sort in ascending order.

BINARY(expr)

Casts the value expr to the target data type binary.

  • expr: The expression to cast.

CAST(expr AS type)

Casts the value expr to the target data type type.

  • expr: Any valid expression.
  • type: The type to cast expr to.

FROM_UNIXTIME(unixtime [, format])

  • unixtime: Unix time.
  • format: The format to convert unixtime to.

UNIX_TIMESTAMP([expr [, pattern]])

Returns the UNIX timestamp of the given time.

  • expr: The time string to convert.
  • format: The format of expr.

TO_DATE(date_str [, fmt])

Parses the date_str expression with the fmt expression to a date. Returns null with invalid input. By default, it follows casting rules to a date if the fmt is omitted.

  • date_str: The date string expression.
  • fmt: The format of date_str.

YEAR(date)

Returns the year component of the date/timestamp.

  • date: The date to extract the year from.

QUARTER(date)

Returns the quarter of the year for date, in the range 1 to 4.

  • date: The date to extract the quarter from.

MONTH(date)

Returns the month component of the date/timestamp.

  • date: The date to extract the month from.

DAY(date)

Returns the day of month of the date/timestamp.

  • date: The date to extract the day from.

HOUR(timestamp)

Returns the hour component of the string/timestamp.

  • timestamp: The timestamp to extract the hours from.

MINUTE(timestamp)

Returns the minute component of the string/timestamp.

  • timestamp: The timestamp to extract the minutes from.

SECOND(timestamp)

Returns the second component of the string/timestamp.

  • timestamp: The timestamp to extract the seconds from.

WEEKOFYEAR(date)

Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.

  • date: The date to extract the week of the year from.

DATEDIFF(endDate, startDate)

Returns the number of days from startDate to endDate.

  • endDate: The end date.
  • startDate: The start date.

DATE_ADD(start_date, num_days)

Returns the date that is num_days after start_date.

  • start_date: The start date.
  • num_days: The number of days to add to start_date.

DATE_SUB(start_date, num_days)

Returns the date that is num_days before start_date.

  • start_date: The start date.
  • num_days: The number of days to subtract from start_date.

FROM_UTC_TIMESTAMP(timestamp, timezone)

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.

  • timestamp: The UTC timestamp.
  • timezone: The timezone to convert to.

TO_UTC_TIMESTAMP(timestamp, timezone)

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.

  • timestamp: The timestamp to convert to UTC.
  • timezone: The timezone of timestamp.

CURRENT_DATE()

Returns the current date at the start of query evaluation.

CURRENT_TIMESTAMP()

Returns the current timestamp at the start of query evaluation.

ADD_MONTHS(start_date, num_months [, fmt])

Returns the date that is num_months after start_date.

  • start_date: The starting date.
  • num_months: The number of months to add.
  • fmt: The output format.

LAST_DAY(date)

Returns the last day of the month which the date belongs to.

  • date: A valid date expression.

NEXT_DAY(start_date, day_of_week)

Returns the first date which is later than start_date and named as indicated.

  • start_date: The start date.
  • day_of_week: The day of week.

TRUNC(date, time_unit)

Returns date with the time portion of the day truncated to the unit specified by the format model fmt. fmt should be one of ["year", "yyyy", "yy", "mon", "month", "mm"]

  • date: A valid date expression.
  • time_unit: The time unit.

MONTHS_BETWEEN(timestamp1, timestamp2)

Returns number of months between timestamp1 and timestamp2.

  • timestamp1: A valid timestamp expression.
  • timestamp2: A valid timestamp expression.

DATE_FORMAT(timestamp, fmt)

Converts timestamp to a value of string in the format specified by the date format fmt.

  • timestamp: A valid timestamp expression.
  • fmt: A valid date format.

IF(expr1, expr2, expr3)

If expr1 evaluates to true, then returns expr2; otherwise returns expr3.

  • expr1: An expression that should evaluate to a boolean value.
  • expr2: A valid expression.
  • expr3: A valid expression.

ISNULL(expr)

Returns true if expr is null, or false otherwise.

  • expr: A valid expression.

ISNOTNULL(expr)

Returns true if expr is not null, or false otherwise.

  • expr: A valid expression.

NVL(expr1, expr2)

Returns expr1 if it's not NaN, or expr2 otherwise.

  • expr1: A valid expression.
  • expr2: A valid expression.

COALESCE(expr1, expr2 [, expr3] [, ...])

Returns the first non-null argument if exists. Otherwise, null.

  • expr1: A valid expression.
  • expr2: A valid expression.
  • expr3: A valid expression.

NULLIF(expr1, expr2)

Returns null if expr1 equals to expr2, or expr1 otherwise.

  • expr1: A valid expression.
  • expr2: A valid expression.

ASSERT_TRUE(expr)

Throws an exception if expr is not true.

  • expr: A valid expression that evaluates to a boolean.

ASCII(str)

Returns the numeric value of the first character of str.

  • str: A string expression.

BASE64(bin)

Converts the argument from a binary bin to a base 64 string.

  • bin: A binary expression.

CHAR_LENGTH(str)

Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.

  • str: A string expression.

CHR(expr)

Returns the ASCII character having the binary equivalent to expr. If n is larger than 256 the result is equivalent to chr(n % 256)

  • expr: A integer expression.

CONCAT(str1, str2 [, str3] [, ...])

  • str1: A valid string expression.
  • str2: A valid string expression.
  • str3: A valid string expression.

CONCAT_WS(sep [, exp1] [, ...])

Returns the concatenation of the strings separated by sep.

  • set: A string separator.
  • exp1: A valid expression.

DECODE(bin, charset)

Decodes the first argument using the second argument character set.

  • bin: The binary expression to decode.
  • charset: The charset to use to decode bin.

ELT(n, input1 [, input2] [, ...])

Returns the n-th input, e.g., returns input2 when n is 2.

  • n: A valid integer index.
  • input1: A valid string expression.
  • input3: A valid string expression.

ENCODE(str, charset)

Encodes the first argument using the second argument character set.

  • str: A string expression to encode.
  • charset: The charset to use to encode str.

FIELD(val1, val2 [, val3] [, ...])

Returns the index of val in the val1,val2,val3,... list or 0 if not found. For example field('world','say','hello','world') returns 3. All primitive types are supported, arguments are compared using str.equals(x). If val is NULL, the return value is 0.

  • val1: A valid expression.
  • val2: A valid expression.
  • val3: A valid expression.

FIND_IN_SET(str, str_array)

Returns the index (1-based) of the given string (str) in the comma-delimited list (str_array). Returns 0, if the string was not found or if the given string (str) contains a comma.

  • str: The string expression to search for.
  • str_array: A comma-delimited list of values.

FORMAT_NUMBER(expr1, expr2)

Formats the number expr1 like '#,###,###.##', rounded to expr2 decimal places. If expr2 is 0, the result has no decimal point or fractional part. This is supposed to function like MySQL's FORMAT.

  • expr1: A numeric expression to format.
  • expr2: The number of deciml places.

GET_JSON_OBJECT(json_txt, path)

Extracts a json object from path.

  • json_txt: JSON data.
  • path: The path to extract.

IN_FILE(str, filename)

Returns true if the string str appears as an entire line in filename.

  • str: The string to search for.
  • filename: The name of the file to search.

INSTR(str, substr)

Returns the (1-based) index of the first occurrence of substr in str.

  • str: A string expression.
  • substr: The string expression to search for.

LENGTH(expr)

Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.

  • expr: A string expression.

LOCATE(substr, str [, pos])

Returns the position of the first occurrence of substr in str after position pos. The given pos and return value are 1-based.

  • substr: The string expression to search for.
  • str: The string expression to search in.
  • pos: The starting index.

LOWER(expr)

Returns str with all characters changed to lowercase.

  • expr: A string expression.

LPAD(str, len, pad_str)

Returns str, left-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters.

  • str: A string expression.
  • len: The length to pad.
  • pad_str: The pad string.

LTRIM(str)

Removes the leading space characters from str.

  • str: A string expression.

OCTET_LENGTH(expr)

Returns the byte length of expr or number of bytes in binary data.

  • expr: Any string expression.

PARSE_URL(url, partToExtract [, key])

Returns the specified part from the URL. For example, parse_url('http://facebook.com/path1/p.php?k1=v1#Ref1', 'HOST') returns 'facebook.com'. Also a value of a particular key in QUERY can be extracted by providing the key as the third argument, for example, parse_url('http://facebook.com/path1/p.php?k1=v1#Ref1', 'QUERY', 'k1') returns 'v1'.

  • url: A valid URL expression.
  • partToExtract: The URL part to extract. Valid values for partToExtract include HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and USERINFO.
  • key: The key.

PRINTF(strfmt [, obj1] [, ...])

Returns a formatted string from printf-style format strings.

  • strfmt: The string format.
  • obj1: The object to include in the formatted string.

REGEXP_EXTRACT(str, regexp [, idx])

Extracts a group that matches regexp.

  • str: A string expression.
  • regexp: A regular expression to search for.
  • idx: The starting index.

REGEXP_REPLACE(str, regexp, rep)

Replaces all substring of str that match regexp with rep.

  • str: A string expression.
  • regexp: A regular expression to search for.
  • rep: The string to replace.

REPEAT(str, n)

Returns the string which repeats the given string value n times.

  • str: The string expression to repeat.
  • n: The number of times to repeat str.

REPLACE(str, search [, replace])

Replaces all occurrences of search with replace. If search is not found in str, str is returned unchanged. If replace is not specified or is an empty string, nothing replaces the string that is removed from str.

  • str: A string expression.
  • search: The search string.
  • replace: A string expression to replace search values.

REVERSE(str)

Returns the reversed given string.

  • str: A string expression.

RPAD(str, len, pad_str)

Returns str, right-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters.

  • str: A string expression.
  • len: The length to pad.
  • pad_str: The pad string.

RTRIM(str)

Removes the trailing space characters from str.

  • str: A string expression.

SENTENCES(str [, lang, country])

Splits str into an array of array of words.

  • str: A string expression.
  • lang: The language of str.
  • country: The country of the specified language.

SPACE(n)

Returns a string consisting of n spaces.

  • n: The number of spaces.

SPLIT(str, regex)

Splits str around occurrences that match regex.

  • str: A string expression.
  • regex: The regular expression to match.

STR_TO_MAP(text [, pairDelim [, keyValueDelim]])

Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' for pairDelim and ':' for keyValueDelim.

  • text: A string expression.
  • pairDelim: The pair delimiter.
  • keyValueDelim: The value delimiter.

SUBSTR(str, pos [, len])

Returns the substring of str that starts at pos and is of length len, or the slice of byte array that starts at pos and is of length len.

  • str: A string expression.
  • pos: The starting position.
  • len: The length of the string.

SUBSTRING_INDEX(str, delim, count)

Returns the substring from str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. The function substring_index performs a case-sensitive match when searching for delim.

  • str: A string expression.
  • delim: The delimiter.
  • count: Total number of occurrences.

TRANSLATE(input, from, to)

Translates the input string by replacing the characters present in the from string with the corresponding characters in the to string.

  • input: A string expression.
  • from: A string expression.
  • to: A string expression.

TRIM(str)

Removes the leading and trailing space characters from str.

  • str: A string expression.

UNBASE64(str)

Converts the argument from a base 64 string str to a binary.

  • str: A string expression.

UPPER(str)

Returns str with all characters changed to uppercase.

  • str: A string expression.

INITCAP(str)

Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space.

  • str: A string expression.

LEVENSHTEIN(str1, str2)

Returns the Levenshtein distance between the two given strings.

  • str1: A string expression.
  • str2: A string expression.

SOUNDEX(str)

Returns Soundex code of the string.

  • str: A string expression.

MASK(str [, upper [, lower [, number]]])

Returns a masked version of str. By default, upper case letters are converted to "X", lower case letters are converted to "x" and numbers are converted to "n". For example mask("abcd-EFGH-8765-4321") results in xxxx-XXXX-nnnn-nnnn. You can override the characters used in the mask by supplying additional arguments: the second argument controls the mask character for upper case letters, the third argument for lower case letters and the fourth argument for numbers. For example, mask("abcd-EFGH-8765-4321", "U", "l", "#") results in llll-UUUU-####-####.

  • str: The string to mask.
  • upper: The character to mask for uppercase letters.
  • lower: The character to mask for lowercase letters.
  • number: The character to mask for numbers.

MASK_FIRST_N(str [, n])

Returns a masked version of str with the first n values masked. Upper case letters are converted to "X", lower case letters are converted to "x" and numbers are converted to "n". For example, mask_first_n("1234-5678-8765-4321", 4) results in nnnn-5678-8765-4321.

  • str: The string to mask.
  • n: The number of values to mask.

MASK_LAST_N(str [, n])

Returns a masked version of str with the last n values masked. Upper case letters are converted to "X", lower case letters are converted to "x" and numbers are converted to "n". For example, mask_last_n("1234-5678-8765-4321", 4) results in 1234-5678-8765-nnnn.

  • str: The string to mask.
  • n: The number of values to mask.

MASK_SHOW_FIRST_N(str [, n])

Returns a masked version of str, showing the first n characters unmasked. Upper case letters are converted to "X", lower case letters are converted to "x" and numbers are converted to "n". For example, mask_show_first_n("1234-5678-8765-4321", 4) results in 1234-nnnn-nnnn-nnnn.

  • str: The string to mask.
  • n: The number of values to mask.

MASK_SHOW_LAST_N(str [, n])

Returns a masked version of str, showing the last n characters unmasked. Upper case letters are converted to "X", lower case letters are converted to "x" and numbers are converted to "n". For example, mask_show_last_n("1234-5678-8765-4321", 4) results in nnnn-nnnn-nnnn-4321.

  • str: The string to mask.
  • n: The number of values to mask.

MASK_HASH(str)

Returns a hashed value based on str. The hash is consistent and can be used to join masked values together across tables. This function returns null for non-string types.

  • str: The string to mask.

JAVA_METHOD(class, method [, arg1] [, ...])

Calls a method with reflection.

  • class: The class to call.
  • method: The method to call.
  • arg1: The argument to pass in.

REFLECT(class, method [, arg1] [, ...])

Calls a method with reflection.

  • class: The class to call.
  • method: The method to call.
  • arg1: The argument to pass in.

HASH(expr1 [, expr2] [, ...])

Returns a hash value of the arguments.

  • expr1: A valid expression.
  • expr2: A valid expression.

CURRENT_USER()

Returns current user name from the configured authenticator manager. Could be the same as the user provided when connecting, but with some authentication managers (for example HadoopDefaultAuthenticator) it could be different.

LOGGED_IN_USER()

Returns current user name from the session state. This is the username provided when connecting to Impala.

CURRENT_DATABASE()

Returns current database name.

MD5(expr)

Returns an MD5 128-bit checksum as a hex string of expr.

  • expr: A valid expression.

SHA1(expr)

Returns a sha1 hash value as a hex string of the expr.

  • expr: A valid expression.

CRC32(expr)

Returns a cyclic redundancy check value of the expr as a bigint.

  • expr: A valid expression.

SHA2(expr, bitlength)

Returns a checksum of SHA-2 family as a hex string of expr. SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 0 is equivalent to 256.

  • expr: A valid expression.
  • bitlength: The bit length.

AES_ENCRYPT(input, key)

Encrypt input using AES. Key lengths of 128, 192 or 256 bits can be used. 192 and 256 bits keys can be used if Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files are installed. If either argument is NULL or the key length is not one of the permitted values, the return value is NULL. Example: base64(aes_encrypt('ABC', '1234567890123456')) = 'y6Ss+zCYObpCbgfWfyNWTw=='.

  • input: The input value to encrypt.
  • key: The key to use when encrypting.

VERSION()

Returns the Impala version. The string contains 2 fields, the first being a build number and the second being a build hash. Example: "select version();" might return "2.1.0.2.5.0.0-1245 r027527b9c5ce1a3d7d0b6d2e6de2378fb0c39232". Actual results will depend on your build.

COUNT(DISTINCT expr1 [, expr2] [, ...])

Returns the number of rows for which the supplied expression(s) are unique and non-null.

  • expr1: A valid expression.
  • expr2: A valid expression.

SUM(expr)

Returns the sum calculated from values of a group.

  • expr: A valid expression.

SUM(DISTINCT expr)

Returns the sum calculated from distinct values of a group.

  • expr: A valid expression.

AVG(expr)

Returns the mean calculated from values of a group.

  • expr: A valid expression.

AVG(DISTINCT expr)

Returns the mean calculated from distinct values of a group.

  • expr: A valid expression.

MIN(expr)

Returns the minimum value of expr.

  • expr: A valid expression.

MAX(expr)

Returns the maximum value of expr.

  • expr: A valid expression.

VARIANCE(expr)

Returns the sample variance calculated from values of a group.

  • expr: A valid expression.

STDDEV_POP(expr)

Returns the population standard deviation calculated from values of a group.

  • expr: A valid expression.

STDDEV_SAMP(expr)

Returns the sample standard deviation calculated from values of a group.

  • expr: A valid expression.

COVAR_POP(expr1, expr2)

Returns the population covariance of a set of number pairs.

  • expr1: A valid expression.
  • expr2: A valid expression.

COVAR_SAMP(expr1, expr2)

Returns the sample covariance of a set of number pairs.

  • expr1: A valid expression.
  • expr2: A valid expression.

CORR(expr1, expr2)

Returns Pearson coefficient of correlation between a set of number pairs.

  • expr1: A valid expression.
  • expr2: A valid expression.

PERCENTILE(col, percentage [, accuracy])

Returns the exact percentile value of numeric column col at the given percentage. The value of percentage must be between 0.0 and 1.0. The value of frequency should be positive integral

  • col: A numeric expression.
  • percentage: The percentage.
  • accuracy: The accuracy to control approximation.

PERCENTILE_APPROX(col, percentage [, accuracy])

Returns the approximate percentile value of numeric column col at the given percentage. The value of percentage must be between 0.0 and 1.0. The accuracy parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column col at the given percentage array.

  • col: A numeric expression.
  • percentage: The percentage.
  • accuracy: The accuracy to control approximation.

COLLECT_SET(expr)

Collects and returns a set of unique elements.

  • expr: A valid expression.

COLLECT_LIST(expr)

Collects and returns a set of unique elements.

  • expr: A valid expression.

NTILE(n)

Divides the rows for each window partition into n buckets ranging from 1 to at most n.

  • n: The number of buckets.

EXPLODE(expr)

Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns.

  • expr: A valid expression.

POSEXPLODE(expr)

Separates the elements of array expr into multiple rows with positions, or the elements of map expr into multiple rows and columns with positions.

  • expr: A valid expression.

INLINE(expr)

Explodes an array of structs into a table.

  • expr: A valid expression.

STACK(n, expr1 [, expr2 ] [, ...])

Separates expr1, ..., exprk into n rows.

  • n: The number of rows.
  • expr1: A valid expression.
  • expr2: A valid expression.

PARSE_URL_TUPLE(urlStr, p1 [, p2 ] [, ...])

Takes URL string and a set of n URL parts, and returns a tuple of n values. This is similar to the parse_url() UDF but can extract multiple parts at once out of a URL. Valid part names are: HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, USERINFO, QUERY:[KEY].

  • urlStr: A valid URL string.
  • p1: A valid part name.
  • p2: A valid part name.

CData Cloud

SELECT INTO Statements

You can use the SELECT INTO statement to export formatted data to a file.

Data Export with an SQL Query

The following query exports data into a file formatted in comma-separated values (CSV):

SELECT City, CompanyName INTO [csv://[CData].[Default].Customers.txt] FROM [[CData].[Default].Customers] WHERE Country = 'US'
You can specify other formats in the file URI. The possible delimiters are tab, semicolon, and comma with the default being a comma. The following example exports tab-separated values:
SELECT City, CompanyName INTO [csv://[CData].[Default].Customers.txt;delimiter=tab] FROM [[CData].[Default].Customers] WHERE Country = 'US'
You can specify other file formats in the URI. The following example exports tab-separated values:

CData Cloud

SQL Functions

The Cloud provides functions that are similar to those that are available with most standard databases. These functions are implemented in the CData provider engine and thus are available across all data sources with the same consistent API. Three categories of functions are available: string, date, and math.

The Cloud interprets all SQL function inputs as either strings or column identifiers, so you need to escape all literals as strings, with single quotes. For example, contrast the SQL Server syntax and Cloud syntax for the DATENAME function:

  • SQL Server:
    SELECT DATENAME(yy,GETDATE())
  • Cloud:
    SELECT DATENAME('yy',GETDATE())

String Functions

These functions perform string manipulations and return a string value. See STRING Functions for more details.

SELECT CONCAT(firstname, space(4), lastname) FROM [CData].[Default].Customers WHERE Country = 'US'

Date Functions

These functions perform date and date time manipulations. See DATE Functions for more details.

SELECT CURRENT_TIMESTAMP() FROM [CData].[Default].Customers

Math Functions

These functions provide mathematical operations. See MATH Functions for more details.

SELECT RAND() FROM [CData].[Default].Customers

Function Parameters and Nesting SQL Functions

The Cloud supports column names, constants, and results of other functions as parameters to functions. The following are all valid uses of SQL functions:
SELECT CONCAT('Mr.', SPACE(2), firstname, SPACE(4), lastname) FROM [CData].[Default].Customers

Predicate Functions

These functions can be used to specify criteria in the WHERE clause of your SQL query. See Predicate Functions for more details.

* FROM [CData].[Default].Customers WHERE CreatedDate = NOW()

CData Cloud

STRING Functions

ASCII(character_expression)

Returns the ASCII code value of the left-most character of the character expression.

  • character_expression: The character expression.

                      SELECT ASCII('0');
                      --  Result: 48
                    

CHAR(integer_expression)

Converts the integer ASCII code to the corresponding character.

  • integer_expression: The integer from 0 through 255.

                      SELECT CHAR(48);
                      -- Result: '0'
                    

CHARINDEX(expressionToFind ,expressionToSearch [,start_location ])

Returns the starting position of the specified expression in the character string.

  • expressionToFind: The character expression to find.
  • expressionToSearch: The character expression, typically a column, to search.
  • start_location: The optional character position to start searching for expressionToFind in expressionToSearch.

                      SELECT CHARINDEX('456', '0123456');
                      -- Result: 4

                      SELECT CHARINDEX('456', '0123456', 5);
                      -- Result: -1
                    

CHAR_LENGTH(character_expression),

Returns the number of UTF-8 characters present in the expression.

  • character_expression: The set of characters to be be evaluated for length.

				 SELECT CHAR_LENGTH('sample text') FROM Account LIMIT 1
				 -- Result: 11			
				

CONCAT(string_value1, string_value2 [, string_valueN])

Returns the string that is the concatenation of two or more string values.

  • string_value1: The first string to be concatenated.
  • string_value2: The second string to be concatenated.
  • *: The optional additional strings to be concatenated.

                      SELECT CONCAT('Hello, ', 'world!');
                      -- Result: 'Hello, world!'
                    

CONTAINS(expressionToSearch, expressionToFind)

Returns 1 if expressionToFind is found within expressionToSearch; otherwise, 0.

  • expressionToSearch: The character expression, typically a column, to search.
  • expressionToFind: The character expression to find.

                      SELECT CONTAINS('0123456', '456');
                      -- Result: 1

                      SELECT CONTAINS('0123456', 'Not a number');
                      -- Result: 0
                    

ENDSWITH(character_expression, character_suffix)

Returns 1 if character_expression ends with character_suffix; otherwise, 0.

  • character_expression: The character expression.
  • character_suffix: The character suffix to search for.

                      SELECT ENDSWITH('0123456', '456');
                      -- Result: 1

                      SELECT ENDSWITH('0123456', '012');
                      -- Result: 0
                    

FILESIZE(uri)

Returns the number of bytes present in the file at the specified file path.

  • uri: The path of the file to read the size from.

				SELECT FILESIZE('C:/Users/User1/Desktop/myfile.txt');
				-- Result: 23684
				

FORMAT(value [, parseFormat], format )

Returns the value formatted with the specified format.

  • value: The string to format.
  • format: The string specifying the output syntax of the date or numeric format.
  • parseFormat: The string specifying the input syntax of the date value. Not applicable to numeric types.

                      SELECT FORMAT(12.34, '#');
                      -- Result: 12

                      SELECT FORMAT(12.34, '#.###');
                      -- Result: 12.34

                      SELECT FORMAT(1234, '0.000E0');
                      -- Result: 1.234E3
                      
                      SELECT FORMAT('2019/01/01', 'yyyy-MM-dd');
                      -- Result: 2019-01-01
                      
                      SELECT FORMAT('20190101', 'yyyyMMdd', 'yyyy-MM-dd');
                      -- Result: '2019-01-01'
                    

FROM_UNIXTIME(time, issecond)

Returns a representation of the unix_timestamp argument as a value in YYYY-MM-DD HH:MM:SS expressed in the current time zone.

  • time: The time stamp value from epoch time. Milliseconds are accepted.
  • issecond: Indicates the time stamp value is milliseconds to epoch time.

                      SELECT FROM_UNIXTIME(1540495231, 1);
                      -- Result: 2018-10-25 19:20:31

                      SELECT FROM_UNIXTIME(1540495357385, 0);
                      -- Result: 2018-10-25 19:22:37
                    

HASHBYTES(algorithm, value)

Returns the hash of the input value as a byte array using the given algorithm. The supported algorithms are MD5, SHA1, SHA2_256, SHA2_512, SHA3_224, SHA3_256, SHA3_384, and SHA3_512.

  • algorithm: The algorithm to use for hashing. Must be one of MD5, SHA1, SHA2_256, SHA2_512, SHA3_224, SHA3_256, SHA3_384, or SHA3_512.
  • value: The value to hash. Must be either a string or byte array.

                      SELECT HASHBYTES('MD5', 'Test');
                      -- Result (byte array): 0x0CBC6611F5540BD0809A388DC95A615B
                    

INDEXOF(expressionToSearch, expressionToFind [,start_location ])

Returns the starting position of the specified expression in the character string.

  • expressionToSearch: The character expression, typically a column, to search.
  • expressionToFind: The character expression to find.
  • start_location: The optional character position to start searching for expressionToFind in expressionToSearch.

                      SELECT INDEXOF('0123456', '456');
                      -- Result: 4

                      SELECT INDEXOF('0123456', '456', 5);
                      -- Result: -1
                    

ISNULL ( check_expression , replacement_value )

Replaces null with the specified replacement value.

  • check_expression: The expression to be checked for null.
  • replacement_value: The expression to be returned if check_expression is null.

                      SELECT ISNULL(42, 'Was NULL');
                      -- Result: 42

                      SELECT ISNULL(NULL, 'Was NULL');
                      -- Result: 'Was NULL'
                    

JSON_AVG(json, jsonpath)

Computes the average value of a JSON array within a JSON object. The path to the array is specified in the jsonpath argument. Return value is numeric or null.

  • json: The JSON document to compute.
  • jsonpath: The JSONPath used to select the nodes. [x], [2..], [..8], or [1..12] are accepted. [x] selects all nodes.

                      SELECT JSON_AVG('[1,2,3,4,5]', '$[x]');
                      -- Result: 3

                      SELECT JSON_AVG('{"test": {"data": [1,2,3,4,5]}}', '$.test.data[x]');
                      -- Result: 3

                      SELECT JSON_AVG('{"test": {"data": [1,2,3,4,5]}}', '$.test.data[3..]');
                      -- Result: 4.5
                    

JSON_COUNT(json, jsonpath)

Returns the number of elements in a JSON array within a JSON object. The path to the array is specified in the jsonpath argument. Return value is numeric or null.

  • json: The JSON document to compute.
  • jsonpath: The JSONPath used to select the nodes. [x], [2..], [..8], or [1..12] are accepted. [x] selects all nodes.

                      SELECT JSON_COUNT('[1,2,3,4,5]', '$[x]');
                      -- Result: 5

                      SELECT JSON_COUNT('{"test": {"data": [1,2,3,4,5]}}', '$.test.data[x]');
                      -- Result: 5

                      SELECT JSON_COUNT('{"test": {"data": [1,2,3,4,5]}}', '$.test.data[3..]');
                      -- Result: 2
                    

JSON_EXTRACT(json, jsonpath)

Selects any value in a JSON array or object. The path to the array is specified in the jsonpath argument. Return value is numeric or null.

  • json: The JSON document to extract.
  • jsonpath: The XPath used to select the nodes. The JSONPath must be a string constant. The values of the nodes selected will be returned in a token-separated list.

                      SELECT JSON_EXTRACT('{"test": {"data": 1}}', '$.test');
                      -- Result: '{"data":1}'

                      SELECT JSON_EXTRACT('{"test": {"data": 1}}', '$.test.data');
                      -- Result: 1

                      SELECT JSON_EXTRACT('{"test": {"data": [1, 2, 3]}}', '$.test.data[1]');
                      -- Result: 2
                    

JSON_MAX(json, jsonpath)

Gets the maximum value in a JSON array within a JSON object. The path to the array is specified in the jsonpath argument. Return value is numeric or null.

  • json: The JSON document to compute.
  • jsonpath: The JSONPath used to select the nodes. [x], [2..], [..8], or [1..12] are accepted. [x] selects all nodes.

                      SELECT JSON_MAX('[1,2,3,4,5]', '$[x]');
                      -- Result: 5

                      SELECT JSON_MAX('{"test": {"data": [1,2,3,4,5]}}', '$.test.data[x]');
                      -- Result: 5

                      SELECT JSON_MAX('{"test": {"data": [1,2,3,4,5]}}', '$.test.data[..3]');
                      -- Result: 4
                    

JSON_MIN(json, jsonpath)

Gets the minimum value in a JSON array within a JSON object. The path to the array is specified in the jsonpath argument. Return value is numeric or null.

  • json: The JSON document to compute.
  • jsonpath: The JSONPath used to select the nodes. [x], [2..], [..8], or [1..12] are accepted. [x] selects all nodes.

                      SELECT JSON_MIN('[1,2,3,4,5]', '$[x]');
                      -- Result: 1

                      SELECT JSON_MIN('{"test": {"data": [1,2,3,4,5]}}', '$.test.data[x]');
                      -- Result: 1

                      SELECT JSON_MIN('{"test": {"data": [1,2,3,4,5]}}', '$.test.data[3..]');
                      -- Result: 4
                    

JSON_SUM(json, jsonpath)

Computes the summary value in JSON according to the JSONPath expression. Return value is numeric or null.

  • json: The JSON document to compute.
  • jsonpath: The JSONPath used to select the nodes. [x], [2..], [..8], or [1..12] are accepted. [x] selects all nodes.

                      SELECT JSON_SUM('[1,2,3,4,5]', '$[x]');
                      -- Result: 15

                      SELECT JSON_SUM('{"test": {"data": [1,2,3,4,5]}}', '$.test.data[x]');
                      -- Result: 15

                      SELECT JSON_SUM('{"test": {"data": [1,2,3,4,5]}}', '$.test.data[3..]');
                      -- Result: 9
                    

LEFT ( character_expression , integer_expression )

Returns the specified number of characters counting from the left of the specified string.

  • character_expression: The character expression.
  • integer_expression: The positive integer that specifies how many characters will be returned counting from the left of character_expression.

                      SELECT LEFT('1234567890', 3);
                      -- Result: '123'
                    

LEN(string_expression)

Returns the number of characters of the specified string expression.

  • string_expression: The string expression.

                      SELECT LEN('12345');
                      -- Result: 5
                    

LOCATE(substring,string)

Returns an integer representing how many characters into the string the substring appears.

  • substring: The substring to find inside larger string.
  • string: The larger string that will be searched for the substring.

				SELECT LOCATE('sample','XXXXXsampleXXXXX');
				-- Result: 6
				

LOWER ( character_expression )

Returns the character expression with the uppercase character data converted to lowercase.

  • character_expression: The character expression.

                      SELECT LOWER('MIXED case');
                      -- Result: 'mixed case'
                    

LTRIM(character_expression)

Returns the character expression with leading blanks removed.

  • character_expression: The character expression.

                      SELECT LTRIM('     trimmed');
                      -- Result: 'trimmed'
                    

MASK(string_expression, mask_character [, start_index [, end_index ]])

Replaces the characters between start_index and end_index with the mask_character within the string.

  • string_expression: The string expression to be searched.
  • mask_character: The character to mask with.
  • start_index: The optional number of characters to leave unmasked at beginning of string. Defaults to 0.
  • end_index: The optional number of characters to leave unmasked at end of string. Defaults to 0.

                        SELECT MASK('1234567890','*',);
                        -- Result: '**********'
                        SELECT MASK('1234567890','*', 4);
                        -- Result: '1234******'
                        SELECT MASK('1234567890','*', 4, 2);
                        -- Result: '1234****90'  
                    

NCHAR(integer_expression)

Returns the Unicode character with the specified integer code as defined by the Unicode standard.

  • integer_expression: The integer from 0 through 255.

OCTET_LENGTH(character_expression),

Returns the number of bytes present in the expression.

  • character_expression: The set of characters to be be evaluated.

				 SELECT OCTET_LENGTH('text') FROM Account LIMIT 1
				 -- Result: 4
				

PATINDEX(pattern, expression)

Returns the starting position of the first occurrence of the pattern in the expression. Returns 0 if the pattern is not found.

  • pattern: The character expression that contains the sequence to be found. The wild-card character % can be used only at the start or end of the expression.
  • expression: The expression, typically a column, to search for the pattern.

                      SELECT PATINDEX('123%', '1234567890');
                      -- Result: 1

                      SELECT PATINDEX('%890', '1234567890');
                      -- Result: 8

                      SELECT PATINDEX('%456%', '1234567890');
                      -- Result: 4
                    

POSITION(expressionToFind IN expressionToSearch)

Returns the starting position of the specified expression in the character string.

  • expressionToFind: The character expression to find.
  • expressionToSearch: The character expression, typically a column, to search.

                      SELECT POSITION('456' IN '123456');
                      -- Result: 4

                      SELECT POSITION('x' IN '123456');
                      -- Result: 0
                    

QUOTENAME(character_string [, quote_character])

Returns a valid SQL Server-delimited identifier by adding the necessary delimiters to the specified Unicode string.

  • character_string: The string of Unicode character data. The string is limited to 128 characters. Inputs greater than 128 characters return null.
  • quote_character: The optional single character to be used as the delimiter. Can be a single quotation mark, a left or right bracket, or a double quotation mark. If quote_character is not specified brackets are used.

                      SELECT QUOTENAME('table_name');
                      -- Result: '[table_name]'

                      SELECT QUOTENAME('table_name', '"');
                      -- Result: '"table_name"'

                      SELECT QUOTENAME('table_name', '[');
                      -- Result: '[table_name]'
                    

REPLACE(string_expression, string_pattern, string_replacement)

Replaces all occurrences of a string with another string.

  • string_expression: The string expression to be searched. Can be a character or binary data type.
  • string_pattern: The substring to be found. Cannot be an empty string.
  • string_replacement: The replacement string.

                      SELECT REPLACE('1234567890', '456', '|');
                      -- Result: '123|7890'

                      SELECT REPLACE('123123123', '123', '.');
                      -- Result: '...'

                      SELECT REPLACE('1234567890', 'a', 'b');
                      -- Result: '1234567890'
                    

REPLICATE ( string_expression ,integer_expression )

Repeats the string value the specified number of times.

  • string_expression: The string to replicate.
  • integer_expression: The repeat count.

                      SELECT REPLACE('x', 5);
                      -- Result: 'xxxxx'
                    

REVERSE ( string_expression )

Returns the reverse order of the string expression.

  • string_expression: The string.

                      SELECT REVERSE('1234567890');
                      -- Result: '0987654321'
                    

RIGHT ( character_expression , integer_expression )

Returns the right part of the string with the specified number of characters.

  • character_expression: The character expression.
  • integer_expression: The positive integer that specifies how many characters of the character expression will be returned.

                      SELECT RIGHT('1234567890', 3);
                      -- Result: '890'
                    

RTRIM(character_expression)

Returns the character expression after it removes trailing blanks.

  • character_expression: The character expression.

                      SELECT RTRIM('trimmed     ');
                      -- Result: 'trimmed'
                    

SOUNDEX(character_expression)

Returns the four-character Soundex code, based on how the string sounds when spoken.

  • character_expression: The alphanumeric expression of character data.

                      SELECT SOUNDEX('smith');
                      -- Result: 'S530'
                    

SPACE(repeatcount)

Returns the string that consists of repeated spaces.

  • repeatcount: The number of spaces.

                      SELECT SPACE(5);
                      -- Result: '     '
                    

SPLIT(string, delimiter, offset)

Returns a section of the string between to delimiters.

  • string: The string to split.
  • delimiter: The character to split the string with.
  • offset: The number of the split to return. Positive numbers are treated as offsets from the left, and negative numbers are treated as offsets from the right.

                      SELECT SPLIT('a/b/c/d', '/', 1);
                      -- Result: 'a'
                      SELECT SPLIT('a/b/c/d', '/', -2);
                      -- Result: 'c'
                    

STARTSWITH(character_expression, character_prefix)

Returns 1 if character_expression starts with character_prefix; otherwise, 0.

  • character_expression: The character expression.
  • character_prefix: The character prefix to search for.

                      SELECT STARTSWITH('0123456', '012');
                      -- Result: 1

                      SELECT STARTSWITH('0123456', '456');
                      -- Result: 0
                    

STR ( float_expression [ , integer_length [ , integer_decimal ] ] )

Returns the character data converted from the numeric data. For example, STR(123.45, 6, 1) returns 123.5.

  • float_expression: The float expression.
  • length: The optional total length to return. This includes decimal point, sign, digits, and spaces. The default is 10.
  • decimal: The optional number of places to the right of the decimal point. The decimal must be less than or equal to 16.

                      SELECT STR('123.456');
                      -- Result: '123'

                      SELECT STR('123.456', 2);
                      -- Result: '**'

                      SELECT STR('123.456', 10, 2);
                      -- Result: '123.46'
                    

STUFF(character_expression , integer_start , integer_length , replaceWith_expression)

Inserts a string into another string. It deletes the specified length of characters in the first string at the start position and then inserts the second string into the first string at the start position.

  • character_expression: The string expression.
  • start: The integer value that specifies the location to start deletion and insertion. If start or length is negative, null is returned. If start is longer than the string to be modified, character_expression, null is returned.
  • length: The integer that specifies the number of characters to delete. If length is longer than character_expression, deletion occurs up to the last character in replaceWith_expression.
  • replaceWith_expression: The expression of character data that will replace length characters of character_expression beginning at the start value.

                      SELECT STUFF('1234567890', 3, 2, 'xx');
                      -- Result: '12xx567890'
                    

SUBSTRING(string_value FROM start FOR length)

Returns the part of the string with the specified length; starts at the specified index.

  • string_value: The character string.
  • start: The positive integer that specifies the start index of characters to return.
  • length: Optional. The positive integer that specifies how many characters will be returned.

                      SELECT SUBSTRING('1234567890' FROM 3 FOR 2);
                      -- Result: '34'

                      SELECT SUBSTRING('1234567890' FROM 3);
                      -- Result: '34567890'
                    

TOSTRING(string_value1)

Converts the value of this instance to its equivalent string representation.

  • string_value1: The string to be converted.

                      SELECT TOSTRING(123);
                      -- Result: '123'

                      SELECT TOSTRING(123.456);
                      -- Result: '123.456'

                      SELECT TOSTRING(null);
                      -- Result: ''
                    

TRIM(trimspec trimchar FROM string_value)

Returns the character expression with leading and/or trailing blanks removed.

  • trimspec: Optional. If included must be one of the keywords BOTH, LEADING or TRAILING.
  • trimchar: Optional. If included should be a one-character string value.
  • string_value: The string value to trim.

                      SELECT TRIM('     trimmed     ');
                      -- Result: 'trimmed'

                      SELECT TRIM(LEADING FROM '     trimmed     ');
                      -- Result: 'trimmed     '

                      SELECT TRIM('-' FROM '-----trimmed-----');
                      -- Result: 'trimmed'

                      SELECT TRIM(BOTH '-' FROM '-----trimmed-----');
                      -- Result: 'trimmed'

                      SELECT TRIM(TRAILING '-' FROM '-----trimmed-----');
                      -- Result: '-----trimmed'
                    

UNICODE(ncharacter_expression)

Returns the integer value defined by the Unicode standard of the first character of the input expression.

  • ncharacter_expression: The Unicode character expression.

UPPER ( character_expression )

Returns the character expression with lowercase character data converted to uppercase.

  • character_expression: The character expression.

                      SELECT UPPER('MIXED case');
                      -- Result: 'MIXED CASE'
                    

XML_EXTRACT(xml, xpath [, separator])

Extracts an XML document using the specified XPath to flatten the XML. A comma is used to separate the outputs by default, but this can be changed by specifying the third parameter.

  • xml: The XML document to extract.
  • xpath: The XPath used to select the nodes. The nodes selected will be returned in a token-separated list.
  • separator: The optional token used to separate the items in the flattened response. If this is not specified, the separator will be a comma.

                      SELECT XML_EXTRACT('<vowels><ch>a</ch><ch>e</ch><ch>i</ch><ch>o</ch><ch>u</ch></vowels>', '/vowels/ch');
                      -- Result: 'a,e,i,o,u'

                      SELECT XML_EXTRACT('<vowels><ch>a</ch><ch>e</ch><ch>i</ch><ch>o</ch><ch>u</ch></vowels>', '/vowels/ch', ';');
                      -- Result: 'a;e;i;o;u'
                    

CData Cloud

DATE Functions

CURRENT_DATE()

Returns the current date value.

                  SELECT CURRENT_DATE();
                  -- Result: 2018-02-01
                

CURRENT_TIMESTAMP()

Returns the current time stamp of the database system as a datetime value. This value is equal to GETDATE and SYSDATETIME, and is always in the local timezone.

                  SELECT CURRENT_TIMESTAMP();
                  -- Result: 2018-02-01 03:04:05
                

DATEADD (datepart , integer_number , date [, dateformat])

Returns the datetime value that results from adding the specified number (a signed integer) to the specified date part of the date.

  • datepart: The part of the date to add the specified number to. The valid values and abbreviations are year (yy, yyyy), quarter (qq, q), month (mm, m), dayofyear (dy, y), day (dd, d), week (wk, ww), weekday (dw), hour (hh), minute (mi, n), second (ss, s), and millisecond (ms).
  • number: The number to be added.
  • date: The expression of the datetime data type.
  • dateformat: The optional output date format.

                  SELECT DATEADD('d', 5, '2018-02-01');
                  -- Result: 2018-02-06

                  SELECT DATEADD('hh', 5, '2018-02-01 00:00:00');
                  -- Result: 2018-02-01 05:00:00
                

DATEDIFF ( datepart , startdate , enddate )

Returns the difference (a signed integer) of the specified time interval between the specified start date and end date.

  • datepart: The part of the date that is the time interval of the difference between the start date and end date. The valid values and abbreviations are day (dd, d), hour (hh), minute (mi, n), second (ss, s), and millisecond (ms).
  • startdate: The datetime expression of the start date.
  • enddate: The datetime expression of the end date.

                  SELECT DATEDIFF('d', '2018-02-01', '2018-02-10');
                  -- Result: 9

                  SELECT DATEDIFF('hh', '2018-02-01 00:00:00', '2018-02-01 12:00:00');
                  -- Result: 12
                

DATEFROMPARTS(integer_year, integer_month, integer_day)

Returns the datetime value for the specified year, month, and day.

  • year: The integer expression specifying the year.
  • month: The integer expression specifying the month.
  • day: The integer expression specifying the day.

                    SELECT DATEFROMPARTS(2018, 2, 1);
                    -- Result: 2018-02-01
                  

DATENAME(datepart , date)

Returns the character string that represents the specified date part of the specified date.

  • datepart: The part of the date to return. The valid values and abbreviations are year (yy, yyyy), quarter (qq, q), month (mm, m), dayofyear (dy, y), day (dd, d), week (wk, ww), weekday (dw), hour (hh), minute (mi, n), second (ss, s), millisecond (ms), microsecond (mcs), nanosecond (ns), and TZoffset (tz).
  • date: The datetime expression.

                     SELECT DATENAME('yy', '2018-02-01');
                     -- Result: '2018'

                     SELECT DATENAME('dw', '2018-02-01');
                     -- Result: 'Thursday'
                   

DATEPART(datepart, date [,integer_datefirst])

Returns a character string that represents the specified date part of the specified date.

  • datepart: The part of the date to return. The valid values and abbreviations are year (yy, yyyy), quarter (qq, q), month (mm, m), dayofyear (dy, y), day (dd, d), week (wk, ww), weekday (dw), hour (hh), minute (mi, n), second (ss, s), millisecond (ms), microsecond (mcs), nanosecond (ns), TZoffset (tz), ISODOW, ISO_WEEK (isoweek, isowk,isoww), and ISOYEAR.
  • date: The datetime string.
  • datefirst: The optional integer representing the first day of the week. The default is 7, Sunday.

                    SELECT DATEPART('yy', '2018-02-01');
                    -- Result: 2018

                    SELECT DATEPART('dw', '2018-02-01');
                    -- Result: 5
                  

DATETIME2FROMPARTS(integer_year, integer_month, integer_day, integer_hour, integer_minute, integer_seconds, integer_fractions, integer_precision)

Returns the datetime value for the specified date parts.

  • year: The integer expression specifying the year.
  • month: The integer expression specifying the month.
  • day: The integer expression specifying the day.
  • hour: The integer expression specifying the hour.
  • minute: The integer expression specifying the minute.
  • seconds: The integer expression specifying the seconds.
  • fractions: The integer expression specifying the fractions of the second.
  • precision: The integer expression specifying the precision of the fraction.

                    SELECT DATETIME2FROMPARTS(2018, 2, 1, 1, 2, 3, 456, 3);
                    -- Result: 2018-02-01 01:02:03.456
                  

DATETIMEFROMPARTS(integer_year, integer_month, integer_day, integer_hour, integer_minute, integer_seconds, integer_milliseconds)

Returns the datetime value for the specified date parts.

  • year: The integer expression specifying the year.
  • month: The integer expression specifying the month.
  • day: The integer expression specifying the day.
  • hour: The integer expression specifying the hour.
  • minute: The integer expression specifying the minute.
  • seconds: The integer expression specifying the seconds.
  • milliseconds: The integer expression specifying the milliseconds.

                    SELECT DATETIMEFROMPARTS(2018, 2, 1, 1, 2, 3, 456);
                    -- Result: 2018-02-01 01:02:03.456
                  

DATE_TRUNC(date, datepart)

Truncates the date to the precision of the given date part. Modeled after the Oracle TRUNC function.

  • date: The datetime string that specifies the date.
  • datepart: Refer to the Oracle documentation for valid datepart syntax.

				    SELECT DATE_TRUNC('05-04-2005', 'YY');
                    -- Result: '1/1/2005'
					
                    SELECT DATE_TRUNC('05-04-2005', 'MM');
                    -- Result: '5/1/2005'                    
                  

DATE_TRUNC2(datepart, date, [weekday])

Truncates the date to the precision of the given date part. Modeled after the PostgreSQL date_trunc function.

  • datepart: One of 'millennium', 'century', 'decade', 'year', 'quarter', 'month', 'week', 'day', 'hour', 'minute' or 'second'.
  • date: The datetime string that specifies the date.
  • weekday: The optional day of the week to use as the first day for 'week'. One of 'sunday', 'monday', etc.

                    SELECT DATE_TRUNC2('year', '2020-02-04');
                    -- Result: '2020-01-01'

                    SELECT DATE_TRUNC2('week', '2020-02-04', 'monday');
                    -- Result: '2020-02-02', which is the previous Monday
                  

DAY(date)

Returns the integer that specifies the day component of the specified date.

  • date: The datetime string that specifies the date.

                    SELECT DAY('2018-02-01');
                    -- Result: 1
                  

DAYOFMONTH(date)

Returns the day of the month of the given date part.
  • date: The datetime string that specifies the date.

				  SELECT DAYOFMONTH('04/15/2000');
				  -- Result: 15
				  

DAYOFWEEK(date)

Returns the day of the week of the given date part.
  • date: The datetime string that specifies the date.

				  SELECT DAYOFWEEK('04/15/2000');
				  -- Result: 7
				  

DAYOFYEAR(date)

Returns the day of the year of the given date part.
  • date: The datetime string that specifies the date.

				  SELECT DAYOFYEAR('04/15/2000');
				  -- Result: 106
				  

EOMONTH(date [, integer_month_to_add ]) or LAST_DAY(date)

Returns the last day of the month that contains the specified date with an optional offset.

  • date: The datetime expression specifying the date for which to return the last day of the month.
  • integer_month_to_add: The optional integer expression specifying the number of months to add to the date before calculating the end of the month.

                  SELECT EOMONTH('2018-02-01');
                  -- Result: 2018-02-28
                  
                  SELECT LAST_DAY('2018-02-01');
                  -- Result: 2018-02-28

                  SELECT EOMONTH('2018-02-01', 2);
                  -- Result: 2018-04-30
                

FDWEEK(date)

Returns the first day of the week of the given date part.
  • date: The datetime string that specifies the date.

				  SELECT FDWEEK('02-08-2018');
				  -- Result: 2/4/2018
				  

FDMONTH(date)

Returns the first day of the month of the given date part.
  • date: The datetime string that specifies the date.

				  SELECT FDMONTH('02-08-2018');
				  -- Result: 2/1/2018
				  

FDQUARTER(date)

Returns the first day of the quarter of the given date part.
  • date: The datetime string that specifies the date.

				  SELECT FDQUARTER('05-08-2018');
				  -- Result: 4/1/2018
				  

FILEMODIFIEDTIME(uri)

Returns the time stamp associated with the Date Modified of the relevant file.

  • uri: An absolute path pointing to a file on the local file system.

				 SELECT FILEMODIFIEDTIME('C:/Documents/myfile.txt');
				 -- Result: 6/25/2019 10:06:58 AM
				 

FROM_DAYS(datevalue)

Returns a date derived from the number of days after 1582-10-15 (based upon the Gregorian calendar). This will be equivalent to the MYSQL FROM_DAYS function.

  • datevalue: A integer value representing the number of days since 1582-10-15.

				SELECT FROM_DAYS(736000);
				-- Result: 2/6/2015
				

GETDATE()

Returns the current time stamp of the database system as a datetime value. This value is equal to CURRENT_TIMESTAMP and SYSDATETIME, and is always in the local timezone.

                  SELECT GETDATE();
                  -- Result: 2018-02-01 03:04:05
                

GETUTCDATE()

Returns the current time stamp of the database system formatted as a UTC datetime value. This value is equal to SYSUTCDATETIME.

                  SELECT GETUTCDATE();
                  -- For example, if the local timezone is Eastern European Time (GMT+2)
                  -- Result: 2018-02-01 05:04:05
                

HOUR(date)

Returns the hour component from the provided datetime.

  • date: The datetime string that specifies the date.

				SELECT HOUR('02-02-2020 11:30:00');
				-- Result: 11
				

ISDATE(date, [date_format])

Returns 1 if the value is a valid date, time, or datetime value; otherwise, 0.

  • date: The datetime string.
  • date_format: The optional datetime format.

                      SELECT ISDATE('2018-02-01', 'yyyy-MM-dd');
                      -- Result: 1

                      SELECT ISDATE('Not a date');
                      -- Result: 0
                    

LAST_WEEK()

Returns a time stamp equivalent to exactly one week before the current date.

				SELECT LAST_WEEK();	//Assume the date is 3/17/2020	
				-- Result: 3/10/2020
				

LAST_MONTH()

Returns a time stamp equivalent to exactly one month before the current date.

				SELECT LAST_MONTH(); //Assume the date is 3/17/2020	
				-- Result: 2/17/2020
				

LAST_YEAR()

Returns a time stamp equivalent to exactly one year before the current date.

				SELECT LAST_YEAR();	//Assume the date is 3/17/2020	
				-- Result: 3/10/2019
				

LDWEEK(date)

Returns the last day of the provided week.

  • date: The datetime string.

				SELECT LDWEEK('02-02-2020');
				-- Result: 2/8/2020
				

LDMONTH(date)

Returns the last day of the provided month.

  • date: The datetime string.

				SELECT LDMONTH('02-02-2020');
				-- Result: 2/29/2020
				

LDQUARTER(date)

Returns the last day of the provided quarter.

  • date: The datetime string.

				SELECT LDQUARTER('02-02-2020');
				-- Result: 3/31/2020
				

MAKEDATE(year, days)

Returns a date value from a year and a number of days.

  • year: The year
  • days: The number of days into the year. Value must be greater than 0.

          SELECT MAKEDATE(2020, 1);
          -- Result: 2020-01-01
        

MINUTE(date)

Returns the minute component from the provided datetime.

  • date: The datetime string that specifies the date.

				SELECT MINUTE('02-02-2020 11:15:00');
				-- Result: 15
				

MONTH(date)

Returns the month component from the provided datetime.

  • date: The datetime string that specifies the date.

				SELECT MONTH('02-02-2020');
				-- Result: 2
				

QUARTER(date)

Returns the quarter associated with the provided datetime.

  • date: The datetime string that specifies the date.

				SELECT QUARTER('02-02-2020');
				-- Result: 1
				

SECOND(date)

Returns the second component from the provided datetime.

  • date: The datetime string that specifies the date.

				SELECT SECOND('02-02-2020 11:15:23');
				-- Result: 23
				

SMALLDATETIMEFROMPARTS(integer_year, integer_month, integer_day, integer_hour, integer_minute)

Returns the datetime value for the specified date and time.

  • year: The integer expression specifying the year.
  • month: The integer expression specifying the month.
  • day: The integer expression specifying the day.
  • hour: The integer expression specifying the hour.
  • minute: The integer expression specifying the minute.

                      SELECT SMALLDATETIMEFROMPARTS(2018, 2, 1, 1, 2);
                      -- Result: 2018-02-01 01:02:00
                    

STRTODATE(string,format)

Parses the provided string value and returns the corresponding datetime.

  • string: The string value to be converted to datetime format.
  • format: A format string which describes how to interpret the first string input. A few special formats are available as well, including UNIX, UNIXMILIS, TICKS, and FILETICKS.

				SELECT STRTODATE('03*04*2020','dd*MM*yyyy');
				-- Result: 4/3/2020
				

SYSDATETIME()

Returns the current time stamp as a datetime value of the database system. It is equal to GETDATE and CURRENT_TIMESTAMP, and is always in the local timezone.

                  SELECT SYSDATETIME();
                  -- Result: 2018-02-01 03:04:05
                

SYSUTCDATETIME()

Returns the current system date and time as a UTC datetime value. It is equal to GETUTCDATE.

                  SELECT SYSUTCDATETIME();
                  -- For example, if the local timezone is Eastern European Time (GMT+2)
                  -- Result: 2018-02-01 05:04:05
                

TIMEFROMPARTS(integer_hour, integer_minute, integer_seconds, integer_fractions, integer_precision)

Returns the time value for the specified time and with the specified precision.

  • hour: The integer expression specifying the hour.
  • minute: The integer expression specifying the minute.
  • seconds: The integer expression specifying the seconds.
  • fractions: The integer expression specifying the fractions of the second.
  • precision : The integer expression specifying the precision of the fraction.

                      SELECT TIMEFROMPARTS(1, 2, 3, 456, 3);
                      -- Result: 01:02:03.456
                    

TO_DAYS(date)

Returns the number of days since 0000-00-01. This will only return a value for dates on or after 1582-10-15 (based upon the Gregorian calendar). This will be equivalent to the MYSQL TO_DAYS function.

  • date: The datetime string that specifies the date.

				SELECT TO_DAYS('02-06-2015');
				-- Result: 736000
				

WEEK(date)

Returns the week (of the year) associated with the provided datetime.

  • date: The datetime string that specifies the date.

				SELECT WEEK('02-17-2020 11:15:23');
				-- Result: 8
				

YEAR(date)

Returns the integer that specifies the year of the specified date.

  • date: The datetime string.

                      SELECT YEAR('2018-02-01');
                      -- Result: 2018
                    

CData Cloud

MATH Functions

ABS ( numeric_expression )

Returns the absolute (positive) value of the specified numeric expression.

  • numeric_expression: The expression of an indeterminate numeric data type except for the bit data type.

                      SELECT ABS(15);
                      -- Result: 15

                      SELECT ABS(-15);
                      -- Result: 15
                    

ACOS ( float_expression )

Returns the arc cosine, the angle in radians whose cosine is the specified float expression.

  • float_expression: The float expression that specifies the cosine of the angle to be returned. Values outside the range from -1 to 1 return null.

                      SELECT ACOS(0.5);
                      -- Result: 1.0471975511966
                    

ASIN ( float_expression )

Returns the arc sine, the angle in radians whose sine is the specified float expression.

  • float_expression: The float expression that specifies the sine of the angle to be returned. Values outside the range from -1 to 1 return null.

                      SELECT ASIN(0.5);
                      -- Result: 0.523598775598299
                    

ATAN ( float_expression )

Returns the arc tangent, the angle in radians whose tangent is the specified float expression.

  • float_expression: The float expression that specifies the tangent of the angle to be returned.

                      SELECT ATAN(10);
                      -- Result: 1.47112767430373
                    

ATN2 ( float_expression1 , float_expression2 )

Returns the angle in radians between the positive x-axis and the ray from the origin to the point (y, x) where x and y are the values of the two specified float expressions.

  • float_expression1: The float expression that is the y-coordinate.
  • float_expression2: The float expression that is the x-coordinate.

                      SELECT ATN2(1, 1);
                      -- Result: 0.785398163397448
                    

CEILING ( numeric_expression ) or CEIL( numeric_expression )

Returns the smallest integer greater than or equal to the specified numeric expression.

  • numeric_expression: The expression of an indeterminate numeric data type except for the bit data type.

                      SELECT CEILING(1.3);
                      -- Result: 2

                      SELECT CEILING(1.5);
                      -- Result: 2

                      SELECT CEILING(1.7);
                      -- Result: 2
                    

COS ( float_expression )

Returns the trigonometric cosine of the specified angle in radians in the specified expression.

  • float_expression: The float expression of the specified angle in radians.

                      SELECT COS(1);
                      -- Result: 0.54030230586814
                    

COT ( float_expression )

Returns the trigonometric cotangent of the angle in radians specified by float_expression.

  • float_expression: The float expression of the angle in radians.

                      SELECT COT(1);
                      -- Result: 0.642092615934331
                    

DEGREES ( numeric_expression )

Returns the angle in degrees for the angle specified in radians.

  • numeric_expression: The angle in radians, an expression of an indeterminate numeric data type except for the bit data type.

                      SELECT DEGREES(3.1415926);
                      -- Result: 179.999996929531
                    

EXP ( float_expression )

Returns the exponential value of the specified float expression. For example, EXP(LOG(20)) is 20.

  • float_expression: The float expression.

                      SELECT EXP(2);
                      -- Result: 7.38905609893065
                    

EXPR ( expression )

Evaluates the expression.

  • expression: The expression. Operators allowed are +, -, *, /, ==, !=, >, <, >=, and <=.

                      SELECT EXPR('1 + 2 * 3');
                      -- Result: 7

                      SELECT EXPR('1 + 2 * 3 == 7');
                      -- Result: true
                    

FLOOR ( numeric_expression )

Returns the largest integer less than or equal to the numeric expression.

  • numeric_expression: The expression of an indeterminate numeric data type except for the bit data type.

                      SELECT FLOOR(1.3);
                      -- Result: 1

                      SELECT FLOOR(1.5);
                      -- Result: 1

                      SELECT FLOOR(1.7);
                      -- Result: 1
                    

GREATEST(int1,int2,....)

Returns the greatest of the supplied integers.

				SELECT GREATEST(3,5,8,10,1)
				-- Result: 10			
				

HEX(value)

Returns a the equivalent hex for the input value.

  • value: A string or numerical value to be converted into hex.

				SELECT HEX(866849198);
				-- Result: 33AB11AE
				
				SELECT HEX('Sample Text');
				-- Result: 53616D706C652054657874
				

LEAST(int1,int2,....)

Returns the least of the supplied integers.

				SELECT LEAST(3,5,8,10,1)
				-- Result: 1			
				

LOG ( float_expression [, base ] )

Returns the natural logarithm of the specified float expression.

  • float_expression: The float expression.
  • base: The optional integer argument that sets the base for the logarithm.

                      SELECT LOG(7.3890560);
                      -- Result: 1.99999998661119
                    

LOG10 ( float_expression )

Returns the base-10 logarithm of the specified float expression.

  • float_expression: The expression of type float.

                      SELECT LOG10(10000);
                      -- Result: 4
                    

MOD(dividend,divisor)

Returns the integer value associated with the remainder when dividing the dividend by the divisor.

  • dividend: The number to take the modulus of.
  • divisor: The number to divide the dividend by when determining the modulus.

				SELECT MOD(10,3);
				-- Result: 1
				

NEGATE(real_number)

Returns the opposite to the real number input.

  • real_number: The real number to find the opposite of.

				SELECT NEGATE(10);
				-- Result: -10
				
				SELECT NEGATE(-12.4)
				--Result: 12.4
				

PI ( )

Returns the constant value of pi.

                  SELECT PI()
                  -- Result: 3.14159265358979 
                

POWER ( float_expression , y )

Returns the value of the specified expression raised to the specified power.

  • float_expression: The float expression.
  • y: The power to raise float_expression to.

                      SELECT POWER(2, 10);
                      -- Result: 1024

                      SELECT POWER(2, -2);
                      -- Result: 0.25
                    

RADIANS ( float_expression )

Returns the angle in radians of the angle in degrees.

  • float_expression: The degrees of the angle as a float expression.

                      SELECT RADIANS(180);
                      -- Result: 3.14159265358979
                    

RAND ( [ integer_seed ] )

Returns a pseudorandom float value from 0 through 1, exclusive.

  • seed: The optional integer expression that specifies the seed value. If seed is not specified, a seed value at random will be assigned.

                      SELECT RAND();
                      -- This result may be different, since the seed is randomized
                      -- Result: 0.873159630165044

                      SELECT RAND(1);
                      -- This result will always be the same, since the seed is constant
                      -- Result: 0.248668584157093
                    

ROUND ( numeric_expression [ ,integer_length] [ ,function ] )

Returns the numeric value rounded to the specified length or precision.

  • numeric_expression: The expression of a numeric data type.
  • length: The optional precision to round the numeric expression to. When this is ommitted, the default behavior will be to round to the nearest whole number.
  • function: The optional type of operation to perform. When the function parameter is omitted or has a value of 0 (default), numeric_expression is rounded. When a value other than 0 is specified, numeric_expression is truncated.

                      SELECT ROUND(1.3, 0);
                      -- Result: 1

                      SELECT ROUND(1.55, 1);
                      -- Result: 1.6

                      SELECT ROUND(1.7, 0, 0);
                      -- Result: 2

                      SELECT ROUND(1.7, 0, 1);
                      -- Result: 1
                      
                      SELECT ROUND (1.24);
                      -- Result: 1.0
                    

SIGN ( numeric_expression )

Returns the positive sign (1), 0, or negative sign (-1) of the specified expression.

  • numeric_expression: The expression of an indeterminate data type except for the bit data type.

                      SELECT SIGN(0);
                      -- Result: 0

                      SELECT SIGN(10);
                      -- Result: 1

                      SELECT SIGN(-10);
                      -- Result: -1
                    

SIN ( float_expression )

Returns the trigonometric sine of the angle in radians.

  • float_expression: The float expression specifying the angle in radians.

                     SELECT SIN(1);
                     -- Result: 0.841470984807897
                    

SQRT ( float_expression )

Returns the square root of the specified float value.

  • float_expression: The expression of type float.

                      SELECT SQRT(100);
                      -- Result: 10
                    

SQUARE ( float_expression )

Returns the square of the specified float value.

  • float_expression: The expression of type float.

                      SELECT SQUARE(10);
                      -- Result: 100

                      SELECT SQUARE(-10);
                      -- Result: 100
                    

TAN ( float_expression )

Returns the tangent of the input expression.

  • float_expression: The expression of type float.

                      SELECT TAN(1);
                      -- Result: 1.5574077246549
                    

TRUNC(decimal_number,precision)

Returns the supplied decimal number truncated to have the supplied decimal precision.

  • decimal_number: The decimal value to truncate.
  • precision: The number of decimal places to truncate the decimal number to.

				SELECT TRUNC(10.3423,2);
				-- Result: 10.34
				

CData Cloud

INSERT Statements

To create new records, use INSERT statements.

INSERT Syntax

The INSERT statement specifies the columns to be inserted and the new column values. You can specify the column values in a comma-separated list in the VALUES clause, as shown in the following example:

INSERT INTO <table_name> 
( <column_reference> [ , ... ] )
VALUES 
( { <expression> | NULL } [ , ... ] ) 
  

<expression> ::=
  | @ <parameter> 
  | ?
  | <literal>
The following is an example query:
INSERT INTO [CData].[Default].Customers (CompanyName) VALUES ('RSSBus Inc.')

CData Cloud

CACHE Statements

CData Cloud

EXECUTE Statements

To execute stored procedures, you can use EXECUTE or EXEC statements.

EXEC and EXECUTE assign stored procedure inputs, referenced by name, to values or parameter names.

Stored Procedure Syntax

To execute a stored procedure as an SQL statement, use the following syntax:

 
{ EXECUTE | EXEC } <stored_proc_name> 
{
  [ @ ] <input_name> = <expression>
} [ , ... ]

<expression> ::=
  | @ <parameter> 
  | ?
  | <literal>

Example Statements

Reference stored procedure inputs by name:

EXECUTE my_proc @second = 2, @first = 1, @third = 3;

Execute a parameterized stored procedure statement:

EXECUTE my_proc second = @p1, first = @p2, third = @p3; 

CData Cloud

PIVOT and UNPIVOT

PIVOT and UNPIVOT can be used to change a table-valued expression into another table.

PIVOT

PIVOT rotates a table-value expression by turning unique values from one column into multiple columns in the output. PIVOT can run aggregations where required on any column value.
PIVOT Synax

 
"SELECT 'AverageCost' AS Cost_Sorted_By_Production_Days, [0], [1], [2], [3], [4]
FROM
(
SELECT DaysToManufacture, StandardCost
FROM Production.Product
) AS SourceTable
PIVOT
(
AVG(StandardCost)
FOR DaysToManufacture IN ([0], [1], [2], [3], [4])
) AS PivotTable;"

UNPIVOT

UNPIVOT carries out nearly the opposite to PIVOT by rotating columns of a table-valued expressions into column values.
UNPIVOT Sytax

 
"SELECT VendorID, Employee, Orders
FROM
(SELECT VendorID, Emp1, Emp2, Emp3, Emp4, Emp5
FROM pvt) p
UNPIVOT
(Orders FOR Employee IN
(Emp1, Emp2, Emp3, Emp4, Emp5)
)AS unpvt;"

For further information on PIVOT and UNPIVOT, see FROM clause plus JOIN, APPLY, PIVOT (Transact-SQL)

CData Cloud

Data Model

The Cloud models Apache Impala instances as relational databases. The Cloud leverages the Impala Server Thrift API, to enable bidirectional access to Apache Impala data through SQL. Impala Server 2.2.0 and above are supported.

Discovering Schemas

The CData Cloud dynamically obtains the Apache Impala schemas. Reconnect to load any changes in the metadata, such as added or removed columns or changes in data type.

CData Cloud

System Tables

You can query the system tables described in this section to access schema information, information on data source functionality, and batch operation statistics.

Schema Tables

The following tables return database metadata for Apache Impala:

  • sys_catalogs: Lists the available databases.
  • sys_schemas: Lists the available schemas.
  • sys_tables: Lists the available tables and views.
  • sys_tablecolumns: Describes the columns of the available tables and views.
  • sys_procedures: Describes the available stored procedures.
  • sys_procedureparameters: Describes stored procedure parameters.
  • sys_keycolumns: Describes the primary and foreign keys.
  • sys_indexes: Describes the available indexes.

Data Source Tables

The following tables return information about how to connect to and query the data source:

  • sys_connection_props: Returns information on the available connection properties.
  • sys_sqlinfo: Describes the SELECT queries that the Cloud can offload to the data source.

Query Information Tables

The following table returns query statistics for data modification queries, including batch operations::

  • sys_identity: Returns information about batch operations or single updates.

CData Cloud

sys_catalogs

Lists the available databases.

The following query retrieves all databases determined by the connection string:

SELECT * FROM sys_catalogs

Columns

Name Type Description
CatalogName String The database name.

CData Cloud

sys_schemas

Lists the available schemas.

The following query retrieves all available schemas:

          SELECT * FROM sys_schemas
          

Columns

Name Type Description
CatalogName String The database name.
SchemaName String The schema name.

CData Cloud

sys_tables

Lists the available tables.

The following query retrieves the available tables and views:

          SELECT * FROM sys_tables
          

Columns

Name Type Description
CatalogName String The database containing the table or view.
SchemaName String The schema containing the table or view.
TableName String The name of the table or view.
TableType String The table type (table or view).
Description String A description of the table or view.
IsUpdateable Boolean Whether the table can be updated.

CData Cloud

sys_tablecolumns

Describes the columns of the available tables and views.

The following query returns the columns and data types for the [CData].[Default].Customers table:

SELECT ColumnName, DataTypeName FROM sys_tablecolumns WHERE TableName='Customers' AND CatalogName='CData' AND SchemaName='Default'

Columns

Name Type Description
CatalogName String The name of the database containing the table or view.
SchemaName String The schema containing the table or view.
TableName String The name of the table or view containing the column.
ColumnName String The column name.
DataTypeName String The data type name.
DataType Int32 An integer indicating the data type. This value is determined at run time based on the environment.
Length Int32 The storage size of the column.
DisplaySize Int32 The designated column's normal maximum width in characters.
NumericPrecision Int32 The maximum number of digits in numeric data. The column length in characters for character and date-time data.
NumericScale Int32 The column scale or number of digits to the right of the decimal point.
IsNullable Boolean Whether the column can contain null.
Description String A brief description of the column.
Ordinal Int32 The sequence number of the column.
IsAutoIncrement String Whether the column value is assigned in fixed increments.
IsGeneratedColumn String Whether the column is generated.
IsHidden Boolean Whether the column is hidden.
IsArray Boolean Whether the column is an array.

CData Cloud

sys_procedures

Lists the available stored procedures.

The following query retrieves the available stored procedures:

          SELECT * FROM sys_procedures
          

Columns

Name Type Description
CatalogName String The database containing the stored procedure.
SchemaName String The schema containing the stored procedure.
ProcedureName String The name of the stored procedure.
Description String A description of the stored procedure.
ProcedureType String The type of the procedure, such as PROCEDURE or FUNCTION.

CData Cloud

sys_procedureparameters

Describes stored procedure parameters.

The following query returns information about all of the input parameters for the SearchSuppliers stored procedure:

SELECT * FROM sys_procedureparameters WHERE ProcedureName='SearchSuppliers' AND Direction=1 OR Direction=2

Columns

Name Type Description
CatalogName String The name of the database containing the stored procedure.
SchemaName String The name of the schema containing the stored procedure.
ProcedureName String The name of the stored procedure containing the parameter.
ColumnName String The name of the stored procedure parameter.
Direction Int32 An integer corresponding to the type of the parameter: input (1), input/output (2), or output(4). input/output type parameters can be both input and output parameters.
DataTypeName String The name of the data type.
DataType Int32 An integer indicating the data type. This value is determined at run time based on the environment.
Length Int32 The number of characters allowed for character data. The number of digits allowed for numeric data.
NumericPrecision Int32 The maximum precision for numeric data. The column length in characters for character and date-time data.
NumericScale Int32 The number of digits to the right of the decimal point in numeric data.
IsNullable Boolean Whether the parameter can contain null.
IsRequired Boolean Whether the parameter is required for execution of the procedure.
IsArray Boolean Whether the parameter is an array.
Description String The description of the parameter.
Ordinal Int32 The index of the parameter.

CData Cloud

sys_keycolumns

Describes the primary and foreign keys.

The following query retrieves the primary key for the [CData].[Default].Customers table:

         SELECT * FROM sys_keycolumns WHERE IsKey='True' AND TableName='Customers' AND CatalogName='CData' AND SchemaName='Default'
          

Columns

Name Type Description
CatalogName String The name of the database containing the key.
SchemaName String The name of the schema containing the key.
TableName String The name of the table containing the key.
ColumnName String The name of the key column.
IsKey Boolean Whether the column is a primary key in the table referenced in the TableName field.
IsForeignKey Boolean Whether the column is a foreign key referenced in the TableName field.
PrimaryKeyName String The name of the primary key.
ForeignKeyName String The name of the foreign key.
ReferencedCatalogName String The database containing the primary key.
ReferencedSchemaName String The schema containing the primary key.
ReferencedTableName String The table containing the primary key.
ReferencedColumnName String The column name of the primary key.

CData Cloud

sys_foreignkeys

Describes the foreign keys.

The following query retrieves all foreign keys which refer to other tables:

         SELECT * FROM sys_foreignkeys WHERE ForeignKeyType = 'FOREIGNKEY_TYPE_IMPORT'
          

Columns

Name Type Description
CatalogName String The name of the database containing the key.
SchemaName String The name of the schema containing the key.
TableName String The name of the table containing the key.
ColumnName String The name of the key column.
PrimaryKeyName String The name of the primary key.
ForeignKeyName String The name of the foreign key.
ReferencedCatalogName String The database containing the primary key.
ReferencedSchemaName String The schema containing the primary key.
ReferencedTableName String The table containing the primary key.
ReferencedColumnName String The column name of the primary key.
ForeignKeyType String Designates whether the foreign key is an import (points to other tables) or export (referenced from other tables) key.

CData Cloud

sys_primarykeys

Describes the primary keys.

The following query retrieves the primary keys from all tables and views:

         SELECT * FROM sys_primarykeys
          

Columns

Name Type Description
CatalogName String The name of the database containing the key.
SchemaName String The name of the schema containing the key.
TableName String The name of the table containing the key.
ColumnName String The name of the key column.
KeySeq String The sequence number of the primary key.
KeyName String The name of the primary key.

CData Cloud

sys_indexes

Describes the available indexes. By filtering on indexes, you can write more selective queries with faster query response times.

The following query retrieves all indexes that are not primary keys:

          SELECT * FROM sys_indexes WHERE IsPrimary='false'
          

Columns

Name Type Description
CatalogName String The name of the database containing the index.
SchemaName String The name of the schema containing the index.
TableName String The name of the table containing the index.
IndexName String The index name.
ColumnName String The name of the column associated with the index.
IsUnique Boolean True if the index is unique. False otherwise.
IsPrimary Boolean True if the index is a primary key. False otherwise.
Type Int16 An integer value corresponding to the index type: statistic (0), clustered (1), hashed (2), or other (3).
SortOrder String The sort order: A for ascending or D for descending.
OrdinalPosition Int16 The sequence number of the column in the index.

CData Cloud

sys_connection_props

Returns information on the available connection properties and those set in the connection string.

When querying this table, the config connection string should be used:

jdbc:cdata:apacheimpala:config:

This connection string enables you to query this table without a valid connection.

The following query retrieves all connection properties that have been set in the connection string or set through a default value:

SELECT * FROM sys_connection_props WHERE Value <> ''

Columns

Name Type Description
Name String The name of the connection property.
ShortDescription String A brief description.
Type String The data type of the connection property.
Default String The default value if one is not explicitly set.
Values String A comma-separated list of possible values. A validation error is thrown if another value is specified.
Value String The value you set or a preconfigured default.
Required Boolean Whether the property is required to connect.
Category String The category of the connection property.
IsSessionProperty String Whether the property is a session property, used to save information about the current connection.
Sensitivity String The sensitivity level of the property. This informs whether the property is obfuscated in logging and authentication forms.
PropertyName String A camel-cased truncated form of the connection property name.
Ordinal Int32 The index of the parameter.
CatOrdinal Int32 The index of the parameter category.
Hierarchy String Shows dependent properties associated that need to be set alongside this one.
Visible Boolean Informs whether the property is visible in the connection UI.
ETC String Various miscellaneous information about the property.

CData Cloud

sys_sqlinfo

Describes the SELECT query processing that the Cloud can offload to the data source.

See SQL Compliance for SQL syntax details.

Discovering the Data Source's SELECT Capabilities

Below is an example data set of SQL capabilities. The following result set indicates the SELECT functionality that the Cloud can offload to the data source or process client side. Your data source may support additional SQL syntax. Some aspects of SELECT functionality are returned in a comma-separated list if supported; otherwise, the column contains NO.

NameDescriptionPossible Values
AGGREGATE_FUNCTIONSSupported aggregation functions.AVG, COUNT, MAX, MIN, SUM, DISTINCT
COUNTWhether COUNT function is supported.YES, NO
IDENTIFIER_QUOTE_OPEN_CHARThe opening character used to escape an identifier.[
IDENTIFIER_QUOTE_CLOSE_CHARThe closing character used to escape an identifier.]
SUPPORTED_OPERATORSA list of supported SQL operators.=, >, <, >=, <=, <>, !=, LIKE, NOT LIKE, IN, NOT IN, IS NULL, IS NOT NULL, AND, OR
GROUP_BYWhether GROUP BY is supported, and, if so, the degree of support.NO, NO_RELATION, EQUALS_SELECT, SQL_GB_COLLATE
OJ_CAPABILITIESThe supported varieties of outer joins supported.NO, LEFT, RIGHT, FULL, INNER, NOT_ORDERED, ALL_COMPARISON_OPS
OUTER_JOINSWhether outer joins are supported.YES, NO
SUBQUERIESWhether subqueries are supported, and, if so, the degree of support.NO, COMPARISON, EXISTS, IN, CORRELATED_SUBQUERIES, QUANTIFIED
STRING_FUNCTIONSSupported string functions.LENGTH, CHAR, LOCATE, REPLACE, SUBSTRING, RTRIM, LTRIM, RIGHT, LEFT, UCASE, SPACE, SOUNDEX, LCASE, CONCAT, ASCII, REPEAT, OCTET, BIT, POSITION, INSERT, TRIM, UPPER, REGEXP, LOWER, DIFFERENCE, CHARACTER, SUBSTR, STR, REVERSE, PLAN, UUIDTOSTR, TRANSLATE, TRAILING, TO, STUFF, STRTOUUID, STRING, SPLIT, SORTKEY, SIMILAR, REPLICATE, PATINDEX, LPAD, LEN, LEADING, KEY, INSTR, INSERTSTR, HTML, GRAPHICAL, CONVERT, COLLATION, CHARINDEX, BYTE
NUMERIC_FUNCTIONSSupported numeric functions.ABS, ACOS, ASIN, ATAN, ATAN2, CEILING, COS, COT, EXP, FLOOR, LOG, MOD, SIGN, SIN, SQRT, TAN, PI, RAND, DEGREES, LOG10, POWER, RADIANS, ROUND, TRUNCATE
TIMEDATE_FUNCTIONSSupported date/time functions.NOW, CURDATE, DAYOFMONTH, DAYOFWEEK, DAYOFYEAR, MONTH, QUARTER, WEEK, YEAR, CURTIME, HOUR, MINUTE, SECOND, TIMESTAMPADD, TIMESTAMPDIFF, DAYNAME, MONTHNAME, CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, EXTRACT
REPLICATION_SKIP_TABLESIndicates tables skipped during replication.
REPLICATION_TIMECHECK_COLUMNSA string array containing a list of columns which will be used to check for (in the given order) to use as a modified column during replication.
IDENTIFIER_PATTERNString value indicating what string is valid for an identifier.
SUPPORT_TRANSACTIONIndicates if the provider supports transactions such as commit and rollback.YES, NO
DIALECTIndicates the SQL dialect to use.
KEY_PROPERTIESIndicates the properties which identify the uniform database.
SUPPORTS_MULTIPLE_SCHEMASIndicates if multiple schemas may exist for the provider.YES, NO
SUPPORTS_MULTIPLE_CATALOGSIndicates if multiple catalogs may exist for the provider.YES, NO
DATASYNCVERSIONThe CData Data Sync version needed to access this driver.Standard, Starter, Professional, Enterprise
DATASYNCCATEGORYThe CData Data Sync category of this driver.Source, Destination, Cloud Destination
SUPPORTSENHANCEDSQLWhether enhanced SQL functionality beyond what is offered by the API is supported.TRUE, FALSE
SUPPORTS_BATCH_OPERATIONSWhether batch operations are supported.YES, NO
SQL_CAPAll supported SQL capabilities for this driver.SELECT, INSERT, DELETE, UPDATE, TRANSACTIONS, ORDERBY, OAUTH, ASSIGNEDID, LIMIT, LIKE, BULKINSERT, COUNT, BULKDELETE, BULKUPDATE, GROUPBY, HAVING, AGGS, OFFSET, REPLICATE, COUNTDISTINCT, JOINS, DROP, CREATE, DISTINCT, INNERJOINS, SUBQUERIES, ALTER, MULTIPLESCHEMAS, GROUPBYNORELATION, OUTERJOINS, UNIONALL, UNION, UPSERT, GETDELETED, CROSSJOINS, GROUPBYCOLLATE, MULTIPLECATS, FULLOUTERJOIN, MERGE, JSONEXTRACT, BULKUPSERT, SUM, SUBQUERIESFULL, MIN, MAX, JOINSFULL, XMLEXTRACT, AVG, MULTISTATEMENTS, FOREIGNKEYS, CASE, LEFTJOINS, COMMAJOINS, WITH, LITERALS, RENAME, NESTEDTABLES, EXECUTE, BATCH, BASIC, INDEX
PREFERRED_CACHE_OPTIONSA string value specifies the preferred cacheOptions.
ENABLE_EF_ADVANCED_QUERYIndicates if the driver directly supports advanced queries coming from Entity Framework. If not, queries will be handled client side.YES, NO
PSEUDO_COLUMNSA string array indicating the available pseudo columns.
MERGE_ALWAYSIf the value is true, The Merge Mode is forcibly executed in Data Sync.TRUE, FALSE
REPLICATION_MIN_DATE_QUERYA select query to return the replicate start datetime.
REPLICATION_MIN_FUNCTIONAllows a provider to specify the formula name to use for executing a server side min.
REPLICATION_START_DATEAllows a provider to specify a replicate startdate.
REPLICATION_MAX_DATE_QUERYA select query to return the replicate end datetime.
REPLICATION_MAX_FUNCTIONAllows a provider to specify the formula name to use for executing a server side max.
IGNORE_INTERVALS_ON_INITIAL_REPLICATEA list of tables which will skip dividing the replicate into chunks on the initial replicate.
CHECKCACHE_USE_PARENTIDIndicates whether the CheckCache statement should be done against the parent key column.TRUE, FALSE
CREATE_SCHEMA_PROCEDURESIndicates stored procedures that can be used for generating schema files.

The following query retrieves the operators that can be used in the WHERE clause:

SELECT * FROM sys_sqlinfo WHERE Name='SUPPORTED_OPERATORS'
Note that individual tables may have different limitations or requirements on the WHERE clause; refer to the Data Model section for more information.

Columns

Name Type Description
NAME String A component of SQL syntax, or a capability that can be processed on the server.
VALUE String Detail on the supported SQL or SQL syntax.

CData Cloud

sys_identity

Returns information about attempted modifications.

The following query retrieves the Ids of the modified rows in a batch operation:

         SELECT * FROM sys_identity
          

Columns

Name Type Description
Id String The database-generated Id returned from a data modification operation.
Batch String An identifier for the batch. 1 for a single operation.
Operation String The result of the operation in the batch: INSERTED, UPDATED, or DELETED.
Message String SUCCESS or an error message if the update in the batch failed.

CData Cloud

Connection String Options

The connection string properties are the various options that can be used to establish a connection. This section provides a complete list of the options you can configure in the connection string for this provider. Click the links for further details.

Authentication


PropertyDescription
AuthSchemeThe authentication scheme used. Accepted values are NoSasl, LDAP and Kerberos.
ServerThe name of the server running SQL Server.
PortThe port for the connection to the Impala Server instance.
UserThe username used to authenticate with Impala.
PasswordThe password used to authenticate with Impala.
ProtocolVersionThe Thrift protocol version to use when connecting to the Impala server.
DatabaseThe name of the Impala database to use by default.
TransportModeThe transport mode to use to communicate with the Impala server. Accepted entries are BINARY and HTTP.

Kerberos


PropertyDescription
KerberosKDCThe Kerberos Key Distribution Center (KDC) service used to authenticate the user.
KerberosRealmThe Kerberos Realm used to authenticate the user.
KerberosSPNThe service principal name (SPN) for the Kerberos Domain Controller.
KerberosKeytabFileThe Keytab file containing your pairs of Kerberos principals and encrypted keys.
KerberosServiceRealmThe Kerberos realm of the service.
KerberosServiceKDCThe Kerberos KDC of the service.
KerberosTicketCacheThe full file path to an MIT Kerberos credential cache file.

SSL


PropertyDescription
SSLClientCertThe TLS/SSL client certificate store for SSL Client Authentication (2-way SSL).
SSLClientCertTypeThe type of key store containing the TLS/SSL client certificate.
SSLClientCertPasswordThe password for the TLS/SSL client certificate.
SSLClientCertSubjectThe subject of the TLS/SSL client certificate.
SSLServerCertThe certificate to be accepted from the server when connecting using TLS/SSL.

Firewall


PropertyDescription
FirewallTypeThe protocol used by a proxy-based firewall.
FirewallServerThe name or IP address of a proxy-based firewall.
FirewallPortThe TCP port for a proxy-based firewall.
FirewallUserThe user name to use to authenticate with a proxy-based firewall.
FirewallPasswordA password used to authenticate to a proxy-based firewall.

Proxy


PropertyDescription
ProxyAutoDetectThis indicates whether to use the system proxy settings or not. This takes precedence over other proxy settings, so you'll need to set ProxyAutoDetect to FALSE in order use custom proxy settings.
ProxyServerThe hostname or IP address of a proxy to route HTTP traffic through.
ProxyPortThe TCP port the ProxyServer proxy is running on.
ProxyAuthSchemeThe authentication type to use to authenticate to the ProxyServer proxy.
ProxyUserA user name to be used to authenticate to the ProxyServer proxy.
ProxyPasswordA password to be used to authenticate to the ProxyServer proxy.
ProxySSLTypeThe SSL type to use when connecting to the ProxyServer proxy.
ProxyExceptionsA semicolon separated list of destination hostnames or IPs that are exempt from connecting through the ProxyServer .

Logging


PropertyDescription
LogfileA filepath which designates the name and location of the log file.
VerbosityThe verbosity level that determines the amount of detail included in the log file.
LogModulesCore modules to be included in the log file.
MaxLogFileSizeA string specifying the maximum size in bytes for a log file (for example, 10 MB).
MaxLogFileCountA string specifying the maximum file count of log files.

Schema


PropertyDescription
LocationA path to the directory that contains the schema files defining tables, views, and stored procedures.
BrowsableSchemasThis property restricts the schemas reported to a subset of the available schemas. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC.
TablesThis property restricts the tables reported to a subset of the available tables. For example, Tables=TableA,TableB,TableC.
ViewsRestricts the views reported to a subset of the available tables. For example, Views=ViewA,ViewB,ViewC.

Caching


PropertyDescription
AutoCacheAutomatically caches the results of SELECT queries into a cache database specified by either CacheLocation or both of CacheConnection and CacheProvider .
CacheLocationSpecifies the path to the cache when caching to a file.
CacheToleranceThe tolerance for stale data in the cache specified in seconds when using AutoCache .
OfflineUse offline mode to get the data from the cache instead of the live source.
CacheMetadataThis property determines whether or not to cache the table metadata to a file store.

Miscellaneous


PropertyDescription
HTTPPathThe path component of the URL endpoint when using HTTP TransportMode.
MaxRowsLimits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.
OtherThese hidden properties are used only in specific use cases.
PagesizeThe maximum number of results to return per page from Apache Impala.
PseudoColumnsThis property indicates whether or not to include pseudo columns as columns to the table.
QueryPassthroughThis option passes the query to the Apache Impala server as is.
ReadonlyYou can use this property to enforce read-only access to Apache Impala from the provider.
RTKThe runtime key used for licensing.
TimeoutThe value in seconds until the timeout error is thrown, canceling the operation.
UserDefinedViewsA filepath pointing to the JSON configuration file containing your custom views.
UseSSLSpecifies whether to use SSL Encryption when connecting to Impala.
CData Cloud

Authentication

This section provides a complete list of the Authentication properties you can configure in the connection string for this provider.


PropertyDescription
AuthSchemeThe authentication scheme used. Accepted values are NoSasl, LDAP and Kerberos.
ServerThe name of the server running SQL Server.
PortThe port for the connection to the Impala Server instance.
UserThe username used to authenticate with Impala.
PasswordThe password used to authenticate with Impala.
ProtocolVersionThe Thrift protocol version to use when connecting to the Impala server.
DatabaseThe name of the Impala database to use by default.
TransportModeThe transport mode to use to communicate with the Impala server. Accepted entries are BINARY and HTTP.
CData Cloud

AuthScheme

The authentication scheme used. Accepted values are NoSasl, LDAP and Kerberos.

Possible Values

NoSasl, LDAP, Kerberos

Data Type

string

Default Value

"NoSasl"

Remarks

The AuthScheme used to authenticate with Impala.

NoSasl (default) default
LDAP Used when the enable_ldap_auth property is set to true and ldap_uri property is not empty.
Kerberos Used when the keytab_file property is is not empty.

CData Cloud

Server

The name of the server running SQL Server.

Data Type

string

Default Value

""

Remarks

Set this property to the name or network address of the SQL Server instance.

CData Cloud

Port

The port for the connection to the Impala Server instance.

Data Type

string

Default Value

"21050"

Remarks

When using BINARY TransportMode, this property should be set to the value in the 'hs2_port'(default 21050) property of the impala configuration (http://host:25000/varz).

When using HTTP TransportMode, this property should be set to the value in the 'hs2_http_port'(default 28000) property of the impala configuration (http://host:25000/varz).

CData Cloud

User

The username used to authenticate with Impala.

Data Type

string

Default Value

""

Remarks

The username used to authenticate with Impala.

CData Cloud

Password

The password used to authenticate with Impala.

Data Type

string

Default Value

""

Remarks

The password used to authenticate with Impala.

CData Cloud

ProtocolVersion

The Thrift protocol version to use when connecting to the Impala server.

Possible Values

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

Data Type

string

Default Value

"7"

Remarks

The most efficient protocol version will be determined automatically by the CData Cloud upon connecting to Impala. This property allows you to explicitly specify the version to use and overrides the version determined by the CData Cloud.

CData Cloud

Database

The name of the Impala database to use by default.

Data Type

string

Default Value

""

Remarks

When specified, the CData Cloud will issue a 'USE [Database]' command upon connecting to Impala. This will be the database schema used when executing queries that do not have a schema explicitly specified.

To execute queries on other schemas, the schema can be explicitly specified in the statement.

When Database is not set, the 'default' database schema will be used (no 'USE' statement is issued to Impala in this case).

CData Cloud

TransportMode

The transport mode to use to communicate with the Impala server. Accepted entries are BINARY and HTTP.

Possible Values

BINARY, HTTP

Data Type

string

Default Value

"BINARY"

Remarks

The transport mode used to communicate with the Impala server.

CData Cloud

Kerberos

This section provides a complete list of the Kerberos properties you can configure in the connection string for this provider.


PropertyDescription
KerberosKDCThe Kerberos Key Distribution Center (KDC) service used to authenticate the user.
KerberosRealmThe Kerberos Realm used to authenticate the user.
KerberosSPNThe service principal name (SPN) for the Kerberos Domain Controller.
KerberosKeytabFileThe Keytab file containing your pairs of Kerberos principals and encrypted keys.
KerberosServiceRealmThe Kerberos realm of the service.
KerberosServiceKDCThe Kerberos KDC of the service.
KerberosTicketCacheThe full file path to an MIT Kerberos credential cache file.
CData Cloud

KerberosKDC

The Kerberos Key Distribution Center (KDC) service used to authenticate the user.

Data Type

string

Default Value

""

Remarks

The Kerberos properties are used when using SPNEGO or Windows Authentication. The Cloud will request session tickets and temporary session keys from the Kerberos KDC service. The Kerberos KDC service is conventionally colocated with the domain controller.

If Kerberos KDC is not specified, the Cloud will attempt to detect these properties automatically from the following locations:

  • KRB5 Config File (krb5.ini/krb5.conf): If the KRB5_CONFIG environment variable is set and the file exists, the Cloud will obtain the KDC from the specified file. Otherwise, it will attempt to read from the default MIT location based on the OS: C:\ProgramData\MIT\Kerberos5\krb5.ini (Windows) or /etc/krb5.conf (Linux).
  • Domain Name and Host: If the Kerberos Realm and Kerberos KDC could not be inferred from another location, the Cloud will infer them from the configured domain name and host.

CData Cloud

KerberosRealm

The Kerberos Realm used to authenticate the user.

Data Type

string

Default Value

""

Remarks

The Kerberos properties are used when using SPNEGO or Windows Authentication. The Kerberos Realm is used to authenticate the user with the Kerberos Key Distribution Service (KDC). The Kerberos Realm can be configured by an administrator to be any string, but conventionally it is based on the domain name.

If Kerberos Realm is not specified, the Cloud will attempt to detect these properties automatically from the following locations:

  • KRB5 Config File (krb5.ini/krb5.conf): If the KRB5_CONFIG environment variable is set and the file exists, the Cloud will obtain the default realm from the specified file. Otherwise, it will attempt to read from the default MIT location based on the OS: C:\ProgramData\MIT\Kerberos5\krb5.ini (Windows) or /etc/krb5.conf (Linux)
  • Domain Name and Host: If the Kerberos Realm and Kerberos KDC could not be inferred from another location, the Cloud will infer them from the user-configured domain name and host. This might work in some Windows environments.

CData Cloud

KerberosSPN

The service principal name (SPN) for the Kerberos Domain Controller.

Data Type

string

Default Value

""

Remarks

If the SPN on the Kerberos Domain Controller is not the same as the URL that you are authenticating to, use this property to set the SPN.

CData Cloud

KerberosKeytabFile

The Keytab file containing your pairs of Kerberos principals and encrypted keys.

Data Type

string

Default Value

""

Remarks

The Keytab file containing your pairs of Kerberos principals and encrypted keys.

CData Cloud

KerberosServiceRealm

The Kerberos realm of the service.

Data Type

string

Default Value

""

Remarks

The KerberosServiceRealm is the specify the service Kerberos realm when using cross-realm Kerberos authentication.

In most cases, a single realm and KDC machine are used to perform the Kerberos authentication and this property is not required.

This property is available for complex setups where a different realm and KDC machine are used to obtain an authentication ticket (AS request) and a service ticket (TGS request).

CData Cloud

KerberosServiceKDC

The Kerberos KDC of the service.

Data Type

string

Default Value

""

Remarks

The KerberosServiceKDC is used to specify the service Kerberos KDC when using cross-realm Kerberos authentication.

In most cases, a single realm and KDC machine are used to perform the Kerberos authentication and this property is not required.

This property is available for complex setups where a different realm and KDC machine are used to obtain an authentication ticket (AS request) and a service ticket (TGS request).

CData Cloud

KerberosTicketCache

The full file path to an MIT Kerberos credential cache file.

Data Type

string

Default Value

""

Remarks

This property can be set if you wish to use a credential cache file that was created using the MIT Kerberos Ticket Manager or kinit command.

CData Cloud

SSL

This section provides a complete list of the SSL properties you can configure in the connection string for this provider.


PropertyDescription
SSLClientCertThe TLS/SSL client certificate store for SSL Client Authentication (2-way SSL).
SSLClientCertTypeThe type of key store containing the TLS/SSL client certificate.
SSLClientCertPasswordThe password for the TLS/SSL client certificate.
SSLClientCertSubjectThe subject of the TLS/SSL client certificate.
SSLServerCertThe certificate to be accepted from the server when connecting using TLS/SSL.
CData Cloud

SSLClientCert

The TLS/SSL client certificate store for SSL Client Authentication (2-way SSL).

Data Type

string

Default Value

""

Remarks

The name of the certificate store for the client certificate.

The SSLClientCertType field specifies the type of the certificate store specified by SSLClientCert. If the store is password protected, specify the password in SSLClientCertPassword.

SSLClientCert is used in conjunction with the SSLClientCertSubject field in order to specify client certificates. If SSLClientCert has a value, and SSLClientCertSubject is set, a search for a certificate is initiated. See SSLClientCertSubject for more information.

Designations of certificate stores are platform-dependent.

The following are designations of the most common User and Machine certificate stores in Windows:

MYA certificate store holding personal certificates with their associated private keys.
CACertifying authority certificates.
ROOTRoot certificates.
SPCSoftware publisher certificates.

In Java, the certificate store normally is a file containing certificates and optional private keys.

When the certificate store type is PFXFile, this property must be set to the name of the file. When the type is PFXBlob, the property must be set to the binary contents of a PFX file (for example, PKCS12 certificate store).

CData Cloud

SSLClientCertType

The type of key store containing the TLS/SSL client certificate.

Possible Values

USER, MACHINE, PFXFILE, PFXBLOB, JKSFILE, JKSBLOB, PEMKEY_FILE, PEMKEY_BLOB, PUBLIC_KEY_FILE, PUBLIC_KEY_BLOB, SSHPUBLIC_KEY_FILE, SSHPUBLIC_KEY_BLOB, P7BFILE, PPKFILE, XMLFILE, XMLBLOB

Data Type

string

Default Value

"USER"

Remarks

This property can take one of the following values:

USER - defaultFor Windows, this specifies that the certificate store is a certificate store owned by the current user. Note that this store type is not available in Java.
MACHINEFor Windows, this specifies that the certificate store is a machine store. Note that this store type is not available in Java.
PFXFILEThe certificate store is the name of a PFX (PKCS12) file containing certificates.
PFXBLOBThe certificate store is a string (base-64-encoded) representing a certificate store in PFX (PKCS12) format.
JKSFILEThe certificate store is the name of a Java key store (JKS) file containing certificates. Note that this store type is only available in Java.
JKSBLOBThe certificate store is a string (base-64-encoded) representing a certificate store in JKS format. Note that this store type is only available in Java.
PEMKEY_FILEThe certificate store is the name of a PEM-encoded file that contains a private key and an optional certificate.
PEMKEY_BLOBThe certificate store is a string (base64-encoded) that contains a private key and an optional certificate.
PUBLIC_KEY_FILEThe certificate store is the name of a file that contains a PEM- or DER-encoded public key certificate.
PUBLIC_KEY_BLOBThe certificate store is a string (base-64-encoded) that contains a PEM- or DER-encoded public key certificate.
SSHPUBLIC_KEY_FILEThe certificate store is the name of a file that contains an SSH-style public key.
SSHPUBLIC_KEY_BLOBThe certificate store is a string (base-64-encoded) that contains an SSH-style public key.
P7BFILEThe certificate store is the name of a PKCS7 file containing certificates.
PPKFILEThe certificate store is the name of a file that contains a PuTTY Private Key (PPK).
XMLFILEThe certificate store is the name of a file that contains a certificate in XML format.
XMLBLOBThe certificate store is a string that contains a certificate in XML format.

CData Cloud

SSLClientCertPassword

The password for the TLS/SSL client certificate.

Data Type

string

Default Value

""

Remarks

If the certificate store is of a type that requires a password, this property is used to specify that password to open the certificate store.

CData Cloud

SSLClientCertSubject

The subject of the TLS/SSL client certificate.

Data Type

string

Default Value

"*"

Remarks

When loading a certificate the subject is used to locate the certificate in the store.

If an exact match is not found, the store is searched for subjects containing the value of the property. If a match is still not found, the property is set to an empty string, and no certificate is selected.

The special value "*" picks the first certificate in the certificate store.

The certificate subject is a comma separated list of distinguished name fields and values. For example, "CN=www.server.com, OU=test, C=US, [email protected]". The common fields and their meanings are shown below.

FieldMeaning
CNCommon Name. This is commonly a host name like www.server.com.
OOrganization
OUOrganizational Unit
LLocality
SState
CCountry
EEmail Address

If a field value contains a comma, it must be quoted.

CData Cloud

SSLServerCert

The certificate to be accepted from the server when connecting using TLS/SSL.

Data Type

string

Default Value

""

Remarks

If using a TLS/SSL connection, this property can be used to specify the TLS/SSL certificate to be accepted from the server. Any other certificate that is not trusted by the machine is rejected.

This property can take the following forms:

Description Example
A full PEM Certificate (example shortened for brevity) -----BEGIN CERTIFICATE----- MIIChTCCAe4CAQAwDQYJKoZIhv......Qw== -----END CERTIFICATE-----
A path to a local file containing the certificate C:\cert.cer
The public key (example shortened for brevity) -----BEGIN RSA PUBLIC KEY----- MIGfMA0GCSq......AQAB -----END RSA PUBLIC KEY-----
The MD5 Thumbprint (hex values can also be either space or colon separated) ecadbdda5a1529c58a1e9e09828d70e4
The SHA1 Thumbprint (hex values can also be either space or colon separated) 34a929226ae0819f2ec14b4a3d904f801cbb150d

If not specified, any certificate trusted by the machine is accepted.

Use '*' to signify to accept all certificates. Note that this is not recommended due to security concerns.

CData Cloud

Firewall

This section provides a complete list of the Firewall properties you can configure in the connection string for this provider.


PropertyDescription
FirewallTypeThe protocol used by a proxy-based firewall.
FirewallServerThe name or IP address of a proxy-based firewall.
FirewallPortThe TCP port for a proxy-based firewall.
FirewallUserThe user name to use to authenticate with a proxy-based firewall.
FirewallPasswordA password used to authenticate to a proxy-based firewall.
CData Cloud

FirewallType

The protocol used by a proxy-based firewall.

Possible Values

NONE, TUNNEL, SOCKS4, SOCKS5

Data Type

string

Default Value

"NONE"

Remarks

This property specifies the protocol that the Cloud will use to tunnel traffic through the FirewallServer proxy. Note that by default, the Cloud connects to the system proxy; to disable this behavior and connect to one of the following proxy types, set ProxyAutoDetect to false.

Type Default Port Description
TUNNEL 80 When this is set, the Cloud opens a connection to Apache Impala and traffic flows back and forth through the proxy.
SOCKS4 1080 When this is set, the Cloud sends data through the SOCKS 4 proxy specified by FirewallServer and FirewallPort and passes the FirewallUser value to the proxy, which determines if the connection request should be granted.
SOCKS5 1080 When this is set, the Cloud sends data through the SOCKS 5 proxy specified by FirewallServer and FirewallPort. If your proxy requires authentication, set FirewallUser and FirewallPassword to credentials the proxy recognizes.

To connect to HTTP proxies, use ProxyServer and ProxyPort. To authenticate to HTTP proxies, use ProxyAuthScheme, ProxyUser, and ProxyPassword.

CData Cloud

FirewallServer

The name or IP address of a proxy-based firewall.

Data Type

string

Default Value

""

Remarks

This property specifies the IP address, DNS name, or host name of a proxy allowing traversal of a firewall. The protocol is specified by FirewallType: Use FirewallServer with this property to connect through SOCKS or do tunneling. Use ProxyServer to connect to an HTTP proxy.

Note that the Cloud uses the system proxy by default. To use a different proxy, set ProxyAutoDetect to false.

CData Cloud

FirewallPort

The TCP port for a proxy-based firewall.

Data Type

int

Default Value

0

Remarks

This specifies the TCP port for a proxy allowing traversal of a firewall. Use FirewallServer to specify the name or IP address. Specify the protocol with FirewallType.

CData Cloud

FirewallUser

The user name to use to authenticate with a proxy-based firewall.

Data Type

string

Default Value

""

Remarks

The FirewallUser and FirewallPassword properties are used to authenticate against the proxy specified in FirewallServer and FirewallPort, following the authentication method specified in FirewallType.

CData Cloud

FirewallPassword

A password used to authenticate to a proxy-based firewall.

Data Type

string

Default Value

""

Remarks

This property is passed to the proxy specified by FirewallServer and FirewallPort, following the authentication method specified by FirewallType.

CData Cloud

Proxy

This section provides a complete list of the Proxy properties you can configure in the connection string for this provider.


PropertyDescription
ProxyAutoDetectThis indicates whether to use the system proxy settings or not. This takes precedence over other proxy settings, so you'll need to set ProxyAutoDetect to FALSE in order use custom proxy settings.
ProxyServerThe hostname or IP address of a proxy to route HTTP traffic through.
ProxyPortThe TCP port the ProxyServer proxy is running on.
ProxyAuthSchemeThe authentication type to use to authenticate to the ProxyServer proxy.
ProxyUserA user name to be used to authenticate to the ProxyServer proxy.
ProxyPasswordA password to be used to authenticate to the ProxyServer proxy.
ProxySSLTypeThe SSL type to use when connecting to the ProxyServer proxy.
ProxyExceptionsA semicolon separated list of destination hostnames or IPs that are exempt from connecting through the ProxyServer .
CData Cloud

ProxyAutoDetect

This indicates whether to use the system proxy settings or not. This takes precedence over other proxy settings, so you'll need to set ProxyAutoDetect to FALSE in order use custom proxy settings.

Data Type

bool

Default Value

true

Remarks

This takes precedence over other proxy settings, so you'll need to set ProxyAutoDetect to FALSE in order use custom proxy settings.

To connect to an HTTP proxy, see ProxyServer. For other proxies, such as SOCKS or tunneling, see FirewallType.

CData Cloud

ProxyServer

The hostname or IP address of a proxy to route HTTP traffic through.

Data Type

string

Default Value

""

Remarks

The hostname or IP address of a proxy to route HTTP traffic through. The Cloud can use the HTTP, Windows (NTLM), or Kerberos authentication types to authenticate to an HTTP proxy.

If you need to connect through a SOCKS proxy or tunnel the connection, see FirewallType.

By default, the Cloud uses the system proxy. If you need to use another proxy, set ProxyAutoDetect to false.

CData Cloud

ProxyPort

The TCP port the ProxyServer proxy is running on.

Data Type

int

Default Value

80

Remarks

The port the HTTP proxy is running on that you want to redirect HTTP traffic through. Specify the HTTP proxy in ProxyServer. For other proxy types, see FirewallType.

CData Cloud

ProxyAuthScheme

The authentication type to use to authenticate to the ProxyServer proxy.

Possible Values

BASIC, DIGEST, NONE, NEGOTIATE, NTLM, PROPRIETARY

Data Type

string

Default Value

"BASIC"

Remarks

This value specifies the authentication type to use to authenticate to the HTTP proxy specified by ProxyServer and ProxyPort.

Note that the Cloud will use the system proxy settings by default, without further configuration needed; if you want to connect to another proxy, you will need to set ProxyAutoDetect to false, in addition to ProxyServer and ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed.

The authentication type can be one of the following:

  • BASIC: The Cloud performs HTTP BASIC authentication.
  • DIGEST: The Cloud performs HTTP DIGEST authentication.
  • NEGOTIATE: The Cloud retrieves an NTLM or Kerberos token based on the applicable protocol for authentication.
  • PROPRIETARY: The Cloud does not generate an NTLM or Kerberos token. You must supply this token in the Authorization header of the HTTP request.

If you need to use another authentication type, such as SOCKS 5 authentication, see FirewallType.

CData Cloud

ProxyUser

A user name to be used to authenticate to the ProxyServer proxy.

Data Type

string

Default Value

""

Remarks

The ProxyUser and ProxyPassword options are used to connect and authenticate against the HTTP proxy specified in ProxyServer.

You can select one of the available authentication types in ProxyAuthScheme. If you are using HTTP authentication, set this to the user name of a user recognized by the HTTP proxy. If you are using Windows or Kerberos authentication, set this property to a user name in one of the following formats:

user@domain
domain\user

CData Cloud

ProxyPassword

A password to be used to authenticate to the ProxyServer proxy.

Data Type

string

Default Value

""

Remarks

This property is used to authenticate to an HTTP proxy server that supports NTLM (Windows), Kerberos, or HTTP authentication. To specify the HTTP proxy, you can set ProxyServer and ProxyPort. To specify the authentication type, set ProxyAuthScheme.

If you are using HTTP authentication, additionally set ProxyUser and ProxyPassword to HTTP proxy.

If you are using NTLM authentication, set ProxyUser and ProxyPassword to your Windows password. You may also need these to complete Kerberos authentication.

For SOCKS 5 authentication or tunneling, see FirewallType.

By default, the Cloud uses the system proxy. If you want to connect to another proxy, set ProxyAutoDetect to false.

CData Cloud

ProxySSLType

The SSL type to use when connecting to the ProxyServer proxy.

Possible Values

AUTO, ALWAYS, NEVER, TUNNEL

Data Type

string

Default Value

"AUTO"

Remarks

This property determines when to use SSL for the connection to an HTTP proxy specified by ProxyServer. This value can be AUTO, ALWAYS, NEVER, or TUNNEL. The applicable values are the following:

AUTODefault setting. If the URL is an HTTPS URL, the Cloud will use the TUNNEL option. If the URL is an HTTP URL, the component will use the NEVER option.
ALWAYSThe connection is always SSL enabled.
NEVERThe connection is not SSL enabled.
TUNNELThe connection is through a tunneling proxy. The proxy server opens a connection to the remote host and traffic flows back and forth through the proxy.

CData Cloud

ProxyExceptions

A semicolon separated list of destination hostnames or IPs that are exempt from connecting through the ProxyServer .

Data Type

string

Default Value

""

Remarks

The ProxyServer is used for all addresses, except for addresses defined in this property. Use semicolons to separate entries.

Note that the Cloud uses the system proxy settings by default, without further configuration needed; if you want to explicitly configure proxy exceptions for this connection, you need to set ProxyAutoDetect = false, and configure ProxyServer and ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed.

CData Cloud

Logging

This section provides a complete list of the Logging properties you can configure in the connection string for this provider.


PropertyDescription
LogfileA filepath which designates the name and location of the log file.
VerbosityThe verbosity level that determines the amount of detail included in the log file.
LogModulesCore modules to be included in the log file.
MaxLogFileSizeA string specifying the maximum size in bytes for a log file (for example, 10 MB).
MaxLogFileCountA string specifying the maximum file count of log files.
CData Cloud

Logfile

A filepath which designates the name and location of the log file.

Data Type

string

Default Value

""

Remarks

Once this property is set, the Cloud will populate the log file as it carries out various tasks, such as when authentication is performed or queries are executed. If the specified file doesn't already exist, it will be created.

Connection strings and version information are also logged, though connection properties containing sensitive information are masked automatically.

If a relative filepath is supplied, the location of the log file will be resolved based on the path found in the Location connection property.

For more control over what is written to the log file, you can adjust the Verbosity property.

Log contents are categorized into several modules. You can show/hide individual modules using the LogModules property.

To edit the maximum size of a single logfile before a new one is created, see MaxLogFileSize.

If you would like to place a cap on the number of logfiles generated, use MaxLogFileCount.

CData Cloud

Verbosity

The verbosity level that determines the amount of detail included in the log file.

Data Type

string

Default Value

"1"

Remarks

The verbosity level determines the amount of detail that the Cloud reports to the Logfile. Verbosity levels from 1 to 5 are supported. These are detailed in the Logging page.

CData Cloud

LogModules

Core modules to be included in the log file.

Data Type

string

Default Value

""

Remarks

Only the modules specified (separated by ';') will be included in the log file. By default all modules are included.

See the Logging page for an overview.

CData Cloud

MaxLogFileSize

A string specifying the maximum size in bytes for a log file (for example, 10 MB).

Data Type

string

Default Value

"100MB"

Remarks

When the limit is hit, a new log is created in the same folder with the date and time appended to the end. The default limit is 100 MB. Values lower than 100 kB will use 100 kB as the value instead.

Adjust the maximum number of logfiles generated with MaxLogFileCount.

CData Cloud

MaxLogFileCount

A string specifying the maximum file count of log files.

Data Type

int

Default Value

-1

Remarks

When the limit is hit, a new log is created in the same folder with the date and time appended to the end and the oldest log file will be deleted.

The minimum supported value is 2. A value of 0 or a negative value indicates no limit on the count.

Adjust the maximum size of the logfiles generated with MaxLogFileSize.

CData Cloud

Schema

This section provides a complete list of the Schema properties you can configure in the connection string for this provider.


PropertyDescription
LocationA path to the directory that contains the schema files defining tables, views, and stored procedures.
BrowsableSchemasThis property restricts the schemas reported to a subset of the available schemas. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC.
TablesThis property restricts the tables reported to a subset of the available tables. For example, Tables=TableA,TableB,TableC.
ViewsRestricts the views reported to a subset of the available tables. For example, Views=ViewA,ViewB,ViewC.
CData Cloud

Location

A path to the directory that contains the schema files defining tables, views, and stored procedures.

Data Type

string

Default Value

"%APPDATA%\\CData\\ApacheImpala Data Provider\\Schema"

Remarks

The path to a directory which contains the schema files for the Cloud (.rsd files for tables and views, .rsb files for stored procedures). The folder location can be a relative path from the location of the executable. The Location property is only needed if you want to customize definitions (for example, change a column name, ignore a column, and so on) or extend the data model with new tables, views, or stored procedures.

If left unspecified, the default location is "%APPDATA%\\CData\\ApacheImpala Data Provider\\Schema" with %APPDATA% being set to the user's configuration directory:

CData Cloud

BrowsableSchemas

This property restricts the schemas reported to a subset of the available schemas. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC.

Data Type

string

Default Value

""

Remarks

Listing the schemas from databases can be expensive. Providing a list of schemas in the connection string improves the performance.

CData Cloud

Tables

This property restricts the tables reported to a subset of the available tables. For example, Tables=TableA,TableB,TableC.

Data Type

string

Default Value

""

Remarks

Listing the tables from some databases can be expensive. Providing a list of tables in the connection string improves the performance of the Cloud.

This property can also be used as an alternative to automatically listing views if you already know which ones you want to work with and there would otherwise be too many to work with.

Specify the tables you want in a comma-separated list. Each table should be a valid SQL identifier with any special characters escaped using square brackets, double-quotes or backticks. For example, Tables=TableA,[TableB/WithSlash],WithCatalog.WithSchema.`TableC With Space`.

Note that when connecting to a data source with multiple schemas or catalogs, you will need to provide the fully qualified name of the table in this property, as in the last example here, to avoid ambiguity between tables that exist in multiple catalogs or schemas.

CData Cloud

Views

Restricts the views reported to a subset of the available tables. For example, Views=ViewA,ViewB,ViewC.

Data Type

string

Default Value

""

Remarks

Listing the views from some databases can be expensive. Providing a list of views in the connection string improves the performance of the Cloud.

This property can also be used as an alternative to automatically listing views if you already know which ones you want to work with and there would otherwise be too many to work with.

Specify the views you want in a comma-separated list. Each view should be a valid SQL identifier with any special characters escaped using square brackets, double-quotes or backticks. For example, Views=ViewA,[ViewB/WithSlash],WithCatalog.WithSchema.`ViewC With Space`.

Note that when connecting to a data source with multiple schemas or catalogs, you will need to provide the fully qualified name of the table in this property, as in the last example here, to avoid ambiguity between tables that exist in multiple catalogs or schemas.

CData Cloud

Caching

This section provides a complete list of the Caching properties you can configure in the connection string for this provider.


PropertyDescription
AutoCacheAutomatically caches the results of SELECT queries into a cache database specified by either CacheLocation or both of CacheConnection and CacheProvider .
CacheLocationSpecifies the path to the cache when caching to a file.
CacheToleranceThe tolerance for stale data in the cache specified in seconds when using AutoCache .
OfflineUse offline mode to get the data from the cache instead of the live source.
CacheMetadataThis property determines whether or not to cache the table metadata to a file store.
CData Cloud

AutoCache

Automatically caches the results of SELECT queries into a cache database specified by either CacheLocation or both of CacheConnection and CacheProvider .

Data Type

bool

Default Value

false

Remarks

When AutoCache = true, the Cloud automatically maintains a cache of your table's data in the database of your choice.

Setting the Caching Database

When AutoCache = true, the Cloud caches to a simple, file-based cache. You can configure its location or cache to a different database with the following properties:

  • CacheLocation: Specifies the path to the file store.
  • CacheProvider and CacheConnection: Specifies a driver to a database and the connection string.

See Also

  • CacheMetadata: This property reduces the amount of metadata that crosses the network by persisting table schemas retrieved from the Apache Impala metadata. Metadata then needs to be retrieved only once instead of every connection.
  • Explicitly Caching Data: This section provides more examples of using AutoCache in Offline mode.
  • CACHE Statements: You can use the CACHE statement to persist any SELECT query, as well as manage the cache; for example, refreshing schemas.

CData Cloud

CacheLocation

Specifies the path to the cache when caching to a file.

Data Type

string

Default Value

"%APPDATA%\\CData\\ApacheImpala Data Provider"

Remarks

The CacheLocation is a simple, file-based cache.

If left unspecified, the default location is "%APPDATA%\\CData\\ApacheImpala Data Provider" with %APPDATA% being set to the user's configuration directory:

See Also

  • AutoCache: Set to implicitly create and maintain a cache for later offline use.
  • CacheMetadata: Set to persist the Apache Impala catalog in CacheLocation.

CData Cloud

CacheTolerance

The tolerance for stale data in the cache specified in seconds when using AutoCache .

Data Type

int

Default Value

600

Remarks

The tolerance for stale data in the cache specified in seconds. This only applies when AutoCache is used. The Cloud checks with the data source for newer records after the tolerance interval has expired. Otherwise, it returns the data directly from the cache.

CData Cloud

Offline

Use offline mode to get the data from the cache instead of the live source.

Data Type

bool

Default Value

false

Remarks

When Offline = true, all queries execute against the cache as opposed to the live data source. In this mode, certain queries like INSERT, UPDATE, DELETE, and CACHE are not allowed.

CData Cloud

CacheMetadata

This property determines whether or not to cache the table metadata to a file store.

Data Type

bool

Default Value

false

Remarks

As you execute queries with this property set, table metadata in the Apache Impala catalog are cached to the file store specified by CacheLocation if set or the user's home directory otherwise. A table's metadata will be retrieved only once, when the table is queried for the first time.

When to Use CacheMetadata

The Cloud automatically persists metadata in memory for up to two hours when you first discover the metadata for a table or view and therefore, CacheMetadata is generally not required. CacheMetadata becomes useful when metadata operations are expensive such as when you are working with large amounts of metadata or when you have many short-lived connections.

When Not to Use CacheMetadata

  • When you are working with volatile metadata: Metadata for a table is only retrieved the first time the connection to the table is made. To pick up new, changed, or deleted columns, you would need to delete and rebuild the metadata cache. Therefore, it is best to rely on the in-memory caching for cases where metadata changes often.
  • When you are caching to a database: CacheMetadata can only be used with CacheLocation. If you are caching to another database with the CacheProvider and CacheConnection properties, use AutoCache to cache implicitly. Or, use CACHE Statements to cache explicitly.

CData Cloud

Miscellaneous

This section provides a complete list of the Miscellaneous properties you can configure in the connection string for this provider.


PropertyDescription
HTTPPathThe path component of the URL endpoint when using HTTP TransportMode.
MaxRowsLimits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.
OtherThese hidden properties are used only in specific use cases.
PagesizeThe maximum number of results to return per page from Apache Impala.
PseudoColumnsThis property indicates whether or not to include pseudo columns as columns to the table.
QueryPassthroughThis option passes the query to the Apache Impala server as is.
ReadonlyYou can use this property to enforce read-only access to Apache Impala from the provider.
RTKThe runtime key used for licensing.
TimeoutThe value in seconds until the timeout error is thrown, canceling the operation.
UserDefinedViewsA filepath pointing to the JSON configuration file containing your custom views.
UseSSLSpecifies whether to use SSL Encryption when connecting to Impala.
CData Cloud

HTTPPath

The path component of the URL endpoint when using HTTP TransportMode.

Data Type

string

Default Value

"cliservice"

Remarks

This property is used to specify the path component of the URL endpoint when using HTTP TransportMode.

CData Cloud

MaxRows

Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.

Data Type

int

Default Value

-1

Remarks

Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.

CData Cloud

Other

These hidden properties are used only in specific use cases.

Data Type

string

Default Value

""

Remarks

The properties listed below are available for specific use cases. Normal driver use cases and functionality should not require these properties.

Specify multiple properties in a semicolon-separated list.

Integration and Formatting

DefaultColumnSizeSets the default length of string fields when the data source does not provide column length in the metadata. The default value is 2000.
ConvertDateTimeToGMTDetermines whether to convert date-time values to GMT, instead of the local time of the machine.
RecordToFile=filenameRecords the underlying socket data transfer to the specified file.

CData Cloud

Pagesize

The maximum number of results to return per page from Apache Impala.

Data Type

int

Default Value

10000

Remarks

The Pagesize property affects the maximum number of results to return per page from Apache Impala. Setting a higher value may result in better performance at the cost of additional memory allocated per page consumed.

CData Cloud

PseudoColumns

This property indicates whether or not to include pseudo columns as columns to the table.

Data Type

string

Default Value

""

Remarks

This setting is particularly helpful in Entity Framework, which does not allow you to set a value for a pseudo column unless it is a table column. The value of this connection setting is of the format "Table1=Column1, Table1=Column2, Table2=Column3". You can use the "*" character to include all tables and all columns; for example, "*=*".

CData Cloud

QueryPassthrough

This option passes the query to the Apache Impala server as is.

Data Type

bool

Default Value

false

Remarks

When this is set, queries are passed through directly to Apache Impala.

CData Cloud

Readonly

You can use this property to enforce read-only access to Apache Impala from the provider.

Data Type

bool

Default Value

false

Remarks

If this property is set to true, the Cloud will allow only SELECT queries. INSERT, UPDATE, DELETE, and stored procedure queries will cause an error to be thrown.

CData Cloud

RTK

The runtime key used for licensing.

Data Type

string

Default Value

""

Remarks

The RTK property may be used to license a build.

CData Cloud

Timeout

The value in seconds until the timeout error is thrown, canceling the operation.

Data Type

int

Default Value

60

Remarks

If Timeout = 0, operations do not time out. The operations run until they complete successfully or until they encounter an error condition.

If Timeout expires and the operation is not yet complete, the Cloud throws an exception.

CData Cloud

UserDefinedViews

A filepath pointing to the JSON configuration file containing your custom views.

Data Type

string

Default Value

""

Remarks

User Defined Views are defined in a JSON-formatted configuration file called UserDefinedViews.json. The Cloud automatically detects the views specified in this file.

You can also have multiple view definitions and control them using the UserDefinedViews connection property. When you use this property, only the specified views are seen by the Cloud.

This User Defined View configuration file is formatted as follows:

  • Each root element defines the name of a view.
  • Each root element contains a child element, called query, which contains the custom SQL query for the view.

For example:

{
	"MyView": {
		"query": "SELECT * FROM [CData].[Default].Customers WHERE MyColumn = 'value'"
	},
	"MyView2": {
		"query": "SELECT * FROM MyTable WHERE Id IN (1,2,3)"
	}
}
Use the UserDefinedViews connection property to specify the location of your JSON configuration file. For example:
"UserDefinedViews", "C:\\Users\\yourusername\\Desktop\\tmp\\UserDefinedViews.json"

CData Cloud

UseSSL

Specifies whether to use SSL Encryption when connecting to Impala.

Data Type

bool

Default Value

false

Remarks

Specifies whether to use SSL Encryption when connecting to Impala.

CData Cloud

Third Party Copyrights

Apache Thrift Client

The Apache License Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Copyright (c) 2023 CData Software, Inc. - All rights reserved.
Build 22.0.8462