The CData Sync App provides a straightforward way to continuously pipeline your Cassandra data to any database, data lake, or data warehouse, making it easily available for Analytics, Reporting, AI, and Machine Learning.
The Cassandra connector can be used from the CData Sync application to pull data from Cassandra and move it to any of the supported destinations.
Create a connection to Cassandra by navigating to the Connections page in the Sync App application and selecting the corresponding icon in the Add Connections panel. If the Cassandra icon is not available, click the Add More icon to download and install the Cassandra connector from the CData site.
Required properties are listed under the Settings tab. The Advanced tab lists connection properties that are not typically required.
Log in to the Azure Portal, select Azure Cosmos DB, and select your account. In the Settings section, click Connection String and set the following values:
The Sync App supports Basic authentication with login credentials and the additional authentication features of DataStax Enterprise (DSE). The following sections detail connection properties your authentication method may require.
You need to set AuthScheme to the value corresponding to the authenticator configured for your system. You specify the authenticator in the authenticator property in the cassandra.yaml file. This file is typically found in /etc/dse/cassandra or through the DSE Unified Authenticator on DSE Cassandra.
Set AuthScheme to Basic to authenticate with login credentials alone.
In the cassandra.yaml file, set the authenticator property to "PasswordAuthenticator".
Set the AuthScheme property to DSE to authenticate with login credentials and the DSE Unified Authenticator.
In the file, set the authenticator property to "com.datastax.bdp.cassandra.auth.DseAuthenticator".
Set the following to authenticating using Kerberos:
Please see Using Kerberos for more details on how to set connection properties in order to connect to Kerberos.
Set the following to authenticating using Kerberos:
You can set UseSSL to negotiate SSL/TLS encryption when you connect. By default, the Sync App attempts to negotiate SSL/TLS by checking the server's certificate against the system's trusted certificate store. To specify another certificate, see the SSLServerCert property for the available formats.
This section shows how to use the Sync App to authenticate using Kerberos.
To authenticate to Cassandra using Kerberos, set the following properties:
You can use one of the following options to retrieve the required Kerberos ticket.
This option enables you to use the MIT Kerberos Ticket Manager or kinit command to get tickets. Note that you do not need to set the User or Password connection properties with this option.
As an alternative to setting the KRB5CCNAME environment variable, you can directly set the file path using the KerberosTicketCache property. When set, the Sync App uses the specified cache file to obtain the Kerberos ticket to connect to Cassandra.
If the KRB5CCNAME environment variable has not been set, you can retrieve a Kerberos ticket using a Keytab File. To do so, set the User property to the desired username and set the KerberosKeytabFile property to a file path pointing to the keytab file associated with the user.
If both the KRB5CCNAME environment variable and the KerberosKeytabFile property have not been set, you can retrieve a ticket using a user and password combination. To do this, set the User and Password properties to the user/password combination that you use to authenticate with Cassandra.
More complex Kerberos environments may require cross-realm authentication where multiple realms and KDC servers are used (e.g., where one realm/KDC is used for user authentication and another realm/KDC is used for obtaining the service ticket).
In such an environment, set the KerberosRealm and KerberosKDC properties to the values required for user authentication. Also set the KerberosServiceRealm and KerberosServiceKDC properties to the values required to obtain the service ticket.
You can use the following properties to gain greater control over Cassandra API features and the strategies the Sync App uses to surface them:
This property applies if you are working with the dynamic schemas generated from Automatic Schema Discovery or if you are using QueryPassthrough.
Cassandra is a NoSQL database that provides high performance, availability, and scalability. However, these capabilities are not necessarily incompatible with a standards-compliant query language like SQL-92. The Sync App models Cassandra tables into relational tables and translates SQL queries into calls to the Cassandra API, the CQL (Cassandra Query Language) binary protocol.
The equivalent of a table in Cassandra is a column family. Column families contain columns of related data. Like other NoSQL databases, Cassandra allows complex types of fields such as set, list, and map. A column family is a nested map data structure. This can be represented as a JSON object.
The Sync App offers two ways to model Cassandra objects. The Automatic Schema Discovery scheme automatically finds the data types in a Cassandra object by scanning a configured number of rows of the object. You can use RowScanDepth, FlattenArrays, and FlattenObjects to control the relational representation of the tables in Cassandra.
The Sync App automatically infers a relational schema by inspecting a series of Cassandra documents in a collection. You can use the RowScanDepth property to define the number of documents the Sync App will scan to do so. The columns identified during the discovery process depend on the FlattenArrays and FlattenObjects properties.
If FlattenObjects is set, all nested objects will be flattened into a series of columns. For example, consider the following document:
{ id: 12, name: "Lohia Manufacturers Inc.", address: {street: "Main Street", city: "Chapel Hill", state: "NC"}, offices: ["Chapel Hill", "London", "New York"], annual_revenue: 35,600,000 }This document will be represented by the following columns:
Column Name | Data Type | Example Value |
id | Integer | 12 |
name | String | Lohia Manufacturers Inc. |
address.street | String | Main Street |
address.city | String | Chapel Hill |
address.state | String | NC |
offices | String | ["Chapel Hill", "London", "New York"] |
annual_revenue | Double | 35,600,000 |
If FlattenObjects is not set, then the address.street, address.city, and address.state columns will not be broken apart. The address column of type string will instead represent the entire object. Its value would be {street: "Main Street", city: "Chapel Hill", state: "NC"}. See JSON Functions for more details on working with JSON aggregates.
The FlattenArrays property can be used to flatten array values into columns of their own. This is only recommended for arrays that are expected to be short, for example the coordinates below:
"coord": [ -73.856077, 40.848447 ]The FlattenArrays property can be set to 2 to represent the array above as follows:
Column Name | Data Type | Example Value |
coord.0 | Float | -73.856077 |
coord.1 | Float | 40.848447 |
It is best to leave other unbounded arrays as they are and piece out the data for them as needed using JSON Functions.
The Sync App can return JSON structures as column values. The Sync App enables you to use standard SQL functions to work with these JSON structures. The examples in this section use the following array:
[ { "grade": "A", "score": 2 }, { "grade": "A", "score": 6 }, { "grade": "A", "score": 10 }, { "grade": "A", "score": 9 }, { "grade": "B", "score": 14 } ]
SELECT Name, JSON_EXTRACT(grades,'[0].grade') AS Grade, JSON_EXTRACT(grades,'[0].score') AS Score FROM Students;
Column Name | Example Value |
Grade | A |
Score | 2 |
SELECT Name, JSON_COUNT(grades,'[x]') AS NumberOfGrades FROM Students;
Column Name | Example Value |
NumberOfGrades | 5 |
SELECT Name, JSON_SUM(score,'[x].score') AS TotalScore FROM Students;
Column Name | Example Value |
TotalScore | 41 |
SELECT Name, JSON_MIN(score,'[x].score') AS LowestScore FROM Students;
Column Name | Example Value |
LowestScore | 2 |
SELECT Name, JSON_MAX(score,'[x].score') AS HighestScore FROM Students;
Column Name | Example Value |
HighestScore | 14 |
The JSON function can be used to retrieve the entire table as a JSON string. See the following query and its result as an example:
SELECT JSON(*) FROM Customers;The query above will return the entire table as shown.
{ "id": 12, "name": "Lohia Manufacturers Inc.", "address": { "street": "Main Street", "city": "Chapel Hill", "state": "NC"}, "offices": [ "Chapel Hill", "London", "New York" ], "annual_revenue": 35,600,000 }
The Sync App maps types from the data source to the corresponding data type available in the schema. The table below documents these mappings.
Note that string columns can map to different data types depending on their length.
Cassandra | CData Schema |
ascii | string |
bigint | long |
blob | binary |
boolean | bool |
counter | long |
date | date |
decimal | decimal |
double | float |
float | float |
inet | string |
int | int |
list | string |
map | string |
set | string |
smallint | int |
text | string |
time | time |
timestamp | datetime |
timeuuid | string |
tinyint | int |
tuple | string |
udt | string |
uuid | string |
varchar | string |
varint | string |
This section details a selection of advanced features of the Cassandra Sync App.
The Sync App allows you to define virtual tables, called user defined views, whose contents are decided by a pre-configured query. These views are useful when you cannot directly control queries being issued to the drivers. See User Defined Views for an overview of creating and configuring custom views.
Use SSL Configuration to adjust how Sync App handles TLS/SSL certificate negotiations. You can choose from various certificate formats; see the SSLServerCert property under "Connection String Options" for more information.
Configure the Sync App for compliance with Firewall and Proxy, including Windows proxies. You can also set up tunnel connections.
The Sync App offloads as much of the SELECT statement processing as possible to Cassandra and then processes the rest of the query in memory (client-side).
See Query Processing for more information.
See Logging for an overview of configuration settings that can be used to refine CData logging. For basic logging, you only need to set two connection properties, but there are numerous features that support more refined logging, where you can select subsets of information to be logged using the LogModules connection property.
By default, the Sync App attempts to negotiate SSL/TLS by checking the server's certificate against the system's trusted certificate store.
To specify another certificate, see the SSLServerCert property for the available formats to do so.
The Cassandra Sync App also supports setting client certificates. Set the following to connect using a client certificate.
Set the following properties:
The connection string properties are the various options that can be used to establish a connection. This section provides a complete list of the options you can configure in the connection string for this provider. Click the links for further details.
For more information on establishing a connection, see Establishing a Connection.
Property | Description |
AuthScheme | The scheme used for authentication. Accepted entries are Basic, DSE, Kerberos, and LDAP. |
Server | The host name or IP address of the server hosting the Cassandra database. |
Port | The port for the Cassandra database. |
LDAPServer | The host name or IP address of the LDAP server. |
User | The Cassandra user account used to authenticate. |
Password | The password used to authenticate the user. |
LDAPPort | The port for the LDAP server. |
Database | The name of the Cassandra keyspace. |
DefaultLDAPUser | The default LDAP user used to connect to and communicate with the server, it must be set if the LDAP server do not allow anonymous bind. |
LDAPPassword | The password of the default LDAP user. It must be set if the LDAP server do not allow anonymous bind. |
SearchBase | The search base for your LDAPServer, used to look up users. |
SearchFilter | The search filter for looking up usernames in LDAP. The default setting is (uid=), When using Active Directory set the filter to (sAMAccountName=). |
UseSSL | This field sets whether SSL is enabled. |
Property | Description |
KerberosKDC | The Kerberos Key Distribution Center (KDC) service used to authenticate the user. |
KerberosRealm | The Kerberos Realm used to authenticate the user. |
KerberosSPN | The service principal name (SPN) for the Kerberos Domain Controller. |
KerberosKeytabFile | The Keytab file containing your pairs of Kerberos principals and encrypted keys. |
KerberosServiceRealm | The Kerberos realm of the service. |
KerberosServiceKDC | The Kerberos KDC of the service. |
KerberosTicketCache | The full file path to an MIT Kerberos credential cache file. |
Property | Description |
SSLClientCert | The TLS/SSL client certificate store for SSL Client Authentication (2-way SSL). |
SSLClientCertType | The type of key store containing the TLS/SSL client certificate. |
SSLClientCertPassword | The password for the TLS/SSL client certificate. |
SSLClientCertSubject | The subject of the TLS/SSL client certificate. |
SSLServerCert | The certificate to be accepted from the server when connecting using TLS/SSL. |
Property | Description |
SSHAuthMode | The authentication method to be used to log on to an SFTP server. |
SSHClientCert | A private key to be used for authenticating the user. |
SSHClientCertPassword | The password of the SSHClientCert key if it has one. |
SSHClientCertSubject | The subject of the SSH client certificate. |
SSHClientCertType | The type of SSHClientCert private key. |
SSHServer | The SSH server. |
SSHPort | The SSH port. |
SSHUser | The SSH user. |
SSHPassword | The SSH password. |
SSHServerFingerprint | The SSH server fingerprint. |
UseSSH | Whether to tunnel the Cassandra connection over SSH. Use SSH. |
Property | Description |
FirewallType | The protocol used by a proxy-based firewall. |
FirewallServer | The name or IP address of a proxy-based firewall. |
FirewallPort | The TCP port for a proxy-based firewall. |
FirewallUser | The user name to use to authenticate with a proxy-based firewall. |
FirewallPassword | A password used to authenticate to a proxy-based firewall. |
Property | Description |
LogModules | Core modules to be included in the log file. |
Property | Description |
Location | A path to the directory that contains the schema files defining tables, views, and stored procedures. |
BrowsableSchemas | This property restricts the schemas reported to a subset of the available schemas. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC. |
Tables | This property restricts the tables reported to a subset of the available tables. For example, Tables=TableA,TableB,TableC. |
Views | Restricts the views reported to a subset of the available tables. For example, Views=ViewA,ViewB,ViewC. |
Property | Description |
AggregationsSupported | Whether or not to support aggregations in the Cassandra server. Note that in queries to the provider, you must use single quotes to define strings. |
AllowFiltering | When true, slow-performing queries are processed on the server. |
CaseSensitivity | Enable case sensitivity to the CQL sending to the server, if set to True, the identifiers in the CQL will be enclosed in double quotation marks. |
ConsistencyLevel | The consistency level determines how many of the replicas of the data you are interacting with need to respond for the query to be considered a success. |
FlattenArrays | By default, nested arrays are returned as strings of JSON. The FlattenArrays property can be used to flatten the elements of nested arrays into columns of their own. Set FlattenArrays to the number of elements you want to return from nested arrays. |
FlattenObjects | Set FlattenObjects to true to flatten object properties into columns of their own. Otherwise, objects nested in arrays are returned as strings of JSON. |
MaxRows | Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time. |
NullToUnset | Use unset instead of NULL in CQL query when performing INSERT operations. |
Other | These hidden properties are used only in specific use cases. |
Pagesize | The maximum number of results to return per page from Cassandra. |
PseudoColumns | This property indicates whether or not to include pseudo columns as columns to the table. |
QueryPassthrough | This option passes the query to the Cassandra server as is. |
RowScanDepth | The maximum number of rows to scan to look for the columns available in a table. |
Timeout | The value in seconds until the timeout error is thrown, canceling the operation. |
UseJsonFormat | Whether to submit and return the JSON encoding for CQL data types. |
UserDefinedViews | A filepath pointing to the JSON configuration file containing your custom views. |
VarintToString | Map Cassandra VARINT to String value. |
This section provides a complete list of the Authentication properties you can configure in the connection string for this provider.
Property | Description |
AuthScheme | The scheme used for authentication. Accepted entries are Basic, DSE, Kerberos, and LDAP. |
Server | The host name or IP address of the server hosting the Cassandra database. |
Port | The port for the Cassandra database. |
LDAPServer | The host name or IP address of the LDAP server. |
User | The Cassandra user account used to authenticate. |
Password | The password used to authenticate the user. |
LDAPPort | The port for the LDAP server. |
Database | The name of the Cassandra keyspace. |
DefaultLDAPUser | The default LDAP user used to connect to and communicate with the server, it must be set if the LDAP server do not allow anonymous bind. |
LDAPPassword | The password of the default LDAP user. It must be set if the LDAP server do not allow anonymous bind. |
SearchBase | The search base for your LDAPServer, used to look up users. |
SearchFilter | The search filter for looking up usernames in LDAP. The default setting is (uid=), When using Active Directory set the filter to (sAMAccountName=). |
UseSSL | This field sets whether SSL is enabled. |
The scheme used for authentication. Accepted entries are Basic, DSE, Kerberos, and LDAP.
Set this property to authenticate to open-source or DataStax Enterprise (DSE) Cassandra instances.
Together with Password and User, this field is used to authenticate against the server. Basic is the default option. Use the following options to select your authentication scheme:
The host name or IP address of the server hosting the Cassandra database.
The host name or IP address of the server hosting the Cassandra database. To connect to a distributed system, you can set Server to a comma-separated list of servers and ports, separated by colons. You will also need to set ConsistencyLevel.
Note that you must specify all of the servers required by your selected consistency level.
The port for the Cassandra database.
The port for the Cassandra database.
The host name or IP address of the LDAP server.
The host name or IP address of the LDAP server.
The Cassandra user account used to authenticate.
Together with Password, this field is used to authenticate against the Cassandra server.
The password used to authenticate the user.
The User and Password are together used to authenticate with the server.
The port for the LDAP server.
The port for the LDAP server.
The name of the Cassandra keyspace.
The name of the Cassandra keyspace containing the tables.
The default LDAP user used to connect to and communicate with the server, it must be set if the LDAP server do not allow anonymous bind.
Specify the default LDAP user in case the LDAP server do not allow anonymous login.
The password of the default LDAP user. It must be set if the LDAP server do not allow anonymous bind.
Specify the password of the default LDAP user.
The search base for your LDAPServer, used to look up users.
The search base for your LDAPServer, used to look up users.
The search filter for looking up usernames in LDAP. The default setting is (uid=), When using Active Directory set the filter to (sAMAccountName=).
The search filter for looking up usernames in LDAP. The default setting is (uid=).
This field sets whether SSL is enabled.
This field sets whether the Sync App will attempt to negotiate TLS/SSL connections to the server. By default, the Sync App checks the server's certificate against the system's trusted certificate store. To specify another certificate, set SSLServerCert.
This section provides a complete list of the Kerberos properties you can configure in the connection string for this provider.
Property | Description |
KerberosKDC | The Kerberos Key Distribution Center (KDC) service used to authenticate the user. |
KerberosRealm | The Kerberos Realm used to authenticate the user. |
KerberosSPN | The service principal name (SPN) for the Kerberos Domain Controller. |
KerberosKeytabFile | The Keytab file containing your pairs of Kerberos principals and encrypted keys. |
KerberosServiceRealm | The Kerberos realm of the service. |
KerberosServiceKDC | The Kerberos KDC of the service. |
KerberosTicketCache | The full file path to an MIT Kerberos credential cache file. |
The Kerberos Key Distribution Center (KDC) service used to authenticate the user.
The Kerberos properties are used when using SPNEGO or Windows Authentication. The Sync App will request session tickets and temporary session keys from the Kerberos KDC service. The Kerberos KDC service is conventionally colocated with the domain controller.
If Kerberos KDC is not specified, the Sync App will attempt to detect these properties automatically from the following locations:
The Kerberos Realm used to authenticate the user.
The Kerberos properties are used when using SPNEGO or Windows Authentication. The Kerberos Realm is used to authenticate the user with the Kerberos Key Distribution Service (KDC). The Kerberos Realm can be configured by an administrator to be any string, but conventionally it is based on the domain name.
If Kerberos Realm is not specified, the Sync App will attempt to detect these properties automatically from the following locations:
The service principal name (SPN) for the Kerberos Domain Controller.
If the SPN on the Kerberos Domain Controller is not the same as the URL that you are authenticating to, use this property to set the SPN.
The Keytab file containing your pairs of Kerberos principals and encrypted keys.
The Keytab file containing your pairs of Kerberos principals and encrypted keys.
The Kerberos realm of the service.
The KerberosServiceRealm is the specify the service Kerberos realm when using cross-realm Kerberos authentication.
In most cases, a single realm and KDC machine are used to perform the Kerberos authentication and this property is not required.
This property is available for complex setups where a different realm and KDC machine are used to obtain an authentication ticket (AS request) and a service ticket (TGS request).
The Kerberos KDC of the service.
The KerberosServiceKDC is used to specify the service Kerberos KDC when using cross-realm Kerberos authentication.
In most cases, a single realm and KDC machine are used to perform the Kerberos authentication and this property is not required.
This property is available for complex setups where a different realm and KDC machine are used to obtain an authentication ticket (AS request) and a service ticket (TGS request).
The full file path to an MIT Kerberos credential cache file.
This property can be set if you wish to use a credential cache file that was created using the MIT Kerberos Ticket Manager or kinit command.
This section provides a complete list of the SSL properties you can configure in the connection string for this provider.
Property | Description |
SSLClientCert | The TLS/SSL client certificate store for SSL Client Authentication (2-way SSL). |
SSLClientCertType | The type of key store containing the TLS/SSL client certificate. |
SSLClientCertPassword | The password for the TLS/SSL client certificate. |
SSLClientCertSubject | The subject of the TLS/SSL client certificate. |
SSLServerCert | The certificate to be accepted from the server when connecting using TLS/SSL. |
The TLS/SSL client certificate store for SSL Client Authentication (2-way SSL).
The name of the certificate store for the client certificate.
The SSLClientCertType field specifies the type of the certificate store specified by SSLClientCert. If the store is password protected, specify the password in SSLClientCertPassword.
SSLClientCert is used in conjunction with the SSLClientCertSubject field in order to specify client certificates. If SSLClientCert has a value, and SSLClientCertSubject is set, a search for a certificate is initiated. See SSLClientCertSubject for more information.
Designations of certificate stores are platform-dependent.
The following are designations of the most common User and Machine certificate stores in Windows:
MY | A certificate store holding personal certificates with their associated private keys. |
CA | Certifying authority certificates. |
ROOT | Root certificates. |
SPC | Software publisher certificates. |
In Java, the certificate store normally is a file containing certificates and optional private keys.
When the certificate store type is PFXFile, this property must be set to the name of the file. When the type is PFXBlob, the property must be set to the binary contents of a PFX file (for example, PKCS12 certificate store).
The type of key store containing the TLS/SSL client certificate.
This property can take one of the following values:
USER - default | For Windows, this specifies that the certificate store is a certificate store owned by the current user. Note that this store type is not available in Java. |
MACHINE | For Windows, this specifies that the certificate store is a machine store. Note that this store type is not available in Java. |
PFXFILE | The certificate store is the name of a PFX (PKCS12) file containing certificates. |
PFXBLOB | The certificate store is a string (base-64-encoded) representing a certificate store in PFX (PKCS12) format. |
JKSFILE | The certificate store is the name of a Java key store (JKS) file containing certificates. Note that this store type is only available in Java. |
JKSBLOB | The certificate store is a string (base-64-encoded) representing a certificate store in JKS format. Note that this store type is only available in Java. |
PEMKEY_FILE | The certificate store is the name of a PEM-encoded file that contains a private key and an optional certificate. |
PEMKEY_BLOB | The certificate store is a string (base64-encoded) that contains a private key and an optional certificate. |
PUBLIC_KEY_FILE | The certificate store is the name of a file that contains a PEM- or DER-encoded public key certificate. |
PUBLIC_KEY_BLOB | The certificate store is a string (base-64-encoded) that contains a PEM- or DER-encoded public key certificate. |
SSHPUBLIC_KEY_FILE | The certificate store is the name of a file that contains an SSH-style public key. |
SSHPUBLIC_KEY_BLOB | The certificate store is a string (base-64-encoded) that contains an SSH-style public key. |
P7BFILE | The certificate store is the name of a PKCS7 file containing certificates. |
PPKFILE | The certificate store is the name of a file that contains a PuTTY Private Key (PPK). |
XMLFILE | The certificate store is the name of a file that contains a certificate in XML format. |
XMLBLOB | The certificate store is a string that contains a certificate in XML format. |
The password for the TLS/SSL client certificate.
If the certificate store is of a type that requires a password, this property is used to specify that password to open the certificate store.
The subject of the TLS/SSL client certificate.
When loading a certificate the subject is used to locate the certificate in the store.
If an exact match is not found, the store is searched for subjects containing the value of the property. If a match is still not found, the property is set to an empty string, and no certificate is selected.
The special value "*" picks the first certificate in the certificate store.
The certificate subject is a comma separated list of distinguished name fields and values. For example, "CN=www.server.com, OU=test, C=US, [email protected]". The common fields and their meanings are shown below.
Field | Meaning |
CN | Common Name. This is commonly a host name like www.server.com. |
O | Organization |
OU | Organizational Unit |
L | Locality |
S | State |
C | Country |
E | Email Address |
If a field value contains a comma, it must be quoted.
The certificate to be accepted from the server when connecting using TLS/SSL.
If using a TLS/SSL connection, this property can be used to specify the TLS/SSL certificate to be accepted from the server. Any other certificate that is not trusted by the machine is rejected.
This property can take the following forms:
Description | Example |
A full PEM Certificate (example shortened for brevity) | -----BEGIN CERTIFICATE----- MIIChTCCAe4CAQAwDQYJKoZIhv......Qw== -----END CERTIFICATE----- |
A path to a local file containing the certificate | C:\cert.cer |
The public key (example shortened for brevity) | -----BEGIN RSA PUBLIC KEY----- MIGfMA0GCSq......AQAB -----END RSA PUBLIC KEY----- |
The MD5 Thumbprint (hex values can also be either space or colon separated) | ecadbdda5a1529c58a1e9e09828d70e4 |
The SHA1 Thumbprint (hex values can also be either space or colon separated) | 34a929226ae0819f2ec14b4a3d904f801cbb150d |
If not specified, any certificate trusted by the machine is accepted.
Use '*' to signify to accept all certificates. Note that this is not recommended due to security concerns.
This section provides a complete list of the SSH properties you can configure in the connection string for this provider.
Property | Description |
SSHAuthMode | The authentication method to be used to log on to an SFTP server. |
SSHClientCert | A private key to be used for authenticating the user. |
SSHClientCertPassword | The password of the SSHClientCert key if it has one. |
SSHClientCertSubject | The subject of the SSH client certificate. |
SSHClientCertType | The type of SSHClientCert private key. |
SSHServer | The SSH server. |
SSHPort | The SSH port. |
SSHUser | The SSH user. |
SSHPassword | The SSH password. |
SSHServerFingerprint | The SSH server fingerprint. |
UseSSH | Whether to tunnel the Cassandra connection over SSH. Use SSH. |
The authentication method to be used to log on to an SFTP server.
A private key to be used for authenticating the user.
SSHClientCert must contain a valid private key in order to use public key authentication. A public key is optional, if one is not included then the Sync App generates it from the private key. The Sync App sends the public key to the server and the connection is allowed if the user has authorized the public key.
The SSHClientCertType field specifies the type of the key store specified by SSHClientCert. If the store is password protected, specify the password in SSHClientCertPassword.
Some types of key stores are containers which may include multiple keys. By default the Sync App will select the first key in the store, but you can specify a specific key using SSHClientCertSubject.
The password of the SSHClientCert key if it has one.
This property is only used when authenticating to SFTP servers with SSHAuthMode set to PublicKey and SSHClientCert set to a private key.
The subject of the SSH client certificate.
When loading a certificate the subject is used to locate the certificate in the store.
If an exact match is not found, the store is searched for subjects containing the value of the property.
If a match is still not found, the property is set to an empty string, and no certificate is selected.
The special value "*" picks the first certificate in the certificate store.
The certificate subject is a comma separated list of distinguished name fields and values. For instance "CN=www.server.com, OU=test, C=US, [email protected]". Common fields and their meanings are displayed below.
Field | Meaning |
CN | Common Name. This is commonly a host name like www.server.com. |
O | Organization |
OU | Organizational Unit |
L | Locality |
S | State |
C | Country |
E | Email Address |
If a field value contains a comma it must be quoted.
The type of SSHClientCert private key.
This property can take one of the following values:
Types | Description | Allowed Blob Values |
MACHINE/USER | Blob values are not supported. | |
JKSFILE/JKSBLOB | base64-only | |
PFXFILE/PFXBLOB | A PKCS12-format (.pfx) file. Must contain both a certificate and a private key. | base64-only |
PEMKEY_FILE/PEMKEY_BLOB | A PEM-format file. Must contain an RSA, DSA, or OPENSSH private key. Can optionally contain a certificate matching the private key. | base64 or plain text. Newlines may be replaced with spaces when providing the blob as text. |
PPKFILE/PPKBLOB | A PuTTY-format private key created using the puttygen tool. | base64-only |
XMLFILE/XMLBLOB | An XML key in the format generated by the .NET RSA class: RSA.ToXmlString(true). | base64 or plain text. |
The SSH server.
The SSH server.
The SSH port.
The SSH port.
The SSH user.
The SSH user.
The SSH password.
The SSH password.
The SSH server fingerprint.
The SSH server fingerprint.
Whether to tunnel the Cassandra connection over SSH. Use SSH.
By default the Sync App will attempt to connect directly to Cassandra. When this option is enabled, the Sync App will instead establish an SSH connection with the SSHServer and tunnel the connection to Cassandra through it.
This section provides a complete list of the Firewall properties you can configure in the connection string for this provider.
Property | Description |
FirewallType | The protocol used by a proxy-based firewall. |
FirewallServer | The name or IP address of a proxy-based firewall. |
FirewallPort | The TCP port for a proxy-based firewall. |
FirewallUser | The user name to use to authenticate with a proxy-based firewall. |
FirewallPassword | A password used to authenticate to a proxy-based firewall. |
The protocol used by a proxy-based firewall.
This property specifies the protocol that the Sync App will use to tunnel traffic through the FirewallServer proxy.
Type | Default Port | Description |
TUNNEL | 80 | When this is set, the Sync App opens a connection to Cassandra and traffic flows back and forth through the proxy. |
SOCKS4 | 1080 | When this is set, the Sync App sends data through the SOCKS 4 proxy specified by FirewallServer and FirewallPort and passes the FirewallUser value to the proxy, which determines if the connection request should be granted. |
SOCKS5 | 1080 | When this is set, the Sync App sends data through the SOCKS 5 proxy specified by FirewallServer and FirewallPort. If your proxy requires authentication, set FirewallUser and FirewallPassword to credentials the proxy recognizes. |
The name or IP address of a proxy-based firewall.
This property specifies the IP address, DNS name, or host name of a proxy allowing traversal of a firewall. The protocol is specified by FirewallType: Use FirewallServer with this property to connect through SOCKS or do tunneling.
The TCP port for a proxy-based firewall.
This specifies the TCP port for a proxy allowing traversal of a firewall. Use FirewallServer to specify the name or IP address. Specify the protocol with FirewallType.
The user name to use to authenticate with a proxy-based firewall.
The FirewallUser and FirewallPassword properties are used to authenticate against the proxy specified in FirewallServer and FirewallPort, following the authentication method specified in FirewallType.
A password used to authenticate to a proxy-based firewall.
This property is passed to the proxy specified by FirewallServer and FirewallPort, following the authentication method specified by FirewallType.
This section provides a complete list of the Logging properties you can configure in the connection string for this provider.
Property | Description |
LogModules | Core modules to be included in the log file. |
Core modules to be included in the log file.
Only the modules specified (separated by ';') will be included in the log file. By default all modules are included.
See the Logging page for an overview.
This section provides a complete list of the Schema properties you can configure in the connection string for this provider.
Property | Description |
Location | A path to the directory that contains the schema files defining tables, views, and stored procedures. |
BrowsableSchemas | This property restricts the schemas reported to a subset of the available schemas. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC. |
Tables | This property restricts the tables reported to a subset of the available tables. For example, Tables=TableA,TableB,TableC. |
Views | Restricts the views reported to a subset of the available tables. For example, Views=ViewA,ViewB,ViewC. |
A path to the directory that contains the schema files defining tables, views, and stored procedures.
The path to a directory which contains the schema files for the Sync App (.rsd files for tables and views, .rsb files for stored procedures). The folder location can be a relative path from the location of the executable. The Location property is only needed if you want to customize definitions (for example, change a column name, ignore a column, and so on) or extend the data model with new tables, views, or stored procedures.
If left unspecified, the default location is "%APPDATA%\\CData\\Cassandra Data Provider\\Schema" with %APPDATA% being set to the user's configuration directory:
This property restricts the schemas reported to a subset of the available schemas. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC.
Listing the schemas from databases can be expensive. Providing a list of schemas in the connection string improves the performance.
This property restricts the tables reported to a subset of the available tables. For example, Tables=TableA,TableB,TableC.
Listing the tables from some databases can be expensive. Providing a list of tables in the connection string improves the performance of the Sync App.
This property can also be used as an alternative to automatically listing views if you already know which ones you want to work with and there would otherwise be too many to work with.
Specify the tables you want in a comma-separated list. Each table should be a valid SQL identifier with any special characters escaped using square brackets, double-quotes or backticks. For example, Tables=TableA,[TableB/WithSlash],WithCatalog.WithSchema.`TableC With Space`.
Note that when connecting to a data source with multiple schemas or catalogs, you will need to provide the fully qualified name of the table in this property, as in the last example here, to avoid ambiguity between tables that exist in multiple catalogs or schemas.
Restricts the views reported to a subset of the available tables. For example, Views=ViewA,ViewB,ViewC.
Listing the views from some databases can be expensive. Providing a list of views in the connection string improves the performance of the Sync App.
This property can also be used as an alternative to automatically listing views if you already know which ones you want to work with and there would otherwise be too many to work with.
Specify the views you want in a comma-separated list. Each view should be a valid SQL identifier with any special characters escaped using square brackets, double-quotes or backticks. For example, Views=ViewA,[ViewB/WithSlash],WithCatalog.WithSchema.`ViewC With Space`.
Note that when connecting to a data source with multiple schemas or catalogs, you will need to provide the fully qualified name of the table in this property, as in the last example here, to avoid ambiguity between tables that exist in multiple catalogs or schemas.
This section provides a complete list of the Miscellaneous properties you can configure in the connection string for this provider.
Property | Description |
AggregationsSupported | Whether or not to support aggregations in the Cassandra server. Note that in queries to the provider, you must use single quotes to define strings. |
AllowFiltering | When true, slow-performing queries are processed on the server. |
CaseSensitivity | Enable case sensitivity to the CQL sending to the server, if set to True, the identifiers in the CQL will be enclosed in double quotation marks. |
ConsistencyLevel | The consistency level determines how many of the replicas of the data you are interacting with need to respond for the query to be considered a success. |
FlattenArrays | By default, nested arrays are returned as strings of JSON. The FlattenArrays property can be used to flatten the elements of nested arrays into columns of their own. Set FlattenArrays to the number of elements you want to return from nested arrays. |
FlattenObjects | Set FlattenObjects to true to flatten object properties into columns of their own. Otherwise, objects nested in arrays are returned as strings of JSON. |
MaxRows | Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time. |
NullToUnset | Use unset instead of NULL in CQL query when performing INSERT operations. |
Other | These hidden properties are used only in specific use cases. |
Pagesize | The maximum number of results to return per page from Cassandra. |
PseudoColumns | This property indicates whether or not to include pseudo columns as columns to the table. |
QueryPassthrough | This option passes the query to the Cassandra server as is. |
RowScanDepth | The maximum number of rows to scan to look for the columns available in a table. |
Timeout | The value in seconds until the timeout error is thrown, canceling the operation. |
UseJsonFormat | Whether to submit and return the JSON encoding for CQL data types. |
UserDefinedViews | A filepath pointing to the JSON configuration file containing your custom views. |
VarintToString | Map Cassandra VARINT to String value. |
Whether or not to support aggregations in the Cassandra server. Note that in queries to the provider, you must use single quotes to define strings.
When true, slow-performing queries are processed on the server.
Cassandra by default does not allow filtering for queries that it predicts will have performance problems. These queries include filtering on a column that is not the primary key. When SupportEnhancedSQL is enabled, the Sync App parses queries in memory with the SQL engine, instead of offloading processing of the query to the database.
You can override the default behavior and rely on the server to process these queries by setting AllowFiltering to true.
Enable case sensitivity to the CQL sending to the server, if set to True, the identifiers in the CQL will be enclosed in double quotation marks.
By default, SQL is case-insensitive. However, Cassandra supports case-sensitive table and column names. Setting this property to True will enable you to retrieve tables and columns based on their case-sensitive names.
The consistency level determines how many of the replicas of the data you are interacting with need to respond for the query to be considered a success.
The consistency level determines how many of the replicas of the data you are interacting with need to respond for the query to be considered a success. You need to specify the appropriate replicas in the Server property.
Below are the possible values:
By default, nested arrays are returned as strings of JSON. The FlattenArrays property can be used to flatten the elements of nested arrays into columns of their own. Set FlattenArrays to the number of elements you want to return from nested arrays.
By default, nested arrays are returned as strings of JSON. The FlattenArrays property can be used to flatten the elements of nested arrays into columns of their own. This is only recommended for arrays that are expected to be short.
Set FlattenArrays to the number of elements you want to return from nested arrays. The specified elements are returned as columns. The zero-based index is concatenated to the column name. Other elements are ignored.
For example, you can return an arbitrary number of elements from an array of strings:
["FLOW-MATIC","LISP","COBOL"]When FlattenArrays is set to 1, the preceding array is flattened into the following table:
Column Name | Column Value |
languages_0 | FLOW-MATIC |
Set FlattenObjects to true to flatten object properties into columns of their own. Otherwise, objects nested in arrays are returned as strings of JSON.
Set FlattenObjects to true to flatten object properties into columns of their own. Otherwise, objects nested in arrays are returned as strings of JSON. The property name is concatenated onto the object name with an underscore to generate the column name.
For example, you can flatten the nested objects below at connection time:
[ { "grade": "A", "score": 2 }, { "grade": "A", "score": 6 }, { "grade": "A", "score": 10 }, { "grade": "A", "score": 9 }, { "grade": "B", "score": 14 } ]When FlattenObjects is set to true and FlattenArrays is set to 1, the preceding array is flattened into the following table:
Column Name | Column Value |
grades_0_grade | A |
grades_0_score | 2 |
Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.
Limits the number of rows returned rows when no aggregation or group by is used in the query. This helps avoid performance issues at design time.
Use unset instead of NULL in CQL query when performing INSERT operations.
In Cassandra 2.2 and above, when executing an INSERT query, a parameter value can be set to unset. Cassandra does not consider unset field values which helps to avoid tombstones.
When NULL values are inserted, it is possible to reach the tombstone threshold limits which causes an exception to be thrown when querying the data. Setting this property to true and submitting unset values avoids these tombstones from being created.
Note: This option is only available on INSERT operations as Cassandra does not support changing existing values to unset.
These hidden properties are used only in specific use cases.
The properties listed below are available for specific use cases. Normal driver use cases and functionality should not require these properties.
Specify multiple properties in a semicolon-separated list.
DefaultColumnSize | Sets the default length of string fields when the data source does not provide column length in the metadata. The default value is 2000. |
ConvertDateTimeToGMT | Determines whether to convert date-time values to GMT, instead of the local time of the machine. |
RecordToFile=filename | Records the underlying socket data transfer to the specified file. |
The maximum number of results to return per page from Cassandra.
The Pagesize property affects the maximum number of results to return per page from Cassandra. Setting a higher value may result in better performance at the cost of additional memory allocated per page consumed.
This property indicates whether or not to include pseudo columns as columns to the table.
This setting is particularly helpful in Entity Framework, which does not allow you to set a value for a pseudo column unless it is a table column. The value of this connection setting is of the format "Table1=Column1, Table1=Column2, Table2=Column3". You can use the "*" character to include all tables and all columns; for example, "*=*".
This option passes the query to the Cassandra server as is.
When this is set, queries are passed through directly to Cassandra.
The maximum number of rows to scan to look for the columns available in a table.
The columns in a table must be determined by scanning table rows. This value determines the maximum number of rows that will be scanned.
Setting a high value may decrease performance. Setting a low value may prevent the data type from being determined properly, especially when there is null data.
The value in seconds until the timeout error is thrown, canceling the operation.
If Timeout = 0, operations do not time out. The operations run until they complete successfully or until they encounter an error condition.
If Timeout expires and the operation is not yet complete, the Sync App throws an exception.
Whether to submit and return the JSON encoding for CQL data types.
Cassandra 2.2 introduced a CQL extension that allows you to JSON-encode CQL data types. By default, you use the JSON syntax to manipulate data and SELECT statements return JSON through the Sync App. Set this property to false to use CQL literals to interact with Cassandra data.
The syntax for CQL literals has several differences from JSON. For example:
Format | Syntax | ||
CQL | INSERT INTO users (user_id, emails) VALUES(@user_id, @emails) | ||
Parameters | |||
user_id | frodo | ||
emails | {'[email protected]', '[email protected]'} | ||
JSON | INSERT INTO users (user_id, emails) VALUES (@user_id, @emails) | ||
Parameters | |||
user_id | frodo | ||
emails | ["[email protected]", "[email protected]"]) |
Note that in queries to the Sync App, you must use single quotes to define strings.
A filepath pointing to the JSON configuration file containing your custom views.
User Defined Views are defined in a JSON-formatted configuration file called UserDefinedViews.json. The Sync App automatically detects the views specified in this file.
You can also have multiple view definitions and control them using the UserDefinedViews connection property. When you use this property, only the specified views are seen by the Sync App.
This User Defined View configuration file is formatted as follows:
For example:
{ "MyView": { "query": "SELECT * FROM \"CData\".\"Sample\".Products WHERE MyColumn = 'value'" }, "MyView2": { "query": "SELECT * FROM MyTable WHERE Id IN (1,2,3)" } }Use the UserDefinedViews connection property to specify the location of your JSON configuration file. For example:
"UserDefinedViews", "C:\\Users\\yourusername\\Desktop\\tmp\\UserDefinedViews.json"
Map Cassandra VARINT to String value.
Map Cassandra VARINT to String value.