Apache Kafka

Version 24.2.9064


Apache Kafka


CData Sync アプリケーションからApache Kafka コネクタを使用して、Apache Kafka からデータを取得してサポートされている任意の同期先に移動できます。これを行うには、コネクタを追加し、コネクタへの認証を行い、接続を完了する必要があります。

Note: With this connector, you can authenticate either to Kafka or to Azure Event Hubs.

Apache Kafka コネクタを追加

Sync でApache Kafka のデータを使用できるようにするには、まず以下の手順でコネクタを追加する必要があります。

  1. Sync のダッシュボードから接続ページを開きます。

  2. 接続を追加をクリックしてコネクタを選択ページを開きます。

  3. データソースタブをクリックしてApache Kafka 行に移動します。

  4. 行末にある接続を設定アイコンをクリックして、新しい接続ページを開きます。接続を設定アイコンが利用できない場合は、コネクタをダウンロードアイコンをクリックしてApache Kafka コネクタをインストールします。新規コネクタのインストールについて詳しくは、接続を参照してください。

Apache Kafka への認証

コネクタを追加したら、必須プロパティを設定する必要があります。

  • Connection Name - Enter a connection name of your choice.

  • Bootstrap Servers - Enter the address of the Apache Kafka bootstrap servers to which you want to connect.

CData Sync supports authenticating to Apache Kafka in several ways. Select your authentication method below to proceed to the relevant section that contains the authentication details.

Anonymous

To connect without authentication, select None for Auth Scheme. No additional properties are required.

SASL Plain

To connect with SASL Plain authentication, specify these properties:

  • Auth Scheme - Select Plain.

  • User - Enter the username that you use to authenticate to your Kafka account.

  • Password - Enter the password that you use to authenticate to your Kafka account.

SCRAM

To connect with SCRAM (SHA-256) credentials, specify these properties:

  • Auth Scheme - Select SCRAM.

  • User - Enter the username that you use to authenticate to your Kafka account.

  • Password - Enter the password that you use to authenticate to your Kafka account.

SCRAM-SHA-512

To connect with the SCRAM (SHA-512) credentials, specify these properties:

  • Auth Scheme - Select SCRAM-SHA-512.

  • User - Enter the username that you use to authenticate to your Kafka account.

  • Password - Enter the password that you use to authenticate to your Kafka account.

Kerberos

To connect with Kerberos, specify these properties:

  • Auth Scheme - Select Kerberos.

  • Kerberos SPN - Enter the service principal name (SPN) for the Kerberos domain controller.

  • Kerberos Service Name - Enter the name of the Kerberos service with which you want to authenticate.

  • Kerberos Keytab File (optional) - Enter the keytab file that contains your pairs of the Kerberos principals and encrypted keys.

  • Use Kerberos Ticket Cache (optional) - Select True to use a ticket cache with the logged-in user instead of a keytab file. The default value is False.

Authenticate to Azure Event Hubs

After you add the connector, you need to set the required properties.

  • Connection Name - Enter a connection name of your choice.

  • Bootstrap Servers - Enter the address of the Apache Kafka bootstrap servers to which you want to connect.

CData Sync supports authenticating to Apache Kafka in several ways. Select your authentication method below to proceed to the relevant section that contains the authentication details.

SSL Certificate

To connect with a Secure Sockets Layer (SSL) client certificate, specify these properties:

  • Auth Scheme - Select SSLCertificate.

  • SSL Client Cert - Enter the SSL client certificate that is used to validate to the Apache Kafka broker.

  • SSL Client Cert Type - Select the format of the SSL client certificate that is used to connect to the Apache Kafka broker:

    • JKSFILE
    • PFXFILE
    • PEMKEY_FILE (default)
    • PEMKEY_BLOB
  • SSL Client Cert Password (optional) - Enter the password that is used to decrypt the SSL client certificate.

Azure Active Directory

To connect with Azure Active Directory (AD) credentials, specify the following properties:

  • Azure Tenant - Enter the Microsoft Online tenant that is used to access data. If you do not specify a tenant, Sync uses the default tenant.

  • OAuth Client Id - Enter the client Id that you were assigned when you registered your application with an OAuth authorization server.

  • OAuth Client Secret - Enter the client secret that you were assigned when you registered your application with an OAuth authorization server.

Azure Managed Service Identity

Azure 仮想マシン上でCData Sync が実行されている場合にAzure マネージドサービスID(MSI)を利用するには、Auth SchemeAzure MSI を選択します。追加のプロパティは必要ありません。

Azure Service Principal

Azure サービスプリンシパルとクライアントシークレットで接続するには、次のプロパティを設定します。

  • Auth Scheme - Azure Service Principal を選択。

  • Azure Tenant - 接続するMicrosoft Online テナントを入力。

  • OAuth Client Id - OAuth 認証サーバーにアプリケーションを登録した際に割り当てられたクライアントId を入力。

  • OAuth Client Secret - OAuth 認証サーバーにアプリケーションを登録した際に割り当てられたクライアントシークレットを入力。

アプリケーションのOAuth クライアントId およびクライアントシークレットを取得するには:

  1. Azure ポータルにログインします。

  2. 左ナビゲーションペインですべてのサービスを選択します。次にアプリの登録を検索し選択します。

  3. 新規登録をクリックします。

  4. アプリケーションの名前を入力して任意のAzure AD ディレクトリ - マルチテナントを選択します。リダイレクトURI をCallbackURL に指定された値に設定します。

  5. アプリケーションを作成したら、概要セクションに表示されるアプリケーション (クライアント) ID の値をコピーします。この値をOAuth クライアントId として使用します。

  6. 証明書とシークレットセクションに移動して、アプリケーションの新しいクライアント シークレットを選択します。

  7. 有効期間を指定してクライアントシークレットを保存します。保存するとキーの値が表示されます。

  8. 値は一度しか表示されないので、コピーしておきます。この値をOAuth クライアントシークレットとして使用します。

  9. 認証タブで、必ずアクセストークン (暗黙的なフローに使用) を選択します。

Azure Service Principal Certificate

To connect with an Azure service principal and client certificate, specify these properties:

  • Auth Scheme - Select AzureServicePrincipalCert.

  • Azure Tenant - Enter the Microsoft Online tenant that is used to access data. If you do not specify a tenant, Sync uses the default tenant.

  • OAuth Client Id - Enter the client Id that you were assigned when you registered your application with an OAuth authorization server.

Extracting Metadata from Topics

Reads in Apache Kafka do not have a natural stopping point. To avoid perpetual Read operations, items are read until either the ReadDuration or Timeout properties expires. By default, ReadDuration is set to 30 seconds.

The Kafka driver models topics as tables and messages as rows, and it facilitates this behavior in two ways:

  • For services that contain a schema registry (for example, Confluent and instances hosted by Amazon Web Services), the schema is read directly from the schema registry.

  • For services that do not contain a schema registry, the driver infers the schema.

Schema Registry

The schema registry contains a list of topics that have registered schemas. The list of tables and columns are simply read directly from the schema registry.

To connect to a service with a schema registry, specify the following properties:

  • Bootstrap Servers - Enter the server (host name or IP address) and port (in the format Server:Port) of the Apache Kafka bootstrap servers.

  • Type Detection Scheme - Select SchemaRegistry.

  • Registry Auth Scheme - Select the appropriate authentication method.

  • Registry Service - Select the Schema Registry service that you want to use to read topic schemas. The options are Confluent and AWSGlue.

  • Registry Url - Enter the URL to the server for the schema registry.

Confluent Schema Registry

When you connect to Confluent Cloud, the registry URL corresponds to the Schema Registry endpoint value that is located in Schemas > Schema Registry > Instructions.

The Confluent schema registry supports several authentication options. Typically, Confluent Cloud deployments require that you set the Registry Auth Scheme property to Basic, along with a registry user and registry password. To find your user and password, navigate to Schemas > Schema Registry > API Access and locate the access-key and secret-key values.

On-premises deployments might not require authentication. In these configurations, you should set Registry Auth Scheme to None. These deployments might require SSL client certificates also. For that, you need to set the SSL Certificate registry auth scheme as well as the Registry Client Cert and Registry Client Cert Type options.

Amazon Web Services (AWS) Glue Schema Registry

When you connect to AWS Glue, the registry URL corresponds to the Amazon Resources Name (ARN) value of the registry.

The AWS Glue schema registry only supports the Basic registry auth scheme. You should set Registry User and Registry Password, respectively, to the access key and secret key of a user that has access to the registry.

No-Schema Registry

To connect to a service without a schema registry, specify the following properties:

  • Bootstrap Servers - Enter the server (host name or IP address) and port (in the format Server:Port) of the Apache Kafka bootstrap servers.

  • Type Detection Scheme - Select Row Scan.

For schema discovery, the Sync application attempts to detect the format (AVRO/JSON/XML/CSV) automatically. You can also set the format explicitly with the Serialization Format property.

After Sync reads the format, it analyzes the rows from the topic. If you want increased accuracy, set a higher value for the Row Scan Depth property. Be aware, though, that setting a higher row-scan depth might decrease performance.

Then, Sync begins reading at the current offset, which you can configure with the Offset Reset Strategy property). From this point, future SELECT statements will start from the beginning.

Sync completes schema discovery by performing deserialization (based on the determined serialization format).

Complete Your Connection

To complete your connection:

  1. Specify the following properties:

    • Type Detection Scheme - Select a detection-scheme type to specify how Sync scans data to determine the fields and data types for the bucket.

      • None - Returns all columns as string type.

      • RowScan - Scans rows in order to determine the data type heuristically.

      • SchemaRegistry - Determines the Schema Registry API and uses a list of predefined AVRO schemas. For this scheme type, specify the following properties:

        • Registry Url - Enter the URL to the server for the schema registry. When you specify this property, Sync reads the Apache Kafka schema from the server.

        • Registry Service - Select the Schema Registry service that you want to use for working with topic schemas.

          • Confluent (default)

          • AWSGlue

        • Registry Auth Scheme - Select the scheme that you want to use to authenticate to the schema registry.

          • Auto (default) - This scheme enables Sync to choose the scheme automatically based on the other connection properties that you set.

          • None - This scheme specifies that no authentication is used.

          • Basic - When you select this scheme, you must specify values for Registry User and Registry Password.

            • Registry User - Enter the username that you use to authenticate to the server that is specified in the registry URL.

            • Registry Password - Enter the password that you use to authenticate to the server that is specified in the registry URL.

          • SSLCertificate - This scheme specifies that SSL client-certificate authentication should be used. For this setting, you must specify the following:

            • Registry Client Cert - Enter the TLS/SSL client certificate store for SSL client authentication (two-way SSL) with the schema registry.

            • Registry Client Cert Password (optional) - Enter the password for your TLS/SSL client certificate.

            • Registry Client Cert Subject (optional) - Enter the subject of your TLS/SSL client certificate.

        • Registry Type (optional) - Select a registry type for the specific topic:

          • Auto (default) - When you select Auto, Sync attempts to detect the valid schema registry type for the selected topic.

          • JSON

          • AVRO

      • MessageOnly - Pushes all information as a single aggregate value on a column named Message.

    • Use SSL (optional) - Specify whether you want to use the Secure Sockets Layer (SSL) protocol. The default value is False.

  2. 高度な設定タブで接続の高度な設定を定義します。(ただし、ほとんどの場合これらの設定は必要ありません。)

  3. AzureAD で認証する場合は、Apache Kafka への接続 をクリックしてApache Kafka アカウントに接続します。

  4. 作成およびテストをクリックして接続を作成します。

詳細情報

CData Sync とApache Kafka の連携について、詳しくはApache Kafka Connector for CData Sync を参照してください。