Apache Kafka Connector Setup

Version 22.0.8473


Apache Kafka Connector Setup


The Apache Kafka connector allows you to integrate Apache Kafka into your data flow by pushing or pulling data from Apache Kafka. Follow the steps below to connect CData Arc to Apache Kafka.

Establish a Connection

To allow Arc to use data from Apache Kafka, you must first establish a connection to Apache Kafka. There are two ways to establish this connection:

  • Add a Apache Kafka connector to your flow. Then, in the settings pane, click Create next to the Connection drop-down list.
  • Open the Arc Settings page, then open the Connections tab. Click Add, select Apache Kafka, and click Next.

Note:

  • The login process is only required the first time the connection is created.
  • Connections to Apache Kafka can be re-used across multiple Apache Kafka connectors.

Enter Connection Settings

After opening a new connection dialogue, follow these steps:

  1. Provide the requested information:

    • Name — The static name of the connection. Set this as desired.
    • Type — This is always set to Apache Kafka.
    • Auth Scheme — The authorization scheme to use for the connection. Options are Auto, None, Plain, Scram, and Kerberos.
    • User — (All schemes except None) The Apache Kafka username to use for logging in.
    • Password — (All schemes except None) The password for the user entered above.
    • Bootstrap Servers — The host/port pair to use for establishing the initial connection to Apache Kafka. If you are connecting to Confluent Cloud, you can find this on the Cluster settings.
  2. If needed, click Advanced to open the drop-down menu of advanced connection settings. These should not be needed in most cases.

  3. Click Test Connection to ensure that Arc can connect to Apache Kafka with the provided information. If an error occurs, check all fields and try again.

  4. Click Add Connection to finalize the connection.

  5. In the Connection drop-down list of the connector configuration pane, select the newly-created connection.

  6. In the Topic field, enter the Apache Kafka topic that you want to target.

  7. Click Save Changes.

Select an Action

After establishing a connection to Apache Kafka, you must choose the action that the Apache Kafka connector will perform. The table below outlines each action and where it belongs in an CData Arc flow:

Action Description Position in Flow
Produce Accepts input data from a file or another connector and sends it to Apache Kafka. End
Consume Checks the queue for messages and sends any data that it gets down the flow through the Output path. Middle

Produce

The Produce action sends input data to Apache Kafka. This data can come from other connectors or from files that you manually upload to the Input tab of the Apache Kafka connector. The Apache Kafka connector sends the input data to the topic you entered in the Topic field of the Configuration section.

Consume

The Consume action checks for messages in the Apache Kafka queue for the topic you entered in the Topic field of the Configuration section. You must set the following fields for this action:

  • Consumer Group ID: Specifies which group the consumers created by the connector should belong to.
  • Read Duration: The length of time (in seconds) that the connector will wait for messages to arrive. The connector will wait the full duration, regardless of the number of messages received.

Data processed through the Consume connector goes to the Output tab and travels down to the next steps of the Arc flow.