JDBC Driver for Apache Kafka

Build 23.0.8839

データモデル

Tables

The CData JDBC Driver for Apache Kafka dynamically models Apache Kafka topics as tables. A complete list of discovered topics can be obtained from the sys_tables system table.

SELECTing from a topic returns existing messages on the topic, as well as live messages posted before the number of seconds specified by the ReadDuration have elapsed.

Stored Procedures

ストアドプロシージャ are function-like interfaces to Apache Kafka. They can be used to create schema files, commit messages, and more.

Consumer Groups

Connections that the 本製品 makes to Apache Kafka are always part of a consumer group. You can control the consumer group by setting a value for the ConsumerGroupId connection property. Using the same consumer group ID across multiple connections puts those connections into the same consumer group. The 本製品 generates a random consumer group ID if one is not provided.

All members of a consumer group share an offset that determines what messages are read next within each topic and partition. The 本製品 supports two ways of updating the offset:

  • If AutoCommit is enabled, the 本製品 periodically commits the offset for any topics and partitions that have been read by SELECT queries. The exact interval is determined by the auto-commit properties in the native library. See ConsumerProperties for details on how to configure these properties.
  • The CommitOffset stored procedure stores the offset of the last item read by the current query. Note that this must be called while the query resultset is still open. The 本製品 resets the offset when the resultset is closed.

If there is no existing offset, the 本製品 uses the OffsetResetStrategy to determine what the offset should be. This may happen if the broker does not recognize the consumer group or if the consumer group never committed an offset.

Bulk Messages

The 本製品 supports reading bulk messages from topics using the CSV, JSON, or XML SerializationFormat. For example, if a single message contained this CSV data, the 本製品 would expand it into three rows:

"1","alpha"
"2","beta"
"3","gamma"

Apache Kafka does not natively support bulk messages, which can lead to rows being skipped in some circumstances. For example:

  1. A 本製品 connection is created with ConsumerGroupId=x
  2. The connection executes the query SELECT * FROM topic LIMIT 3.
  3. The connection commits its offset and closes.
  4. Another connection is created with the same ConsumerGroupId
  5. The connection executes the query SELECT * FROM topic.

Consider what happens if this procedure is performed on the following topic. The first connection consumes all rows from the first message and one row from the second. However, the 本製品 has no way to report to Apache Kafka that only part of the second message was read. This means that step 3 commits the offset 3 and the second connection starts on row 5, skipping row 4.

"row 1"
"row 2"
/* End of message 1 */

"row 3"
"row 4"
/* End of message 2 */

"row 5"
"row 6"
/* End of message 3 */

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 23.0.8839