Kafka-MarkLogic-Connector

The Kafka-MarkLogic-Connector, written in Java, is a supported tool that uses the standard Kafka APIs and libraries to subscribe to Kafka topics and consume messages. The connector then uses the MarkLogic Data Movement SDK (DMSDK) to efficiently store those messages in a MarkLogic database. As messages stream onto the Kafka topic, the threads of the DMSDK will aggregate the messages and then push the messages into the database based on a configured batch size and time-out threshold.

All three components of the system– Kafka, MarkLogic, and Kafka-MarkLogic-Connector– are designed to easily permit new servers to be added to the system. New Kafka nodes can be used for redundancy to prevent data loss. Combined with MarkLogic’s ACID transactions, the system has extremely high reliability. New server nodes can also quickly and dynamically increase available bandwidth. As resources are maxed out, each of the three components may be expanded independently to meet data flow requirements.

VISIT THE REPOSITORY

Read about how Philip Barber’s tool can help you stream data from Kafka into MarkLogic easily and reliably.

Read the Blog

Walk through this tutorial to build a working example using the kafka-marklogic-connector to achieve a basic version of this system setup.

Read the Tutorial

Need more information about Apache Kafka and Zookeeper? The Apache Kafka website has instructions for starting up Zookeeper and Kafka. The tutorial assumes you are starting fresh and have no existing Kafka or ZooKeeper data.

Get Kafka

Tool

Learn More

Stay on top of everything Marklogic.

This website uses cookies.