Kafka Connect Hands-on
What is Kafka Connect
Simplifies the process of getting data out of and into different sources and sinks
Kafka Connect are a series of connectors that someone else has written for very common source and sinks
Kafka Connect Architecture

Kafka Connect Concepts
-
Kafka Connect Cluster has multiple loaded Connectors
- Each connector is a reusable piece of code
- Many connectors exist in the open source world
-
Connectors + User Configuration = Tasks
- A task is linked to a connector configuration
- A job configuration may spawn multiple tasks
-
Tasks are executed by Kafka Connect Workers
- A worker is a single java process
- A worker can be standalone or in a cluster
-
Standalone
- A single process runs your connectors and tasks
- Configuration is bundled with your process
- Very easy to get started
- Not fault tolerant no scalability hard to monitor
-
Distributed:
- Multiple workers run your connectors and tasks
- Configuration is submitted using a REST API
- Easy to scale and fault tolerant
- Useful for production deployment of connectors

Connect Cluster will do automatic rebalancing for Connector Tasks if a worker (typically a server) goes down