Advanced Topics Configuration

Changing a Topic Configuration

You can use kafka-configs CLI to add/alter a config value

Example kafka-configs --bootstrap-server localhost:9092 --entity-type topics --entity-name configured-topic --alter --add-config min.insync.replicas=2

Segment and Indexes

Topics are made of partitions and partitions are made of segments. There's only one active segment at a time. The active segment is the one being written to.

Active segment can change either based on time or bytes in the segment:

2 indexes are available for segments:

These help kafka find messages in the correct segment.

We want to tune our segment count based on our throughput (high throughput versus low throughput)

How often do I need log compaction to happen? Tune this setting

Log Cleanup Policies

Log cleanup: Make data expire

This allows you to control the size of data on disk and limits maintenance work on the cluter

Log Cleanup Delete

Apache Kafka Series - Learn Apache Kafka for Beginners v3 - 202302211109-20240330175918393.webp

Log Compaction Theory

Apache Kafka Series - Learn Apache Kafka for Beginners v3 - 202302211109-20240330180109019.webp

Mythbusting:

Note

You can't trigger log compaction using an api call currently

Apache Kafka Series - Learn Apache Kafka for Beginners v3 - 202302211109-20240330180532927.webp

Unclean Leader Election

unclean.leader.election.enable, only turn this on if data loss is OK, false by default

Basically what this says if if your In Sync Replicas go offline and have out of sync replicas up, instead of waiting for an insync replica to come back online, you can enable the election of a leader (out of sync replica)

Note

Generally don't use

Large Messages in Kafka

Note

Generally don't. Large messages are considered inefficient and an anti-pattern

Apache Kafka Series - Learn Apache Kafka for Beginners v3 - 202302211109-20240330181341202.webp

References

Flashcards