Kafka Internals

Cluster Membership

Controller

Physical Storage

Log Compaction

Request Handling

References

Flashcards

What are the two types of indexes maintained by Kafka for each partition?:: Offset index (mapping offsets to physical positions) and Timestamp index (mapping timestamps to offsets)

How does Kafka prevent split-brain scenarios with controllers?:: Uses controller epoch numbers - each new controller gets a higher number and messages from controllers with old epoch numbers are ignored

What happens when a broker loses its ZooKeeper connection?:: Its ephemeral node is automatically removed, but its broker ID still exists in other data structures

What are the two portions of a compacted partition?:: Clean portion (previously compacted messages) and Dirty portion (messages written after last compaction)

What triggers log compaction in Kafka?:: When the ratio of dirty records reaches the configured threshold (default 50%) and a segment is closed

What is the role of the Kafka Controller?:: Responsible for partition leader election, monitoring broker failures, and maintaining cluster metadata

How does Kafka handle broker registration in a cluster?:: Each broker creates an ephemeral node in ZooKeeper with a unique broker ID

What happens if a Kafka index becomes corrupted?:: It can be safely regenerated from the matching log segment by rereading the messages