In-Sync Replicas Management
In-sync replicas (ISR) are crucial for Kafka's reliability guarantees. A replica is considered in-sync if it:
- Maintains an active ZooKeeper session
- Fetches messages from the leader regularly
- Catches up to most recent messages within configured time
Key Configurations
-
- Time before broker considered dead if no heartbeat
- Default: 18 seconds (since 2.5.0)
- Higher values improve stability in cloud environments
-
- Maximum time a follower can lag behind leader
- Default: 30 seconds (since 2.5.0)
- Affects consumer latency
-
min.insync.replicas
- Minimum replicas that must acknowledge writes
- Works with acks=all for strongest durability
- Partition becomes read-only if not met
References
Flashcards
What makes a replica considered "in-sync"?:: Active ZooKeeper session, regular fetches from leader, and caught up to recent messages
What happens when available replicas fall below min.insync.replicas?:: Partition becomes read-only for producers (but consumers can still read)
What is the purpose of replica.lag.time.max.ms?:: Defines maximum time a follower can lag behind leader before being removed from ISR