High Watermark and Message Visibility
High Watermark Concept
The high watermark (HW) is a crucial mechanism in Kafka that:
- Controls message visibility to consumers
- Ensures consistency across replicas
- Prevents reading uncommitted messages
Message Visibility Rules
-
Producer Side
- Messages written to leader
- Replicated based on acks setting
- HW advances after ISR replication
-
Consumer Side
- Can only read up to HW
- May see delay with acks=1
- Guarantees consistent view across consumers
Impact on Reliability
-
Data Consistency
- Prevents reading uncommitted data
- Ensures all consumers see same data
- Handles replica failures gracefully
-
Failure Scenarios
- Leader failures don't lose committed data
- Follower catch-up uses HW
- Replication maintains consistency
References
Flashcards
When does the high watermark advance?:: When all in-sync replicas have replicated the latest offsets
Under what conditions will consumers see messages with acks=1?:: When the high watermark has advanced, even if followers haven't replicated yet
What's the relationship between high watermark and committed messages?:: Consumers can only read up to the high watermark, which represents committed messages