CCDAK Practice Exam 1
References
Flashcards
Which of the following is the correct command for creating a topic foo with 3 partitions and with a replication factor of 3?
- bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 3 --partitions 3 --topic foo
- bin/kafka-topics-create.sh --zookeeper localhost:9092 --replication-factor 3 --partitions 3 --topic foo
- bin/kafka-topics.sh --create --broker localhost:9092 --replication-factor 3 --partitions 3 --topic foo
- bin/kafka-topics-create.sh --zookeeper localhost:2181 --num-replicas 3 --partitions 3 --topic foo
?
bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 3 --partitions 3 --topic foo
Which class is used to create a Producer in kafka:
- Producer Class
- KafkaProducer Class
- Kafka Class
- ConfluentProducer Class
- Confluent Class
?
KafkaProducer Class
What happens when a broker is brought down for an upgrade and the topic has no replication enabled i.e. 1 copy only.
- Kafka will automatically copy the topic partition on non replicated topic to some other machine
- Topic partition resident to that broker will be unavailable
- Kafka will not let you bring down a broker where a topic partition is hosted with no additional replication
- Upgrade can only be possible when whole Kafka cluster is brought down.
?
Topic partition resident to that broker will be unavailable
Does KSQL support indexing?:: No
Which property is used to specify brokers for the initial connection in kafka:
- brokers.list
- bootstrap.brokers
- bootstrap.servers
- servers.list
- broker.servers
?
bootstrap.servers
On our development cluster we have set auto.create.topics.enable to true. We have just created a new topic called “platinumcustomers”. What will be replication factor for this topic
- The replication factor will be 1
- The replication will be configured according to the defaults in server.properties file of the brokers
- The replication factor will be 3
- You cannot create a topic without specifying replication factor
?
The replication will be configured according to the defaults in server.properties file of the brokers
Kafka platform is not well suited for
- Simplifying data pipelines
- Data visualization and reporting
- Handling streaming data
- Support real time analytics
?
Data visualization and reporting
Which property is used to specify compression type:
- compression.codec
- compression.type
- compression.format
- compression.technology
- compression.strategy
?
compression.type
On our development cluster we have set auto.create.topics.enable to true. We have just created a new topic called “goldcustomers”.How many partitions will the topic have
- The number of partitions will be 1
- The number of partitions will be configured according to the defaults in server.properties file of the brokers
- The number of partitions will be 3
- You cannot create a topic without specifying number of partitions.
?
The number of partitions will be configured according to the defaults in server.properties file of the brokers
Which of the following are not part of the Kafka Ecosystem? Choose Two
- KSQL
- Schema Registry
- Oozie
- Sqoop
- Rest Proxy
? - Oozie
- Sqoop
Which of the following are invalid compression types to be specified in Kafka, Choose Two:
- snappy
- gzip
- Avro
- lz4
- ORC
? - Avro
- ORC
Which setting in the Kafka Cluster Configuration decides if automatic creation of topic is allowed or not allowed
- default.replication.factor
- auto.create.topics.enable
- topic.creation.automatic
- bootstrap.servers
?
auto.create.topics.enable
The non-Java clients are based on the following library which provides consistent APIs & semantics, high performance & high quality clients in various programming languages
- Kafka-console-consumer
- Kafka-console-producer
- Zookeeper
- librdkafka
?
librdkafka
Which property is used to specify the datatype i.e. Class used to serialize the key for the key in a kafka message:
- key.type
- key.serializer
- datatype.key
- datatype.key.serializer
- data.key.type
?
key.serializer
While running the following two commands the second command does not go through and throws an error. What could be the root cause of the error:
$> kafka-topics --zookeeper zk:port --create --topic sometopic --partitions 6 --replication-factor 3
$> kafka-topics --zookeeper zk:port --alter --topic sometopic --partitions 2
- Error arrives because in second command we are not mentioning replication factor
- Error arrives because you can increase the number of partitions but cannot reduce it
- There is no error
- Error arrives because you cannot change no. of partitions once decided during creation.
?
Error arrives because you can increase the number of partitions but cannot reduce it
Are worker processes of Kafka Connect managed by Kafka?:: False
Which property is used to specify the datatype i.e. Class used to serialize the value for the value in a kafka message:
- value.type
- value.serializer
- datatype.value
- datatype.value.serializer
- data.value.type
?
value.serializer
While running the following two commands on our development Kafka Cluster with all default settings we see an extremely strange behavior:
$> kafka-topics --zookeeper zk:port --delete --topic sometopic
$> kafka-topics --zookeeper zk:port --list
After running second command we still see that sometopic is existing and hasn’t been deleted. What could be the potential reason
- Topics cannot be deleted in Kafka
- To delete a topic all the brokers must be brought down
- To delete a topic we should stop all producers and consumers, after running the first command may be a running producer is again trying to access the topic
- After deleting a topic a zookeeper restart is needed only then a topic gets deleted
?
To delete a topic we should stop all producers and consumers, after running the first command may be a running producer is again trying to access the topic
Following are the main storage and messaging components of the Kafka cluster
- Brokers
- Producers
- Consumers
- Topics
?
Brokers
When writing a Kafka Producer the compression.type is set to snappy. What does this imply:
- Data is compressed on the producer, is decompressed as soon as it is written to Kafka
- Data is compressed on the producer, is stored as compressed on kafka and also stays compressed on the consumer. The message can only be decompressed after it is written to some file system like local, HDFS, or Amazon S3
- Data is compressed on the producer, is stored as compressed on kafka broker, decompressed on the Consumer
?
Data is compressed on the producer, is stored as compressed on kafka broker, decompressed on the Consumer
What is the default size of a log segment i.e. log.segment.bytes
- 4 GB
- 10 GB
- 24 GB
- 1 GB
?
1 GB
True or False: Broker, Producer, and Consumer software works the same on physical machines, VMs, Docker containers i.e. the broker, producer and consumer are agnostic to running on physical machines, VMs and Dockers.:: True
True or False: Key serializer must be used even if you do not intend to use keys:: True
What is the default period of time for log segment files to be rolled i.e. log.roll.hours
- 1 day
- 3 days
- 7 days
- 1 month
?
7 days
True or False: Keys and Values in a kafka message have to be of the same type:: False
True or False: Compression in Kafka is enabled on per topic basis:: False
Client Authentication can be done using which two protocols
- TLS
- SASL
- UDP
- TCP
- HTTP
?
TLS
SASL
__________________ is a concept that allows breaking up data, so that there is no need for Consumers to parse the data that they do not want or need to see.
- Producers
- Topics
- Zookeeper
- Kafka Cluster
?
Topics
True or False: We can have one producer compressing the messages and another one not compressing messages both writing to the same topic:: True
Which of the following best describes Kafka Connect?
- Kafka Connect is a framework for streaming data between Apache Kafka and other data systems
- Kafka Connect provides a serving layer for your metadata. It provides a RESTful interface for storing and retrieving Avro schemas. It stores a versioned history of all schemas, provides multiple compatibility settings and allows evolution of schemas
- Kafka Connect is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
- Kafka Connect provides a RESTful interface to a Kafka cluster, making it easy to produce and consume messages, view the state of the cluster, and perform administrative actions without using the native Kafka protocol or clients
?
Kafka Connect is a framework for streaming data between Apache Kafka and other data systems
Following partitioner is used when there is no key specified.
- Hashpartitioner
- Round Robin Partitioner
- Rangepartitioner
- Valuepartitioner
?
Round Robin Partitioner
In a Kafka Produce program how do you ensure Producer will not wait for any acknowledgement from the server:
- Set acks=0
- Set acks=false
- Set acks=all
- Set acks=1
- Set acks=true
?
Set acks=0
Which of the following is not a characteristic or benefit of Kafka Connect?
- Off-the-shelf, tested Connectors for common data sources are available
- Features fault tolerance and automatic load balancing when running in distributed mode
- Just write Plain Old Java code for Kafka Connect
- Pluggable/Extensible by developers
?
Just write Plain Old Java code for Kafka Connect
Following partitioner is the default partitioner when the key is specified.
- Hashpartitioner
- Round Robin Partitioner
- Rangepartitioner
- Valuepartitioner
?
Hashpartitioner
In a Kafka Produce program how do you ensure Producer will wait until the leader has written the record to its local log:
- Set acks=0
- Set acks=false
- Set acks=all
- Set acks=1
- Set acks=true
?
Set acks=1
Which component of Kafka Connect would be considered capable of reading data from external data stores such as databases
- Kafka Connector Sources
- Kafka Connector Sinks
- Kafka Topic
- Kafka Connector Cluster
- Zookeeper
?
Kafka Connector Sources
True or False: Hashpartitioner will place all messages with the same value on a single partition.:: False
In a Kafka Produce program how do you ensure Producer will wait until all in-sync replicas have acknowledged receipt of the record:
- Set acks=0
- Set acks=false
- Set acks=all
- Set acks=1
- Set acks=true
?
Set acks=all
Which component of Kafka Connect would be considered capable of pulling data from a Kafka topic and write it to an external application such as HDFS
- Kafka Connector Sources
- Kafka Connector Sinks
- Kafka Topic
- Kafka Connector Cluster
- Zookeeper
?
Kafka Connector Sinks
REST server allows users to send Producer and Consumer requests to the cluster using which protocol
- UDP
- HTTP
- TCP
- FTP
?
HTTP
What is the best practice to handle transient failures if acks is a value other than 0:
- Set retries to non-zero value
- Set retries to zero
- Don’t set retries value
?
Set retries to non-zero value
Which component related to Kafka Connect is a repository for Connectors, Transformations and convertors for Kafka Connect
- Kafka Connector Sources
- Confluent Hub
- Kafka Topic
- Kafka Connector Cluster
- Zookeeper
?
Confluent Hub
The following is intended to allow customers to add Kafka to their IoT architecture to enable stream processing?
- UDP
- MQTT
- TCP
- FTP
?
MQTT
What is the difference between KafkaProducer class and ProducerRecord class?
- They both are the same, ProducerRecord is old class and KafkaProducer is the new one
- They both are the same, ProducerRecord is the new class and KafkaProducer is the old one
- KafkaProducer is the class that handles Producer and ProducerRecord is the class that handles messages. KafkaProducer’s send method takes object of ProducerRecord as an input
- KafkaProducer is the class that handles messages and ProducerRecord is the class that handles Producer. ProducerRecord’s send method takes object of KafkaProducers class as an input
?
KafkaProducer is the class that handles Producer and ProducerRecord is the class that handles messages. KafkaProducer’s send method takes object of ProducerRecord as an input
Where does Kafka Connect run
- Kafka Brokers
- Own dedicated cluster
- Kafka Producers
- Kafka Consumers
- Zookeeper
?
Own dedicated cluster
True or False: Retention policies for messages can be configured on a topic level:: True
What should be the configurations looking like for High Throughput for batching:
- large batch.size and large linger.ms
- large batch.size and small linger.ms
- small batch.size and large linger.ms
- small batch.size and small linger.ms
?
large batch.size and large linger.ms
What does Offset mean or correspond to for the Kafka Connector Source for a file input?
- timestamp
- Position in the file
- Sequence id
- File name
?
Position in the file
Consumer Offset are stored in
- __consumer_offsets topic
- Zookeeper
- /tmp directory on brokers
- consumers
?
__consumer_offsets topic
What should be the configurations looking like for Low Latency for batching:
- large batch.size and large linger.ms
- large batch.size and small linger.ms
- small batch.size and large linger.ms
- small batch.size and small linger.ms
?
small batch.size and small linger.ms
Which of the below components of Kafka support Exactly Once Semantic? (More than one option is correct)
- Schema Registry
- Kafka SQL
- Auto Databalancer
- Kafka Streams
?
Kafka SQL
Kafka Streams
How do we configure rack awareness in Kafka?
- Set rack.awareness to true
- Set broker.rack to rack id
- Set broker.rack to true
- Set rack.available to true
?
Set broker.rack to rack id
How many modes exist in Kafka Connect? (Choose 2)
- Distributed Mode
- Psuedo Mode
- Local Mode
- Standalone Mode
?
Distributed Mode
Standalone Mode
How does Replicator solve the problem of infinite replication loop?
- It uses a message body to record origin cluster
- It uses another topic to keep track of the origin cluster
- It uses a partition to store origin cluster
- It uses message header to record origin cluster
?
It uses another topic to keep track of the origin cluster
How can we ensure that the load balancer of Rest proxy binds a User’s session to a specific instance?
- Use 1 Rest proxy at a time
- Use Sticky Load balancer
- Use F5 Load Balancer
- None of the above
?
Use Sticky Load balancer
Which is considered as the best format to work with Kafka?
- CSV
- Avro
- Parquet
- JSON
?
Avro
Which factors are considered while deciding the task.max for Kafka connector? (Choose any three)
- Number of Topics
- Number of Partitions
- Desired Throughput/Throughput per task
- Machine * No of cores per machine
? - Number of Partitions
- Desired Throughput/Throughput per task
- Machine * No of cores per machine
What is the default size of a message in Kafka?
- 100 KB
- I0 KB
- 1 MB
- 10 MB
?
1 MB
Which of the below options are recommended for Zookeeper?
- Allocate dedicated high core machines for processing data
- Allocate memory-intensive machines for caching data
- Run more than 9 Zookeepers
- Dedicate a separate SATA or SSD for its transaction logs
?
Dedicate a separate SATA or SSD for its transaction logs
How many max partitions are recommended in a broker?
- 500
- 1000
- 2000
- 4000
?
4000
How much JVM memory is sufficient for Broker JVM in a typical deployment?
- 1 GB
- 4 GB
- 20 GB
- 100 GB
?
4 GB
Which filesystem is recommended for deploying Kafka clusters?
- XFS
- NTFS
- FAT32
- EXT3
?
XFS
Which component is responsible for checking the liveness of Consumers?
- Group Leader
- Consumer Group
- Group Coordinator
- Zookeeper
?
Group Coordinator
Where does Kafka store Consumer offsets?
- Zookeeper
- Controller
- User Topic
- System Topic
?
System Topic