Question 96
Data is synchronized between partition replication in Kafka. It takes a thread (replicationFetcherThread) to replicate data from the leader of the partition to the follower. Follower (a follower is equivalent to a consumer) actively pulls messages from the leader in batches, which greatly improves throughput.
Question 97
Assuming that HDFS only saves 2 copies when writing data. During the writing process, the HDFS Client first writes the data to DataNodel1 and then writes the data to DataNode2.
Question 98
In Hive, which of the following description of the partition is wrong?
Question 99
Assuming that the data volume is about 200GB and the maximum fragment capacity is limited to 30GB, what is the appropriate design for the maximum number of fragments?
Question 100
ElasticSearch's shards index fragmentation can break up the index data and distribute it to different nodes.