Question 96

Data is synchronized between partition replication in Kafka. It takes a thread (replicationFetcherThread) to replicate data from the leader of the partition to the follower. Follower (a follower is equivalent to a consumer) actively pulls messages from the leader in batches, which greatly improves throughput.
  • Question 97

    Assuming that HDFS only saves 2 copies when writing data. During the writing process, the HDFS Client first writes the data to DataNodel1 and then writes the data to DataNode2.
  • Question 98

    In Hive, which of the following description of the partition is wrong?
  • Question 99

    Assuming that the data volume is about 200GB and the maximum fragment capacity is limited to 30GB, what is the appropriate design for the maximum number of fragments?
  • Question 100

    ElasticSearch's shards index fragmentation can break up the index data and distribute it to different nodes.