首頁 > 軟體

kafka消費不到資料的排查過程

2023-02-08 22:00:30

kafka消費不到資料的排查

叢集上新安裝並啟動了3個kafka Broker,程式碼打包上傳至叢集,執行後發現一直消費不到資料,

本地idea中debug後發現,程式一直阻塞在如下程式中,陷入了死迴圈。

  /**
     * Block until the coordinator for this group is known and is ready to receive requests.
     * 等待直到我們和伺服器端的GroupCoordinator取得連線
     */
    public void ensureCoordinatorReady() {
        while (coordinatorUnknown()) {//無法獲取GroupCoordinator
            RequestFuture<Void> future = sendGroupCoordinatorRequest();//傳送請求
            client.poll(future);//同步等待非同步呼叫的結果
            if (future.failed()) {
                if (future.isRetriable())
                    client.awaitMetadataUpdate();
                else
                    throw future.exception();
            } else if (coordinator != null && client.connectionFailed(coordinator)) {
                // we found the coordinator, but the connection has failed, so mark
                // it dead and backoff before retrying discovery
                coordinatorDead();
                time.sleep(retryBackoffMs);//等待一段時間,然後重試
            }

        }
    }

流程大概說就是

  • consumer會從叢集中選取一個broker作為coordinator
  • 然後group中的consumer會向coordinator發請求申請成為consumergroup中的leader
  • 最後有1個consumer會成為consumerLeader ,其他consumer成為follower
  • consumerLeader做分割區分配任務,同步給coordinator
  • consumerFollower從coordinator同步分割區分配資料

問題出現在第一步,意思就是說Consumer和伺服器端的GroupCoordinator無法取得連線,所以程式一直在等待狀態。

看了下__consumer_offsets 這個topic情況,50個分割區全在broker id為152的broker上

bin/kafka-topics.sh --describe --zookeeper localhost:2182 --topic __consumer_offsets
Topic:__consumer_offsets    PartitionCount:50    ReplicationFactor:1    Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
    Topic: __consumer_offsets    Partition: 0    Leader: 152    Replicas: 152   Isr:152
    Topic: __consumer_offsets    Partition: 1    Leader: 152    Replicas: 152   Isr:152
    Topic: __consumer_offsets    Partition: 2    Leader: 152    Replicas: 152   Isr:152
    Topic: __consumer_offsets    Partition: 3    Leader: 152   
......

但是叢集上並沒有broker id為152的節點,想到該叢集kafka節點曾經新增刪除過節點,初步斷定152是之前的kafka節點,後來該節點去掉後又加入新的節點但是zookeeper中的資料並沒有更新。

所以就關閉broker,進入zookeeper使用者端,將brokers節點下的topics節點下的__consumer_offsets刪除,然後重啟broker,注意,此時zookeeper上__consumer_offsets還並沒有生成,要開啟消費者之後才會生成.

然後再觀察__consumer_offsets,分割區已經均勻分佈在三個broker上面了

 bin/kafka-topics.sh --zookeeper localhost:2182 --describe --topic __consumer_offsets
Topic:__consumer_offsets    PartitionCount:50    ReplicationFactor:3    Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
    Topic: __consumer_offsets    Partition: 0    Leader: 420    Replicas: 420,421,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 1    Leader: 421    Replicas: 421,422,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 2    Leader: 422    Replicas: 422,420,421    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 3    Leader: 420    Replicas: 420,422,421    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 4    Leader: 421    Replicas: 421,420,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 5    Leader: 422    Replicas: 422,421,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 6    Leader: 420    Replicas: 420,421,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 7    Leader: 421    Replicas: 421,422,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 8    Leader: 422    Replicas: 422,420,421    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 9    Leader: 420    Replicas: 420,422,421    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 10    Leader: 421    Replicas: 421,420,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 11    Leader: 422    Replicas: 422,421,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 12    Leader: 420    Replicas: 420,421,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 13    Leader: 421    Replicas: 421,422,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 14    Leader: 422    Replicas: 422,420,421    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 15    Leader: 420    Replicas: 420,422,421    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 16    Leader: 421    Replicas: 421,420,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 17    Leader: 422    Replicas: 422,421,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 18    Leader: 420    Replicas: 420,421,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 19    Leader: 421    Replicas: 421,422,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 20    Leader: 422    Replicas: 422,420,421    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 21    Leader: 420    Replicas: 420,422,421    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 22    Leader: 421    Replicas: 421,420,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 23    Leader: 422    Replicas: 422,421,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 24    Leader: 420    Replicas: 420,421,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 25    Leader: 421    Replicas: 421,422,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 26    Leader: 422    Replicas: 422,420,421    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 27    Leader: 420    Replicas: 420,422,421    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 28    Leader: 421    Replicas: 421,420,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 29    Leader: 422    Replicas: 422,421,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 30    Leader: 420    Replicas: 420,421,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 31    Leader: 421    Replicas: 421,422,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 32    Leader: 422    Replicas: 422,420,421    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 33    Leader: 420    Replicas: 420,422,421    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 34    Leader: 421    Replicas: 421,420,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 35    Leader: 422    Replicas: 422,421,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 36    Leader: 420    Replicas: 420,421,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 37    Leader: 421    Replicas: 421,422,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 38    Leader: 422    Replicas: 422,420,421    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 39    Leader: 420    Replicas: 420,422,421    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 40    Leader: 421    Replicas: 421,420,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 41    Leader: 422    Replicas: 422,421,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 42    Leader: 420    Replicas: 420,421,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 43    Leader: 421    Replicas: 421,422,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 44    Leader: 422    Replicas: 422,420,421    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 45    Leader: 420    Replicas: 420,422,421    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 46    Leader: 421    Replicas: 421,420,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 47    Leader: 422    Replicas: 422,421,420    Isr: 422,420,421
    Topic: __consumer_offsets    Partition: 48    Leader: 420    Replicas: 420,421,422    Isr: 420,422,421
    Topic: __consumer_offsets    Partition: 49    Leader: 421    Replicas: 421,422,420    Isr: 422,420,421

這個時候重啟程式,發現已經可以正常消費了,問題解決。

參考資料:

總結

以上為個人經驗,希望能給大家一個參考,也希望大家多多支援it145.com。


IT145.com E-mail:sddin#qq.com