cdc同步到kafka的changefeed出现问题,影响了mysql的change feed

【 TiDB 使用环境】生产环境 /测试/ Poc 生产环境
【 TiDB 版本】v4.0.13

背景:
上游 TIDB,用cdc同步到mysql,kafka,然后因为kafka的服务挂了,导致同步kafka的changefeed出现了问题,然后使用cdc cli执行了两次pause,一次resume操作,第二次pause没有改变list查看的状态,期间没有做其他操作
,之后同步到mysql的changefeed出现了延迟

解决方案:
删除了同步到kafka的任务后,同步恢复正常

监控:


可以看到get的量在同步延迟时间有大量的减少

cdc日志

[2023/04/14 07:17:15.795 +00:00] [INFO] [owner.go:622] ["stale task status is not deleted, wait metadata cleaned to create new changefeed"] ["task status"="{\"tables\":{\"1306\":{\"start-ts\":440271347866402910,\"mark-table-id\":0},\"1462\":{\"start-ts\":440271347866402910,\"mark-table-id\":0},\"1995\":{\"start-ts\":440271347866402910,\"mark-table-id\":0},\"477\":{\"start-ts\":440271347866402910,\"mark-table-id\":0},\"517\":{\"start-ts\":440271347866402910,\"mark-table-id\":0},\"677\":{\"start-ts\":440271347866402910,\"mark-table-id\":0}},\"operation\":null,\"admin-job-type\":1}"] [changefeed=cdc-kafka]
[2023/04/14 07:17:15.879 +00:00] [ERROR] [kafka.go:227] ["close async client with error"] [error="kafka: Failed to deliver 9 messages."]
[2023/04/14 07:17:15.879 +00:00] [ERROR] [processor.go:1428] ["processor receives redundant error"] [error="[CDC:ErrAdminStopProcessor]stop processor by admin command"] [errorVerbose="[CDC:ErrAdminStopProcessor]stop processor by admin command\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/normalize.go:156\ngithub.com/pingcap/ticdc/cdc.(*oldProcessor).stop\n\tgithub.com/pingcap/ticdc@/cdc/processor.go:1312\ngithub.com/pingcap/ticdc/cdc.(*Capture).handleTaskEvent\n\tgithub.com/pingcap/ticdc@/cdc/capture.go:283\ngithub.com/pingcap/ticdc/cdc.(*Capture).Run\n\tgithub.com/pingcap/ticdc@/cdc/capture.go:211\ngithub.com/pingcap/ticdc/cdc.(*Server).run.func4\n\tgithub.com/pingcap/ticdc@/cdc/server.go:272\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20201020160332-67f06af15bc9/errgroup/errgroup.go:57\nruntime.goexit\n\truntime/asm_amd64.s:1357"]
[2023/04/14 07:17:15.962 +00:00] [INFO] [owner.go:290] ["Find new changefeed"] [info="{\"sink-uri\":\"***\",\"opts\":{\"max-message-bytes\":\"67108864\"},\"create-time\":\"2022-08-10T08:38:01.259768036Z\",\"start-ts\":435190675862716506,\"target-ts\":0,\"admin-job-type\":2,\"sort-engine\":\"unified\",\"config\":{\"case-sensitive\":true,\"enable-old-value\":true,\"force-replicate\":false,\"check-gc-safe-point\":true,\"filter\":{\"rules\":[],\"ignore-txn-start-ts\":null,\"ddl-allow-list\":null},\"mounter\":{\"worker-num\":16},\"sink\":{\"dispatchers\":null,\"protocol\":\"canal-json\"},\"cyclic-replication\":{\"enable\":false,\"replica-id\":0,\"filter-replica-ids\":null,\"id-buckets\":0,\"sync-ddl\":false},\"scheduler\":{\"type\":\"table-number\",\"polling-time\":-1}},\"state\":\"normal\",\"history\":[1681455243649],\"error\":null,\"sync-point-enabled\":false,\"sync-point-interval\":600000000000,\"creator-version\":\"v4.0.13\"}"] [changefeed=cdc-kafka] ["checkpoint ts"=440783380765474971]
[2023/04/14 07:17:16.074 +00:00] [INFO] [kafka.go:320] ["Starting kafka sarama producer ..."] [config="{\"PartitionNum\":1,\"ReplicationFactor\":1,\"Version\":\"2.6.2\",\"MaxMessageBytes\":67108864,\"Compression\":\"none\",\"ClientID\":\"\",\"Credential\":{\"ca-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"cert-allowed-cn\":null},\"TopicPreProcess\":true}"]
[2023/04/14 07:17:16.076 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=362] [regionID=128924] [startKey=6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006c0000000000fa] [endKey=6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006d0000000000fa] [checkpointTs=440783380765474971]
[2023/04/14 07:17:16.076 +00:00] [INFO] [client.go:814] ["creating new stream to store to send request"] [regionID=128924] [requestID=484592] [storeID=1] [addr=ip.ip:20160]
[2023/04/14 07:17:16.076 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=363] [regionID=128924] [startKey=6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006c00000000fb] [endKey=6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006d00000000fb] [checkpointTs=440783380765474971]
[2023/04/14 07:17:16.076 +00:00] [INFO] [client.go:814] ["creating new stream to store to send request"] [regionID=128924] [requestID=484593] [storeID=1] [addr=ip.ip:20160]
[2023/04/14 07:17:16.081 +00:00] [INFO] [puller.go:217] ["puller is initialized"] [duration=7.485094ms] [changefeed=] [tableID=-1] [spans="[\"[6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006c0000000000fa, 6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006d0000000000fa)\",\"[6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006c00000000fb, 6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006d00000000fb)\"]"] [resolvedTs=440783380765474971]
[2023/04/14 07:17:16.539 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=236727] [addr=ip.ip:20160]
# 大量drop resolved ts due to region feed stopped
[2023/04/14 07:18:16.298 +00:00] [INFO] [client.go:1150] ["stream to store closed"] [addr=ip.ip:20160] [storeID=1]
[2023/04/14 07:18:16.307 +00:00] [INFO] [client.go:1150] ["stream to store closed"] [addr=ip.ip:20160] [storeID=1]
[2023/04/14 07:18:16.855 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=209288] [addr=ip.ip:20160]
[2023/04/14 07:18:16.865 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=236727] [addr=ip.ip:20160]
[2023/04/14 07:18:17.885 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=236727] [addr=ip.ip:20160]
[2023/04/14 07:18:17.892 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=209288] [addr=ip.ip:20160]
[2023/04/14 07:18:18.301 +00:00] [WARN] [owner.go:660] ["create changefeed failed, retry later"] [changefeed=cdc-kafka] [error="[CDC:ErrKafkaNewSaramaProducer]kafka: client has run out of available brokers to talk to (Is your cluster reachable?)"] [errorVerbose="[CDC:ErrKafkaNewSaramaProducer]kafka: client has run out of available brokers to talk to (Is your cluster reachable?)\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByCause\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/normalize.go:279\ngithub.com/pingcap/ticdc/pkg/errors.WrapError\n\tgithub.com/pingcap/ticdc@/pkg/errors/helper.go:28\ngithub.com/pingcap/ticdc/cdc/sink/producer/kafka.NewKafkaSaramaProducer\n\tgithub.com/pingcap/ticdc@/cdc/sink/producer/kafka/kafka.go:330\ngithub.com/pingcap/ticdc/cdc/sink.newKafkaSaramaSink\n\tgithub.com/pingcap/ticdc@/cdc/sink/mq.go:477\ngithub.com/pingcap/ticdc/cdc/sink.init.1.func3\n\tgithub.com/pingcap/ticdc@/cdc/sink/sink.go:81\ngithub.com/pingcap/ticdc/cdc/sink.NewSink\n\tgithub.com/pingcap/ticdc@/cdc/sink/sink.go:113\ngithub.com/pingcap/ticdc/cdc.(*Owner).newChangeFeed\n\tgithub.com/pingcap/ticdc@/cdc/owner.go:433\ngithub.com/pingcap/ticdc/cdc.(*Owner).loadChangeFeeds\n\tgithub.com/pingcap/ticdc@/cdc/owner.go:634\ngithub.com/pingcap/ticdc/cdc.(*Owner).run\n\tgithub.com/pingcap/ticdc@/cdc/owner.go:1357\ngithub.com/pingcap/ticdc/cdc.(*Owner).Run\n\tgithub.com/pingcap/ticdc@/cdc/owner.go:1252\ngithub.com/pingcap/ticdc/cdc.(*Server).campaignOwnerLoop\n\tgithub.com/pingcap/ticdc@/cdc/server.go:177\ngithub.com/pingcap/ticdc/cdc.(*Server).run.func1\n\tgithub.com/pingcap/ticdc@/cdc/server.go:260\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20201020160332-67f06af15bc9/errgroup/errgroup.go:57\nruntime.goexit\n\truntime/asm_amd64.s:1357"]
[2023/04/14 07:18:18.343 +00:00] [INFO] [owner.go:290] ["Find new changefeed"] [info="{\"sink-uri\":\"***\",\"opts\":{\"max-message-bytes\":\"67108864\"},\"create-time\":\"2022-08-10T08:38:01.259768036Z\",\"start-ts\":435190675862716506,\"target-ts\":0,\"admin-job-type\":2,\"sort-engine\":\"unified\",\"config\":{\"case-sensitive\":true,\"enable-old-value\":true,\"force-replicate\":false,\"check-gc-safe-point\":true,\"filter\":{\"rules\":[],\"ignore-txn-start-ts\":null,\"ddl-allow-list\":null},\"mounter\":{\"worker-num\":16},\"sink\":{\"dispatchers\":null,\"protocol\":\"canal-json\"},\"cyclic-replication\":{\"enable\":false,\"replica-id\":0,\"filter-replica-ids\":null,\"id-buckets\":0,\"sync-ddl\":false},\"scheduler\":{\"type\":\"table-number\",\"polling-time\":-1}},\"state\":\"normal\",\"history\":[1681456698295],\"error\":{\"addr\":\"ip.ip:8300\",\"code\":\"CDC-owner-1001\",\"message\":\"[CDC:ErrKafkaNewSaramaProducer]kafka: client has run out of available brokers to talk to (Is your cluster reachable?)\"},\"sync-point-enabled\":false,\"sync-point-interval\":600000000000,\"creator-version\":\"v4.0.13\"}"] [changefeed=cdc-kafka] ["checkpoint ts"=440783380765474971]
[2023/04/14 07:18:18.447 +00:00] [INFO] [kafka.go:320] ["Starting kafka sarama producer ..."] [config="{\"PartitionNum\":1,\"ReplicationFactor\":1,\"Version\":\"2.6.2\",\"MaxMessageBytes\":67108864,\"Compression\":\"none\",\"ClientID\":\"\",\"Credential\":{\"ca-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"cert-allowed-cn\":null},\"TopicPreProcess\":true}"]
[2023/04/14 07:18:18.449 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=364] [regionID=128924] [startKey=6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006c0000000000fa] [endKey=6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006d0000000000fa] [checkpointTs=440783380765474971]
[2023/04/14 07:18:18.449 +00:00] [INFO] [client.go:814] ["creating new stream to store to send request"] [regionID=128924] [requestID=484596] [storeID=1] [addr=ip.ip:20160]
[2023/04/14 07:18:18.449 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=365] [regionID=128924] [startKey=6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006c00000000fb] [endKey=6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006d00000000fb] [checkpointTs=440783380765474971]
[2023/04/14 07:18:18.449 +00:00] [INFO] [client.go:814] ["creating new stream to store to send request"] [regionID=128924] [requestID=484597] [storeID=1] [addr=ip.ip:20160]
[2023/04/14 07:18:18.453 +00:00] [INFO] [puller.go:217] ["puller is initialized"] [duration=7.314801ms] [changefeed=] [tableID=-1] [spans="[\"[6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006c0000000000fa, 6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006d0000000000fa)\",\"[6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006c00000000fb, 6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006d00000000fb)\"]"] [resolvedTs=440783380765474971]
[2023/04/14 07:18:18.869 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=236727] [addr=ip.ip:20160]
# 大量drop resolved ts due to region feed stopped
[2023/04/14 07:18:55.905 +00:00] [INFO] [statistics.go:126] ["sink replication status"] [name=mysql] [changefeed=cdc-abc-max-tidb] [capture=ip.ip:8300] [count=0] [qps=0]
[2023/04/14 07:18:56.125 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=209288] [addr=ip.ip:20160]
# 大量drop resolved ts due to region feed stopped
[2023/04/14 07:19:17.013 +00:00] [INFO] [client.go:968] ["EventFeed disconnected"] [regionID=290141] [requestID=484581] [span="[7480000000000005ff205f720000000000fa, 7480000000000005ff205f72891d3e675fffc010000000000000fa)"] [checkpoint=440783799881302033] [error="[CDC:ErrEventFeedEventError]not_leader:<region_id:290141 > "]
[2023/04/14 07:19:17.013 +00:00] [INFO] [region_range_lock.go:370] ["unlocked range"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783799881302033]
[2023/04/14 07:19:17.013 +00:00] [INFO] [region_cache.go:829] ["switch region peer to next due to NotLeader with NULL leader"] [currIdx=1] [regionID=290141]
[2023/04/14 07:19:17.013 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783799881302033]
[2023/04/14 07:19:17.013 +00:00] [INFO] [client.go:859] ["start new request"] [request="{\"header\":{\"cluster_id\":7006125491825389564,\"ticdc_version\":\"4.0.13\"},\"region_id\":290141,\"region_epoch\":{\"conf_ver\":8,\"version\":1367},\"checkpoint_ts\":440783799881302033,\"start_key\":\"dIAAAAAAAAX/IF9yAAAAAAD6\",\"end_key\":\"dIAAAAAAAAX/IF9yiR0+Z1//wBAAAAAAAAD6\",\"request_id\":484598,\"extra_op\":1}"] [addr=ip.28.128:20160]
[2023/04/14 07:19:17.015 +00:00] [INFO] [client.go:1269] ["region state entry will be replaced because received message of newer requestID"] [regionID=290141] [oldRequestID=484576] [requestID=484598] [addr=ip.28.128:20160]
[2023/04/14 07:19:17.015 +00:00] [INFO] [client.go:968] ["EventFeed disconnected"] [regionID=290141] [requestID=484598] [span="[7480000000000005ff205f720000000000fa, 7480000000000005ff205f72891d3e675fffc010000000000000fa)"] [checkpoint=440783799881302033] [error="[CDC:ErrEventFeedEventError]not_leader:<region_id:290141 > "]
[2023/04/14 07:19:17.015 +00:00] [INFO] [region_range_lock.go:370] ["unlocked range"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783799881302033]
[2023/04/14 07:19:17.015 +00:00] [INFO] [region_cache.go:829] ["switch region peer to next due to NotLeader with NULL leader"] [currIdx=2] [regionID=290141]
[2023/04/14 07:19:17.015 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783799881302033]
[2023/04/14 07:19:17.015 +00:00] [INFO] [client.go:859] ["start new request"] [request="{\"header\":{\"cluster_id\":7006125491825389564,\"ticdc_version\":\"4.0.13\"},\"region_id\":290141,\"region_epoch\":{\"conf_ver\":8,\"version\":1367},\"checkpoint_ts\":440783799881302033,\"start_key\":\"dIAAAAAAAAX/IF9yAAAAAAD6\",\"end_key\":\"dIAAAAAAAAX/IF9yiR0+Z1//wBAAAAAAAAD6\",\"request_id\":484599,\"extra_op\":1}"] [addr=ip.ip:20160]
[2023/04/14 07:19:17.017 +00:00] [INFO] [client.go:1269] ["region state entry will be replaced because received message of newer requestID"] [regionID=290141] [oldRequestID=484577] [requestID=484599] [addr=ip.ip:20160]
[2023/04/14 07:19:17.017 +00:00] [INFO] [client.go:968] ["EventFeed disconnected"] [regionID=290141] [requestID=484599] [span="[7480000000000005ff205f720000000000fa, 7480000000000005ff205f72891d3e675fffc010000000000000fa)"] [checkpoint=440783799881302033] [error="[CDC:ErrEventFeedEventError]not_leader:<region_id:290141 > "]
[2023/04/14 07:19:17.017 +00:00] [INFO] [region_range_lock.go:370] ["unlocked range"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783799881302033]
[2023/04/14 07:19:17.017 +00:00] [INFO] [region_cache.go:829] ["switch region peer to next due to NotLeader with NULL leader"] [currIdx=0] [regionID=290141]
[2023/04/14 07:19:17.017 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783799881302033]
[2023/04/14 07:19:17.017 +00:00] [INFO] [client.go:859] ["start new request"] [request="{\"header\":{\"cluster_id\":7006125491825389564,\"ticdc_version\":\"4.0.13\"},\"region_id\":290141,\"region_epoch\":{\"conf_ver\":8,\"version\":1367},\"checkpoint_ts\":440783799881302033,\"start_key\":\"dIAAAAAAAAX/IF9yAAAAAAD6\",\"end_key\":\"dIAAAAAAAAX/IF9yiR0+Z1//wBAAAAAAAAD6\",\"request_id\":484600,\"extra_op\":1}"] [addr=ip.ip:20160]
[2023/04/14 07:19:17.021 +00:00] [INFO] [client.go:1269] ["region state entry will be replaced because received message of newer requestID"] [regionID=290141] [oldRequestID=484581] [requestID=484600] [addr=ip.ip:20160]
[2023/04/14 07:19:17.021 +00:00] [INFO] [client.go:968] ["EventFeed disconnected"] [regionID=290141] [requestID=484600] [span="[7480000000000005ff205f720000000000fa, 7480000000000005ff205f72891d3e675fffc010000000000000fa)"] [checkpoint=440783799881302033] [error="[CDC:ErrEventFeedEventError]not_leader:<region_id:290141 leader:<id:290144 store_id:61784 > > "]
[2023/04/14 07:19:17.021 +00:00] [INFO] [region_range_lock.go:370] ["unlocked range"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783799881302033]
[2023/04/14 07:19:17.021 +00:00] [INFO] [region_cache.go:842] ["switch region leader to specific leader due to kv return NotLeader"] [regionID=290141] [currIdx=1] [leaderStoreID=61784]
[2023/04/14 07:19:17.021 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783799881302033]
[2023/04/14 07:19:17.021 +00:00] [INFO] [client.go:859] ["start new request"] [request="{\"header\":{\"cluster_id\":7006125491825389564,\"ticdc_version\":\"4.0.13\"},\"region_id\":290141,\"region_epoch\":{\"conf_ver\":8,\"version\":1367},\"checkpoint_ts\":440783799881302033,\"start_key\":\"dIAAAAAAAAX/IF9yAAAAAAD6\",\"end_key\":\"dIAAAAAAAAX/IF9yiR0+Z1//wBAAAAAAAAD6\",\"request_id\":484601,\"extra_op\":1}"] [addr=ip.28.128:20160]
[2023/04/14 07:19:17.105 +00:00] [INFO] [client.go:1269] ["region state entry will be replaced because received message of newer requestID"] [regionID=290141] [oldRequestID=484598] [requestID=484601] [addr=ip.28.128:20160]
[2023/04/14 07:19:17.211 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=236727] [addr=ip.ip:20160]
[2023/04/14 07:19:17.213 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=209288] [addr=ip.ip:20160]
[2023/04/14 07:19:18.225 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=209288] [addr=ip.ip:20160]
[2023/04/14 07:19:18.232 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=236727] [addr=ip.ip:20160]
[2023/04/14 07:19:18.653 +00:00] [INFO] [client.go:1150] ["stream to store closed"] [addr=ip.ip:20160] [storeID=1]
[2023/04/14 07:19:18.707 +00:00] [INFO] [client.go:1150] ["stream to store closed"] [addr=ip.ip:20160] [storeID=1]
[2023/04/14 07:19:19.212 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=236727] [addr=ip.ip:20160]
[2023/04/14 07:19:19.224 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=209288] [addr=ip.ip:20160]
[2023/04/14 07:19:20.230 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=209288] [addr=ip.ip:20160]
[2023/04/14 07:19:20.231 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=236727] [addr=ip.ip:20160]
[2023/04/14 07:19:20.645 +00:00] [WARN] [owner.go:660] ["create changefeed failed, retry later"] [changefeed=cdc-kafka] [error="[CDC:ErrKafkaNewSaramaProducer]kafka: client has run out of available brokers to talk to (Is your cluster reachable?)"] [errorVerbose="[CDC:ErrKafkaNewSaramaProducer]kafka: client has run out of available brokers to talk to (Is your cluster reachable?)\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByCause\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/normalize.go:279\ngithub.com/pingcap/ticdc/pkg/errors.WrapError\n\tgithub.com/pingcap/ticdc@/pkg/errors/helper.go:28\ngithub.com/pingcap/ticdc/cdc/sink/producer/kafka.NewKafkaSaramaProducer\n\tgithub.com/pingcap/ticdc@/cdc/sink/producer/kafka/kafka.go:330\ngithub.com/pingcap/ticdc/cdc/sink.newKafkaSaramaSink\n\tgithub.com/pingcap/ticdc@/cdc/sink/mq.go:477\ngithub.com/pingcap/ticdc/cdc/sink.init.1.func3\n\tgithub.com/pingcap/ticdc@/cdc/sink/sink.go:81\ngithub.com/pingcap/ticdc/cdc/sink.NewSink\n\tgithub.com/pingcap/ticdc@/cdc/sink/sink.go:113\ngithub.com/pingcap/ticdc/cdc.(*Owner).newChangeFeed\n\tgithub.com/pingcap/ticdc@/cdc/owner.go:433\ngithub.com/pingcap/ticdc/cdc.(*Owner).loadChangeFeeds\n\tgithub.com/pingcap/ticdc@/cdc/owner.go:634\ngithub.com/pingcap/ticdc/cdc.(*Owner).run\n\tgithub.com/pingcap/ticdc@/cdc/owner.go:1357\ngithub.com/pingcap/ticdc/cdc.(*Owner).Run\n\tgithub.com/pingcap/ticdc@/cdc/owner.go:1252\ngithub.com/pingcap/ticdc/cdc.(*Server).campaignOwnerLoop\n\tgithub.com/pingcap/ticdc@/cdc/server.go:177\ngithub.com/pingcap/ticdc/cdc.(*Server).run.func1\n\tgithub.com/pingcap/ticdc@/cdc/server.go:260\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20201020160332-67f06af15bc9/errgroup/errgroup.go:57\nruntime.goexit\n\truntime/asm_amd64.s:1357"]
[2023/04/14 07:19:20.703 +00:00] [INFO] [owner.go:290] ["Find new changefeed"] [info="{\"sink-uri\":\"***\",\"opts\":{\"max-message-bytes\":\"67108864\"},\"create-time\":\"2022-08-10T08:38:01.259768036Z\",\"start-ts\":435190675862716506,\"target-ts\":0,\"admin-job-type\":2,\"sort-engine\":\"unified\",\"config\":{\"case-sensitive\":true,\"enable-old-value\":true,\"force-replicate\":false,\"check-gc-safe-point\":true,\"filter\":{\"rules\":[],\"ignore-txn-start-ts\":null,\"ddl-allow-list\":null},\"mounter\":{\"worker-num\":16},\"sink\":{\"dispatchers\":null,\"protocol\":\"canal-json\"},\"cyclic-replication\":{\"enable\":false,\"replica-id\":0,\"filter-replica-ids\":null,\"id-buckets\":0,\"sync-ddl\":false},\"scheduler\":{\"type\":\"table-number\",\"polling-time\":-1}},\"state\":\"normal\",\"history\":[1681456698295,1681456760629],\"error\":{\"addr\":\"ip.ip:8300\",\"code\":\"CDC-owner-1001\",\"message\":\"[CDC:ErrKafkaNewSaramaProducer]kafka: client has run out of available brokers to talk to (Is your cluster reachable?)\"},\"sync-point-enabled\":false,\"sync-point-interval\":600000000000,\"creator-version\":\"v4.0.13\"}"] [changefeed=cdc-kafka] ["checkpoint ts"=440783380765474971]
[2023/04/14 07:19:20.801 +00:00] [INFO] [kafka.go:320] ["Starting kafka sarama producer ..."] [config="{\"PartitionNum\":1,\"ReplicationFactor\":1,\"Version\":\"2.6.2\",\"MaxMessageBytes\":67108864,\"Compression\":\"none\",\"ClientID\":\"\",\"Credential\":{\"ca-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"cert-allowed-cn\":null},\"TopicPreProcess\":true}"]
[2023/04/14 07:19:20.802 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=366] [regionID=128924] [startKey=6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006c0000000000fa] [endKey=6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006d0000000000fa] [checkpointTs=440783380765474971]
[2023/04/14 07:19:20.802 +00:00] [INFO] [client.go:814] ["creating new stream to store to send request"] [regionID=128924] [requestID=484604] [storeID=1] [addr=ip.ip:20160]
[2023/04/14 07:19:20.802 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=367] [regionID=128924] [startKey=6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006c00000000fb] [endKey=6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006d00000000fb] [checkpointTs=440783380765474971]
[2023/04/14 07:19:20.802 +00:00] [INFO] [client.go:814] ["creating new stream to store to send request"] [regionID=128924] [requestID=484605] [storeID=1] [addr=ip.ip:20160]
[2023/04/14 07:19:20.805 +00:00] [INFO] [puller.go:217] ["puller is initialized"] [duration=4.702012ms] [changefeed=] [tableID=-1] [spans="[\"[6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006c0000000000fa, 6d44444c4a6f624cff69ff737400000000ff0000f90000000000ff00006d0000000000fa)\",\"[6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006c00000000fb, 6d44444c4a6f6241ff64ff644964784c69ff7374ff0000000000ff000000f700000000ff0000006d00000000fb)\"]"] [resolvedTs=440783380765474971]
[2023/04/14 07:19:21.226 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=236727] [addr=ip.ip:20160]
# 大量drop resolved ts due to region feed stopped
[2023/04/14 07:19:51.916 +00:00] [INFO] [client.go:968] ["EventFeed disconnected"] [regionID=290141] [requestID=484601] [span="[7480000000000005ff205f720000000000fa, 7480000000000005ff205f72891d3e675fffc010000000000000fa)"] [checkpoint=440783809017020493] [error="[CDC:ErrEventFeedEventError]not_leader:<region_id:290141 > "]
[2023/04/14 07:19:51.916 +00:00] [INFO] [region_range_lock.go:370] ["unlocked range"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783809017020493]
[2023/04/14 07:19:51.916 +00:00] [INFO] [region_cache.go:829] ["switch region peer to next due to NotLeader with NULL leader"] [currIdx=2] [regionID=290141]
[2023/04/14 07:19:51.916 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783809017020493]
[2023/04/14 07:19:51.916 +00:00] [INFO] [client.go:859] ["start new request"] [request="{\"header\":{\"cluster_id\":7006125491825389564,\"ticdc_version\":\"4.0.13\"},\"region_id\":290141,\"region_epoch\":{\"conf_ver\":8,\"version\":1367},\"checkpoint_ts\":440783809017020493,\"start_key\":\"dIAAAAAAAAX/IF9yAAAAAAD6\",\"end_key\":\"dIAAAAAAAAX/IF9yiR0+Z1//wBAAAAAAAAD6\",\"request_id\":484606,\"extra_op\":1}"] [addr=ip.ip:20160]
[2023/04/14 07:19:51.920 +00:00] [INFO] [client.go:1269] ["region state entry will be replaced because received message of newer requestID"] [regionID=290141] [oldRequestID=484599] [requestID=484606] [addr=ip.ip:20160]
[2023/04/14 07:19:51.920 +00:00] [INFO] [client.go:968] ["EventFeed disconnected"] [regionID=290141] [requestID=484606] [span="[7480000000000005ff205f720000000000fa, 7480000000000005ff205f72891d3e675fffc010000000000000fa)"] [checkpoint=440783809017020493] [error="[CDC:ErrEventFeedEventError]not_leader:<region_id:290141 > "]
[2023/04/14 07:19:51.920 +00:00] [INFO] [region_range_lock.go:370] ["unlocked range"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783809017020493]
[2023/04/14 07:19:51.920 +00:00] [INFO] [region_cache.go:829] ["switch region peer to next due to NotLeader with NULL leader"] [currIdx=0] [regionID=290141]
[2023/04/14 07:19:51.920 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783809017020493]
[2023/04/14 07:19:51.920 +00:00] [INFO] [client.go:859] ["start new request"] [request="{\"header\":{\"cluster_id\":7006125491825389564,\"ticdc_version\":\"4.0.13\"},\"region_id\":290141,\"region_epoch\":{\"conf_ver\":8,\"version\":1367},\"checkpoint_ts\":440783809017020493,\"start_key\":\"dIAAAAAAAAX/IF9yAAAAAAD6\",\"end_key\":\"dIAAAAAAAAX/IF9yiR0+Z1//wBAAAAAAAAD6\",\"request_id\":484607,\"extra_op\":1}"] [addr=ip.ip:20160]
[2023/04/14 07:19:51.928 +00:00] [INFO] [client.go:1269] ["region state entry will be replaced because received message of newer requestID"] [regionID=290141] [oldRequestID=484600] [requestID=484607] [addr=ip.ip:20160]
[2023/04/14 07:19:51.928 +00:00] [INFO] [client.go:968] ["EventFeed disconnected"] [regionID=290141] [requestID=484607] [span="[7480000000000005ff205f720000000000fa, 7480000000000005ff205f72891d3e675fffc010000000000000fa)"] [checkpoint=440783809017020493] [error="[CDC:ErrEventFeedEventError]region_not_found:<region_id:290141 > "]
[2023/04/14 07:19:51.928 +00:00] [INFO] [region_range_lock.go:370] ["unlocked range"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783809017020493]
[2023/04/14 07:19:51.928 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783809017020493]
[2023/04/14 07:19:51.929 +00:00] [INFO] [client.go:859] ["start new request"] [request="{\"header\":{\"cluster_id\":7006125491825389564,\"ticdc_version\":\"4.0.13\"},\"region_id\":290141,\"region_epoch\":{\"conf_ver\":8,\"version\":1367},\"checkpoint_ts\":440783809017020493,\"start_key\":\"dIAAAAAAAAX/IF9yAAAAAAD6\",\"end_key\":\"dIAAAAAAAAX/IF9yiR0+Z1//wBAAAAAAAAD6\",\"request_id\":484608,\"extra_op\":1}"] [addr=ip.ip:20160]
[2023/04/14 07:19:51.932 +00:00] [INFO] [client.go:1269] ["region state entry will be replaced because received message of newer requestID"] [regionID=290141] [oldRequestID=484607] [requestID=484608] [addr=ip.ip:20160]
[2023/04/14 07:19:51.932 +00:00] [INFO] [client.go:968] ["EventFeed disconnected"] [regionID=290141] [requestID=484608] [span="[7480000000000005ff205f720000000000fa, 7480000000000005ff205f72891d3e675fffc010000000000000fa)"] [checkpoint=440783809017020493] [error="[CDC:ErrEventFeedEventError]region_not_found:<region_id:290141 > "]
[2023/04/14 07:19:51.932 +00:00] [INFO] [region_range_lock.go:370] ["unlocked range"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783809017020493]
[2023/04/14 07:19:51.932 +00:00] [INFO] [region_range_lock.go:218] ["range locked"] [lockID=199] [regionID=290141] [startKey=7480000000000005ff205f720000000000fa] [endKey=7480000000000005ff205f72891d3e675fffc010000000000000fa] [checkpointTs=440783809017020493]
[2023/04/14 07:19:51.932 +00:00] [INFO] [client.go:859] ["start new request"] [request="{\"header\":{\"cluster_id\":7006125491825389564,\"ticdc_version\":\"4.0.13\"},\"region_id\":290141,\"region_epoch\":{\"conf_ver\":8,\"version\":1367},\"checkpoint_ts\":440783809017020493,\"start_key\":\"dIAAAAAAAAX/IF9yAAAAAAD6\",\"end_key\":\"dIAAAAAAAAX/IF9yiR0+Z1//wBAAAAAAAAD6\",\"request_id\":484609,\"extra_op\":1}"] [addr=ip.ip:20160]
[2023/04/14 07:19:52.048 +00:00] [INFO] [client.go:1269] ["region state entry will be replaced because received message of newer requestID"] [regionID=290141] [oldRequestID=484608] [requestID=484609] [addr=ip.ip:20160]
[2023/04/14 07:19:52.441 +00:00] [WARN] [client.go:1335] ["drop resolved ts due to region feed stopped"] [regionID=157636] [requestID=236727] [addr=ip.ip:20160]
[

pd和kv日志上没有看到有明显的报错

类似的issue如下:
https://github.com/pingcap/tiflow/issues/2552
https://github.com/pingcap/tiflow/issues/3352
https://github.com/pingcap/tiflow/issues/2978

https://github.com/pingcap/tiflow/issues/4241

大佬们帮忙看看是否有遇到过或者知道是哪个issue的

可能是这个 https://github.com/pingcap/tiflow/issues/4241

总的来说,老版本 TiCDC 中存在当下游 kafka 异常时可能导致 TiCDC 节点卡住的问题,从而影响其他 changefeed。
建议升级 TiCDC 到 6.1 或者 6.5 的最新版本。

感谢感谢,我看了这个,但是日志不太一致,感觉不是一个issue,不知道cdc有没有计划作为一个独立的组件,可以不受集群版本的影响,不然升级的沟通难度还是挺大的

由于一些技术限制,目前 TiCDC 6.1 最低支持 5.1.0-alpha 的 TiDB 集群。

好吧,那我暂时先采用删除任务方式绕过吧,感谢老师

此话题已在最后回复的 60 天后被自动关闭。不再允许新回复。