ticdc异常状态 table sink stuck

【 TiDB 使用环境】生产环境
【 TiDB 版本】v6.5.9
【复现路径】做过哪些操作出现的问题
kafka磁盘满了,导致数据存不到kafka
【遇到的问题:问题现象及影响】
cdc任务因为磁盘写满,导致写不进数据,ticdc主动停止了任务,出现了warning的报错
{
“upstream_id”: 7358011227008144308,
“namespace”: “default”,
“id”: “sms-record-sync”,
“state”: “warning”,
“checkpoint_tso”: 449549867346296837,
“checkpoint_time”: “2024-05-05 16:10:49.728”,
“error”: {
“time”: “2024-05-08T11:51:07.591737751+08:00”,
“addr”: “172.22.147.102:8300”,
“code”: “CDC:ErrProcessorUnknown”,
“message”: “table sink stuck”
}
}

【资源配置】进入到 TiDB Dashboard -集群信息 (Cluster Info) -主机(Hosts) 截图此页面

【附件:截图/日志/监控】
有关cdc的错误日志如下
[2024/05/08 11:55:38.178 +08:00] [ERROR] [client.go:1068] [“region worker exited with error”] [namespace=default] [changefeed=sms-record-sync] [tableID=7269] [tableName=itnio_sms_record.tbsendrcd] [store=172.22.161.103:20160] [storeID=2] [streamID=19852] [error=“context canceled”] [errorVerbose=“context canceled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/juju_adaptor.go:15\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).eventHandler\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:480\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).run.func4\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:654\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.5.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594”]
[2024/05/08 11:55:38.178 +08:00] [ERROR] [client.go:1068] [“region worker exited with error”] [namespace=default] [changefeed=sms-record-sync] [tableID=7269] [tableName=itnio_sms_record.tbsendrcd] [store=172.22.145.102:20160] [storeID=3] [streamID=19886] [error=“context canceled”] [errorVerbose=“context canceled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/juju_adaptor.go:15\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).eventHandler\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:480\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).run.func4\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:654\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.5.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594”]
[2024/05/08 11:55:38.178 +08:00] [ERROR] [client.go:1068] [“region worker exited with error”] [namespace=default] [changefeed=sms-record-sync] [tableID=7269] [tableName=itnio_sms_record.tbsendrcd] [store=172.22.131.104:20160] [storeID=11] [streamID=19897] [error=“context canceled”] [errorVerbose=“context canceled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/juju_adaptor.go:15\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).resolveLock\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:300\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).run.func3\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:651\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.5.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594”]
[2024/05/08 11:55:38.178 +08:00] [ERROR] [client.go:1068] [“region worker exited with error”] [namespace=default] [changefeed=sms-record-sync] [tableID=7269] [tableName=itnio_sms_record.tbsendrcd] [store=172.22.131.101:20160] [storeID=10] [streamID=19869] [error=“context canceled”] [errorVerbose=“context canceled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/juju_adaptor.go:15\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).eventHandler\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:480\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).run.func4\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:654\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.5.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594”]
[2024/05/08 11:55:38.178 +08:00] [ERROR] [client.go:1068] [“region worker exited with error”] [namespace=default] [changefeed=sms-record-sync] [tableID=7269] [tableName=itnio_sms_record.tbsendrcd] [store=172.22.145.105:20160] [storeID=1] [streamID=19833] [error=“context canceled”] [errorVerbose=“context canceled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/juju_adaptor.go:15\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).eventHandler\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:480\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).run.func4\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:654\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.5.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594”]
[2024/05/08 11:55:38.236 +08:00] [ERROR] [client.go:1068] [“region worker exited with error”] [namespace=default] [changefeed=sms-record-sync] [tableID=7140] [tableName=itnio_sms_record.tbsendrcd] [store=172.22.131.104:20160] [storeID=11] [streamID=19707] [error=“context canceled”] [errorVerbose=“context canceled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/juju_adaptor.go:15\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).resolveLock\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:300\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).run.func3\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:651\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.5.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594”]
[2024/05/08 11:55:38.236 +08:00] [ERROR] [client.go:1068] [“region worker exited with error”] [namespace=default] [changefeed=sms-record-sync] [tableID=7140] [tableName=itnio_sms_record.tbsendrcd] [store=172.22.131.101:20160] [storeID=10] [streamID=19690] [error=“context canceled”] [errorVerbose=“context canceled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/juju_adaptor.go:15\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).resolveLock\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:300\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).run.func3\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:651\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.5.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594”]
[2024/05/08 11:55:38.237 +08:00] [ERROR] [client.go:1068] [“region worker exited with error”] [namespace=default] [changefeed=sms-record-sync] [tableID=7140] [tableName=itnio_sms_record.tbsendrcd] [store=172.22.145.105:20160] [storeID=1] [streamID=19670] [error=“context canceled”] [errorVerbose=“context canceled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/juju_adaptor.go:15\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).resolveLock\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:300\ngithub.com/pingcap/tiflow/cdc/kv.(*regionWorker).run.func3\n\tgithub.com/pingcap/tiflow/cdc/kv/region_worker.go:651\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.5.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594”]
[2024/05/08 11:55:38.237 +08:00] [ERROR] [client.go:1068] [“region worker exited with error”] [namespace=default] [changefeed=sms-record-sync] [tableID=7140] [tableName=itnio_sms_record.tbsendrcd] [store=172.22.161.103:20160] [storeID=2] [streamID=19675] [error=“context canceled”]

这个问题解决了没,如果没有则优先处理这个问题,之后再重新执行 resume 恢复同步任务

kafka的问题已经解决了,扩容了磁盘,然后resume的时候出现的这个table sink stuck的问题

记住点位重新创建一个任务

到ticdc的日志目录,确认下日志里出现报错 table sink stuck 的地方,获取更详细的日志信息。

日志如下:
[2024/05/08 17:20:39.388 +08:00] [WARN] [feed_state_manager.go:575] [“changefeed meets an warning”] [warning=“{"time":"2024-05-08T17:20:39.352566396+08:00","addr":"172.22.163.103:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}”]
[2024/05/08 17:20:46.139 +08:00] [INFO] [api.go:188] [“List changefeed successfully!”] [info=“{"upstream_id":7358011227008144308,"namespace":"default","id":"sms-record-sync","state":"warning","checkpoint_tso":449549867346296837,"checkpoint_time":"2024-05-05 16:10:49.728","error":{"time":"2024-05-08T17:20:39.352566396+08:00","addr":"172.22.163.103:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}}”]
[2024/05/08 17:20:53.538 +08:00] [INFO] [api.go:188] [“List changefeed successfully!”] [info=“{"upstream_id":7358011227008144308,"namespace":"default","id":"sms-record-sync","state":"warning","checkpoint_tso":449549867346296837,"checkpoint_time":"2024-05-05 16:10:49.728","error":{"time":"2024-05-08T17:20:39.352566396+08:00","addr":"172.22.163.103:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}}”]
[2024/05/08 17:28:46.033 +08:00] [WARN] [manager.go:289] [“Sink manager backend sink fails”] [namespace=default] [changefeed=sms-record-sync] [factoryVersion=1] [error=“table sink stuck”] [errorVerbose=“table sink stuck\ngithub.com/pingcap/tiflow/cdc/processor/sinkmanager.(*SinkManager).GetTableStats\n\tgithub.com/pingcap/tiflow/cdc/processor/sinkmanager/manager.go:1038\ngithub.com/pingcap/tiflow/cdc/processor.(*processor).GetTableStatus\n\tgithub.com/pingcap/tiflow/cdc/processor/processor.go:429\ngithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent.(*table).getTableStatus\n\tgithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent/table.go:72\ngithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent.(*agent).handleMessageHeartbeat\n\tgithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent/agent.go:269\ngithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent.(*agent).handleMessage\n\tgithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent/agent.go:249\ngithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent.(*agent).Tick\n\tgithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent/agent.go:206\ngithub.com/pingcap/tiflow/cdc/processor.(*processor).tick\n\tgithub.com/pingcap/tiflow/cdc/processor/processor.go:703\ngithub.com/pingcap/tiflow/cdc/processor.(*processor).Tick\n\tgithub.com/pingcap/tiflow/cdc/processor/processor.go:597\ngithub.com/pingcap/tiflow/cdc/processor.(*managerImpl).Tick\n\tgithub.com/pingcap/tiflow/cdc/processor/manager.go:133\ngithub.com/pingcap/tiflow/pkg/orchestrator.(*EtcdWorker).Run\n\tgithub.com/pingcap/tiflow/pkg/orchestrator/etcd_worker.go:290\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).runEtcdWorker\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:550\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).run.func4\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:382\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.5.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594”]
[2024/05/08 17:28:46.437 +08:00] [WARN] [feed_state_manager.go:481] [“processor reports a warning”] [namespace=default] [changefeed=sms-record-sync] [captureID=f724604d-e235-44bb-902d-c19bc8961504] [warning=“{"time":"2024-05-08T17:28:46.267991696+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}”]
[2024/05/08 17:28:46.438 +08:00] [WARN] [feed_state_manager.go:575] [“changefeed meets an warning”] [warning=“{"time":"2024-05-08T17:28:46.267991696+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}”]
[2024/05/08 17:31:02.988 +08:00] [WARN] [feed_state_manager.go:481] [“processor reports a warning”] [namespace=default] [changefeed=sms-record-sync] [captureID=29f337b7-84a2-4352-aa0e-7336bd5e6a9c] [warning=“{"time":"2024-05-08T17:31:02.942752472+08:00","addr":"172.22.147.102:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}”]
[2024/05/08 17:31:02.988 +08:00] [WARN] [feed_state_manager.go:575] [“changefeed meets an warning”] [warning=“{"time":"2024-05-08T17:31:02.942752472+08:00","addr":"172.22.147.102:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}”]
[2024/05/08 17:35:08.705 +08:00] [WARN] [manager.go:289] [“Sink manager backend sink fails”] [namespace=default] [changefeed=sms-record-sync] [factoryVersion=2] [error=“table sink stuck”] [errorVerbose=“table sink stuck\ngithub.com/pingcap/tiflow/cdc/processor/sinkmanager.(*SinkManager).GetTableStats\n\tgithub.com/pingcap/tiflow/cdc/processor/sinkmanager/manager.go:1038\ngithub.com/pingcap/tiflow/cdc/processor.(*processor).GetTableStatus\n\tgithub.com/pingcap/tiflow/cdc/processor/processor.go:429\ngithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent.(*table).getTableStatus\n\tgithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent/table.go:72\ngithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent.(*agent).handleMessageHeartbeat\n\tgithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent/agent.go:269\ngithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent.(*agent).handleMessage\n\tgithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent/agent.go:249\ngithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent.(*agent).Tick\n\tgithub.com/pingcap/tiflow/cdc/scheduler/internal/v3/agent/agent.go:206\ngithub.com/pingcap/tiflow/cdc/processor.(*processor).tick\n\tgithub.com/pingcap/tiflow/cdc/processor/processor.go:703\ngithub.com/pingcap/tiflow/cdc/processor.(*processor).Tick\n\tgithub.com/pingcap/tiflow/cdc/processor/processor.go:597\ngithub.com/pingcap/tiflow/cdc/processor.(*managerImpl).Tick\n\tgithub.com/pingcap/tiflow/cdc/processor/manager.go:133\ngithub.com/pingcap/tiflow/pkg/orchestrator.(*EtcdWorker).Run\n\tgithub.com/pingcap/tiflow/pkg/orchestrator/etcd_worker.go:290\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).runEtcdWorker\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:550\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).run.func4\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:382\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.5.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594”]
[2024/05/08 17:35:08.993 +08:00] [WARN] [feed_state_manager.go:481] [“processor reports a warning”] [namespace=default] [changefeed=sms-record-sync] [captureID=f724604d-e235-44bb-902d-c19bc8961504] [warning=“{"time":"2024-05-08T17:35:08.896949106+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}”]
[2024/05/08 17:35:08.993 +08:00] [WARN] [feed_state_manager.go:575] [“changefeed meets an warning”] [warning=“{"time":"2024-05-08T17:35:08.896949106+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}”]
[2024/05/08 17:40:23.747 +08:00] [INFO] [api.go:188] [“List changefeed successfully!”] [info=“{"upstream_id":7358011227008144308,"namespace":"default","id":"sms-record-sync","state":"warning","checkpoint_tso":449549867346296837,"checkpoint_time":"2024-05-05 16:10:49.728","error":{"time":"2024-05-08T17:35:08.896949106+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}}”]
[2024/05/08 17:40:27.560 +08:00] [INFO] [api.go:188] [“List changefeed successfully!”] [info=“{"upstream_id":7358011227008144308,"namespace":"default","id":"sms-record-sync","state":"warning","checkpoint_tso":449549867346296837,"checkpoint_time":"2024-05-05 16:10:49.728","error":{"time":"2024-05-08T17:35:08.896949106+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}}”]
[2024/05/08 17:40:31.050 +08:00] [INFO] [api.go:188] [“List changefeed successfully!”] [info=“{"upstream_id":7358011227008144308,"namespace":"default","id":"sms-record-sync","state":"warning","checkpoint_tso":449549867346296837,"checkpoint_time":"2024-05-05 16:10:49.728","error":{"time":"2024-05-08T17:35:08.896949106+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}}”]
[2024/05/08 17:40:34.448 +08:00] [INFO] [changefeed.go:723] [“changefeed closed”] [namespace=default] [changefeed=sms-record-sync] [status=“{"checkpoint-ts":449549867346296837,"min-table-barrier-ts":449558191387967491,"admin-job-type":0}”] [info=“{"upstream-id":7358011227008144308,"namespace":"default","changefeed-id":"sms-record-sync","sink-uri":"kafka://b-1.itniokafka3.1uw2xm.c10.kafka.us-west-2.amazonaws.com:9092,b-2.itniokafka3.1uw2xm.c10.kafka.us-west-2.amazonaws.com:9092,b-3.itniokafka3.1uw2xm.c10.kafka.us-west-2.amazonaws.com:9092/sms-record-sync?kafka-version=2.8.1\u0026max-message-bytes=67108864\u0026partition-num=60\u0026protocol=canal-json\u0026replication-factor=1","create-time":"2024-04-28T15:51:25.988067401+08:00","start-ts":449391017571647493,"target-ts":0,"admin-job-type":0,"sort-engine":"unified","sort-dir":"","config":{"memory-quota":1073741824,"case-sensitive":false,"enable-old-value":true,"force-replicate":false,"check-gc-safe-point":true,"enable-sync-point":false,"enable-table-monitor":false,"bdr-mode":false,"sync-point-interval":600000000000,"sync-point-retention":86400000000000,"filter":{"rules":["itnio_sms_record.tbsendrcd"],"ignore-txn-start-ts":null,"event-filters":null},"mounter":{"worker-num":16},"sink":{"transaction-atomicity":"","protocol":"canal-json","dispatchers":[{"matcher":["itnio_sms_record.tbsendrcd"],"dispatcher":"","partition":"rowid","topic":""}],"csv":null,"column-selectors":null,"schema-registry":"","encoder-concurrency":0,"terminator":"\r\n","date-separator":"","enable-partition-separator":false,"advance-timeout-in-sec":150},"consistent":{"level":"none","max-log-size":64,"flush-interval":2000,"meta-flush-interval":200,"encoding-worker-num":16,"flush-worker-num":8,"storage":"","use-file-backend":false,"compression":"","memory-usage":{"memory-quota-percentage":50,"event-cache-percentage":0}},"changefeed-error-stuck-duration":1800000000000,"sql-mode":"ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION","synced-status":{"synced-check-interval":300,"checkpoint-interval":15}},"state":"warning","error":null,"warning":{"time":"2024-05-08T17:35:08.896949106+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"},"creator-version":"v6.5.9","epoch":449618776523276334}”] [isRemoved=false]
[2024/05/08 17:40:36.738 +08:00] [INFO] [api.go:188] [“List changefeed successfully!”] [info=“{"upstream_id":7358011227008144308,"namespace":"default","id":"sms-record-sync","state":"stopped","checkpoint_tso":449549867346296837,"checkpoint_time":"2024-05-05 16:10:49.728","error":{"time":"2024-05-08T17:35:08.896949106+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}}”]
[2024/05/08 17:40:41.588 +08:00] [INFO] [api.go:188] [“List changefeed successfully!”] [info=“{"upstream_id":7358011227008144308,"namespace":"default","id":"sms-record-sync","state":"stopped","checkpoint_tso":449549867346296837,"checkpoint_time":"2024-05-05 16:10:49.728","error":{"time":"2024-05-08T17:35:08.896949106+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}}”]
[2024/05/08 17:40:42.688 +08:00] [INFO] [api.go:188] [“List changefeed successfully!”] [info=“{"upstream_id":7358011227008144308,"namespace":"default","id":"sms-record-sync","state":"stopped","checkpoint_tso":449549867346296837,"checkpoint_time":"2024-05-05 16:10:49.728","error":{"time":"2024-05-08T17:35:08.896949106+08:00","addr":"172.22.133.101:8300","code":"CDC:ErrProcessorUnknown","message":"table sink stuck"}}”]

我现在这个状态,数据还是有到kafka,但是点位checkpoint_tso一直没变化

查看了日志,有ALTER语句,表数据有9亿,是不是因为这个才导致的问题

建议看看日志

很有可能

测试了不是alter语句的问题,有个研发往里面执行了update的全表更新,导致的kafka爆满后续发生的问题

2 个赞

那这个就厉害了

那看来应该是一个很大的事务, 没同步完所以不刷新

1 个赞

看到问题产生原因,看来平时还是要监测这种大事务大操作DDL