ticdc 无法同步数据,报错:etcd client outCh blocking too long

ticdc 无法同步,未显示error信息,日志告警如下:

[2023/08/03 17:29:00.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=10.599113445s] [role=processor]
[2023/08/03 17:29:01.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=11.599196548s] [role=processor]
[2023/08/03 17:29:02.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=12.600323247s] [role=processor]
[2023/08/03 17:29:03.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=13.59911505s] [role=processor]
[2023/08/03 17:29:04.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=14.599893366s] [role=processor]
[2023/08/03 17:29:05.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=15.599842259s] [role=processor]
[2023/08/03 17:29:06.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=16.600315897s] [role=processor]
[2023/08/03 17:29:07.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=17.599149941s] [role=processor]
[2023/08/03 17:29:08.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=18.600282456s] [role=processor]
[2023/08/03 17:29:09.182 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=19.600627829s] [role=processor]
[2023/08/03 17:29:10.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=20.599791854s] [role=processor]
[2023/08/03 17:29:11.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=21.599458008s] [role=processor]
[2023/08/03 17:29:12.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=22.599127427s] [role=processor]
[2023/08/03 17:29:13.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=23.599763841s] [role=processor]
[2023/08/03 17:29:14.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=24.599174768s] [role=processor]
[2023/08/03 17:29:15.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=25.600263766s] [role=processor]
[2023/08/03 17:29:16.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=26.599329814s] [role=processor]
[2023/08/03 17:29:17.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=27.599595961s] [role=processor]
[2023/08/03 17:29:18.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=28.599149714s] [role=processor]
[2023/08/03 17:29:19.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=29.59949637s] [role=processor]
[2023/08/03 17:29:20.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=30.599115826s] [role=processor]
[2023/08/03 17:29:21.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=31.59928108s] [role=processor]
[2023/08/03 17:29:22.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=32.599275573s] [role=processor]
[2023/08/03 17:29:23.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=33.599748676s] [role=processor]
[2023/08/03 17:29:24.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=34.599139836s] [role=processor]
[2023/08/03 17:29:25.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=35.599114923s] [role=processor]
[2023/08/03 17:29:26.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=36.599141394s] [role=processor]
[2023/08/03 17:29:27.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=37.599736012s] [role=processor]
[2023/08/03 17:29:28.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=38.599553315s] [role=processor]
[2023/08/03 17:29:29.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=39.59912175s] [role=processor]
[2023/08/03 17:29:30.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=40.599826815s] [role=processor]
[2023/08/03 17:29:31.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=41.599205771s] [role=processor]
[2023/08/03 17:29:32.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=42.600046078s] [role=processor]
[2023/08/03 17:29:33.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=43.59911869s] [role=processor]
[2023/08/03 17:29:34.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=44.599363853s] [role=processor]
[2023/08/03 17:29:35.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=45.599124334s] [role=processor]
[2023/08/03 17:29:36.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=46.599703028s] [role=processor]

与这个主题类似 TiCDC无法同步,不断打告警日志【etcd client outCh blocking too long】

创建cdc 的参数和配置文件贴下

https://github.com/pingcap/tiflow/issues/4987
是不是这个bug

看着描述像是的呢

那就奇怪了,正常这个bug 应该被修过了,不知道为什么还会被爆出来~

配置上应该没问题,一直都用的一个配置,但这个问题今天第一次出现

还有这个报错

[2023/08/03 17:41:29.762 +08:00] [WARN] [server.go:447] ["topic handler returned error"] [error="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled"] [errorVerbose="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252\ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2\n\tgithub.com/pingcap/tiflow/pkg/p2p/server.go:249\nruntime.goexit\n\truntime/asm_amd64.s:1594"]
[2023/08/03 17:41:29.763 +08:00] [WARN] [manager.go:162] ["processor close took too long"] [namespace=default] [changefeed=sync-tidb-test] [capture=879ea6bf-e014-4c93-aa0c-6edef50d1c46] [duration=4m0.489560052s]
[2023/08/03 17:41:29.763 +08:00] [WARN] [etcd_worker.go:293] ["EtcdWorker reactor tick took too long"] [duration=4m0.489780512s] [role=processor]
[2023/08/03 17:41:29.763 +08:00] [WARN] [server.go:271] ["handler not found"] [topic=changefeed/default/sync-tidb-test/agent]
[2023/08/03 17:41:29.856 +08:00] [WARN] [server.go:447] ["topic handler returned error"] [error="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled"] [errorVerbose="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252\ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2\n\tgithub.com/pingcap/tiflow/pkg/p2p/server.go:249\nruntime.goexit\n\truntime/asm_amd64.s:1594"]
[2023/08/03 17:41:29.857 +08:00] [WARN] [server.go:271] ["handler not found"] [topic=changefeed/default/sync-tidb-test/scheduler]
[2023/08/03 17:41:29.966 +08:00] [WARN] [server.go:447] ["topic handler returned error"] [error="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled"] [errorVerbose="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252\ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2\n\tgithub.com/pingcap/tiflow/pkg/p2p/server.go:249\nruntime.goexit\n\truntime/asm_amd64.s:1594"]
[2023/08/03 17:41:40.528 +08:00] [WARN] [replication_manager.go:503] ["schedulerv3: cannot advance checkpoint since missing table"] [namespace=default] [changefeed=sync-tidb-test] [tableID=13120]

根据您提供的信息,这个问题可能是由于 TiCDC 无法连接到 etcd 或者 etcd 响应过慢导致的。您可以尝试以下方法解决这个问题:

  1. 检查 etcd 是否正常运行。您可以使用以下命令检查 etcd 的状态:

    tiup ctl:v5.1.1 pdctl member list
    

    如果 etcd 的状态不正常,您可以使用以下命令重启 etcd:

    tiup cluster restart <cluster-name> --node pd
    

    其中,<cluster-name> 是 TiDB 集群的名称。

  2. 检查 TiCDC 的配置是否正确。您可以使用以下命令查看 TiCDC 的配置:

    tiup ctl:v5.1.1 cdc cli --pd=http://<pd-address>:<pd-port> changefeed query -c <changefeed-id>
    

    其中,<pd-address><pd-port> 是 PD 的地址和端口号,<changefeed-id> 是 TiCDC 同步任务的 ID。

    您可以检查 TiCDC 的配置是否正确,包括 TiCDC 的版本、TiCDC 同步任务的配置等。

  3. 检查 TiCDC 的日志。您可以使用以下命令查看 TiCDC 的日志:

    tiup ctl:v5.1.1 cdc cli --pd=http://<pd-address>:<pd-port> changefeed log -c <changefeed-id>
    

    其中,<pd-address><pd-port> 是 PD 的地址和端口号,<changefeed-id> 是 TiCDC 同步任务的 ID。

    您可以检查 TiCDC 的日志,查看是否有其他错误或者异常情况。

问题找到了,有一张表在目标环境不存在,但是 tiup cdc cli changefeed list 时没有 error 信息

这张不存在的表是在 changefeed 创建后,才在上游创建的么, 还是在 changefeed start-ts 之前就存在?

不是,是下游的表名做过变更

此话题已在最后回复的 60 天后被自动关闭。不再允许新回复。