tikv打印cdc错误以及cdc下游无法接收到tikv发送的事件

【TiDB 使用环境】生产环境 /测试/ Poc
【TiKV 版本】7.6.0
【操作系统】
【部署方式】云上部署(什么云)/机器部署(什么机器配置、什么硬盘)
【集群数据量】
【集群节点数】3
【问题复现路径】
我在使用tikv的cdc接口,调试若干次之后发现有些region订阅后不能接受到resolvedTS事件。
并且有的tikv实例在打印日志(并不是全部)。
【遇到的问题:问题现象及影响】cdc下游无法接收到任何事件
【复制黏贴 ERROR 报错的日志】

[2025/05/14 02:16:33.291 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:34.293 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:35.295 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:36.297 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:37.299 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:38.301 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:39.303 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:40.305 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:41.307 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:42.309 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:43.311 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:44.313 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:45.315 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:46.317 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:47.319 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:48.321 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:49.323 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:50.325 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:51.327 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:52.329 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:53.331 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]
[2025/05/14 02:16:54.333 +00:00] [INFO] [endpoint.rs:1038] ["cdc send event failed, full"] [downstream="\"ipv4:10.222.55.32:50610\""] [conn_id=ConnId(2)] [thread_id=0x5]

订阅的日志

[2025/05/13 10:33:42.934 +00:00] [INFO] [service.rs:431] ["cdc connection created"] [features="[\"stream-multiplexing\"]"] [downstream=ipv4:10.222.36.248:52024] [thread_id=0x5]
[2025/05/13 10:33:42.934 +00:00] [INFO] [endpoint.rs:790] ["cdc register region"] [downstream_id=DownstreamId(110)] [observe_id=ObserveId(606)] [req_id=2] [conn_id=ConnId(58)] [region_id=45] [thread_id=0x5]

没能看到 cdc downstream starts to initialize 日志

更新一下:目前问题基本定位在tikv 的 cdc scan task长时间不能退出。

具体不能退出的原因还在排查

1 个赞

为啥选 7.6.0 这个版本呢?

感觉8.x 和9.x还不太成熟吧… 感觉这个没法解决,只能调调参数尝试让tikv更快的扫描?

你选版本好歹选个 LTS 吧,选个 DMR ,有bug 都没 fix version 继续发布,你怎么更新呢?

那只能大版本升级了,可以考虑下…

如果是在生产,那就比较糟糕了

1 个赞

需要看下这些 region 有什么异常,比如存在分裂或者合并的行为。

你应该选择LTS版本,现在的问题是,即使你最后确定这是一个bug,也没人会在7.6这个分支上修复。

你不用8.x,9.x,可以用7.1,7.5这两个分支,他们都是LTS版本,有bug会有子版本的更新。7.5.1到7.5.2这样。你用7.6.0,有bug也不会有7.6.1。