TiCDC同步数据到MySQL,有时数据无法同步,请协助确认原因,谢谢。

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:

【TiDB 版本】v.4.0.8

【问题描述】在TiDB端执行update命令,有时无法同步到MySQL,具体信息如下:
TiDB侧:
MySQL [sync_test]> update cl_user_old_1 set aid=‘aaa’ limit 2000;
Query OK, 2000 rows affected (0.12 sec)
Rows matched: 2000 Changed: 2000 Warnings: 0

MySQL [sync_test]> select count() from cl_user_old_1 where aid=‘aaa’;
±---------+
| count(
) |
±---------+
| 2000 |
±---------+
1 row in set (0.30 sec)

MySQL [sync_test]> update cl_user_old_1 set aid=‘bbb’ limit 1;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0

MySQL侧:

MySQL [sync_test]> select count() from cl_user_old_1 where aid=‘aaa’;
±---------+
| count(
) |
±---------+
| 0 |
±---------+
1 row in set (0.40 sec)

MySQL [sync_test]> select count() from cl_user_old_1 where aid=‘bbb’;
±---------+
| count(
) |
±---------+
| 0 |
±---------+
1 row in set (0.39 sec)

TiCDC日志:
[2021/02/20 10:47:53.430 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=0] [qps=0]
[2021/02/20 10:48:23.460 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=0] [qps=0]
[2021/02/20 10:48:53.984 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=0] [qps=0]
[2021/02/20 10:49:24.015 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=2000] [qps=66]
[2021/02/20 10:49:54.015 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=0] [qps=0]
[2021/02/20 10:50:24.015 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=0] [qps=0]
[2021/02/20 10:50:54.065 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=0] [qps=0]
[2021/02/20 10:51:24.115 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=0] [qps=0]
[2021/02/20 10:51:54.115 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=1] [qps=0]

TiCDC任务显示正常:
tiup ctl cdc changefeed list --pd=http://10.60.79.168:2379
Starting component ctl: /root/.tiup/components/ctl/v5.0.0-rc/ctl cdc changefeed list --pd=http://10.60.79.168:2379
[
{
“id”: “sync-test-repl-task”,
“summary”: {
“state”: “normal”,
“tso”: 423045195223269378,
“checkpoint”: “2021-02-20 10:49:09.454”,
“error”: null
}
}
]


若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

这部分尝试使用 limit 看能查出数据吗,

tidb 侧同样的,也辛苦坐下。

您好,为了说明问题,我带上了时间信息,重新复现了下,请参考如下:
TIDB侧:

root@127.0.0.1:4000 [sync_test] : Sat Feb 20 13:22:43 2021 > select count() from cl_user_old_1 where aid=‘xxxxx’ ;
±---------+
| count(
) |
±---------+
| 0 |
±---------+
1 row in set (0.31 sec)

root@127.0.0.1:4000 [sync_test] : Sat Feb 20 13:22:51 2021 > update cl_user_old_1 set aid=‘xxxxx’ limit 2000;
Query OK, 2000 rows affected (0.13 sec)
Rows matched: 2000 Changed: 2000 Warnings: 0

root@127.0.0.1:4000 [sync_test] : Sat Feb 20 13:23:24 2021 > select count() from cl_user_old_1 where aid=‘xxxxx’ ;
±---------+
| count(
) |
±---------+
| 2000 |
±---------+
1 row in set (0.26 sec)

root@127.0.0.1:4000 [sync_test] : Sat Feb 20 13:25:18 2021 > select uname from cl_user_old_1 where aid=‘xxxxx’ limit 5;
±------------------+
| uname |
±------------------+
| Amur |
| galaxy.jax |
| JasmineRobinson |
| FernandoD.Ventura |
| EverardoTorres |
±------------------+
5 rows in set (0.00 sec)
MySQL侧:
opsadmin@rds-test.ct7cj9f0ldac.us-east-1.rds.amazonaws.com:3306 [sync_test] : Sat Feb 20 13:24:37 2021 > select count() from cl_user_old_1 where aid=‘xxxxx’ ;
±---------+
| count(
) |
±---------+
| 0 |
±---------+
1 row in set (0.38 sec)

dbadmin@rds-test.ct7cj9f0ldac.us-east-1.rds.amazonaws.com:3306 [sync_test] : Sat Feb 20 13:24:39 2021 >select uname from cl_user_old_1 where aid=‘xxxxx’ limit 5;
Empty set (0.39 sec)

dbadmin@rds-test.ct7cj9f0ldac.us-east-1.rds.amazonaws.com:3306 [sync_test] : Sat Feb 20 13:26:04 2021 > select count() from cl_user_old_1 where aid=‘xxxxx’ ;
±---------+
| count(
) |
±---------+
| 0 |
±---------+
1 row in set (0.37 sec)

TiCDC日志:
[2021/02/20 13:23:44.590 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:45.591 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:46.592 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:47.593 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:48.594 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:49.595 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:50.596 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:51.055 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=2000] [qps=66]
[2021/02/20 13:23:51.598 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:52.598 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:53.599 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:54.600 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:55.601 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:56.602 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:57.603 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:58.604 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:23:59.605 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:00.606 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:01.607 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:02.608 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:03.609 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:04.610 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:05.611 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:06.612 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:07.613 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:08.614 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:09.615 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:10.616 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:11.617 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:12.618 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:13.619 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:14.619 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:15.620 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:16.622 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:17.622 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:18.624 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:19.625 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:20.625 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:21.105 +08:00] [INFO] [statistics.go:118] [“sink replication status”] [name=mysql] [changefeed=sync-test-repl-task] [captureaddr=10.60.79.168:8300] [count=0] [qps=0]
[2021/02/20 13:24:21.627 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:22.627 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:23.628 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:24.629 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]
[2021/02/20 13:24:25.630 +08:00] [WARN] [client.go:1200] [“drop resolved ts due to region feed stopped”] [regionID=2] [requestID=97] [addr=10.60.79.168:20160]

疑问:

  1. 上午提的问题,我刚刚check结果,发现数据又同步过去了,为了研究这个问题,我重现了这个现象,那我的疑问是:ticdc的状态是normal的,是不是数据在同步,只是没有同步完?如果查看同步进度?谢谢。

更新下信息:
现在这个时间点去check,发现数据同步到MySQL了
dbadmin@rds-test.ct7cj9f0ldac.us-east-1.rds.amazonaws.com:3306 [sync_test] : Sat Feb 20 14:24:18 2021 > select count() from cl_user_old_1 where aid=‘xxxxx’ ;
±---------+
| count(
) |
±---------+
| 2000 |
±---------+

还是之前的疑问,不知道ticdc发生了什么,现象看上去就是同步不及时,而且没有任何的日志信息可供参考。

ticdc 的监控可以给下~


还有个现象,CDC的负载比较高,请问这正常吗?

从表述上来看同步是有一定延迟的,changefeed status normal 只能说明该 changefeed 状态是正常的。

可否使用以下方式,提供下 ticdc的监控面板截图,判断下延迟情况,或者自行判断下。感谢配合:


打开 grafana 监控,先按 d 再按 shift+e 可以打开所有监控项。

1. chrome 安装这个插件https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl

2. 鼠标焦点置于 Dashboard 上,按 ? 可显示所有快捷键,先按 d 再按 E 可将所有 Rows 的 Panels 打开,需等待一段时间待页面加载完成。

3. 使用这个 full-page-screen-capture 插件进行截屏保存

ticdc 负载较高目前也是已知的现象,我们也在尝试优化。

延迟时间点的监控?(监控包括操作该问题的步骤。

没理解您的意思。那同步延时的原因是?
还有个问题,关于tidb同步数据到MySQL,除了用TiCDC,请问是否还有其他更好的办法?
谢谢。

这个是checkpoint的时间,图里线性增长是正常的,说明没有什么大延迟,上面也讲了,看咱们在 mysql 和 tidb 中操作的时长,如果马上查询可能出现这中情况。

这个监控是update 200条的监控,如果update 2000条,那延时会是2.38个小时,这样的延时是正常的?

root@127.0.0.1:4000 [sync_test] : Sat Feb 20 13:22:51 2021 > update cl_user_old_1 set aid=‘xxxxx’ limit 2000;
2000 行是个大事务,对应监控 changefeed checkpoint lag 可以对应上。

2.38min~~~

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。