dm同步 worker 中异常报错 i/o timeout

ponent=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=usernovicetask] [from="position: (mysql-bin.000164, 786411355), gtid-set: "] [to="position: (mysql-bin.000164, 786411355), gtid-set: "]
[2021/12/16 14:30:19.745 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=userbuybookhistory] [from="position: (mysql-bin.000164, 791638283), gtid-set: "] [to="position: (mysql-bin.000164, 791638283), gtid-set: "]
[2021/12/16 14:30:19.876 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=userenergytimestask] [from="position: (mysql-bin.000164, 790584060), gtid-set: "] [to="position: (mysql-bin.000164, 790584060), gtid-set: "]
[2021/12/16 14:30:19.980 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=giftitem] [from="position: (mysql-bin.000164, 791592910), gtid-set: "] [to="position: (mysql-bin.000164, 791592910), gtid-set: "]
[2021/12/16 14:30:20.088 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=usertask] [from="position: (mysql-bin.000164, 791697359), gtid-set: "] [to="position: (mysql-bin.000164, 791697359), gtid-set: "]
[2021/12/16 14:30:20.194 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=ruleenginelog] [from="position: (mysql-bin.000164, 791768505), gtid-set: "] [to="position: (mysql-bin.000164, 791768505), gtid-set: "]
[2021/12/16 14:30:20.297 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=deeplinkmisslog] [from="position: (mysql-bin.000164, 783635788), gtid-set: "] [to="position: (mysql-bin.000164, 783635788), gtid-set: "]
[2021/12/16 14:30:20.447 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=bookcommentreplyitem] [from="position: (mysql-bin.000164, 791776883), gtid-set: "] [to="position: (mysql-bin.000164, 791776883), gtid-set: "]
[2021/12/16 14:30:20.583 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=daily_platform_static] [from="position: (mysql-bin.000164, 791772789), gtid-set: "] [to="position: (mysql-bin.000164, 791772789), gtid-set: "]
[2021/12/16 14:30:20.697 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=sale_total_yestoday] [from="position: (mysql-bin.000164, 708986436), gtid-set: "] [to="position: (mysql-bin.000164, 708986436), gtid-set: "]
[2021/12/16 14:30:20.804 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=chl_daily_pay] [from="position: (mysql-bin.000164, 784335907), gtid-set: "] [to="position: (mysql-bin.000164, 784335907), gtid-set: "]
[2021/12/16 14:30:20.912 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=sale_composition_daily_core] [from="position: (mysql-bin.000164, 707890545), gtid-set: "] [to="position: (mysql-bin.000164, 707890545), gtid-set: "]
[2021/12/16 14:30:21.026 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=testnovel_tidb_sp] [table=ads_adjust_notify] [from="position: (mysql-bin.000164, 788132944), gtid-set: "] [to="position: (mysql-bin.000164, 788132944), gtid-set: "]
[2021/12/16 14:30:21.123 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=test_activity_tidb_sp] [table=activityawardlog] [from="position: (mysql-bin.000164, 786790011), gtid-set: "] [to="position: (mysql-bin.000164, 786790011), gtid-set: "]
[2021/12/16 14:30:21.214 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=test_activity_tidb_sp] [table=activitydiceorder] [from="position: (mysql-bin.000164, 598815744), gtid-set: "] [to="position: (mysql-bin.000164, 598815744), gtid-set: "]
[2021/12/16 14:30:21.311 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=test_activity_tidb_sp] [table=activitygameuserguesslog] [from="position: (mysql-bin.000164, 723128762), gtid-set: "] [to="position: (mysql-bin.000164, 723128762), gtid-set: "]
[2021/12/16 14:30:21.410 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=test_activity_tidb_sp] [table=activityuserguide] [from="position: (mysql-bin.000164, 791772191), gtid-set: "] [to="position: (mysql-bin.000164, 791772191), gtid-set: "]
[2021/12/16 14:30:21.518 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=test_activity_tidb_sp] [table=activitygamerobot] [from="position: (mysql-bin.000164, 514220910), gtid-set: "] [to="position: (mysql-bin.000164, 514220910), gtid-set: "]
[2021/12/16 14:30:21.617 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=test_activity_tidb_sp] [table=invitecode] [from="position: (mysql-bin.000164, 649757278), gtid-set: "] [to="position: (mysql-bin.000164, 649757278), gtid-set: "]
[2021/12/16 14:30:21.751 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=test_activity_tidb_sp] [table=activitydiceawardlog] [from="position: (mysql-bin.000164, 598816168), gtid-set: "] [to="position: (mysql-bin.000164, 598816168), gtid-set: "]
[2021/12/16 14:30:21.861 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=test_activity_tidb_sp] [table=flopcardlotterylog] [from="position: (mysql-bin.000164, 712760530), gtid-set: "] [to=“position: (mysql-bin.000164, 712760530), gtid-set: “]
[2021/12/16 14:30:21.977 +08:00] [INFO] [checkpoint.go:694] [“rollback checkpoint”] [task=sp-mysql-8] [unit=“binlog replication”] [component=“remote checkpoint”] [schema=test_activity_tidb_sp] [table=activitygameuserguess] [from=“position: (mysql-bin.000164, 750914562), gtid-set: “] [to=“position: (mysql-bin.000164, 750914562), gtid-set: “]
[2021/12/16 14:30:22.086 +08:00] [INFO] [subtask.go:292] [“unit process returned”] [subtask=sp-mysql-8] [unit=Sync] [stage=Paused] [status=”{“syncerBinlog”:”(mysql-bin.000164, 791776914)”,“binlogType”:“remote”}”]
[2021/12/16 14:30:22.086 +08:00] [ERROR] [subtask.go:311] [“unit process error”] [subtask=sp-mysql-8] [unit=Sync] [“error information”=”{“ErrCode”:36069,“ErrClass”:“sync-unit”,“ErrScope”:“upstream”,“ErrLevel”:“high”,“Message”:“get binlog event error: io.CopyN failed. err read tcp 10.10.10.74:54308-\u003e143.120.123.189:33006: i/o timeout, copied 4931584, expected 16777215: connection was bad”,“Workaround”:“Please check if the binlog file could be parsed by mysqlbinlog.”}”]
[2021/12/16 14:30:26.983 +08:00] [WARN] [task_checker.go:393] [“backoff skip auto resume task”] [component=“task checker”] [task=sp-mysql-8] [latestResumeTime=2021/12/16 14:29:01.983 +08:00] [duration=5m0s]
[2021/12/16 14:30:28.364 +08:00] [INFO] [server.go:753] [request=QueryStatus] [payload=]
[2021/12/16 14:30:31.983 +08:00] [WARN] [task_checker.go:393] [“backoff skip auto resume task”] [component=“task checker”] [task=sp-mysql-8] [latestResumeTime=2021/12/16 14:29:01.983 +08:00] [duration=5m0s]
[2021/12/16 14:30:36.984 +08:00] [WARN] [task_checker.go:393] [“backoff skip auto resume task”] [component=“task checker”] [task=sp-mysql-8] [latestResumeTime=2021/12/16 14:29:01.983 +08:00] [duration=5m0s]
[2021/12/16 14:30:41.983 +08:00] [WARN] [task_checker.go:393] [“backoff skip auto resume task”] [component=“task checker”] [task=sp-mysql-8] [latestResumeTime=2021/12/16 14:29:01.983 +08:00] [duration=5m0s]
[2021/12/16 14:30:46.983 +08:00] [WARN] [task_checker.go:393] [“backoff skip auto resume task”] [component=“task checker”] [task=sp-mysql-8] [latestResumeTime=2021/12/16 14:29:01.983 +08:00] [duration=5m0s]

3 个赞

1 个赞

1)请问dm、上游mysql、下游tidb是什么版本?
2)确认dm可以拥有上游mysqld实例相应的权限
3)日志报错:
[2021/12/16 14:30:22.086 +08:00] [ERROR] [subtask.go:311] [“unit process error”] [subtask=sp-mysql-8] [unit=Sync] [“error information”=”{“ErrCode”:36069,“ErrClass”:“sync-unit”,“ErrScope”:“upstream”,“ErrLevel”:“high”,“Message”:“get binlog event error: io.CopyN failed. err read tcp 10.10.10.74:54308-\u003e143.120.123.189:33006: i/o timeout, copied 4931584, expected 16777215: connection was bad”,“Workaround”:“Please check if the binlog file could be parsed by mysqlbinlog .”}”]
看日志是解析binlog报错

2 个赞

dm-worker进程(10.10.10.74:54308) 到上游mysql(143.120.123.189:33006)之间的链路质量如何?看起来网络有问题。可能原因:
1)网络有问题
2)dm-worker或者mysqld所在的主机负载比较高,导致网络io超时,收到的数据包有问题。

1 个赞

1) dm版本:2.0.3、mysql: 8.0.22、tidb: 5.0.3
2)上有mysql实例同步账号权限:GRANT SELECT, RELOAD, SUPER, REPLICATION SLAVE, REPLICATION CLIENT ON . TO test@192.168.1.172;

[root@10_10_10_74 ~]# iostat
Linux 3.10.0-693.21.1.el7.x86_64 (10_10_10_74) 2021年12月16日 x86_64 (24 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
29.09 0.00 2.70 0.04 0.00 68.16

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 92.08 147.06 922.73 2172335250 13629972218

[root@10_10_10_74 ~]# ping xxx.xx.xx.xx
PING 43.130.113.89 () 56(84) bytes of data.
64 bytes from : icmp_seq=1 ttl=48 time=266 ms
64 bytes from : icmp_seq=2 ttl=48 time=296 ms
64 bytes from : icmp_seq=3 ttl=48 time=276 ms
64 bytes from : icmp_seq=4 ttl=48 time=276 ms
64 bytes from : icmp_seq=6 ttl=48 time=290 ms
^C
— xxx.xxx.xxx.xxx ping statistics —
7 packets transmitted, 5 received, 28% packet loss, time 6002ms
rtt min/avg/max/mdev = 266.775/281.424/296.513/10.778 ms

1 个赞

上有mysql、下游tidb负载都不高,网络延迟在200~300ms之间,io也不高

1 个赞

可能有一些不兼容的情况

上游mysql8 没有开启gtid_mode;binlog pos卡住的地方是在执行这条语句,不知道是不是跟这个有关系
image

我有5个上游mysql8实例在做dm同步,有1个同步正常,4个报同样的报错,其中1个卡了1天后,恢复了同步,目前又出现同样的报错。

5个mysql 8版本的实例配置都是一样的吗?

是的,都是报I/O timeout

如果5个mysql 8中有4个报错,1个正常,可以比对下配置有什么差异

配置都是一样,除了CPU、内存有些不同,不过有3个实例CPU、内存是一样的。

1)上游同步的实例是从还是主(包括双主)?
2)复制源的实例是否有开启gtid?
3) 上游实例是否有做过gtid的调整?
4) gtid_next=anonymous,是一种过度状态,推测做过gtid的调整

在关闭GTID的时候,主库在设置成OFF_PERMISSIVE后,不再生成GTID,备库在接受到带GTID 和不带GTID的事务都可以容忍。
主库在关闭GTID时,执行事务会产生一个Anonymous_Gtid事件,会在备库执行:
SET @@SESSION.GTID_NEXT= ‘ANONYMOUS’
备库在执行匿名事务时,就不会去尝试生成本地GTID了

1 个赞

dm 配置文件都是一样的

我说不是dm配置,说的是上游mysql实例的gtid配置。

1) 上游mysql实例从库
2)安装初始化时就未开启
3)没有

都是相同的