dm v1.0.6 binlog 同步错误 300012

报错,

“unresolvedDDLLockID”: “”,
“sync”: {
“totalEvents”: “0”,
“totalTps”: “0”,
“recentTps”: “0”,
“masterBinlog”: “(mysql-bin.000005, 547252327)”,
“masterBinlogGtid”: “”,
“syncerBinlog”: “(mysql-bin|000001.000003, 482531774)”,
“syncerBinlogGtid”: “”,
“blockingDDLs”: [
],
“unresolvedGroups”: [
],
“synced”: false
}
}
],
“relayStatus”: {
“masterBinlog”: “(mysql-bin.000005, 547252327)”,
“masterBinlogGtid”: “”,
“relaySubDir”: “aa5524b0-b7a1-11ea-98a0-00505695c7b8.000001”,
“relayBinlog”: “(mysql-bin.000003, 482531774)”,
“relayBinlogGtid”: “”,
“relayCatchUpMaster”: false,
“stage”: “Paused”,
“result”: {
“isCanceled”: false,
“errors”: [
{
“Type”: “UnknownError”,
“msg”: “”,
“error”: {
“ErrCode”: 30012,
“ErrClass”: 8,
“ErrScope”: 1,
“ErrLevel”: 3,
“Message”: “start reader for UUID aa5524b0-b7a1-11ea-98a0-00505695c7b8.000001: start sync from position (mysql-bin.000003, 482531774): dial tcp 192.168.2.102:3306: connect: connection refused”,
“RawCause”: “dial tcp 192.168.2.102:3306: connect: connection refused”
}
}
],
“detail”: null

我有尝试修改相关的sync-checkpoint 表里的binlog文件名,但是没用,只是syncerBinlog的变了,再relayStatus下面的没有本质上的变化

发现问题是没有从主库上把另外两个log下载下来,这是什么bug?同步到一半不自动下载?后面我要怎么才能继续同步数据呢?请tidb的工程师尽快给一个解决办法吧

您好,根据您这边提供的信息,报错连接上游权限有问题,麻烦检查下 dm-worker 配置中上游用户的权限是否正确。https://docs.pingcap.com/zh/tidb-data-migration/stable/dm-worker-intro#上游数据库用户权限

我之前就已经配置好了权限的,12号之前都能同步成功,12号之后突然不行了,怎么会是权限问题呢?

上面给到的报错是 connection refused,建议先检查下权限。另外方便的话可以把 dm-worker 以及 DM 版本信息提供一下,辛苦。

以下是dm-worker的信息,DM用的v1.0.6

server-id = 101

source-id = “mysql-replica-01”

flavor = “mysql”

enable-gtid = false

relay-binlog-name = “mysql-bin.000001”

#charset of DSN of source mysql/mariadb instance
charset = “”
meta-dir = “”

[from]
host = “192.168.2.102”
user = “repl”
password = “+OAGTYteTKBbfD+fpyyEJ0C7mAZExCvxRsS/”
port = 3306

#relay log purge strategy
[purge]
interval = 3600
expires = 0
remain-space = 15

麻烦在提供下 dm-worker.log 吧 多谢。您这边确认过权限了吗? 从 dm-worker 部署机器上使用配置用户连接到上游是否正常?权限是否正常?

权限没问题的,12号之前都很正常,这个是之前一直在同步的任务,后面突然不正常了
[2020/08/06 12:39:58] [info] binlogsyncer.go:776 rotate to (mysql-bin.000001, 4)
[2020/08/11 07:39:43] [info] binlogsyncer.go:776 rotate to (mysql-bin.000002, 4)
[2020/08/11 07:39:43] [info] binlogsyncer.go:776 rotate to (mysql-bin.000002, 4)
[2020/08/12 09:24:17] [info] binlogsyncer.go:776 rotate to (mysql-bin.000003, 4)
[2020/08/12 09:24:17] [info] binlogsyncer.go:776 rotate to (mysql-bin.000003, 4)
[2020/08/12 15:10:53] [info] binlogsyncer.go:720 receive EOF packet, retry ReadPacket
[2020/08/12 15:10:53] [error] binlogsyncer.go:656 io.ReadFull(header) failed. err EOF: connection was bad
[2020/08/12 15:10:53] [warn] binlogsyncer.go:666 retry sync is disabled
[2020/08/12 15:10:53] [error] binlogstreamer.go:77 close sync with err: io.ReadFull(header) failed. err EOF: connection was bad
[2020/08/12 15:10:54] [info] binlogsyncer.go:175 syncer is closing…
[2020/08/12 15:10:54] [info] binlogsyncer.go:202 syncer is closed
[2020/08/12 15:10:54] [info] binlogsyncer.go:144 create BinlogSyncer with config {101 mysql 192.168.2.102 3306 repl false true false UTC true 0 30s 1m0s 0 true true 0}
[2020/08/12 15:10:54] [info] binlogsyncer.go:359 begin to sync binlog from position (mysql-bin.000003, 482531774)

数据库突然连不上了??,难道需要重启一下dm-worker?
[2020/08/12 15:10:50.181 +08:00] [ERROR] [syncer.go:2058] [“fail to estimate unreplicated binlog size”] [task=test-sync] [unit=“binlog replication”] [error="[code=10001:class=database:scope=not-set:level=high] database driver error: dial tcp 192.168.2.102:3306: connect: connection refused"] [errorVerbose="[code=10001:class=database:scope=not-set:level=high] database driver error: dial tcp 192.168.2.102:3306: connect: connection refused\ngithub.com/pingcap/dm/pkg/terror.(*Error).Delegate\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:271\ngithub.com/pingcap/dm/pkg/terror.DBErrorAdaptArgs\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/adapter.go:39\ github.com/pingcap/dm/pkg/terror.DBErrorAdapt\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/adapter.go:46\ github.com/pingcap/dm/syncer.getBinaryLogs\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/db.go:486\ github.com/pingcap/dm/syncer.countBinaryLogsSize\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/db.go:461\ github.com/pingcap/dm/syncer.(*UpStreamConn).countBinaryLogsSize\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/db.go:129\ngithub.com/pingcap/dm/syncer.(*Syncer).printStatus\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:2055\ngithub.com/pingcap/dm/syncer.(*Syncer).Run.func4\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:1125\ runtime.goexit\ \t/usr/local/go/src/runtime/asm_amd64.s:1357"]
[2020/08/12 15:10:50.181 +08:00] [ERROR] [syncer.go:2077] [“fail to get master status”] [task=test-sync] [unit=“binlog replication”] [error="[code=10001:class=database:scope=not-set:level=high] database driver error: dial tcp 192.168.2.102:3306: connect: connection refused"]
[2020/08/12 15:10:50.181 +08:00] [INFO] [syncer.go:2088] [“binlog replication status”] [task=test-sync] [unit=“binlog replication”] [total_events=1] [total_tps=0] [tps=0] [master_position="(, 0)"] [master_gtid=NULL] [checkpoint="(mysql-bin|000001.000003, 482531774)(flushed (mysql-bin|000001.000003, 482217281))"][2020/08/12 15:10:52.997 +08:00] [WARN] [status.go:39] [“fail to get master status”] [task=test-sync] [unit=“binlog replication”] [error="[code=10001:class=database:scope=not-set:level=high] database driver error: dial tcp 192.168.2.102:3306: connect: connection refused"] [errorVerbose="[code=10001:class=database:scope=not-set:level=high] database driver error: dial tcp 192.168.2.102:3306: connect: connection refused\ngithub.com/pingcap/dm/pkg/terror.(*Error).Delegate\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:271\ngithub.com/pingcap/dm/pkg/terror.DBErrorAdaptArgs\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/adapter.go:39\ github.com/pingcap/dm/pkg/terror.DBErrorAdapt\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/adapter.go:46\ github.com/pingcap/dm/pkg/utils.GetMasterStatus\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/utils/db.go:147\ github.com/pingcap/dm/syncer.(*UpStreamConn).getMasterStatus\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/db.go:102\ngithub.com/pingcap/dm/syncer.(*Syncer).getMasterStatus\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:636\ngithub.com/pingcap/dm/syncer.(*Syncer).Status\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/status.go:37\ngithub.com/pingcap/dm/dm/worker.(*Worker).Status\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/status.go:104\ngithub.com/pingcap/dm/dm/worker.(*Worker).StatusJSON\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/status.go:125\ngithub.com/pingcap/dm/dm/worker.(*Worker).Start\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/worker.go:198\ngithub.com/pingcap/dm/dm/worker.(*Server).Start.func1\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/server.go:87\ runtime.goexit\ \t/usr/local/go/src/runtime/asm_amd64.s:1357"]
[2020/08/12 15:10:52.997 +08:00] [WARN] [status.go:44] [“fail to get flushed global point”] [task=test-sync] [unit=“binlog replication”] [error="[code=10001:class=database:scope=not-set:level=high] database driver error: dial tcp 192.168.2.102:3306: connect: connection refused"] [errorVerbose="[code=10001:class=database:scope=not-set:level=high] database driver error: dial tcp 192.168.2.102:3306: connect: connection refused\ngithub.com/pingcap/dm/pkg/terror.(*Error).Delegate\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:271\ngithub.com/pingcap/dm/pkg/terror.DBErrorAdaptArgs\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/adapter.go:39\ github.com/pingcap/dm/pkg/terror.DBErrorAdapt\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/adapter.go:46\ github.com/pingcap/dm/pkg/utils.GetMasterStatus\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/utils/db.go:147\ github.com/pingcap/dm/syncer.(*UpStreamConn).getMasterStatus\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/db.go:102\ngithub.com/pingcap/dm/syncer.(*Syncer).getMasterStatus\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:636\ngithub.com/pingcap/dm/syncer.(*Syncer).Status\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/status.go:37\ngithub.com/pingcap/dm/dm/worker.(*Worker).Status\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/status.go:104\ngithub.com/pingcap/dm/dm/worker.(*Worker).StatusJSON\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/status.go:125\ngithub.com/pingcap/dm/dm/worker.(*Worker).Start\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/worker.go:198\ngithub.com/pingcap/dm/dm/worker.(*Server).Start.func1\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/server.go:87\ runtime.goexit\ \t/usr/local/go/src/runtime/asm_amd64.s:1357"]
[2020/08/12 15:10:54.171 +08:00] [WARN] [relay.go:299] [“receive retryable error for binlog reader”] [component=“relay log”] [error="[code=30015:class=relay-unit:scope=upstream:level=high] TCPReader get relay event with error: io.ReadFull(header) failed. err EOF: connection was bad"][2020/08/12 15:10:54.172 +08:00] [ERROR] [relay.go:302] [“fail to close binlog event reader”] [component=“relay log”] [error="[code=10001:class=database:scope=upstream:level=high] kill connection 6197 for master 192.168.2.102:3306: database driver error: dial tcp 192.168.2.102:3306: connect: connection refused"] [errorVerbose="[code=10001:class=database:scope=upstream:level=high] kill connection 6197 for master 192.168.2.102:3306: database driver error: dial tcp 192.168.2.102:3306: connect: connection refused\ngithub.com/pingcap/dm/pkg/terror.(*Error).Delegate\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:271\ngithub.com/pingcap/dm/pkg/terror.DBErrorAdaptArgs\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/adapter.go:39\ github.com/pingcap/dm/pkg/terror.DBErrorAdapt\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/adapter.go:46\ github.com/pingcap/dm/pkg/utils.KillConn\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/utils/db.go:356\ github.com/pingcap/dm/pkg/binlog/reader.(*TCPReader).Close\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/binlog/reader/tcp.go:131\ngithub.com/pingcap/dm/relay/reader.(*reader).Close\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/relay/reader/reader.go:126\ngithub.com/pingcap/dm/relay.(*Relay).process\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/relay/relay.go:300\ngithub.com/pingcap/dm/relay.(*Relay).Process\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/relay/relay.go:191\ngithub.com/pingcap/dm/dm/worker.(*realRelayHolder).run\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/relay.go:167\ngithub.com/pingcap/dm/dm/worker.(*realRelayHolder).Start.func1\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/relay.go:143\ runtime.goexit\ \t/usr/local/go/src/runtime/asm_amd64.s:1357"]

建议您这边在 dm-worker 部署机器上,手动连接一下上游数据库,如果连接失败,请检查上游 mysql 环境或者是权限,如果手动连接成功,但是 dm-worker 连接失败,尝试重启一下。

发现密码被改了,但是我用下面的方式重启了dm-worker1 ,然后还重启了task,好像没反应。
ansible-playbook rolling_update.yam --tags=dm-worker1 是否可以,我的这台worker的tag取名是dm-worker1

可以确认下 dm-worker.toml 中密码是否改成了修改后的密码。另外建议也滚动更新下 dm-master。

没有,我是不是要depoly一下 --tags=dm-worker1

我只是在 dm-master的 dm-ansible里对dm-worker1 做了rolling-update,但是我去到实际的worker上没看到有什么变动,这块的更新要怎么做?不需要deploy吗?如果需要deploy,需要先stop那个dm-worker吗?

滚动更新 dm-worker 没有生效吗? 按照这个帖子的步骤操作下试试 request to dm-worker ip:8262 is timeout

感谢!解决了,根本问题是密码被改了,
解决过程
1.修改 dm-ansible下的inventory.ini 的密码
2.进行deploy -l 自己需要修改的dm-worker
3.进行stop -l 自己的dm-worker , start -l 自己的dm-worker

1 个赞

:+1::+1:

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。