Task

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:
    DM版本:1.0.2
    TiDB版本:3.0.4
  • 【问题描述】:
    task通过query-status查看报错,通过query-error 查看任务正常,我尝试resume-task office_dip_test,无法恢复,这个如何恢复呢

image

若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

“msg”: “[code=30012:class=relay-unit:scope=upstream:level=high] start reader for UUID 739342e7-4ec6-11e8-9f34-00505687548d.000001: start sync from position (mysql-bin.002392, 102514034): dial tcp 192.168.47.142:30112: connect: connection refusedngithub.com/pingcap/dm/pkg/terror.(*Error).Delegate /home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:267ngithub.com/pingcap/dm/pkg/binlog/reader.(*TCPReader).StartSyncByPos /home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/binlog/reader/tcp.go:79ngithub.com/pingcap/dm/relay/reader.(*reader).setUpReaderByPos /home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/relay/reader/reader.go:166ngithub.com/pingcap/dm/relay/reader.(*reader).Start /home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/relay/reader/reader.go:111ngithub.com/pingcap/dm/relay.(*Relay).setUpReader /home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/relay/relay.go:606ngithub.com/pingcap/dm/relay.(*Relay).process /home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/relay/relay.go:304ngithub.com/pingcap/dm/relay.(*Relay).Process /home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/relay/relay.go:191ngithub.com/pingcap/dm/dm/worker.(*realRelayHolder).run /home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/relay.go:164ngithub.com/pingcap/dm/dm/worker.(*realRelayHolder).Start.func1 /home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/worker/relay.go:140 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1337”

我在woker的节点上访问上游的数据库都是正常的

192.168.47.142:30112这是上游 MySQL IP 地址以及端口吗

是的1

执行 resume 操作,dm-worker 节点日志是怎么样的

我通过重启worker节点恢复正常,可能是woker节点和上游mysql连接断开没有重连了,我重启又重新连上了,像这种情况,task的状态是runing,但是 已经不同步了,我监控那个指标能发现这个问题呢 ./scripts/stop_dm-worker.sh ./scripts/start_dm-worker.sh

这边报错是在 Relay 模块,可以看下这些监控:

https://pingcap.com/docs-cn/stable/reference/tools/data-migration/monitor/#relay-log-1

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。