dm同步任务启动报错

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:v4.0.0
  • 【问题描述】:

dm任务启动报错:

start-task test_tidb.yaml
{
    "result": false,
    "msg": "rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 192.168.10.101:8265: connect: connection refused\"\
github.com/pingcap/errors.AddStack\
\t/go/pkg/mod/github.com/pingcap/errors@v0.11.4/errors.go:174\
github.com/pingcap/errors.Trace\
\t/go/pkg/mod/github.com/pingcap/errors@v0.11.4/juju_adaptor.go:15\
github.com/pingcap/dm/dm/master/workerrpc.callRPC\
\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/master/workerrpc/rawgrpc.go:124\
github.com/pingcap/dm/dm/master/workerrpc.(*GRPCClient).SendRequest\
\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/master/workerrpc/rawgrpc.go:64\
github.com/pingcap/dm/dm/master.(*Server).allWorkerConfigs.func3\
\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/master/server.go:1711\
github.com/pingcap/dm/dm/master.(*AgentPool).Emit\
\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/master/agent_pool.go:117\
runtime.goexit\
\t/usr/local/go/src/runtime/asm_amd64.s:1337\
fetch config of worker 192.168.10.101:8265",
    "workers": [
    ]
}

» check-task test_tidb.yaml
{
    "result": false,
    "msg": "rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 192.168.10.101:8265: connect: connection refused\"\
github.com/pingcap/errors.AddStack\
\t/go/pkg/mod/github.com/pingcap/errors@v0.11.4/errors.go:174\
github.com/pingcap/errors.Trace\
\t/go/pkg/mod/github.com/pingcap/errors@v0.11.4/juju_adaptor.go:15\
github.com/pingcap/dm/dm/master/workerrpc.callRPC\
\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/master/workerrpc/rawgrpc.go:124\
github.com/pingcap/dm/dm/master/workerrpc.(*GRPCClient).SendRequest\
\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/master/workerrpc/rawgrpc.go:64\
github.com/pingcap/dm/dm/master.(*Server).allWorkerConfigs.func3\
\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/master/server.go:1711\
github.com/pingcap/dm/dm/master.(*AgentPool).Emit\
\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/master/agent_pool.go:117\
runtime.goexit\
\t/usr/local/go/src/runtime/asm_amd64.s:1337\
fetch config of worker 192.168.10.101:8265"

监控没有报错:


端口:

中间空缺是我重启了dm,还是报错
日志:
dm-worker-stderr.log (24.7 KB)

1.请检查下网络是否通,能够正常访问。
2. 是否有修改过密码?导致无法访问,多谢。

1.网络能正常访问,可以ping通。
2.近期没有做任何修改。
由于DML语句任务大量同步失败,所以重新同步,删除所有库后重新同步,出现主键冲突。stop-task 任务后,启动任务就一直报这个错

重新清理所有数据,执行unsafe_cleanup.yml后中心deploy、start。能启动成功了

好的,如果还有问题在继续反馈吧。