创建DM数据源失败

【 TiDB 使用环境】生产\测试环境\ POC
【 TiDB 版本】dm-2.0
【遇到的问题】 创建同步任务失败
【问题现象及影响】
【附件】
报错日志:

[code=10001:class=database:scope=not-set:level=high], Message: database driver error, RawCause: dial tcp ip:3306: i/o timeout, Workaround: Please check the database connection and the database config in configuration file.",
“sources”: [
]

本地登陆远端mysql成功:

看下配置是不是对的吧

能发一下数据库的配置文件和任务配置文件吗?另外,您是在 operate-source create 的时候遇到这个错误,还是在 start-task 的时候遇到的呢?

operate-source create 阶段。

配置文件source.yaml
image

配置已补充


这里是单引号吗?

不好意思,马赛克不熟练

双引号:

看日志告警像 “网络问题”。

感觉整个配置没生效,试着把3306随便改个端口吧,看再报错是不是显示的改后的端口

你这个应该是连不上上游数据库,可以排查下 source.yaml 中的用户以及 数据库中配置的用户,IP 权限是否对应上。然后看下使用 source.yaml 配置的用户, dm-master 和 dm-worker 机器上能不能登录到数据库中。

检查:账号、密码、网络都是ok的。帖子图片使用此账号密码连接mysql 可以连接。

感觉是网络问题,但是直接登陆是ok的。
如果账号密码有问题, 回报账号密码错误。
连接链路应该还没到 “认证” 步骤。
我再排查一下

问题定位完成:
DM集群的leader网络通,follower网络不通,导致timeout。

DM集群环境:2.0版本,已经创建10+个DM 线上同步任务。

看社区方法:升级至 nightly版本

问题1:以前创建dm同步任务OK,现阶段为啥失败。
问题2:如果升级nightly版本会不会有啥问题。

  • 执行operate-source create 报错:
[2022/04/20 14:02:45.038 +08:00] [WARN] [task.go:926] ["session variable 'time_zone' is overwritten by default UTC timezone."] [time_zone=+00:00]
[2022/04/20 14:02:45.051 +08:00] [INFO] [scheduler.go:1850] ["found free worker when source bound"] [component=scheduler] [worker=dm-localhost-8270] [source=mysql-repl_pr
ess_geass]
[2022/04/20 14:02:45.053 +08:00] [INFO] [scheduler.go:1888] ["bound the source to worker"] [component=scheduler] [bound="{\"source\":\"mysql-repl_press_geass\",\"worker\":\"dm-localhost-8270\"}"]
[2022/04/20 14:02:45.092 +08:00] [INFO] [server.go:1602] ["fail to get expect operation result"] [retryNum=0] [task=] [source=mysql-repl_press_geass] [expect=Running] [resp="msg:\"[code=40070:class=dm-worker:scope=internal:level=high], Message: no mysql source is being handled in the worker\" sourceStatus:<worker:\"dm-localhost-8270\" > "]
[2022/04/20 14:03:16.094 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=1] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2022/04/20 14:03:47.097 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=2] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2022/04/20 14:04:18.098 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=3] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2022/04/20 14:04:49.100 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=4] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2022/04/20 14:05:20.101 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=5] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2022/04/20 14:05:51.102 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=6] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
  • 执行operate-source show 报错
[2022/04/20 15:06:11.109 +08:00] [INFO] [server.go:2206] [payload="{HidePasswordObject=\"op:ShowSource config:\\\"source-id: \\\\\\\"mysql-repl_press_geass\\\\\\\"\\\\
\\\\
# \\\\346\\\\230\\\\257\\\\345\\\\220\\\\246\\\\345\\\\274\\\\200\\\\345\\\\220\\\\257 GTID\\\\
enable-gtid: true\\\\
\\\\
\\\\
from:\\\\
  host: \\\\\\\"\\\\\\\"\\\\
  port: 3306\\\\
  user: \\\\\\\"\\\\\\\"\\\\
  password: \\\\\\\"******\\\\\\\"\\\\
\\\\
purge:\\\\
   interval: 3600\\\\
   expires: 48\\\\
   remain-space: 50\\\\
\\\" \"}"] [request=OperateSource]
[2022/04/20 15:06:41.109 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=0] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2022/04/20 15:07:12.111 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=1] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2022/04/20 15:07:43.113 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=2] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2022/04/20 15:08:14.114 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=3] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2022/04/20 15:08:35.949 +08:00] [ERROR] [server.go:1542] ["fail to query operation"] [retryNum=4] [task=] [source=mysql-repl_press_geass] [expect=Running] [error="[code=38008:class=dm-master:scope=internal:level=high], Message: grpc request error, RawCause: rpc error: code = Canceled desc = context canceled"]

看起来是 grpc 报错呀,看看 dmmaster 和 dmworker 之间能联通不?list-member 看看。

升级的话不用到 nightly 吧,5.4,6.0 这种比较新的版本都是可以的,2.0 确实不利于找问题。

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。