升级前在tiup cluster check xx --cluster 时一直提示 Run Command Timeout

Mwkk · 2024 年11 月 15 日 02:00

【 TiDB 使用环境】生产环境 /测试/ Poc
【 TiDB 版本】
【复现路径】做过哪些操作出现的问题
【遇到的问题：问题现象及影响】
【资源配置】进入到 TiDB Dashboard -集群信息 (Cluster Info) -主机(Hosts) 截图此页面
【附件：截图/日志/监控】

, ssh_command: export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin; /usr/bin/sudo -H bash -c "/tmp/tiup/bin/insight"}, cause: Run Command Timeout

Verbose debug logs has been written to /home/tidb/.tiup/logs/tiup-cluster-debug-2024-11-15-09-56-49.log.

执行 tiup cluster exec renzheng-operation-log --command 'df -h ’ 可以正确返回集群下所有磁盘使用情况，执行 tmp/tiup/bin/insight 就超时，这应该如何解决。

Lucien-卢西恩 · 2024 年11 月 15 日 02:22

登陆一下目标端 xx.xx.56.3 的目标端，手动执行 /tmp/tiup/bin/insight 看看。

Mwkk · 2024 年11 月 15 日 02:28

real    0m2.895s
user    0m0.232s
sys     0m0.392s

WalterWj · 2024 年11 月 15 日 02:37

看下目标节点和 tiup 本地节点 /tmp 空间是不是没了。

Mwkk · 2024 年11 月 15 日 02:43

tiup 本地节点 /tmp 在/ 下，目前磁盘空间使用情况

Filesystem                                       Size  Used Avail Use% Mounted on
/dev/sda2                                        182G  118G   64G  65% /

目标节点 /tmp 在 / 下，目前磁盘空间使用情况

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2       350G  3.3G  347G   1% /

老谭来了 · 2024 年11 月 15 日 02:45

你是用什么账号执行的，用tidb集群账号执行一下

Mwkk · 2024 年11 月 15 日 02:48

目标端 xx.xx.56.3 是用 tidb 账户执行的

nobody · 2024 年11 月 15 日 02:50

应该就是执行 insight 命令 120 超时，可以 check 下 tiup 日志看看命令发起到报错是不是 120s。

可以尝试执行 tiup 时设置如下参数到一个合理的值，来规避这个问题

--wait-timeout uint   Timeout in seconds to wait for an operation to complete, ignored for operations that don not fit. (default 120)

nobody · 2024 年11 月 15 日 03:05

tiup 执行的命令一般如下形式

/usr/bin/sudo -H bash -c \"/tmp/tiup/bin/insight\"

Mwkk · 2024 年11 月 15 日 03:07

改用

tiup cluster check  --cluster --wait-timeout 1200

拿到结果了

system · 2024 年11 月 22 日 09:20

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。