通过tiup cluster scale-out 部署tiflash的时候报错:Failed to execute command over SSH

【 TiDB 使用环境】:生产环境
【概述】通过tiup cluster scale-out 部署tiflash的时候报错:Failed to execute command over SSH,在开始部署之前由于忘记了tidb用户的密码,所有的机器上都重置过密码,而且重新配置了中控机到其他机器的免密登录。现在不仅不能部署tiflash,而且通过tiup cluster 命令管理集群也报同样的错误。
【背景】确认过免密登录正常,在中控机上远程执行命令也正常
【现象】部署tiflash报错
【业务影响】无
【TiDB 版本】v4.0.9
【附件】
【报错信息】

[tidb@goldlion-pord-tidb1 ~]$ tiup cluster scale-out tidb-cluster ./tiflash-scale-out.yaml
Starting component cluster: /home/tidb/.tiup/components/cluster/v1.9.2/tiup-cluster scale-out tidb-cluster ./tiflash-scale-out.yaml

  • Detect CPU Arch
    • Detecting node 10.0.1.121 … Done
      Please confirm your topology:
      Cluster type: tidb
      Cluster name: tidb-cluster
      Cluster version: v4.0.9
      Role Host Ports OS/Arch Directories

tiflash 10.0.1.121 9000/8123/3930/20170/20292/8234 linux/x86_64 /data/tidb_deploy/tiflash-9000,/data/tidb_data/tiflash-9000
Attention:
1. If the topology is not what you expected, check your yaml file.
2. Please confirm there is no port/directory conflicts in same host.
Do you want to continue? [y/N]: (default=N) y

  • [ Serial ] - SSHKeySet: privateKey=/home/tidb/.tiup/storage/cluster/clusters/tidb-cluster/ssh/id_rsa, publicKey=/home/tidb/.tiup/storage/cluster/clusters/tidb-cluster/ssh/id_rsa.pub
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.118
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.119
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.119
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.118
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.120
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.117
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.117
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.120
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.120
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.120
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.120
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.119
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.117
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.118
  • [Parallel] - UserSSH: user=tidb, host=10.0.1.121
  • Download TiDB components
    • Download tiflash:v4.0.9 (linux/amd64) … Done
  • Initialize target host environments
  • Deploy TiDB instance
    • Deploy instance tiflash -> 10.0.1.121:9000 … Error

Error: executor.ssh.execute_failed: Failed to execute command over SSH for ‘tidb@10.0.1.121:22’ {ssh_stderr: , ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin /usr/bin/sudo -H bash -c “test -d /data || (mkdir -p /data && chown tidb:$(id -g -n tidb) /data)”}, cause: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain

Verbose debug logs has been written to /home/tidb/.tiup/logs/tiup-cluster-debug-2022-03-17-17-13-02.log.
Error: run /home/tidb/.tiup/components/cluster/v1.9.2/tiup-cluster (wd:/home/tidb/.tiup/data/T0KPEjP) failed: exit status 1

========================

1 个赞

tiup-cluster-debug-2022-03-17-17-13-02.log (28.3 KB)

看日志还是ssh问题
看看这篇文章能不能帮到你
https://asktug.com/t/topic/95777

加上–ssh system试试

谢谢您的回复,我用-ssh system已经尝试过,还是报同样的错误。

谢谢,目前还没有解决,我的这个被搞的有点复杂了。

试试 -i 指定tiup list显示的秘钥文件

1 个赞

兄弟,非常感谢你的提点,我将tiup cluster list显示出来的对应目录下面的id_rsa和id_rsa.pub添加old后缀作为保留文件,然后将新生成的id_rsa和id_rsa.pub文件copy到tiup cluster list显示出来的对应目录下面。问题解决了。给兄弟你365个:+1:

厉害了

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。