grafana_server导致deploy失败

【版本】 v5.2.3 ARM平台
【问题】deploy部署时,如果指定grafana server的话则会报SSH失败
grafana server配置
grafana_servers:

  • host: 10.161.65.156
    ssh_port: 22
    port: 3000
    arch: arm64
    os: linux

部署过程
bash-5.0$ tiup cluster deploy jftidb v5.2.3 cluster.yaml -u tidb -p
Starting component cluster: /data/.tiup/components/cluster/v1.8.1/tiup-cluster deploy jftidb v5.2.3 cluster.yaml -u tidb -p
Input SSH password:
Please confirm your topology:
Cluster type: tidb
Cluster name: jftidb
Cluster version: v5.2.3
Role Host Ports OS/Arch Directories


pd 10.172.65.185 23792/23811 linux/aarch64 /data/deploy/pd-23792,/data/data/pd-23792
pd 10.172.65.188 23792/23811 linux/aarch64 /data/deploy/pd-23792,/data/data/pd-23792
pd 10.172.65.190 23792/23811 linux/aarch64 /data/deploy/pd-23792,/data/data/pd-23792
tikv 10.172.65.156 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tikv 10.172.65.157 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tikv 10.172.65.160 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tikv 10.172.65.161 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tikv 10.172.65.164 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tikv 10.172.65.165 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tikv 10.172.65.183 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tikv 10.172.65.184 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tikv 10.172.65.185 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tikv 10.172.65.188 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tikv 10.172.65.190 20161/20180 linux/aarch64 /data/deploy/tikv-20161,/data/data/tikv-20161
tidb 10.172.65.156 4000/20080 linux/aarch64 /data/deploy/tidb-4000 <—此处配置是65.156
tidb 10.172.65.157 4000/20080 linux/aarch64 /data/deploy/tidb-4000
tidb 10.172.65.160 4000/20080 linux/aarch64 /data/deploy/tidb-4000
tidb 10.172.65.161 4000/20080 linux/aarch64 /data/deploy/tidb-4000
tidb 10.172.65.164 4000/20080 linux/aarch64 /data/deploy/tidb-4000
tidb 10.172.65.165 4000/20080 linux/aarch64 /data/deploy/tidb-4000
tidb 10.172.65.183 4000/20080 linux/aarch64 /data/deploy/tidb-4000
tidb 10.172.65.184 4000/20080 linux/aarch64 /data/deploy/tidb-4000
prometheus 10.172.65.190 9090 linux/aarch64 /data/deploy/prometheus-9090,/data/data/prometheus-9090
grafana 10.161.65.156 3000 linux/aarch64 /data/deploy/grafana-3000 <—此处配置是65.156
Attention:
1. If the topology is not what you expected, check your yaml file.
2. Please confirm there is no port/directory conflicts in same host.
Do you want to continue? [y/N]: (default=N) y

  • Generate SSH keys … Done
  • Download TiDB components
    • Download pd:v5.2.3 (linux/arm64) … Done
    • Download tikv:v5.2.3 (linux/arm64) … Done
    • Download tidb:v5.2.3 (linux/arm64) … Done
    • Download prometheus:v5.2.3 (linux/arm64) … Done
    • Download grafana:v5.2.3 (linux/arm64) … Done
    • Download node_exporter: (linux/arm64) … Done
    • Download blackbox_exporter: (linux/arm64) … Done
  • Initialize target host environments
    • Prepare 10.172.65.185:22 … Done
    • Prepare 10.172.65.188:22 … Done
    • Prepare 10.172.65.190:22 … Done
    • Prepare 10.172.65.156:22 … Done <—此处对65.156prepare
    • Prepare 10.172.65.157:22 … Done
    • Prepare 10.172.65.160:22 … Done
    • Prepare 10.172.65.161:22 … Done
    • Prepare 10.172.65.164:22 … Done
    • Prepare 10.172.65.165:22 … Done
    • Prepare 10.172.65.183:22 … Done
    • Prepare 10.172.65.184:22 … Done
    • Prepare 10.161.65.156:22 … Error <—此处又对65.156prepare,其他IP仅出现1次

Error: Failed to initialize TiDB environment on remote host ‘10.161.65.156’ (task.env_init.failed)
caused by: Failed to create ‘~/.ssh’ directory for user ‘tidb’
caused by: Failed to execute command over SSH for ‘tidb@10.161.65.156:22’
caused by: dial tcp 10.161.65.156:22: i/o timeout

Verbose debug logs has been written to /data/.tiup/logs/tiup-cluster-debug-2022-01-17-22-35-55.log.
Error: run /data/.tiup/components/cluster/v1.8.1/tiup-cluster (wd:/data/.tiup/data/SumjRGP) failed: exit status 1
更换多次其他IP和端口,每次都报上述错误,去除grafana_servers后部署成功

2 个赞

看下是不是这种问题

ssh疑难杂症还得请周大佬出马:grinning:

3 个赞

audit.log (426.2 KB)

找到问题了 IP错了,10.172的IP 写成161了

:rofl::rofl::rofl:

超低级错误

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。