【TiDB 使用环境】生产环境
【TiDB 版本】v6.1.0
【操作系统】centos7
【部署方式】机器部署
【集群数据量】
【集群节点数】
- Generate config prometheus → 10.173.17.4:9090 … Error
- Generate config grafana → 10.173.17.4:3000 … Error
- Generate config alertmanager → 10.173.17.4:9093 … Error
Error: init config failed: 10.173.17.4:9090: transfer from /root/.tiup/storage/cluster/clusters/tidb-iap/config-cache/prometheus-10.173.17.4-9090.service to /tmp/prometheus_d21b6d81-b2f7-4d71-9ed7-60228725b874.service failed: failed to scp /root/.tiup/storage/cluster/clusters/tidb-iap/config-cache/prometheus-10.173.17.4-9090.service to tidb@10.173.17.4:/tmp/prometheus_d21b6d81-b2f7-4d71-9ed7-60228725b874.service: ssh: handshake failed: read tcp 10.173.17.4:59468->10.173.17.4:22: read: connection reset by peer
扩容新的PD节点,最后在在更新启动监控相关组件报错,查看集群状态PD已经扩容成功,已经修复了scp的问题,如何重新继续执行后面的流程?
前面失败了,想缩容这个PD节点在重新扩容有有了新的报错,执行命令 ./bin/tiup cluster scale-in tidb-iap --node 10.173.191.94:2379,现在这个节点是down的状态,但是下不掉了,怎么解决呢?
Stopping component pd
Stopping instance 10.173.191.94
Stop pd 10.173.191.94:2379 success
Destroying component pd
Destroying instance 10.173.191.94
Destroy 10.173.191.94 success
- Destroy pd paths: [/home/data/tidb-deploy/pd-2379/log /home/data/tidb-deploy/pd-2379 /etc/systemd/system/pd-2379.service /home/data/tidb-data/pd-2379]
Stopping component node_exporter
Stopping instance 10.173.191.94
Error: failed to destroy: failed to stop monitor: failed to stop: 10.173.191.94 node_exporter-9100.service, please check the instance’s log() for more detail.: timed out waiting for port 9100 to be stopped after 2m0s