Hi 顧問們 你們好,
TiDB版本: v4.0.8
TiUP工具版本: v1.2.3
近日在「GCP」上依照官方建議Dev規格佈署TiDB Cluster,規格如下:
佈署與啟動Cluster皆無誤,但透過TiUP去關閉cluster時,會出現以下錯誤訊息:
(已確認過各節點Selinux與防火牆皆已關閉)
p.s. 之前在local端VM佈署 不會有此狀況
[tidb@dev-tidb-tidb1 ~]$ tiup cluster stop tidbcluster
Starting component cluster
: /home/tidb/.tiup/components/cluster/v1.2.3/tiup-cluster stop tidbcluster
- [ Serial ] - SSHKeySet: privateKey=/home/tidb/.tiup/storage/cluster/clusters/tidbcluster/ssh/id_rsa, publicKey=/home/tidb/.tiup/storage/cluster/clusters/tidbcluster/ssh/id_rsa.pub
- [Parallel] - UserSSH: user=tidb, host=10.210.1.116
- [Parallel] - UserSSH: user=tidb, host=10.210.1.114
- [Parallel] - UserSSH: user=tidb, host=10.210.1.115
- [Parallel] - UserSSH: user=tidb, host=10.210.1.116
- [Parallel] - UserSSH: user=tidb, host=10.210.1.111
- [Parallel] - UserSSH: user=tidb, host=10.210.1.116
- [Parallel] - UserSSH: user=tidb, host=10.210.1.112
- [Parallel] - UserSSH: user=tidb, host=10.210.1.113
- [ Serial ] - StopCluster
Stopping component alertmanager
Stopping instance 10.210.1.116
Stop alertmanager 10.210.1.116:9093 success
Stopping component grafana
Stopping instance 10.210.1.116
Stop grafana 10.210.1.116:3000 success
Stopping component prometheus
Stopping instance 10.210.1.116
Stop prometheus 10.210.1.116:9090 success
Stopping component node_exporter
retry error: operation timed out after 2m0s
** prometheus 10.210.1.116:9090 failed to stop: timed out waiting for port 9100 to be stopped after 2m0s**
Error: prometheus 10.210.1.116:9090 failed to stop: timed out waiting for port 9100 to be stopped after 2m0s: timed out waiting for port 9100 to be stopped after 2m0s
Verbose debug logs has been written to /home/tidb/logs/tiup-cluster-debug-2020-11-03-09-10-24.log.
Error: run /home/tidb/.tiup/components/cluster/v1.2.3/tiup-cluster
(wd:/home/tidb/.tiup/data/SFEgypc) failed: exit status 1
log檔案如下:
tiup-cluster-debug-2020-11-03-09-10-24.log (110.6 KB)
Grafana那台上的node_exporter.log
https://drive.google.com/file/d/1s0n8HgsBHqv902HCQVFievSDe6FrnvBm/view?usp=sharing
主要看到的錯誤:
time=“2020-11-03T09:08:12+08:00” level=fatal msg=“listen tcp :9100: bind: address already in use” source=“node_exporter.go:114”
topology.yml檔案如下:
topology.yml (4.5 KB)