TIDB集群换IP后,无法启动

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【 TiDB 使用环境】
【概述】换IP后无法启动
【背景】
操作步骤:
1、运维通知机器要换IP
2、停止集群
3、运维换IP(由192.168.0.* 修改为:10.31.4.*)
4、修改配置文件中的IP:vi /home/tidb/.tiup/storage/cluster/clusters/tidb-test/meta.yaml
5、tiup cluster start tidb-test
6、tiup cluster:v1.8.2 reload tidb-test -R pd --force
7、tiup cluster start tidb-test -R pd --force
【现象】业务和数据库现象
【业务影响】
集群无法启动

修改IP后发现PD的IP还是192.168.0.*,进程信息如下:
tidb      9274     1  0 10:10 ?        00:00:00 bin/pd-server --name=pd-192.168.0.212-2379 --client-urls=http://0.0.0.0:2379 --advertise-client-urls=http://192.168.0.212:2379 --peer-urls=http://0.0.0.0:2380 --advertise-peer-urls=http://192.168.0.212:2380 --data-dir=/data/tidb-data/pd-2379 --initial-cluster=pd-192.168.0.210-2379=http://192.168.0.210:2380,pd-192.168.0.211-2379=http://192.168.0.211:2380,pd-192.168.0.212-2379=http://192.168.0.212:2380 --config=conf/pd.toml --log-file=/data/tidb-deploy/pd-2379/log/pd.log

【TiDB 版本】v5.2.0
【附件】
这个启动集群日志中也有“192.168.122.1”IP信息
tiup-cluster-debug-2022-01-12-10-12-19.log (1.7 MB)

  1. TiUP Cluster Display 信息

  2. TiUP Cluster Edit Config 信息

global:
  user: tidb
  ssh_port: 22
  ssh_type: builtin
  deploy_dir: /data/tidb-deploy
  data_dir: /data/tidb-data
  os: linux
monitored:
  node_exporter_port: 9100
  blackbox_exporter_port: 9115
  deploy_dir: /data/tidb-deploy/monitor-9100
  data_dir: /data/tidb-data/monitor-9100
  log_dir: /data/tidb-deploy/monitor-9100/log
server_configs:
  tidb:
    log.level: error
    prepared-plan-cache.enabled: true
  tikv:
    log-level: error
    rocksdb.defaultcf.block-cache-size: 24GB
    rocksdb.writecf.block-cache-size: 6GB
  pd:
    replication.enable-placement-rules: true
  tiflash: {}
  tiflash-learner: {}
  pump: {}
  drainer: {}
  cdc: {}
tidb_servers:
- host: 10.31.4.240
  ssh_port: 22
  port: 4000
  status_port: 10080
  deploy_dir: /data/tidb-deploy/tidb-4000
  log_dir: /data/tidb-deploy/tidb-4000/log
  arch: amd64
  os: linux
- host: 10.31.4.241
  ssh_port: 22
  port: 4000
  status_port: 10080
  deploy_dir: /data/tidb-deploy/tidb-4000
  log_dir: /data/tidb-deploy/tidb-4000/log
  arch: amd64
  os: linux
- host: 10.31.4.232
  ssh_port: 22
  port: 4000
  status_port: 10080
  deploy_dir: /data/tidb-deploy/tidb-4000
  log_dir: /data/tidb-deploy/tidb-4000/log
  arch: amd64
  os: linux
tikv_servers:
- host: 10.31.4.233
  ssh_port: 22
  port: 20160
  status_port: 20180
  deploy_dir: /data/tidb-deploy/tikv-20160
  data_dir: /data/tidb-data/tikv-20160
  log_dir: /data/tidb-deploy/tikv-20160/log
  arch: amd64
  os: linux
- host: 10.31.4.234
  ssh_port: 22
  port: 20160
  status_port: 20180
  deploy_dir: /data/tidb-deploy/tikv-20160
  data_dir: /data/tidb-data/tikv-20160
  log_dir: /data/tidb-deploy/tikv-20160/log
  arch: amd64
  os: linux
- host: 10.31.4.240
  ssh_port: 22
  port: 20160
  status_port: 20180
  deploy_dir: /data/tidb-deploy/tikv-20160
  data_dir: /data/tidb-data/tikv-20160
  log_dir: /data/tidb-deploy/tikv-20160/log
  arch: amd64
  os: linux
- host: 10.31.4.241
  ssh_port: 22
  port: 20160
  status_port: 20180
  deploy_dir: /data/tidb-deploy/tikv-20160
  data_dir: /data/tidb-data/tikv-20160
  log_dir: /data/tidb-deploy/tikv-20160/log
  arch: amd64
  os: linux
- host: 10.31.4.246
  ssh_port: 22
  port: 20160
  status_port: 20180
  deploy_dir: /data/tidb-deploy/tikv-20160
  data_dir: /data/tidb-data/tikv-20160
  log_dir: /data/tidb-deploy/tikv-20160/log
  arch: amd64
  os: linux
tiflash_servers:
- host: 10.31.4.245
  ssh_port: 22
  tcp_port: 9000
  http_port: 8123
  flash_service_port: 3930
  flash_proxy_port: 20170
  flash_proxy_status_port: 20292
  metrics_port: 8234
  deploy_dir: /data/tidb-deploy/tiflash-9000
  data_dir: /data/tidb-data/tiflash-9000
  log_dir: /data/tidb-deploy/tiflash-9000/log
  arch: amd64
  os: linux
pd_servers:
- host: 10.31.4.240
  ssh_port: 22
  name: pd-10.31.4.240-2379
  client_port: 2379
  peer_port: 2380
  deploy_dir: /data/tidb-deploy/pd-2379
  data_dir: /data/tidb-data/pd-2379
  log_dir: /data/tidb-deploy/pd-2379/log
  arch: amd64
  os: linux
- host: 10.31.4.241
  ssh_port: 22
  name: pd-10.31.4.241-2379
  client_port: 2379
  peer_port: 2380
  deploy_dir: /data/tidb-deploy/pd-2379
  data_dir: /data/tidb-data/pd-2379
  log_dir: /data/tidb-deploy/pd-2379/log
  arch: amd64
  os: linux
- host: 10.31.4.232
  ssh_port: 22
  name: pd-10.31.4.232-2379
  client_port: 2379
  peer_port: 2380
  deploy_dir: /data/tidb-deploy/pd-2379
  data_dir: /data/tidb-data/pd-2379
  log_dir: /data/tidb-deploy/pd-2379/log
  arch: amd64
  os: linux
cdc_servers:
- host: 10.31.4.233
  ssh_port: 22
  port: 8300
  deploy_dir: /data/tidb-deploy/cdc-8300
  data_dir: /data/tidb-data/cdc-8300
  log_dir: /data/tidb-deploy/cdc-8300/log
  arch: amd64
  os: linux
- host: 10.31.4.234
  ssh_port: 22
  port: 8300
  deploy_dir: /data/tidb-deploy/cdc-8300
  data_dir: /data/tidb-data/cdc-8300
  log_dir: /data/tidb-deploy/cdc-8300/log
  arch: amd64
  os: linux
- host: 10.31.4.246
  ssh_port: 22
  port: 8300
  deploy_dir: /data/tidb-deploy/cdc-8300
  data_dir: /data/tidb-data/cdc-8300
  log_dir: /data/tidb-deploy/cdc-8300/log
  arch: amd64
  os: linux
monitoring_servers:
- host: 10.31.4.246
  ssh_port: 22
  port: 9090
  deploy_dir: /data/tidb-deploy/prometheus-9090
  data_dir: /data/tidb-data/prometheus-9090
  log_dir: /data/tidb-deploy/prometheus-9090/log
  external_alertmanagers: []
  arch: amd64
  os: linux
grafana_servers:
- host: 10.31.4.246
  ssh_port: 22
  port: 3000
  deploy_dir: /data/tidb-deploy/grafana-3000
  arch: amd64
  os: linux
  username: admin
  password: admin
  anonymous_enable: false
  root_url: ""
  domain: ""
alertmanager_servers:
- host: 10.31.4.246
  ssh_port: 22
  web_port: 9093
  cluster_port: 9094
  deploy_dir: /data/tidb-deploy/alertmanager-9093
  data_dir: /data/tidb-data/alertmanager-9093
  log_dir: /data/tidb-deploy/alertmanager-9093/log
  arch: amd64
  os: linux
  1. TiDB- Overview 监控
  • 对应模块日志(包含问题前后1小时日志)

更换ip可以参考:

啊这~这种应该走扩容缩容吧~
竟然有SOP,那还是参考SOP吧

应该走扩容缩容,谢谢

谢谢:handshake:

我的也是,更换PD的ip地址之后,无法启动,是根据SOP系列 12操作的,也启动不了,
tiup cluster:v1.10.3 reload tidb-test -R pd --force 之后出现下面的报错:
Starting component cluster: /home/tidb/.tiup/components/cluster/v1.10.3/tiup-cluster reload tidb-test -R pd --force
Will reload the cluster tidb-test with restart policy is true, nodes: , roles: pd.
Do you want to continue? [y/N]:(default=N) Y

  • [ Serial ] - SSHKeySet: privateKey=/home/tidb/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa, publicKey=/home/tidb/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa.pub
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.186
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.187
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.192
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.180
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.181
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.180
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.185
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.194
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.180
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.193
  • [Parallel] - UserSSH: user=tidb, host=192.168.1.180
  • [ Serial ] - UpdateTopology: cluster=tidb-test
    {“level”:“warn”,“ts”:“2022-08-19T21:53:12.896+0800”,“logger”:“etcd-client”,“caller”:“v3@v3.5.4/retry_interceptor.go:62”,“msg”:“retrying of unary invoker failed”,“target”:“etcd-endpoints://0xc0005bce00/192.168.1.192:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = “transport: Error while dialing dial tcp 192.168.1.194:2379: connect: connection refused””}

Error: context deadline exceeded

Verbose debug logs has been written to /home/tidb/.tiup/logs/tiup-cluster-debug-2022-08-19-21-53-13.log.
[tidb@tidb180 tidb-test]$

请教下,如果是机房更换,所有的ip地址更换,也可以走扩容和缩容吗???都不是在同一个网段

可以的,只要网络互通就可以,没有必须同一网段的要求

扩缩容可以在线处理,是最优方案

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。