扩容pump节点

【TiDB 版本】
4.0.8
【问题描述】
集群1上有一个pump节点,通过drainer同步数据到集群2,集群1与集群2都是4.0.8

现在想给集群1中再加2个pump节点,请问需要如何去完成呢?drainer需要先停掉吗?

方法同扩容 ticdc 相似,编辑下扩容文件 scale-out.yaml,里面内容类似:

pump_servers:
  - host: xx.xx.xx.xx
    port: xxx

然后再执行扩容操作:
tiup cluster scale-out <cluster-name> scale-out.yaml

扩容前群集状态如下:

扩容文件内容如下:
cat scale-out-sc-pump.yaml
pump_servers:

  • host: 10.97.6.44
    port: 8250
    deploy_dir: /data/deploy/pump
    data_dir: /data/deploy/pump/data.pump
    log_dir: /data/deploy/pump/log
  • host: 10.97.6.46
    port: 8250
    deploy_dir: /data/deploy/pump
    data_dir: /data/deploy/pump/data.pump
    log_dir: /data/deploy/pump/log

扩容命令
tiup cluster scale-out servicecloud_oltp scale-out-sc-pump.yaml

在下面出现了报错,但是我看集群状态时,发现是已经加好的
报错如下:
$ tiup cluster scale-out servicecloud_oltp scale-out-sc-pump.yaml
Found cluster newer version:

The latest version:         v1.4.1
Local installed version:    v1.2.3
Update current component:   tiup update cluster
Update all components:      tiup update --all

Starting component cluster: /home/appadmin/.tiup/components/cluster/v1.2.3/tiup-cluster scale-out servicecloud_oltp scale-out-sc-pump.yaml
Please confirm your topology:
Cluster type: tidb
Cluster name: servicecloud_oltp
Cluster version: v4.0.8
Type Host Ports OS/Arch Directories


pump 10.97.6.44 8250 linux/x86_64 /data/deploy/pump,/data/deploy/pump/data.pump
pump 10.97.6.46 8250 linux/x86_64 /data/deploy/pump,/data/deploy/pump/data.pump
Attention:
1. If the topology is not what you expected, check your yaml file.
2. Please confirm there is no port/directory conflicts in same host.
Do you want to continue? [y/N]: y

  • [ Serial ] - SSHKeySet: privateKey=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/ssh/id_rsa, publicKey=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/ssh/id_rsa.pub
    • Download pump:v4.0.8 (linux/amd64) … Done
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.47
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.44
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.47
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.47
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.46
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.45
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.46
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.52
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.45
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.51
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.47
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.44
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.46
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.50
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.45
  • [ Serial ] - UserSSH: user=tidb, host=10.97.6.46
  • [ Serial ] - UserSSH: user=tidb, host=10.97.6.44
  • [ Serial ] - Mkdir: host=10.97.6.44, directories=‘/data/deploy/pump’,‘/data/deploy/pump/log’,‘/data/deploy/pump/bin’,‘/data/deploy/pump/conf’,‘/data/deploy/pump/scripts’
  • [ Serial ] - Mkdir: host=10.97.6.46, directories=‘/data/deploy/pump’,‘/data/deploy/pump/log’,‘/data/deploy/pump/bin’,‘/data/deploy/pump/conf’,‘/data/deploy/pump/scripts’
  • [ Serial ] - Mkdir: host=10.97.6.44, directories=‘/data/deploy/pump/data.pump’
  • [ Serial ] - Mkdir: host=10.97.6.46, directories=‘/data/deploy/pump/data.pump’
  • [ Serial ] - CopyComponent: component=pump, version=v4.0.8, remote=10.97.6.44:/data/deploy/pump os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=pump, version=v4.0.8, remote=10.97.6.46:/data/deploy/pump os=linux, arch=amd64
  • [ Serial ] - ScaleConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.44, service=pump-8250.service, deploy_dir=/data/deploy/pump, data_dir=[/data/deploy/pump/data.pump], log_dir=/data/deploy/pump/log, cache_dir=
  • [ Serial ] - ScaleConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.46, service=pump-8250.service, deploy_dir=/data/deploy/pump, data_dir=[/data/deploy/pump/data.pump], log_dir=/data/deploy/pump/log, cache_dir=
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.46
  • [Parallel] - UserSSH: user=tidb, host=10.97.6.44
  • [ Serial ] - Save meta
  • [ Serial ] - StartCluster
    Starting component pump
    Starting instance pump 10.97.6.46:8250
    Starting instance pump 10.97.6.44:8250
    Start pump 10.97.6.46:8250 success
    Start pump 10.97.6.44:8250 success
    Starting component node_exporter
    Starting instance 10.97.6.44
    Start 10.97.6.44 success
    Starting component blackbox_exporter
    Starting instance 10.97.6.44
    Start 10.97.6.44 success
    Starting component node_exporter
    Starting instance 10.97.6.46
    Start 10.97.6.46 success
    Starting component blackbox_exporter
    Starting instance 10.97.6.46
    Start 10.97.6.46 success
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.47, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/alertmanager-9093.service, deploy_dir=/data/deploy/alert, data_dir=[/data/deploy/alert/data.alertmanager], log_dir=/data/deploy/alert/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.44, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/pd-2379.service, deploy_dir=/data/deploy/pd, data_dir=[/data/deploy/pd/data.pd], log_dir=/data/deploy/pd/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.46, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/pd-2379.service, deploy_dir=/data/deploy/pd, data_dir=[/data/deploy/pd/data.pd], log_dir=/data/deploy/pd/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.45, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/pd-2379.service, deploy_dir=/data/deploy/pd, data_dir=[/data/deploy/pd/data.pd], log_dir=/data/deploy/pd/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.44, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/tidb-3306.service, deploy_dir=/data/deploy/tidb-server, data_dir=[], log_dir=/data/deploy/tidb-server/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.47, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/tiflash-9000.service, deploy_dir=/data/deploy/tiflash-9000, data_dir=[/data/deploy/tiflash-data], log_dir=/data/deploy/tiflash-9000/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.46, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/drainer-8249.service, deploy_dir=/data/deploy/drainer, data_dir=[/data/deploy/drainer/data/drainer-8249], log_dir=/data/deploy/drainer/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.47, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/grafana-3000.service, deploy_dir=/data/deploy/grafana, data_dir=[], log_dir=/data/deploy/grafana/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.46, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/pump-8250.service, deploy_dir=/data/deploy/pump, data_dir=[/data/deploy/pump/data.pump], log_dir=/data/deploy/pump/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.44, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/pump-8250.service, deploy_dir=/data/deploy/pump, data_dir=[/data/deploy/pump/data.pump], log_dir=/data/deploy/pump/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.45, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/pump-8250.service, deploy_dir=/data/deploy/pump, data_dir=[/data/deploy/pump/data.pump], log_dir=/data/deploy/pump/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.50, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/tikv-20160.service, deploy_dir=/data/deploy, data_dir=[/data/deploy/data], log_dir=/data/deploy/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.45, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/tidb-3306.service, deploy_dir=/data/deploy/tidb-server, data_dir=[], log_dir=/data/deploy/tidb-server/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.52, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/tikv-20160.service, deploy_dir=/data/deploy, data_dir=[/data/deploy/data], log_dir=/data/deploy/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.46, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/tidb-3306.service, deploy_dir=/data/deploy/tidb-server, data_dir=[], log_dir=/data/deploy/tidb-server/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.47, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/prometheus-9090.service, deploy_dir=/data/deploy/prometheus, data_dir=[/data/deploy/prometheus/prometheus2.0.0.data.metrics], log_dir=/data/deploy/prometheus/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache
  • [ Serial ] - InitConfig: cluster=servicecloud_oltp, user=tidb, host=10.97.6.51, path=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/tikv-20160.service, deploy_dir=/data/deploy, data_dir=[/data/deploy/data], log_dir=/data/deploy/log, cache_dir=/home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache

Error: init config failed: 10.97.6.50:20160: transfer from /home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/tikv-10.97.6.50-20160.service to /tmp/tikv_f88f850a-fb6a-4b4f-bfee-2f5217bc026b.service failed: failed to scp /home/appadmin/.tiup/storage/cluster/clusters/servicecloud_oltp/config-cache/tikv-10.97.6.50-20160.service to tidb@10.97.6.50:/tmp/tikv_f88f850a-fb6a-4b4f-bfee-2f5217bc026b.service: Process exited with status 1

Verbose debug logs has been written to /home/appadmin/logs/tiup-cluster-debug-2021-04-20-14-34-57.log.
Error: run /home/appadmin/.tiup/components/cluster/v1.2.3/tiup-cluster (wd:/home/appadmin/.tiup/data/SV6LSMX) failed: exit status 1

扩容后集群状态

可以看下扩容的那个时间段里 10.97.6.50:20160 这台 tikv 日志中有无什么报错信息,有的话麻烦反馈下。

tikv.log (1.5 MB)

在grep -i err tikv.log时没有发现错误日志,有一些warn的告警,如下,我上传了tikv.log
[2021/04/20 14:35:54.709 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 897, leader may Some(id: 899 store_id: 5)" not_leader { region_id: 897 leader { id: 899 store_id: 5 } }”]
[2021/04/20 14:35:54.709 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 905, leader may Some(id: 908 store_id: 4)" not_leader { region_id: 905 leader { id: 908 store_id: 4 } }”]
[2021/04/20 14:35:54.709 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 913, leader may Some(id: 916 store_id: 4)" not_leader { region_id: 913 leader { id: 916 store_id: 4 } }”]
[2021/04/20 14:35:54.709 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 913, leader may Some(id: 916 store_id: 4)" not_leader { region_id: 913 leader { id: 916 store_id: 4 } }”]
[2021/04/20 14:35:54.761 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 897, leader may Some(id: 899 store_id: 5)" not_leader { region_id: 897 leader { id: 899 store_id: 5 } }”]
[2021/04/20 14:35:54.761 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 905, leader may Some(id: 908 store_id: 4)" not_leader { region_id: 905 leader { id: 908 store_id: 4 } }”]
[2021/04/20 14:35:54.761 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 913, leader may Some(id: 916 store_id: 4)" not_leader { region_id: 913 leader { id: 916 store_id: 4 } }”]
[2021/04/20 14:35:54.761 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 913, leader may Some(id: 916 store_id: 4)" not_leader { region_id: 913 leader { id: 916 store_id: 4 } }”]
[2021/04/20 14:35:54.970 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 13353, leader may Some(id: 13355 store_id: 5)" not_leader { region_id: 13353 leader { id: 13355 store_id: 5 } }”]
[2021/04/20 14:36:00.027 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 12045, leader may Some(id: 12048 store_id: 4)" not_leader { region_id: 12045 leader { id: 12048 store_id: 4 } }”]
[2021/04/20 14:36:00.110 +08:00] [WARN] [endpoint.rs:527] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 12045, leader may Some(id: 12048 store_id: 4)" not_leader { region_id: 12045 leader { id: 12048 store_id: 4 } }”]

这个警告信息对集群没有什么影响,内部会自动进行重试,上面新 pump 扩容后你可以测试验证下功能是否正常。

嗯嗯,谢谢!

目前来看功能是正常的

不客气,有问题可重新开贴提问。

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。