tidb 4.0 缩容tikv,reload 节点报错contained unknown configuration options

拓扑结构如图,在缩容一个tikv节点时出错,错误提示如下,日志在附件,请问什么原因?
后来 tidb cluster diplay 名,看着也下线了

Starting component `cluster`: /home/tidb/.tiup/components/cluster/v0.6.0/cluster scale-in zhuashitidb --node 172.7.160.198:20160
This operation will delete the 172.7.160.198:20160 nodes in `zhuashitidb` and all their data.
Do you want to continue? [y/N]: y
Scale-in nodes...
+ [ Serial ] - SSHKeySet: privateKey=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/ssh/id_rsa, publicKey=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/ssh/id_rsa.pub
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.16
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.37
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.36
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.220
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.36
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.16
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.198
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.26
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.216
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.235
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.37
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.16
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.216
+ [ Serial ] - ClusterOperate: operation=ScaleInOperation, options={Roles:[] Nodes:[172.7.160.198:20160] Force:false Timeout:300}
The component `tikv` will be destroyed when display cluster info when it become tombstone, maybe exists in several minutes or hours
+ [ Serial ] - UpdateMeta: cluster=zhuashitidb, deleted=`''`
+ [ Serial ] - Download: component=alertmanager, version=v0.17.0
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.37, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/pd-2379.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data.pd, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - CopyComponent: component=alertmanager, version=v0.17.0, remote=172.7.160.16:/data/deploy
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.37, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tidb-4000.service, deploy_dir=/data/deploy, data_dir=, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - Download: component=prometheus, version=v4.0.0-rc.1
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.235, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tikv-20160.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - CopyComponent: component=prometheus, version=v4.0.0-rc.1, remote=172.7.160.16:/data/deploy
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.26, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tikv-20160.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.216, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/pd-2379.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data.pd, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.36, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tidb-4000.service, deploy_dir=/data/deploy, data_dir=, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.36, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/pd-2379.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data.pd, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - Download: component=grafana, version=v4.0.0-rc.1
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.220, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tikv-20160.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - CopyComponent: component=grafana, version=v4.0.0-rc.1, remote=172.7.160.16:/data/deploy
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.216, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tidb-4000.service, deploy_dir=/data/deploy, data_dir=, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.16, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/alertmanager-9093.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data.alertmanager, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.16, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/prometheus-9090.service, deploy_dir=/data/deploy, data_dir=/data/deploy/prometheus2.0.0.data.metrics, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.16, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/grafana-3000.service, deploy_dir=/data/deploy, data_dir=, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config

Error: Process exited with status 1

Verbose debug logs has been written to /home/tidb/logs/tiup-cluster-debug-2020-05-09-07-04-01.log.
Error: run `/home/tidb/.tiup/components/cluster/v0.6.0/cluster` (wd:/home/tidb/.tiup/data/RyRNuoC) failed: exit status 1

tiup-cluster-debug-2020-05-09-07-04-01.log (69.2 KB)

在这个报错后,reload 集群也失败,无法reload

[tidb@tidb9 ~]$ tiup cluster reload zhuashitidb
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v0.6.0/cluster reload zhuashitidb
+ [ Serial ] - SSHKeySet: privateKey=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/ssh/id_rsa, publicKey=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/ssh/id_rsa.pub
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.16
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.37
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.235
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.26
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.216
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.36
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.220
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.16
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.216
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.37
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.16
+ [Parallel] - UserSSH: user=tidb, host=172.7.160.36
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.220
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.235
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.16
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.37
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.235, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tikv-20160.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.36
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.220, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tikv-20160.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.36
+ [ Serial ] - Download: component=alertmanager, version=v0.17.0
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.26
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.37
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.37, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tidb-4000.service, deploy_dir=/data/deploy, data_dir=, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.216
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.16
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.16
+ [ Serial ] - Download: component=grafana, version=v4.0.0-rc.1
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.216, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/pd-2379.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data.pd, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - UserSSH: user=tidb, host=172.7.160.216
+ [ Serial ] - CopyComponent: component=alertmanager, version=v0.17.0, remote=172.7.160.16:/data/deploy
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.216, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tidb-4000.service, deploy_dir=/data/deploy, data_dir=, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.36, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tidb-4000.service, deploy_dir=/data/deploy, data_dir=, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.37, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/pd-2379.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data.pd, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.36, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/pd-2379.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data.pd, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - Download: component=prometheus, version=v4.0.0-rc.1
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.26, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/tikv-20160.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - CopyComponent: component=prometheus, version=v4.0.0-rc.1, remote=172.7.160.16:/data/deploy
+ [ Serial ] - CopyComponent: component=grafana, version=v4.0.0-rc.1, remote=172.7.160.16:/data/deploy
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.16, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/alertmanager-9093.service, deploy_dir=/data/deploy, data_dir=/data/deploy/data.alertmanager, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.16, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/prometheus-9090.service, deploy_dir=/data/deploy, data_dir=/data/deploy/prometheus2.0.0.data.metrics, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config
+ [ Serial ] - InitConfig: cluster=zhuashitidb, user=tidb, host=172.7.160.16, path=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config/grafana-3000.service, deploy_dir=/data/deploy, data_dir=, log_dir=/data/deploy/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/zhuashitidb/config

Error: Process exited with status 1

Verbose debug logs has been written to /home/tidb/logs/tiup-cluster-debug-2020-05-09-07-37-23.log.
Error: run `/home/tidb/.tiup/components/cluster/v0.6.0/cluster` (wd:/home/tidb/.tiup/data/RyRWQmy) failed: exit status 1

tiup-cluster-debug-2020-05-09-07-37-23.log (72.8 KB)

1 个赞

请问,是打算下线两个tikv吗? 看拓扑总共只有3个tikv,如果下线两个tikv,集群就没法使用了。 raft协议需要有超过半数以上节点才行。

更新了拓扑图,共4个tikv,只打算下线一个 198 那个,提示下 线失败后,从面板里看到已下线了,更新的图是当前正在使用的。但目前无法reload

目前状态,无法reload,网站变得很卡

错误日志中包含

 /data/deploy/conf/tidb.toml contained unknown configuration options: log.file.log-rotate, pessimistic-txn.ttl, txn-local-latches, txn-local-latches.capacity, txn-local-latches.enabled

方便提供这几个信息吗:

  1. 是否是 ansible 导入的集群?
  2. 是否是 ansible 导入后升级的?

ansible 升级到 tidb 4.0.rc.1的 升级一周了,而且升级后已经成功缩容和扩容过一个tikv了并使用了几天了

就昨晚新增一个tikv节点时,出现了这个问题,新增后就很慢,今早发现还是很卡,就打算先下线

你方便把 tidb 的配置文件发一份吗?我们看看具体是哪个字段导致的

哪个配置文件,具体名称

把 tidb 节点的这个配置发一下,我们排查一下

这个下线缩容的节点,已经没有这个目录了。提示下线缩容失败,而且面板还是显示下线中,但这个节点下,目录已经不存在了

只有中控机上的配置

中控机也可以,中控机的配置在 ~/.tiup/storage/cluster/zhuashitidb/config 目录里面,感谢

这里有多个

只需要 tidb*.toml

在 scale-in 缩容198节点几次都是报错且reload也无法执行后,我手动删掉了这个目录下 含有 198 节点信息的文件,然后尝试reload也是失败。
现在里面的配置信息应该都是正常的

3个tidb.toml.txt (5.9 KB)

麻烦看下私信,多谢。

问题解决了,感谢官方的热情帮助:stuck_out_tongue_winking_eye:,无法 reload 的原因是作为 监控机 的那个节点, /var 所在分区磁盘满了,无法写入。清空日志有了可用硬盘空间后就可 reload 了。

1 个赞

:love_you_gesture:

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。