关于 TIDB 大版本升级问题 v4.0.2 --> 5.0.0

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:

【TiDB 版本】 当前版本 v4.0.2

【问题描述】新建TIDB 4版本集群环境,集群环境如下:

但通过官方文档参考:
https://docs.pingcap.com/zh/tidb/stable/upgrade-tidb-using-tiup#1-升级兼容性说明

使用命令 升级 5.0.0 版本 报错,日志如下:
[tidb@Tikv03 ~]$
[tidb@Tikv03 ~]$ tiup cluster upgrade bx_tidb v5.0.0
Starting component cluster: /home/tidb/.tiup/components/cluster/v1.4.2/tiup-cluster upgrade bx_tidb v5.0.0
This operation will upgrade tidb v4.0.2 cluster bx_tidb to v5.0.0.
Do you want to continue? [y/N]:(default=N) y
Upgrading cluster…

  • [ Serial ] - SSHKeySet: privateKey=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/ssh/id_rsa, publicKey=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/ssh/id_rsa.pub
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.175
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.164
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.165
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.175
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.175
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.166
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.168
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.162
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.165
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.166
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.167
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.163
  • [Parallel] - UserSSH: user=tidb, host=10.10.101.167
  • [ Serial ] - Download: component=alertmanager, version=, os=linux, arch=amd64
  • [ Serial ] - Download: component=tidb, version=v5.0.0, os=linux, arch=amd64
  • [ Serial ] - Download: component=tikv, version=v5.0.0, os=linux, arch=amd64
  • [ Serial ] - Download: component=pd, version=v5.0.0, os=linux, arch=amd64
  • [ Serial ] - Download: component=prometheus, version=v5.0.0, os=linux, arch=amd64
  • [ Serial ] - Download: component=grafana, version=v5.0.0, os=linux, arch=amd64
  • [ Serial ] - Download: component=tiflash, version=v5.0.0, os=linux, arch=amd64
  • [ Serial ] - BackupComponent: component=alertmanager, currentVersion=v4.0.2, remote=10.10.101.175:/tidb-deploy/alertmanager-9093
  • [ Serial ] - BackupComponent: component=pd, currentVersion=v4.0.2, remote=10.10.101.167:/tidb-deploy/pd-2379
  • [ Serial ] - BackupComponent: component=tidb, currentVersion=v4.0.2, remote=10.10.101.165:/tidb-deploy/tidb-4000
  • [ Serial ] - BackupComponent: component=tiflash, currentVersion=v4.0.2, remote=10.10.101.168:/tidb-deploy/tiflash-9000
  • [ Serial ] - BackupComponent: component=tidb, currentVersion=v4.0.2, remote=10.10.101.167:/tidb-deploy/tidb-4000
  • [ Serial ] - BackupComponent: component=prometheus, currentVersion=v4.0.2, remote=10.10.101.175:/tidb-deploy/prometheus-8249
  • [ Serial ] - BackupComponent: component=tikv, currentVersion=v4.0.2, remote=10.10.101.163:/tidb-deploy/tikv-20160
  • [ Serial ] - BackupComponent: component=tikv, currentVersion=v4.0.2, remote=10.10.101.164:/tidb-deploy/tikv-20160
  • [ Serial ] - BackupComponent: component=tikv, currentVersion=v4.0.2, remote=10.10.101.162:/tidb-deploy/tikv-20160
  • [ Serial ] - BackupComponent: component=grafana, currentVersion=v4.0.2, remote=10.10.101.175:/tidb-deploy/grafana-3000
  • [ Serial ] - BackupComponent: component=tidb, currentVersion=v4.0.2, remote=10.10.101.166:/tidb-deploy/tidb-4000
  • [ Serial ] - BackupComponent: component=pd, currentVersion=v4.0.2, remote=10.10.101.165:/tidb-deploy/pd-2379
  • [ Serial ] - BackupComponent: component=pd, currentVersion=v4.0.2, remote=10.10.101.166:/tidb-deploy/pd-2379
  • [ Serial ] - CopyComponent: component=tikv, version=v5.0.0, remote=10.10.101.164:/tidb-deploy/tikv-20160 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=pd, version=v5.0.0, remote=10.10.101.167:/tidb-deploy/pd-2379 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=tidb, version=v5.0.0, remote=10.10.101.167:/tidb-deploy/tidb-4000 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=tikv, version=v5.0.0, remote=10.10.101.163:/tidb-deploy/tikv-20160 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=tikv, version=v5.0.0, remote=10.10.101.162:/tidb-deploy/tikv-20160 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=pd, version=v5.0.0, remote=10.10.101.166:/tidb-deploy/pd-2379 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=pd, version=v5.0.0, remote=10.10.101.165:/tidb-deploy/pd-2379 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=tidb, version=v5.0.0, remote=10.10.101.165:/tidb-deploy/tidb-4000 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=tidb, version=v5.0.0, remote=10.10.101.166:/tidb-deploy/tidb-4000 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=alertmanager, version=, remote=10.10.101.175:/tidb-deploy/alertmanager-9093 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=prometheus, version=v5.0.0, remote=10.10.101.175:/tidb-deploy/prometheus-8249 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=grafana, version=v5.0.0, remote=10.10.101.175:/tidb-deploy/grafana-3000 os=linux, arch=amd64
  • [ Serial ] - CopyComponent: component=tiflash, version=v5.0.0, remote=10.10.101.168:/tidb-deploy/tiflash-9000 os=linux, arch=amd64
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.167, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/pd-2379.service, deploy_dir=/tidb-deploy/pd-2379, data_dir=[/tidb-data/pd-2379], log_dir=/tidb-deploy/pd-2379/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.167, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/tidb-4000.service, deploy_dir=/tidb-deploy/tidb-4000, data_dir=[], log_dir=/tidb-deploy/tidb-4000/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.164, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/tikv-20160.service, deploy_dir=/tidb-deploy/tikv-20160, data_dir=[/tidb-data/tikv-20160], log_dir=/tidb-deploy/tikv-20160/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.175, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/alertmanager-9093.service, deploy_dir=/tidb-deploy/alertmanager-9093, data_dir=[/tidb-data/alertmanager-9093], log_dir=/tidb-deploy/alertmanager-9093/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.175, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/prometheus-9090.service, deploy_dir=/tidb-deploy/prometheus-8249, data_dir=[/tidb-data/prometheus-8249], log_dir=/tidb-deploy/prometheus-8249/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.175, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/grafana-3000.service, deploy_dir=/tidb-deploy/grafana-3000, data_dir=[], log_dir=/tidb-deploy/grafana-3000/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.165, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/pd-2379.service, deploy_dir=/tidb-deploy/pd-2379, data_dir=[/tidb-data/pd-2379], log_dir=/tidb-deploy/pd-2379/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.166, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/pd-2379.service, deploy_dir=/tidb-deploy/pd-2379, data_dir=[/tidb-data/pd-2379], log_dir=/tidb-deploy/pd-2379/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.166, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/tidb-4000.service, deploy_dir=/tidb-deploy/tidb-4000, data_dir=[], log_dir=/tidb-deploy/tidb-4000/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.165, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/tidb-4000.service, deploy_dir=/tidb-deploy/tidb-4000, data_dir=[], log_dir=/tidb-deploy/tidb-4000/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.162, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/tikv-20160.service, deploy_dir=/tidb-deploy/tikv-20160, data_dir=[/tidb-data/tikv-20160], log_dir=/tidb-deploy/tikv-20160/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.163, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/tikv-20160.service, deploy_dir=/tidb-deploy/tikv-20160, data_dir=[/tidb-data/tikv-20160], log_dir=/tidb-deploy/tikv-20160/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache
  • [ Serial ] - InitConfig: cluster=bx_tidb, user=tidb, host=10.10.101.168, path=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache/tiflash-9000.service, deploy_dir=/tidb-deploy/tiflash-9000, data_dir=[/tidb-data/tiflash-9000], log_dir=/tidb-deploy/tiflash-9000/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/bx_tidb/config-cache

Error: init config failed: 10.10.101.164:20160: executor.ssh.execute_failed: Failed to execute command over SSH for ‘tidb@10.10.101.164:22’ {ssh_stderr: unknown configuration options: log
, ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/usr/bin:/usr/sbin /tidb-deploy/tikv-20160/bin/tikv-server --config-check --config=/tidb-deploy/tikv-20160/conf/tikv.toml --pd=“”}, cause: Process exited with status 1: check config failed

Verbose debug logs has been written to /home/tidb/.tiup/logs/tiup-cluster-debug-2021-05-12-19-08-40.log.
Error: run /home/tidb/.tiup/components/cluster/v1.4.2/tiup-cluster (wd:/home/tidb/.tiup/data/SXC5E5L) failed: exit status 1
[tidb@Tikv03 ~]$

后查看日志信息:

但 小版本升级没有任何问题,版本升级后为 4.0.12
**

**

想问一下 ,4版本 到 5版本升级过程中 配置文件需要有什么修改吗? 目前参数配置基本都是默认:
topology.yaml (10.5 KB)

https://asktug.com/t/topic/68547/6

我们罗列了升级参数,可以看一下按照这个升级操作是否有问题哈

你好 我就是按上面命令跑的升级啊。。。。跑完就报错了


然后报错后,我尝试升级 小版本的 结果从 4.0.2 到 4.0.12 是没问题的!!

所以才发帖问一下tiup-cluster-debug-2021-05-13-14-34-32.log (228.1 KB)

  1. 看报错有 ssh_stderr: unknown configuration options: log
  2. 麻烦反馈下 tiup cluster edit-config <集群名> 的完整结果,多谢。

[tidb@Tikv03 ~]$
[tidb@Tikv03 ~]$ tiup cluster edit-config bx_tidb
Found cluster newer version:

The latest version:         v1.4.3
Local installed version:    v1.4.2
Update current component:   tiup update cluster
Update all components:      tiup update --all

Starting component cluster: /home/tidb/.tiup/components/cluster/v1.4.2/tiup-cluster edit-config bx_tidb

global:
user: tidb
ssh_port: 22
ssh_type: builtin
deploy_dir: /tidb_deploy
data_dir: /tidb_data
os: linux
arch: amd64
monitored:
node_exporter_port: 9100
blackbox_exporter_port: 9115
deploy_dir: /tidb_deploy/monitored_9100
data_dir: /tidb_data/monitored_9100
log_dir: /tidb_deploy/monitored_9100/log
tidb_servers:

  • host: 10.10.101.165
    ssh_port: 22
    port: 4000
    status_port: 10080
    deploy_dir: /tidb-deploy/tidb-4000
    log_dir: /tidb-deploy/tidb-4000/log
    arch: amd64
    os: linux
  • host: 10.10.101.166
    ssh_port: 22
    port: 4000
    status_port: 10080
    deploy_dir: /tidb-deploy/tidb-4000
    log_dir: /tidb-deploy/tidb-4000/log
    arch: amd64
    os: linux
  • host: 10.10.101.167
    ssh_port: 22
    port: 4000
    status_port: 10080
    deploy_dir: /tidb-deploy/tidb-4000
    log_dir: /tidb-deploy/tidb-4000/log
    arch: amd64
    os: linux
    tikv_servers:
  • host: 10.10.101.162
    ssh_port: 22
    port: 20160
    status_port: 20180
    deploy_dir: /tidb-deploy/tikv-20160
    data_dir: /tidb-data/tikv-20160
    log_dir: /tidb-deploy/tikv-20160/log
    config:
    log.level: warn
    arch: amd64
    os: linux
  • host: 10.10.101.163
    ssh_port: 22
    port: 20160
    status_port: 20180
    deploy_dir: /tidb-deploy/tikv-20160
    data_dir: /tidb-data/tikv-20160
    log_dir: /tidb-deploy/tikv-20160/log
    config:
    log.level: warn
    arch: amd64
    os: linux
  • host: 10.10.101.164
    ssh_port: 22
    port: 20160
    status_port: 20180
    deploy_dir: /tidb-deploy/tikv-20160
    data_dir: /tidb-data/tikv-20160
    log_dir: /tidb-deploy/tikv-20160/log
    config:
    “/tmp/2694761109” 159L, 3595C

帮忙看看是不是配置文件有啥问题 ?我基本没什么改参数 都是默认

这里和您确认下, log.level: warn 是. 不是- 对吧。 查看了配置文件模板,这里应该是想用 log-level = “warning”?
https://github.com/tikv/tikv/blob/v5.0.1/etc/config-template.toml

嗯 ,这个是通过命令创建 配置文件模版 改的配置文件。。

命令如下 :
tiup cluster template > topology.yaml

另外想问一下 ,现在 4 版本 和5 版本有部分参数差异,有没有类似分开的模版分别创建对应的配置文件呢?
新手,我也搞不清那些参数差异!!

在这个链接找自己想要的对应的模板,尽量可以自己参考模板,手工编辑下

好, 那问一下 重启一下集群是不是可以重新应用新的配置文件 ? 还是说像我这样的情况,需要推倒重来 ?

  1. 在 edit 里改一下参数配置就行了,或者你先把这一行参数配置去掉,试试能不能升级成功。
  2. 目前是测试环境吧? 那就可以先删了升级试试, edit 可以编辑参数

嗯 ,是的 正好有物理机用来搞一搞!!我也是想体验一下 版本升级

你好,问题已经解决 ,可以升级到 5版本了 是配置文件里面目录写错了

将:
deploy_dir: “/tidb-deploy”

# TiDB Cluster data storage directory

data_dir: “/tidb-data”

写成:
deploy_dir: “/tidb_deploy”

# TiDB Cluster data storage directory

data_dir: “/tidb_data”

其他目录还是 - 造成目录不一致!

额,升级到 4.0.12 时目录也是错的吗? 那个时候没有影响?

对的,就是 小版本升级 没有影响,然后手工 4.0.12 升级 5 版本时候有那个报错,比较奇怪。。。。

所以建议一下 配置文件如果目录什么的不一致 能不能在报错信息里面提现一下,或者干脆不能完成创建集群。

看了下,按理说目录错误,启动都会有问题…v4.0.12 也不应该升级成功。:joy:

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。