'tidb_version' is undefined

【概述】启动tidb集群,预处理时,执行ansible-playbook local_prepare.yml,报错:
TASK [local : create packages.yml] *********************************************
fatal: [localhost]: FAILED! => {“changed”: false, “msg”: “AnsibleUndefinedVariable: ‘tidb_version’ is undefined”}
to retry, use: --limit @/home/tidb/tidb-ansible-v3.0.0/local_prepare.retry

【TiDB 版本】我试过v3.0.0 和 latest,都报这个错
【附件】
inventory.ini配置如下:

TiDB Cluster Part

[tidb_servers]
172.25.132.150

[tikv_servers]
172.25.132.150
172.25.132.152
172.25.132.174

[pd_servers]
172.25.132.150
172.25.132.152
172.25.132.174

[spark_master]

[spark_slaves]

[lightning_server]

[importer_server]

Monitoring Part

prometheus and pushgateway servers

[monitoring_servers]
172.25.132.128

[grafana_servers]
172.25.132.128

node_exporter and blackbox_exporter servers

[monitored_servers]
172.25.132.150
172.25.132.152
172.25.132.174
172.25.132.128

[alertmanager_servers]
172.25.132.128

[kafka_exporter_servers]

Binlog Part

[pump_servers]

[drainer_servers]

Group variables

[pd_servers:vars]

location_labels = [“zone”,“rack”,“host”]

Global variables

[all:vars]
deploy_dir = /home/tidb/deploy

Connection

ssh via normal user

ansible_user = tidb

cluster_name = test-cluster

tidb_version = v3.0.0

process supervision, [systemd, supervise]

process_supervision = systemd

timezone = Asia/Shanghai

enable_firewalld = False

check NTP service

enable_ntpd = True
set_hostname = False

binlog trigger

enable_binlog = False

kafka cluster address for monitoring, example:

kafka_addrs = “192.168.0.11:9092,192.168.0.12:9092,192.168.0.13:9092”

kafka_addrs = “”

zookeeper address of kafka cluster for monitoring, example:

zookeeper_addrs = “192.168.0.11:2181,192.168.0.12:2181,192.168.0.13:2181”

zookeeper_addrs = “”

enable TLS authentication in the TiDB cluster

enable_tls = False

KV mode

deploy_without_tidb = False

wait for region replication complete before start tidb-server.

wait_replication = True

Optional: Set if you already have a alertmanager server.

Format: alertmanager_host:alertmanager_port

alertmanager_target = “”

grafana_admin_user = “admin”
grafana_admin_password = “admin”

Collect diagnosis

collect_log_recent_hours = 2

enable_bandwidth_limit = True

default: 10Mb/s, unit: Kbit/s

collect_bandwidth_limit = 10000

看下是否是 inventory 文件里面 tidb_version 没有定义或者定义有问题,辛苦检查下。建议上传文件,复制粘贴的信息可能失真。

初次安装建议使用5.0及以上版本,性能优化提升很大,功能也更强大~

1赞

inventory.ini (1.9 KB)
麻烦您再给看看,谢谢。
python 3.7.2

比对了一下,ini文件和初始是一致的。版本太旧了,我这边也没有可验证的环境。

看看其他同学有没有解决方案吧~

(PS:再次建议使用5.0及以上版本)

由于/home/tidb/tidb-ansible目录权限问题,改成chmod 755,就可以了
不过在集群启动的时候报错了:
[172.25.132.50]: Ansible UNREACHABLE! => playbook: bootstrap.yml; TASK: pre-ansible : disk space check - fail when disk is full; message: {“changed”: false, “msg”: “Failed to connect to the host via ssh: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).”, “unreachable”: true}
172.25.132.50,是主控机,ansible 172.19.132.50 -m ping,这个命令不通,不过ansible localhost -m ping 命令是通的
(一开始是按照3.0版本配置的,系统调整了很多,在换成5.0,还要重新配,等研究了3.0在升级,谢谢)

这个很明显是没有做互信。建议严格按照指引进行操作。

https://docs.pingcap.com/zh/tidb/v3.0/online-deployment-using-ansible#如何手工配置-ssh-互信及-sudo-免密码

我的目标机器是可以的 ,只有主控机不能,172.25.132.50是主控机
PLAY RECAP *********************************************************************
172.25.132.128 : ok=21 changed=0 unreachable=0 failed=0
172.25.132.50 : ok=0 changed=0 unreachable=1 failed=0
172.25.132.52 : ok=21 changed=0 unreachable=0 failed=0
172.25.132.74 : ok=21 changed=0 unreachable=0 failed=0
localhost : ok=3 changed=0 unreachable=0 failed=0

中控机也要做sudo免密吧

配置 tidb 用户 sudo 免密码,将 tidb ALL=(ALL) NOPASSWD: ALL 添加到文件末尾即可。

https://docs.pingcap.com/zh/tidb/v3.0/online-deployment-using-ansible#第-2-步在中控机上创建-tidb-用户并生成-ssh-key

另外部署过程中,一定要注意各步骤中的执行用户是什么,注意切换用户。