tiup cluster list 查询为空
如何查询到当前正在执行的tidb cluster的名称?
通过 tiup cluster deploy
部署过的集群才会显示在 tiup cluster list
中,你之前使用 tiup cluster deploy
部署过吗?
是用的以下命令安装的,在中控机上安装 TiDB 组件
4.1) tiup cluster deploy mytidb v4.0.8 mini.yaml --user root -p **8
-p为root的密码。
tiup cluster list查不到,tiup cluster stop mytidb也提示不对
部署的日志能否上传一下?找到对应的 deploy 命令:
JoshuadeMacBook-Pro:~ joshua$ tiup cluster audit|head
Starting component `cluster`: /Users/joshua/.tiup/components/cluster/v1.2.5/tiup-cluster audit
ID Time Command
-- ---- -------
fvJhyJsdH4T 2020-12-10T10:27:31+08:00 /Users/joshua/.tiup/components/cluster/v1.2.5/tiup-cluster audit
fvJhs7Dw4Mt 2020-12-10T10:25:53+08:00 /Users/joshua/.tiup/components/cluster/v1.2.5/tiup-cluster deploy test v4.0.8 /Users/joshua/test.yaml -p
然后 tiup cluster audit fvJhs7Dw4Mt 查看日志(fvJhs7Dw4Mt 为 deploy 命令对应的 ID)
tiup cluster audit fvxY1fHvVy7
- OPERATION TIME: 2020-12-06T00:22:49 -
/root/.tiup/components/cluster/v1.2.5/tiup-cluster deploy mytidb v4.0.0 mini.yaml --user root -p
2020-12-06T00:22:42.099+0800 INFO Execute command {“command”: “tiup cluster deploy mytidb v4.0.0 mini.yaml --user root -p”}
2020-12-06T00:22:42.105+0800 INFO Please confirm your topology:
2020-12-06T00:22:42.105+0800 WARN Attention:
2020-12-06T00:22:42.105+0800 WARN 1. If the topology is not what you expected, check your yaml file.
2020-12-06T00:22:42.105+0800 WARN 2. Please confirm there is no port/directory conflicts in same host.
2020-12-06T00:22:48.917+0800 ERROR SSHCommand {“host”: “172.19.120.84”, “port”: “22”, “cmd”: “export LANG=C; PATH=$PATH:/usr/bin:/usr/sbin sudo -H bash -c "id -u tidb > /dev/null 2>&1 || (/usr/sbin/groupadd -f tidb && /usr/sbin/useradd -m -s /bin/bash -g tidb tidb) && echo ‘tidb ALL=(ALL) NOPASSWD:ALL’ > /etc/sudoers.d/tidb"”, “error”: “ssh: handshake failed: ssh: unable to authenticate, attempted methods [none], no supported methods remain”, “stdout”: “”, “stderr”: “”}
2020-12-06T00:22:48.918+0800 ERROR SSHCommand {“host”: “172.19.120.83”, “port”: “22”, “cmd”: “export LANG=C; PATH=$PATH:/usr/bin:/usr/sbin sudo -H bash -c "id -u tidb > /dev/null 2>&1 || (/usr/sbin/groupadd -f tidb && /usr/sbin/useradd -m -s /bin/bash -g tidb tidb) && echo ‘tidb ALL=(ALL) NOPASSWD:ALL’ > /etc/sudoers.d/tidb"”, “error”: “ssh: handshake failed: ssh: unable to authenticate, attempted methods [none], no supported methods remain”, “stdout”: “”, “stderr”: “”}
2020-12-06T00:22:48.918+0800 ERROR SSHCommand {“host”: “172.19.120.85”, “port”: “22”, “cmd”: “export LANG=C; PATH=$PATH:/usr/bin:/usr/sbin sudo -H bash -c "id -u tidb > /dev/null 2>&1 || (/usr/sbin/groupadd -f tidb && /usr/sbin/useradd -m -s /bin/bash -g tidb tidb) && echo ‘tidb ALL=(ALL) NOPASSWD:ALL’ > /etc/sudoers.d/tidb"”, “error”: “ssh: handshake failed: ssh: unable to authenticate, attempted methods [none], no supported methods remain”, “stdout”: “”, “stderr”: “”}
2020-12-06T00:22:48.918+0800 INFO Execute command finished {“code”: 1, “error”: “task.env_init.failed: Failed to initialize TiDB environment on remote host ‘172.19.120.84’, cause: module.user.user_add_failed: Failed to create new system user ‘tidb’ on remote host, cause: executor.ssh.execute_failed: Failed to execute command over SSH for ‘root@172.19.120.84:22’ {ssh_stderr: , ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/usr/bin:/usr/sbin sudo -H bash -c "id -u tidb > /dev/null 2>&1 || (/usr/sbin/groupadd -f tidb && /usr/sbin/useradd -m -s /bin/bash -g tidb tidb) && echo ‘tidb ALL=(ALL) NOPASSWD:ALL’ > /etc/sudoers.d/tidb"}, cause: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none], no supported methods remain”, “errorVerbose”: “task.env_init.failed: Failed to initialize TiDB environment on remote host ‘172.19.120.84’, cause: module.user.user_add_failed: Failed to create new system user ‘tidb’ on remote host, cause: executor.ssh.execute_failed: Failed to execute command over SSH for ‘root@172.19.120.84:22’ {ssh_stderr: , ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/usr/bin:/usr/sbin sudo -H bash -c "id -u tidb > /dev/null 2>&1 || (/usr/sbin/groupadd -f tidb && /usr/sbin/useradd -m -s /bin/bash -g tidb tidb) && echo ‘tidb ALL=(ALL) NOPASSWD:ALL’ > /etc/sudoers.d/tidb"}, cause: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none], no supported methods remain
at github.com/pingcap/tiup/pkg/cluster/executor.(*EasySSHExecutor).Execute()
\tgithub.com/pingcap/tiup@/pkg/cluster/executor/ssh.go:153
at github.com/pingcap/tiup/pkg/cluster/module.(*UserModule).Execute()
\tgithub.com/pingcap/tiup@/pkg/cluster/module/user.go:126
at github.com/pingcap/tiup/pkg/cluster/task.(*EnvInit).execute()
\tgithub.com/pingcap/tiup@/pkg/cluster/task/env_init.go:67
at github.com/pingcap/tiup/pkg/cluster/task.(*EnvInit).Execute()
\tgithub.com/pingcap/tiup@/pkg/cluster/task/env_init.go:46
at github.com/pingcap/tiup/pkg/cluster/task.(*Serial).Execute()
\tgithub.com/pingcap/tiup@/pkg/cluster/task/task.go:191
at github.com/pingcap/tiup/pkg/cluster/task.(*StepDisplay).Execute()
\tgithub.com/pingcap/tiup@/pkg/cluster/task/step.go:85
at github.com/pingcap/tiup/pkg/cluster/task.(*Parallel).Execute.func1()
\tgithub.com/pingcap/tiup@/pkg/cluster/task/task.go:236
at runtime.goexit()
\truntime/asm_amd64.s:1357”}
虽然有报错,但是cluster是部署起来了。 运行连接正常的。
navigate连接数据库,也是正常的。
这个集群看起来不是 TiUP 部署的,麻烦看下 /etc/systemd/system/tikv-20160.service 的内容
[Unit]
Description=tikv service
After=syslog.target network.target remote-fs.target nss-lookup.target
[Service]
LimitNOFILE=1000000
LimitSTACK=10485760
User=tidb
ExecStart=/tidb-deploy/tikv-20160/scripts/run_tikv.sh
Restart=always
RestartSec=15s
[Install]
WantedBy=multi-user.target
在 tivk 的机器上 head 一下日志看看启动信息呢?
head /tidb-deploy/tikv-20160/log/tikv.log
[root@iZuf6ikybnkbb4w1xvsmfxZ soft]# head /tidb-deploy/tikv-20160/log/tikv.log
[2020/12/10 10:59:11.477 +08:00] [INFO] [gc_manager.rs:456] [“gc_worker: finished auto gc”] [processed_regions=348]
[2020/12/10 11:06:04.248 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://172.19.120.85:2379]
[2020/12/10 11:06:04.248 +08:00] [INFO] [] [“New connected subchannel at 0x7fe029269480 for subchannel 0x7fe0292e3680”]
[2020/12/10 11:06:04.250 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://172.19.120.84:2379]
[2020/12/10 11:06:04.251 +08:00] [INFO] [util.rs:484] [“connected to PD leader”] [endpoints=http://172.19.120.84:2379]
[2020/12/10 11:06:04.251 +08:00] [INFO] [util.rs:190] [“heartbeat sender and receiver are stale, refreshing …”]
[2020/12/10 11:06:04.252 +08:00] [WARN] [util.rs:209] [“updating PD client done”] [spend=4.052198ms]
[2020/12/10 11:06:04.252 +08:00] [INFO] [client.rs:433] [“cancel region heartbeat sender”]
[2020/12/10 11:09:11.502 +08:00] [INFO] [gc_manager.rs:416] [“gc_worker: start auto gc”] [safe_point=421414574195474432]
[2020/12/10 11:09:11.960 +08:00] [INFO] [gc_manager.rs:456] [“gc_worker: finished auto gc”] [processed_regions=348]
[root@iZuf6ikybnkbb4w1xvsmfxZ soft]# dir /tidb-deploy/
monitor-9100 pd-2379 tidb-4000 tikv-20160
看起来这不是文件的开头,grep -r “Welcome to TiKV” /tidb-deploy/tikv-20160/log/ 看看
从 audit log 上看并没有执行过 tiup cluster start mytidb 命令,所以基本上确定这个集群不是 tiup 拉起来的。是不是有其他地方在控制这个集群?
tiup cluster start 确实是这个命令启的,控制的其他 2台服务器。
[root@iZuf6ikybnkbb4w1xvsmfvZ /]# grep -r “Welcome” /tidb-deploy/tikv-20160/log/
/tidb-deploy/tikv-20160/log/tikv.log.2020-12-08-10:56:06.504514139:[2020/12/07 10:55:56.506 +08:00] [INFO] [lib.rs:92] [“Welcome to TiKV”]
这个里面日志显示 tikv 的启动时间是 12 月 8 日,但是上传的 audit log 里没有执行过 tiup cluster start,执行了一次 restart 的时间是 12 月 10 号(并且这个 restart 命令是不对的,启动不了集群),说明集群不是由这台机器控制的:
目前能得出的结论是:你看到正在运行的这个集群不是由你执行 tiup cluster
这台机器的 root 用户部署的,所以请排查两方面:
-
确认执行
tiup cluster
的机器是该集群的中控机 -
确认 root 用户是部署该集群的用户
中控机怎么可能有 welcome信息呢?
为什么要在中控机上找启动信息呢? tidb一般是 tidb主机启动啊,中控只是负责控制
但是事实我是在中控上启动了2台 tidb服务器,现在也是运行的。
我不清楚为什么找不到 cluster name
你说可以正常访问的那个 tidb 集群,是运行在 tidb227
上的,是吧?
我看你的 Xshell 第一个标签 web62_tb_中控
这个机器应该是中控机,跟 tidb227
是不同的机器,对吧?
可是你排查问题,执行 tiup cluster audit
的机器,好像也是 tidb227
并不是 web62_tb_中控
感觉是你搞错了。
你可以在你的中控机器上,看看这个目录是什么内容
ls $TIUP_HOME/storage/cluster/clusters
我的意思是上 TiKV 的那台机器去执行这个 grep 命令,没说是在中控机上执行哦,并且你前面的回复里也去 TiKV 的机器上执行了这个命令,并且拿到了结果,根据结果判断,你认为的 “中控机” 并不是实际的中控机,需要进一步判断是否是用其他用户执行了 tiup cluster deploy
第 4 步:在中控机上安装 TiDB 组件
4.1) tiup cluster deploy mytidb v4.0.8 mini.yaml --user root -p **8
-p为root的密码。
4.2) 启动tidb
tiup cluster start mytidb
这是我的笔记,我当时是这么运行和启动的