单机启动集群tiup cluster start 报错: retry error: operation timed out after 1m0s

执行: tiup cluster start cluster-tidb

Starting cluster cluster-tidb...
+ [ Serial ] - SSHKeySet: privateKey=/root/.tiup/storage/cluster/clusters/cluster-tidb/ssh/id_rsa, publicKey=/root/.tiup/storage/cluster/clusters/cluster-tidb/ssh/id_rsa.pub
+ [Parallel] - UserSSH: user=tidb, host=192.168.11.26
+ [Parallel] - UserSSH: user=tidb, host=192.168.11.26
+ [Parallel] - UserSSH: user=tidb, host=192.168.11.26
+ [Parallel] - UserSSH: user=tidb, host=192.168.11.26
+ [Parallel] - UserSSH: user=tidb, host=192.168.11.26
+ [Parallel] - UserSSH: user=tidb, host=192.168.11.26
+ [Parallel] - UserSSH: user=tidb, host=192.168.11.26
+ [Parallel] - UserSSH: user=tidb, host=192.168.11.26
+ [ Serial ] - ClusterOperate: operation=StartOperation, options={Roles:[] Nodes:[] Force:false SSHTimeout:5 OptTimeout:60 APITimeout:300}
Starting component pd
	Starting instance pd 192.168.11.26:2379
	Start pd 192.168.11.26:2379 success
Starting component node_exporter
	Starting instance 192.168.11.26
	Start 192.168.11.26 success
Starting component blackbox_exporter
	Starting instance 192.168.11.26
	Start 192.168.11.26 success
Starting component tikv
	Starting instance tikv 192.168.11.26:20162
	Starting instance tikv 192.168.11.26:20160
	Starting instance tikv 192.168.11.26:20161
retry error: operation timed out after 1m0s
	tikv 192.168.11.26:20162 failed to start: timed out waiting for port 20162 to be started after 1m0s, please check the log of the instance
retry error: operation timed out after 1m0s
	tikv 192.168.11.26:20161 failed to start: timed out waiting for port 20161 to be started after 1m0s, please check the log of the instance

Run Command Timeout!

Error: failed to start: failed to start tikv: tikv 192.168.11.26:20162 failed to start: timed out waiting for port 20162 to be started after 1m0s, please check the log of the instance: timed out waiting for port 20162 to be started after 1m0s

Verbose debug logs has been written to /tidb-deploy/logs/tiup-cluster-debug-2020-06-01-17-23-22.log. Error: run /root/.tiup/components/cluster/v1.0.0/cluster (wd:/root/.tiup/data/S0eNeLy) failed: exit status 1 [root@openstack-controller tidb-deploy]#

报错日志:tiup-cluster-debug-2020-06-01-17-23-22.log.log (614.4 KB)

参考文档是:https://pingcap.com/docs-cn/stable/quick-start-with-tidb/ 第三种:使用 TiUP cluster 在单机上模拟生产环境部署步骤

  1. 请上传 tikv 192.168.11.26:20161 的 tikv 日志,多谢。
  2. 请问有使用虚拟机吗? 比如vmware之类的?

是的 使用vmware虚拟机

请查收tikv.log

tikv.rar (165.7 KB)

1.查看日志报错

[2020/06/01 16:41:22.000 +08:00] [INFO] [util.rs:358] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 12-UNIMPLEMENTED, details: Some(“unknown service pdpb.PD”) }))”] [endpoints=192.168.11.26:2379] [2020/06/01 16:41:22.000 +08:00] [WARN] [client.rs:56] [“validate PD endpoints failed”] [err=“Other(”[components/pd_client/src/util.rs:389]: PD cluster failed to respond")"]

  1. 请问您的配置文件是修改了什么名称吗? pdpb 是修改了哪里? 参考文档部署即可。

您好!你说的配置文件具体是指哪个文件? 我是按照官方文档一步步部署的。
唯一修改的地方是开始,

  1. 修改 /etc/ssh/sshd_configMaxSessions 调至 20。
    image

其他,没有做特殊修改

  1. 麻烦上传下您的 yaml 文件,多谢。
  2. 执行 tiup cluster display 看下
  3. 把这个 maxsessions 值改为原值试试,多谢

yaml文件请查收:topo.yaml (1.0 KB)

maxsessions改回并重启sshd后,依然报错: Starting component tikv Starting instance tikv 192.168.11.26:20162 Starting instance tikv 192.168.11.26:20160 Starting instance tikv 192.168.11.26:20161 retry error: operation timed out after 1m0s tikv 192.168.11.26:20162 failed to start: timed out waiting for port 20162 to be started after 1m0s, please check the log of the instance retry error: operation timed out after 1m0s tikv 192.168.11.26:20161 failed to start: timed out waiting for port 20161 to be started after 1m0s, please check the log of the instance Run Command Timeout!

Error: failed to start: failed to start tikv: tikv 192.168.11.26:20162 failed to start: timed out waiting for port 20162 to be started after 1m0s, please check the log of the instance: timed out waiting for port 20162 to be started after 1m0s

Verbose debug logs has been written to /root/logs/tiup-cluster-debug-2020-06-02-09-21-25.log. Error: run /root/.tiup/components/cluster/v1.0.0/cluster (wd:/root/.tiup/data/S0iGqIj) failed: exit status 1

display执行.log (1.8 KB)

  1. 麻烦确认下 防火墙都关闭了吗?
  2. 麻烦上传最新的 debug 日志,我确认下是否还是之前的报错,多谢。

[root@openstack-controller ~]# firewall-cmd --state
not running

tiup-cluster-debug-2020-06-02-09-21-25.log.log (611.0 KB)

你好,

辛苦上传下启动超时 tikv 节点的日志,看下是否有报错信息

请查收tikv节点的log:

download.rar (3.7 MB)

你好

display 看下集群状态,日志中显示 pd 连接失败

[root@openstack-controller log]#  tiup cluster  display  cluster-tidb
Starting component `cluster`: /root/.tiup/components/cluster/v1.0.0/cluster display cluster-tidb
TiDB Cluster: cluster-tidb
TiDB Version: v4.0.0
ID                   Role        Host           Ports                            OS/Arch       Status    Data Dir                    Deploy Dir
--                   ----        ----           -----                            -------       ------    --------                    ----------
192.168.11.26:3000   grafana     192.168.11.26  3000                             linux/x86_64  inactive  -                           /tidb-deploy/grafana-3000
192.168.11.26:2379   pd          192.168.11.26  2379/2380                        linux/x86_64  Down      /tidb-data/pd-2379          /tidb-deploy/pd-2379
192.168.11.26:9090   prometheus  192.168.11.26  9090                             linux/x86_64  inactive  /tidb-data/prometheus-9090  /tidb-deploy/prometheus-9090
192.168.11.26:4000   tidb        192.168.11.26  4000/10080                       linux/x86_64  Down      -                           /tidb-deploy/tidb-4000
192.168.11.26:9000   tiflash     192.168.11.26  9000/8123/3930/20170/20292/8234  linux/x86_64  Down      /tidb-data/tiflash-9000     /tidb-deploy/tiflash-9000
192.168.11.26:20160  tikv        192.168.11.26  20160/20180                      linux/x86_64  Down      /tidb-data/tikv-20160       /tidb-deploy/tikv-20160
192.168.11.26:20161  tikv        192.168.11.26  20161/20181                      linux/x86_64  Down      /tidb-data/tikv-20161       /tidb-deploy/tikv-20161
192.168.11.26:20162  tikv        192.168.11.26  20162/20182                      linux/x86_64  Down      /tidb-data/tikv-20162       /tidb-deploy/tikv-20162
[root@openstack-controller log]#

PDl log请查收:
pd.rar (367.5 KB)