Error: failed to start tikv: failed to start: tikv 10.10.1.52:20160, please check the instance's log(/tidb-deploy/tikv-20160/log) for more detail.: timed out waiting for port 20160 to be started after 2m0s

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:

【TiDB 版本】
4.0.0,5.0.0,5.0.1
【问题描述】
Error: failed to start tikv: failed to start: tikv 10.10.1.52:20160, please check the instance’s log(/tidb-deploy/tikv-20160/log) for more detail.: timed out waiting for port 20160 to be started after 2m0s
这类主题的回答都看过了,不存在提到的几个问题。服务器版本Ubuntu 16.04.1 LTS
tikv日志附件
tikv.log (746.8 KB)

若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

tikv 日志存在版本不兼容的报错,请确保所有的 tikv 版本一致,且升级到 5.0 版本后不支持降级
[err=“Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(“version should compatible with version 5.0.0, got 4.0.0”) }))”]

1赞

感谢回复,我又替换回5.0.1了,是另外的报错

原版本我通过tiup destroy销毁,然后killall所有tidb进程,删掉tidb部署目录,重新部署5.0.1报错如下
Error: failed to start tidb: failed to start: tidb 10.10.1.52:4000, please check the instance’s log(/tidb-deploy/tidb-4000/log) for more detail.: timed out waiting for port 4000 to be started after 2m0s

Verbose debug logs has been written to /root/.tiup/logs/tiup-cluster-debug-2021-05-29-22-13-33.log.
Error: run /root/.tiup/components/cluster/v1.4.4/tiup-cluster (wd:/root/.tiup/data/SYrAAvG) failed: exit status 1

根据报错内容(/tidb-deploy/tidb-4000/log) 日志内不存在任何日志。重新穿上tikv的日志。

tikv.log (31.2 KB)

看报错是 TiDB 这边没有正常启动,可以先看看 tiup 的日志以及 tidb 的日志有没有相关的报错。

2021-05-29T23:53:13.294-0400 DEBUG retry error: operation timed out after 2m0s
2021-05-29T23:53:13.294-0400 DEBUG TaskFinish {“task”: “StartCluster”, “error”: “failed to start tidb: failed to start: tidb 10.10.1.52:4000, please check the instance’s log(/tidb-deploy/tidb-4000/log) for more detail.: timed out waiting for port 4000 to be started after 2m0s”, “errorVerbose”: “timed out waiting for port 4000 to be started after 2m0s\ngithub.com/pingcap/tiup/pkg/cluster/module.(*WaitFor).Execute\n\tgithub.com/pingcap/tiup/pkg/cluster/module/wait_for.go:91\ngithub.com/pingcap/tiup/pkg/cluster/spec.PortStarted\n\tgithub.com/pingcap/tiup/pkg/cluster/spec/instance.go:114\ngithub.com/pingcap/tiup/pkg/cluster/spec.(*BaseInstance).Ready\n\tgithub.com/pingcap/tiup/pkg/cluster/spec/instance.go:145\ngithub.com/pingcap/tiup/pkg/cluster/operation.startInstance\n\tgithub.com/pingcap/tiup/pkg/cluster/operation/action.go:363\ngithub.com/pingcap/tiup/pkg/cluster/operation.StartComponent.func1\n\tgithub.com/pingcap/tiup/pkg/cluster/operation/action.go:484\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20210220032951-036812b2e83c/errgroup/errgroup.go:57\nruntime.goexit\n\truntime/asm_amd64.s:1371\nfailed to start: tidb 10.10.1.52:4000, please check the instance’s log(/tidb-deploy/tidb-4000/log) for more detail.\nfailed to start tidb”}
2021-05-29T23:53:13.294-0400 INFO Execute command finished {“code”: 1, “error”: “failed to start tidb: failed to start: tidb 10.10.1.52:4000, please check the instance’s log(/tidb-deploy/tidb-4000/log) for more detail.: timed out waiting for port 4000 to be started after 2m0s”, “errorVerbose”: “timed out waiting for port 4000 to be started after 2m0s\ngithub.com/pingcap/tiup/pkg/cluster/module.(*WaitFor).Execute\n\tgithub.com/pingcap/tiup/pkg/cluster/module/wait_for.go:91\ngithub.com/pingcap/tiup/pkg/cluster/spec.PortStarted\n\tgithub.com/pingcap/tiup/pkg/cluster/spec/instance.go:114\ngithub.com/pingcap/tiup/pkg/cluster/spec.(*BaseInstance).Ready\n\tgithub.com/pingcap/tiup/pkg/cluster/spec/instance.go:145\ngithub.com/pingcap/tiup/pkg/cluster/operation.startInstance\n\tgithub.com/pingcap/tiup/pkg/cluster/operation/action.go:363\ngithub.com/pingcap/tiup/pkg/cluster/operation.StartComponent.func1\n\tgithub.com/pingcap/tiup/pkg/cluster/operation/action.go:484\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20210220032951-036812b2e83c/errgroup/errgroup.go:57\nruntime.goexit\n\truntime/asm_amd64.s:1371\nfailed to start: tidb 10.10.1.52:4000, please check the instance’s log(/tidb-deploy/tidb-4000/log) for more detail.\nfailed to start tidb”}
tiup的日志是这样的,让看/tidb-deploy/tidb-4000/log,但是跟上面提到一样,这里面没有日志

检查下 tidb deploy 目录下是否有完整的二进制程序、启动脚本、配置文件,先尝试手动 systemctl start 拉起服务或 tiup cluster start -N 指定该节点启动

root@ubuntu:/tidb-deploy/tidb-4000# tree *
bin
└── tidb-server
conf
└── tidb.toml
log
└── tidb_stderr.log
scripts
└── run_tidb.sh

0 directories, 4 files
这个目录/tidb-deploy/tidb-4000下,文件结构如上

单独启动该节点如下:
root@ubuntu:/tidb-deploy/tidb-4000# tiup cluster start -N tidb-test 10.10.1.52:4000
Starting component cluster: /root/.tiup/components/cluster/v1.4.4/tiup-cluster start -N tidb-test 10.10.1.52:4000
Starting cluster 10.10.1.52:4000…

Error: tidb cluster 10.10.1.52:4000 not exists

Verbose debug logs has been written to /root/.tiup/logs/tiup-cluster-debug-2021-06-01-11-27-10.log.
Error: run /root/.tiup/components/cluster/v1.4.4/tiup-cluster (wd:/root/.tiup/data/SZ3AIoY) failed: exit status 1
/root/.tiup/logs/tiup-cluster-debug-2021-06-01-11-27-10.log日志内容如下:

2021-06-01T11:27:10.764+0800 INFO Execute command {“command”: “tiup cluster start -N tidb-test 10.10.1.52:4000”}
2021-06-01T11:27:10.764+0800 DEBUG Environment variables {“env”: [“TIUP_HOME=/root/.tiup”, “TIUP_WORK_DIR=/tidb-deploy/tidb-4000”, “TIUP_USER_INPUT_VERSION=”, “TIUP_VERSION=1.4.4”, “TIUP_INSTANCE_DATA_DIR=/root/.tiup/data/SZ3AIoY”, “TIUP_COMPONENT_DATA_DIR=/root/.tiup/storage/cluster”, “TIUP_COMPONENT_INSTALL_DIR=/root/.tiup/components/cluster/v1.4.4”, “TIUP_TELEMETRY_STATUS=enable”, “TIUP_TELEMETRY_UUID=6f3c3d33-48d7-497e-a837-6b3c59222d0d”, “TIUP_TELEMETRY_SECRET=c0754397834f749ef6cda539be8a9679”, “TIUP_TAG=SZ3AIoY”, “XDG_SESSION_ID=187”, “TERM=xterm”, “SHELL=/bin/bash”, “SSH_CLIENT=122.224.228.133 7357 22”, “SSH_TTY=/dev/pts/0”, “USER=root”, “LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.Z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.jpg=01;35:.jpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.ogv=01;35:.ogx=01;35:.aac=00;36:.au=00;36:.flac=00;36:.m4a=00;36:.mid=00;36:.midi=00;36:.mka=00;36:.mp3=00;36:.mpc=00;36:.ogg=00;36:.ra=00;36:.wav=00;36:.oga=00;36:.opus=00;36:.spx=00;36:.xspf=00;36:”, “MAIL=/var/mail/root”, “PATH=/root/.tiup/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin”, “PWD=/tidb-deploy/tidb-4000”, “LANG=en_US.UTF-8”, “SHLVL=1”, “HOME=/root”, “LOGNAME=root”, “XDG_DATA_DIRS=/usr/local/share:/usr/share:/var/lib/snapd/desktop”, “SSH_CONNECTION=122.224.228.133 7357 10.10.1.52 22”, “LESSOPEN=| /usr/bin/lesspipe %s”, “XDG_RUNTIME_DIR=/run/user/0”, “LESSCLOSE=/usr/bin/lesspipe %s %s”, “_=/root/.tiup/bin/tiup”, “OLDPWD=/tidb-deploy/tidb-4000/log”, “TIUP_TELEMETRY_EVENT_UUID=dd558567-dcca-4a41-a9e5-86d551fd54db”, “TIUP_MIRRORS=https://tiup-mirrors.pingcap.com”]}
2021-06-01T11:27:10.769+0800 INFO Starting cluster 10.10.1.52:4000…
2021-06-01T11:27:10.770+0800 INFO Execute command finished {“code”: 1, “error”: “tidb cluster 10.10.1.52:4000 not exists”, “errorVerbose”: “tidb cluster 10.10.1.52:4000 not exists\ngithub.com/pingcap/tiup/pkg/cluster/manager.(*Manager).meta\n\tgithub.com/pingcap/tiup/pkg/cluster/manager/manager.go:65\ngithub.com/pingcap/tiup/pkg/cluster/manager.(*Manager).StartCluster\n\tgithub.com/pingcap/tiup/pkg/cluster/manager/basic.go:81\ngithub.com/pingcap/tiup/components/cluster/command.newStartCmd.func1\n\tgithub.com/pingcap/tiup/components/cluster/command/start.go:39\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v1.1.3/command.go:852\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v1.1.3/command.go:960\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/cobra@v1.1.3/command.go:897\ngithub.com/pingcap/tiup/components/cluster/command.Execute\n\tgithub.com/pingcap/tiup/components/cluster/command/root.go:264\nmain.main\n\tgithub.com/pingcap/tiup/components/cluster/main.go:23\nruntime.main\n\truntime/proc.go:225\nruntime.goexit\n\truntime/asm_amd64.s:1371”}

tiup cluster start tidb-test -N 10.10.1.52:4000

报错依然是:
Error: failed to start tidb: failed to start: tidb 10.10.1.52:4000, please check the instance’s log(/tidb-deploy/tidb-4000/log) for more detail.: timed out waiting for port 4000 to be started after 2m0s

Verbose debug logs has been written to /root/.tiup/logs/tiup-cluster-debug-2021-06-01-18-58-23.log.
Error: run /root/.tiup/components/cluster/v1.4.4/tiup-cluster (wd:/root/.tiup/data/SZ4zM9s) failed: exit status 1
/tidb-deploy/tidb-4000/log下日志无内容

登录到 10.10.1.52 上面尝试执行下 systemctl start tidb-4000.service,看下 /var/log/message 日志有没有报错

用的是ubuntu16.04,但是没有/var/log/message日志,dmesg也没有生成相关日志

或者 journalctl -u tidb-4000.service 看看这个日志吧,另外检查下 tidb-server -V 是不是正常

Hi~ 麻烦登陆到报错机器的 deploy 目录下(使用 tidb 用户进入该目录),然后

cp script/run_tidb.sh /tmp
vim /tmp/run_tidb.sh,将最后一行 --log* 删除,倒数第二行末尾的 \ 删除
然后手动启动 tidb: sh /tmp/run_tidb.sh

谢谢,这种方式能启动,剩下tiflash节点是否也采用类似操作,以及怎样让这类动作常规化

这个操作是为了看为啥不能启动(这样如果不能启动会把报错打在屏幕上),如果手动能启动,用 systemd 却不能的话,看一下 journalctl -u tidb-4000.service 的最后输出是啥

最后输出如下,而且短时间看tidb-4000是up状态,几小时后又变为down
Jun 02 23:32:15 ubuntu systemd[1]: tidb-4000.service: Service hold-off time over, scheduling restart.
Jun 02 23:32:15 ubuntu systemd[1]: Stopped tidb service.
Jun 02 23:32:15 ubuntu systemd[1]: Started tidb service.
Jun 02 23:32:15 ubuntu systemd[1]: tidb-4000.service: Main process exited, code=exited, status=1/FAILURE
Jun 02 23:32:15 ubuntu systemd[1]: tidb-4000.service: Unit entered failed state.
Jun 02 23:32:15 ubuntu systemd[1]: tidb-4000.service: Failed with result ‘exit-code’.
Jun 02 23:32:30 ubuntu systemd[1]: tidb-4000.service: Service hold-off time over, scheduling restart.
Jun 02 23:32:30 ubuntu systemd[1]: Stopped tidb service.
Jun 02 23:32:30 ubuntu systemd[1]: Started tidb service.
Jun 02 23:32:30 ubuntu systemd[1]: tidb-4000.service: Main process exited, code=exited, status=1/FAILURE
Jun 02 23:32:30 ubuntu systemd[1]: tidb-4000.service: Unit entered failed state.
Jun 02 23:32:30 ubuntu systemd[1]: tidb-4000.service: Failed with result ‘exit-code’.
Jun 02 23:32:45 ubuntu systemd[1]: tidb-4000.service: Service hold-off time over, scheduling restart.
Jun 02 23:32:45 ubuntu systemd[1]: Stopped tidb service.
Jun 02 23:32:45 ubuntu systemd[1]: Started tidb service.
Jun 02 23:32:45 ubuntu systemd[1]: tidb-4000.service: Main process exited, code=exited, status=1/FAILURE
Jun 02 23:32:45 ubuntu systemd[1]: tidb-4000.service: Unit entered failed state.
Jun 02 23:32:45 ubuntu systemd[1]: tidb-4000.service: Failed with result ‘exit-code’.

那这几个小时内 tidb 有日志输出吗?

没有任何输出

启动以后,TiDB Server 有业务接入吗 ?是重启 ?还是直接 down 了 ?