TiDB集群无法启动,failed to start tikv

【 TiDB 使用环境】
【概述】:虚拟机搭建的集群,日志写满导致,移动目录,重启集群,异常。
【背景】:磁盘写满,然后停止集群,移动log目录tidb-deploy并创建软链接,中途网络异常中断。+ 重新移动log目录之后,再启动tidb集群就报错。
【现象】:错误信息如下:


failed to start tikv: failed to start: 172.16.100.13 tikv-20160.service, please check the instance’s log(/tidb-deploy/tikv-20160/log) for more detail.: timed out waiting for port 20160 to be started after 2m0s
【问题】:3台tikv,有2台无法启动
【业务影响】:集群无法使用
【TiDB 版本】:v5.1.0
【附件】:

  • 相关日志
    [2021/10/08 20:37:09.361 +08:00] [ERROR] [raft_client.rs:407] [“connection aborted”] [addr=172.16.100.13:20160] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: [] }))”] [sink_error=Some(RemoteStopped)] [store_id=1]
    [2021/10/08 20:37:09.361 +08:00] [ERROR] [raft_client.rs:707] [“connection abort”] [addr=172.16.100.13:20160] [store_id=1]
    [2021/10/08 20:37:09.364 +08:00] [ERROR] [raft_client.rs:407] [“connection aborted”] [addr=172.16.100.11:20160] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: [] }))”] [sink_error=Some(RemoteStopped)] [store_id=5]
    [2021/10/08 20:37:09.364 +08:00] [ERROR] [raft_client.rs:707] [“connection abort”] [addr=172.16.100.11:20160] [store_id=5]
    [2021/10/08 20:37:14.364 +08:00] [ERROR] [raft_client.rs:407] [“connection aborted”] [addr=172.16.100.13:20160] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: [] }))”] [sink_error=Some(RemoteStopped)] [store_id=1]
    [2021/10/08 20:37:14.366 +08:00] [ERROR] [raft_client.rs:707] [“connection abort”] [addr=172.16.100.13:20160] [store_id=1]
    [2021/10/08 20:37:14.367 +08:00] [ERROR] [raft_client.rs:407] [“connection aborted”] [addr=172.16.100.11:20160] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: [] }))”] [sink_error=Some(RemoteStopped)] [store_id=5]
    [2021/10/08 20:37:14.368 +08:00] [ERROR] [raft_client.rs:707] [“connection abort”] [addr=172.16.100.11:20160] [store_id=5]
    [2021/10/08 20:37:19.369 +08:00] [ERROR] [raft_client.rs:407] [“connection aborted”] [addr=172.16.100.13:20160] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: [] }))”] [sink_error=Some(RemoteStopped)] [store_id=1]
    [2021/10/08 20:37:19.369 +08:00] [ERROR] [raft_client.rs:707] [“connection abort”] [addr=172.16.100.13:20160] [store_id=1]
    [2021/10/08 20:37:19.373 +08:00] [ERROR] [raft_client.rs:407] [“connection aborted”] [addr=172.16.100.11:20160] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: [] }))”] [sink_error=Some(RemoteStopped)] [store_id=5]
    [2021/10/08 20:37:19.373 +08:00] [ERROR] [raft_client.rs:707] [“connection abort”] [addr=172.16.100.11:20160] [store_id=5]
    [2021/10/08 20:37:24.374 +08:00] [ERROR] [raft_client.rs:407] [“connection aborted”] [addr=172.16.100.13:20160] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: [] }))”] [sink_error=Some(RemoteStopped)] [store_id=1]
    [2021/10/08 20:37:24.374 +08:00] [ERROR] [raft_client.rs:707] [“connection abort”] [addr=172.16.100.13:20160] [store_id=1]
    [2021/10/08 20:37:24.377 +08:00] [ERROR] [raft_client.rs:407] [“connection aborted”] [addr=172.16.100.11:20160] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: [] }))”] [sink_error=Some(RemoteStopped)] [store_id=5]
    [2021/10/08 20:37:24.377 +08:00] [ERROR] [raft_client.rs:707] [“connection abort”] [addr=172.16.100.11:20160] [store_id=5]
    [2021/10/08 20:37:29.377 +08:00] [ERROR] [raft_client.rs:407] [“connection aborted”] [addr=172.16.100.13:20160] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: [] }))”] [sink_error=Some(RemoteStopped)] [store_id=1]
    [2021/10/08 20:37:29.377 +08:00] [ERROR] [raft_client.rs:707] [“connection abort”] [addr=172.16.100.13:20160] [store_id=1]
    [2021/10/08 20:37:29.379 +08:00] [ERROR] [raft_client.rs:407] [“connection aborted”] [addr=172.16.100.11:20160] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: [] }))”] [sink_error=Some(RemoteStopped)] [store_id=5]
    [2021/10/08 20:37:29.379 +08:00] [ERROR] [raft_client.rs:707] [“connection abort”] [addr=172.16.100.11:20160] [store_id=5]

  • 配置文件

  • Grafana 监控(https://metricstool.pingcap.com/)

调整 data 目录可以先参考一下 SOP 系列文章:

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。