[2020/07/28 19:34:09.572 +09:00] [WARN] [backoff.go:319] [“pdRPC backoffer.maxSleep 20000ms is exceeded, errors:\
loadStore from PD failed, id: 291001, err: rpc error: code = Unknown desc = invalid store ID 291001, not found at 2020-07-28T19:34:05.198024208+09:00\
loadStore from PD failed, id: 291001, err: rpc error: code = Unknown desc = invalid store ID 291001, not found at 2020-07-28T19:34:06.836664819+09:00\
loadStore from PD failed, id: 291001, err: rpc error: code = Unknown desc = invalid store ID 291001, not found at 2020-07-28T19:34:09.572894318+09:00”]
[2020/07/28 19:34:09.573 +09:00] [FATAL] [session.go:1849] [“check bootstrapped failed”] [error="[tikv:9005]Region is unavailable"] [stack=“github.com/pingcap/tidb/session.getStoreBootstrapVersion\
\t/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/session/session.go:1849\
github.com/pingcap/tidb/session.BootstrapSession\
\t/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/session/session.go:1649\
main.createStoreAndDomain\
\t/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/tidb-server/main.go:295\
main.main\
\t/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/tidb-server/main.go:181\
runtime.main\
\t/usr/local/go/src/runtime/proc.go:203”]
过程:
集群原始结构是三台TIKV.发生意外情况了。导致三台KV中的两台KV启动失败,有一台KV完整,参考了先缩容,再扩容的方式重建集群。重建集群后tidb启动不了
3 个 tikv 中 2 个无法启动的话,正常缩容应该肯定会失败的,因为必然有 region 过半副本丢失从而无法进行副本搬迁。建议通过修复无法启动的 tikv 的方式来重建集群。
当前现状是进行了 --force强制缩容。通过tiup扩容也完成了。但是tidb启动不起来
当前副本数不是三个的region大概还有七千多个 /home/tidb/tidb4/bin/pd-ctl -u http://10.0.110.63:2379 region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length != 3)}"|wc -l
7954