TiKV 节点无法下线,leader_count 和 region_count 已经为 0

另外针对上述两个 region,只需要 unsafe recover 这两个 region 就行,使用 -r 不需要使用 --all-regions

谢谢啦; 但是清理报错,实例已经关机

./tikv-ctl --data-dir /data1/tidb-deploy/data/tikv-20160/db/ unsafe-recover remove-fail-stores -s 89241 -r 18201,18277
[2021/10/15 06:41:17.343 +00:00] [INFO] [mod.rs:118] [“encryption: none of key dictionary and file dictionary are found.”]
[2021/10/15 06:41:17.343 +00:00] [INFO] [mod.rs:479] [“encryption is disabled.”]
[2021/10/15 06:41:17.345 +00:00] [WARN] [config.rs:587] [“compaction guard is disabled due to region info provider not available”]
[2021/10/15 06:41:17.345 +00:00] [WARN] [config.rs:682] ["compaction guard is disabled due to region info provider not available"thread ']main
’ panicked at ‘called Result::unwrap() on an Err value: Os { code: 2, kind: NotFound, message: “No such file or directory” }’, cmd/tikv-ctl/src/main.rs:121:57
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

实例leader迁走,关机执行还是报错哈,请教

./tikv-ctl --data-dir /data1/tidb-deploy/data/tikv-20160/db/ unsafe-recover remove-fail-stores -s 89241 -r 18201,18277
[2021/10/15 06:41:17.343 +00:00] [INFO] [mod.rs:118] [“encryption: none of key dictionary and file dictionary are found.”]
[2021/10/15 06:41:17.343 +00:00] [INFO] [mod.rs:479] [“encryption is disabled.”]
[2021/10/15 06:41:17.345 +00:00] [WARN] [config.rs:587] [“compaction guard is disabled due to region info provider not available”]
[2021/10/15 06:41:17.345 +00:00] [WARN] [config.rs:682] ["compaction guard is disabled due to region info provider not available"thread ']main
’ panicked at ‘called Result::unwrap() on an Err value: Os { code: 2, kind: NotFound, message: “No such file or directory” }’, cmd/tikv-ctl/src/main.rs:121:57
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

你这里是关闭了整个集群还是 tikv 节点?

单个tikv节点
关闭整个集群有点过了吧?

unsafe recover 需要关闭全部的 tikv 节点(pd 节点不要关闭),依次在所有的 tikv 节点上执行,本身就是高危操作,所以叫做 unsafe 。。。 生产环境请慎重,你这操作有点太随意了 :joy:

https://docs.pingcap.com/zh/tidb/stable/tikv-control#强制-region-从多副本失败状态恢复服务

强制 Region 从多副本失败状态恢复服务

unsafe-recover remove-fail-stores 命令可以将故障机器从指定 Region 的 peer 列表中移除。运行命令之前,需要目标 TiKV 先停掉服务以便释放文件锁。

看文章的意思只是关闭需要操作的实例上面的服务以便释放文件锁,而不是关闭整个集群哈;麻烦确认下?

否则就等你们的在线修复的上线吧,希望10月份这版不要再跳票了哦

还有, 麻烦再请问下; 其他节点上没有故障节点上的region 也需要关机操作吗?

全部 tikv 节点。。。

好吧,那谢谢了

请问最后解决了吗?碰到了同样的问题

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。