tiup cluster audit fJnrZ8TsHCB
tiup cluster audit fJnrXNnW5PL
辛苦将上述两个 audit 中的结果分别输出到两个文件中,并上传
现在的情况是 --force 缩容了 store 302321060,其进程以及 data-dir 都被清理掉了,在状态未变为 tombstone 时,然后 scale-out 了新的 store 7748745185, 因为使用了与 store 302321060 相同的 ip 和 port,导致 scale-out 后,一直处于不断重启的状态。请将 7748745185 store 的 tikv 实例 stop 掉,然后再使用下述命令检查下集群中是否存在没有 leader 的 region 信息:
1、不存在 leader 的 region
pd-ctl -u http://ip:port region --jq '.regions[]|select(has("leader")|not)|{id: .id, peer_stores: [.peers[].store_id]}'
2、使用 tikv-ctl 命令,检查下,在 store 302321060 上的 region 状态:
1)tiup ctl:v5.0.0 pd -u pd_ip:pd_port region store 302321060 获取当前其上的 region 信息:
2)根据获取的 region 信息,使用 tikv-ctl 检查 3~4 个 region 的状态:
比如获取的 region 为:region 451990729,451990890,451990991:
{
"id": 451990729,
"start_key": "7480000000000000FFBC5F698000000000FF0000010419AAA0FAFFC000000003800000FF02EC537D9A000000FC",
"end_key": "7480000000000000FFBC5F698000000000FF0000010419AAA0FBFFC000000003800000FF02EC6F32C0000000FC",
"epoch": {
"conf_ver": 9443,
"version": 1182
},
"peers": [
{
"id": 451990730,
"store_id": 302321062
},
{
"id": 451990731,
"store_id": 302321060
},
{
"id": 451990732,
"store_id": 49878450
}
],
"leader": {
"id": 451990731,
"store_id": 302321060
},
"written_bytes": 0,
"read_bytes": 0,
"written_keys": 0,
"read_keys": 0,
"approximate_size": 0,
"approximate_keys": 0
},
{
"id": 451990890,
"start_key": "7480000000000000FFBC5F698000000000FF0000010419AAA0FCFF4000000003800000FF02EC80D244000000FC",
"end_key": "7480000000000000FFBC5F698000000000FF0000010419AAA0FDFF4000000003800000FF02EC9224BE000000FC",
"epoch": {
"conf_ver": 9443,
"version": 1184
},
"peers": [
{
"id": 451990891,
"store_id": 302321062
},
{
"id": 451990892,
"store_id": 302321060
},
{
"id": 451990893,
"store_id": 49878450
}
],
"leader": {
"id": 451990892,
"store_id": 302321060
},
"written_bytes": 0,
"read_bytes": 0,
"written_keys": 0,
"read_keys": 0,
"approximate_size": 0,
"approximate_keys": 0
},
{
"id": 451990991,
"start_key": "7480000000000000FFBC5F698000000000FF0000010419AAA0FDFF4000000003800000FF02EC9224BE000000FC",
"end_key": "7480000000000000FFBC5F698000000000FF0000010419AAA0FEFF4000000003800000FF02ECA729BD000000FC",
"epoch": {
"conf_ver": 9443,
"version": 1185
},
"peers": [
{
"id": 451990992,
"store_id": 302321062
},
{
"id": 451990993,
"store_id": 302321060
},
{
"id": 451990994,
"store_id": 49878450
}
],
"leader": {
"id": 451990993,
"store_id": 302321060
},
"written_bytes": 0,
"read_bytes": 0,
"written_keys": 0,
"read_keys": 0,
"approximate_size": 0,
"approximate_keys": 0
}
使用 tikv-ctl 到非 store 302321060 的任意两个 peer 所在的 store 上执行,以 region 451990991 为例,可以到 store 49878450 或 store 302321062上获取相应的 region 信息:
https://docs.pingcap.com/zh/tidb/stable/tikv-control#查看-raft-状态机的信息