tikv缩容和扩容问题

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:

【TiDB 版本】
4.0.10
【问题描述】
副本数3,tikv节点有4个,然后在pd上store delete 两个tikv,过去了48小时,还是offline状态,region count大概150个左右,数据容量有4G左右。写入不多
问题:
1、为什么过了48H了,delete的tikv上region还没有调度走呢?
2、如果让两个是offline状态的tikv变成tombstone呢?


若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

默认不会将同一个 region 的多个 peer 调度到同一个 TiKV 实例上。四个 store delete 掉两个,剩余两个节点无法满足三副本的存放,所以会一直尝试迁移 region ,但是没有迁移成功,一直处于 offline 的状态。

扩容到了5个节点后过两天,之前删除的两个store 状态还是offline,并且总共才几个G的数据,也不至于均衡这么慢吧

这不是迁移慢的问题,这是 3 个副本无法被分配到剩余两个 store 节点上,导致 region 一直迁移不成功,要迁移成功的话,需要保证剩余的 store 节点数大于等于副本数。

目前tikv已经是5个节点,副本是3,delete 删除两个tikv,这种情况store的节点数等于region副本数,理应该会把deleted的store上的region 均衡掉,实际上是过了两天,上面还有零星1-2个region。

    "capacity": "845.3GiB",
    "available": "817.5GiB",
    "used_size": "181.4MiB",
    "leader_count": 0,
    "leader_weight": 1,
    "leader_score": 0,
    "leader_size": 0,
    "region_count": 6,
    "region_weight": 1,
    "region_score": 6,
    "region_size": 6,

通过 pd-ctl 执行 region store <store_id> 看下节点上剩余的 region 信息。

» store 5
{
“store”: {
“id”: 5,
“state”: 1,
“git_hash”: “4ac5e7ea1839d63163e911e2e1164d663f49592b”,
“start_timestamp”: 1617336897,
“last_heartbeat”: 1618281725876663070,
“state_name”: “Offline”
},
“status”: {
“capacity”: “845.3GiB”,
“available”: “817.3GiB”,
“used_size”: “181.9MiB”,
“leader_count”: 0,
“leader_weight”: 1,
“leader_score”: 0,
“leader_size”: 0,
“region_count”: 6,
“region_weight”: 1,
“region_score”: 6,
“region_size”: 6,
“start_ts”: “2021-04-02T12:14:57+08:00”,
“last_heartbeat_ts”: “2021-04-13T10:42:05.87666307+08:00”,

» region store 5
{
“count”: 6,
“regions”: [
{
“id”: 12,
“start_key”: “7480000000000000FF0500000000000000F8”,
“end_key”: “7480000000000000FF0700000000000000F8”,
“epoch”: {
“conf_ver”: 5,
“version”: 3
},
“peers”: [
{
“id”: 13,
“store_id”: 1
},
{
“id”: 14,
“store_id”: 4
},
{
“id”: 15,
“store_id”: 5
}
],
“leader”: {
“id”: 13,
“store_id”: 1
},
“pending_peers”: [
{
“id”: 15,
“store_id”: 5
}
],
“written_bytes”: 0,
“read_bytes”: 0,
“written_keys”: 0,
“read_keys”: 0,
“approximate_size”: 1,
“approximate_keys”: 0
},
{
“id”: 64,
“start_key”: “7480000000000000FF1F00000000000000F8”,
“end_key”: “7480000000000000FF2100000000000000F8”,
“epoch”: {
“conf_ver”: 5,
“version”: 16
},
“peers”: [
{
“id”: 65,
“store_id”: 1
},
{
“id”: 66,
“store_id”: 4
},
{
“id”: 67,
“store_id”: 5
}
],
“leader”: {
“id”: 66,
“store_id”: 4
},
“pending_peers”: [
{
“id”: 67,
“store_id”: 5
}
],
“written_bytes”: 41,
“read_bytes”: 0,
“written_keys”: 1,
“read_keys”: 0,
“approximate_size”: 1,
“approximate_keys”: 0
},
{
“id”: 68,
“start_key”: “7480000000000000FF2100000000000000F8”,
“end_key”: “7480000000000000FF2300000000000000F8”,
“epoch”: {
“conf_ver”: 5,
“version”: 17
},
“peers”: [
{
“id”: 69,
“store_id”: 1
},
{
“id”: 70,
“store_id”: 4
},
{
“id”: 71,
“store_id”: 5
}
],
“leader”: {
“id”: 70,
“store_id”: 4
},
“pending_peers”: [
{
“id”: 71,
“store_id”: 5
}
],
“written_bytes”: 41,
“read_bytes”: 0,
“written_keys”: 1,
“read_keys”: 0,
“approximate_size”: 1,
“approximate_keys”: 0
},
{
“id”: 76,
“start_key”: “7480000000000000FF2500000000000000F8”,
“end_key”: “7480000000000000FF2700000000000000F8”,
“epoch”: {
“conf_ver”: 5,
“version”: 19
},
“peers”: [
{
“id”: 77,
“store_id”: 1
},
{
“id”: 78,
“store_id”: 4
},
{
“id”: 79,
“store_id”: 5
}
],
“leader”: {
“id”: 77,
“store_id”: 1
},
“pending_peers”: [
{
“id”: 79,
“store_id”: 5
}
],
“written_bytes”: 0,
“read_bytes”: 0,
“written_keys”: 0,
“read_keys”: 0,
“approximate_size”: 1,
“approximate_keys”: 0
},
{
“id”: 80,
“start_key”: “7480000000000000FF2700000000000000F8”,
“end_key”: “7480000000000000FF2900000000000000F8”,
“epoch”: {
“conf_ver”: 5,
“version”: 20
},
“peers”: [
{
“id”: 81,
“store_id”: 1
},
{
“id”: 82,
“store_id”: 4
},
{
“id”: 83,
“store_id”: 5
}
],
“leader”: {
“id”: 82,
“store_id”: 4
},
“pending_peers”: [
{
“id”: 83,
“store_id”: 5
}
],
“written_bytes”: 41,
“read_bytes”: 0,
“written_keys”: 1,
“read_keys”: 0,
“approximate_size”: 1,
“approximate_keys”: 0
},
{
“id”: 84,
“start_key”: “7480000000000000FF2900000000000000F8”,
“end_key”: “7480000000000000FF2B00000000000000F8”,
“epoch”: {
“conf_ver”: 5,
“version”: 21
},
“peers”: [
{
“id”: 85,
“store_id”: 1
},
{
“id”: 86,
“store_id”: 4
},
{
“id”: 87,
“store_id”: 5
}
],
“leader”: {
“id”: 85,
“store_id”: 1
},
“pending_peers”: [
{
“id”: 87,
“store_id”: 5
}
],
“written_bytes”: 0,
“read_bytes”: 0,
“written_keys”: 0,
“read_keys”: 0,
“approximate_size”: 1,
“approximate_keys”: 7166
}
]
}

>> operator add remove-peer 1 2                         // 移除 store 2 上的 Region 1 的一个副本

手动将这些 region 的 pending_peers 去掉,看下能不能迁移成功。

谢谢~!

{
**“id”: 84**,
“start_key”: “7480000000000000FF2900000000000000F8”,
“end_key”: “7480000000000000FF2B00000000000000F8”,
“epoch”: {
“conf_ver”: 5,
“version”: 21
},
“peers”: [
{
“id”: 85,
“store_id”: 1
},
{
“id”: 86,
“store_id”: 4
},
{
“id”: 87,
“store_id”: 5
}
],
“leader”: {
“id”: 85,
“store_id”: 1
},
“pending_peers”: [
{
**“id”: 87**,
“store_id”: 5
}
],
“written_bytes”: 0,
“read_bytes”: 0,
“written_keys”: 0,
“read_keys”: 0,
“approximate_size”: 1,
“approximate_keys”: 7166
}

请问这里是执行
operator add remove-peer 84 5
还是执行
operator add remove-peer 87 5

我的理解是执行前者,因为region id=84,而85 86 87只是peer信息,只是想确认下。

» operator add remove-peer 84 5
Failed! [500] “failed to add operator, maybe already have one”
» operator add remove-peer 87 5
Failed! [500] “region 87 not found”
@GangShen
谢谢

» store 5
{
“store”: {
“id”: 5,
“state”: 1,
“start_timestamp”: 1617336897,
“last_heartbeat”: 1618366884726423950,
“state_name”: “Offline”
},
“status”: {
“capacity”: “845.3GiB”,
“available”: “815.8GiB”,
“used_size”: “200.9MiB”,
“leader_count”: 0,
“leader_weight”: 1,
“leader_score”: 0,
“leader_size”: 0,
“region_count”: 6,
“region_weight”: 1,
“region_score”: 6,
“region_size”: 6,
“start_ts”: “2021-04-02T12:14:57+08:00”,
“last_heartbeat_ts”: “2021-04-14T10:21:24.72642395+08:00”,
“uptime”: “286h6m27.72642395s”
}
}

@GangShen

operator add remove-peer 84 5
应该填的是 region id ,不是 peer id

嗯,执行有报错
» operator add remove-peer 84 5
Failed! [500] “failed to add operator, maybe already have one”
» operator add remove-peer 87 5
Failed! [500] “region 87 not found”

看pd leader日志:
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[2021/04/14 11:02:34.602 +08:00] [INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[2021/04/14 11:02:34.602 +08:00] [INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[2021/04/14 11:02:39.602 +08:00] [INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[2021/04/14 11:02:39.602 +08:00] [INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7589 on store 2117”] [source=“active push”]
[2021/04/14 11:02:44.602 +08:00] [INFO] [operator_controller.go:560] [“operator timeout”] [region-id=84] [takes=10m0.453800896s] [operator="“replace-offline-replica {mv peer: store [4] to [2117]} (kind:region,replica, region:84(21,5), createAt:2021-04-14 10:52:44.148557749 +0800 CST m=+1034683.573833541, startAt:2021-04-14 10:52:44.148704226 +0800 CST m=+1034683.573980082, currentStep:0, steps:[add learner peer 7589 on store 2117, promote learner peer 7589 on store 2117 to voter, remove peer on store 4]) timeout”"]
[2021/04/14 11:02:44.637 +08:00] [INFO] [operator_controller.go:424] [“add operator”] [region-id=84] [operator="“replace-offline-replica {mv peer: store [4] to [2117]} (kind:region,replica, region:84(21,5), createAt:2021-04-14 11:02:44.637516694 +0800 CST m=+1035284.062792450, startAt:0001-01-01 00:00:00 +0000 UTC, currentStep:0, steps:[add learner peer 7595 on store 2117, promote learner peer 7595 on store 2117 to voter, remove peer on store 4])”"] [“additional info”=]
[2021/04/14 11:02:44.637 +08:00] [INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7595 on store 2117”] [source=create]
[2021/04/14 11:02:50.102 +08:00] [INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7595 on store 2117”] [source=“active push”]
[2021/04/14 11:02:50.102 +08:00] [INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7595 on store 2117”] [source=“active push”]
[2021/04/14 11:02:55.102 +08:00] [INFO] [operator_controller.go:620] [“send schedule command”] [region-id=84] [step=“add learner peer 7595 on store 2117”] [source=“active push”]

在几个 tikv 节点的日志中,通过 region_id=84 这个关键字都 grep 一下,看下 tikv 上这个 region 的操作为什么超时。

搜不到这个关键字
搜到的也是这些:
[2021/04/02 12:17:47.728 +08:00] [WARN] [endpoint.rs:537] [error-response] [err=“Region error (will back off and retry) message: "peer is not leader for region 84, leader may Some(id: 86 store_id: 4)" not_leader { region_id: 84 leader { id: 86 store_id: 4 } }”]

  1. 在 pd-ctl 中执行 region 84
  2. 用 tikv-ctl 在 store id 为 4 的 tikv 节点上执行 ./tikv-ctl --host ${store_ip}:${tikv_port} raft region -r 84

看下结果

store 5上执行 tikv-ctl --host ${store_5_host}:${store_5_port} raft region -r 84 :
region id: 84
region state key: \001\003\000\000\000\000\000\000\000T\001
region state: Some(region { id: 84 start_key: 7480000000000000FF2900000000000000F8 end_key: 7480000000000000FF2B00000000000000F8 region_epoch { conf_ver: 5 version: 21 } peers { id: 85 store_id: 1 } peers { id: 86 store_id: 4 } peers { id: 87 store_id: 5 } })
raft state key: \001\002\000\000\000\000\000\000\000T\002
raft state: Some(hard_state { term: 9 vote: 86 commit: 11 } last_index: 11)
apply state key: \001\002\000\000\000\000\000\000\000T\003
apply state: Some(applied_index: 11 last_commit_index: 10 commit_index: 11 commit_term: 9 truncated_state { index: 5 term: 5 })

在store 4上执行 tikv-ctl --host ${store_4_host}:${store_4_port} raft region -r 84
region id: 84
region state key: \001\003\000\000\000\000\000\000\000T\001
region state: Some(region { id: 84 start_key: 7480000000000000FF2900000000000000F8 end_key: 7480000000000000FF2B00000000000000F8 region_epoch { conf_ver: 5 version: 21 } peers { id: 85 store_id: 1 } peers { id: 86 store_id: 4 } peers { id: 87 store_id: 5 } })
raft state key: \001\002\000\000\000\000\000\000\000T\002
raft state: Some(hard_state { term: 9 vote: 86 commit: 11 } last_index: 11)
apply state key: \001\002\000\000\000\000\000\000\000T\003
apply state: Some(applied_index: 11 last_commit_index: 10 commit_index: 11 commit_term: 9 truncated_state { index: 5 term: 5 })

region_5_before_restart.txt (3.6 KB) region_5.txt (2.4 KB)
今天早上重启了store 5,然后执行了region store 5,发现region count从之前的6变成3,但是region 84还是停留在store 5上。

重启store 5之前的输出结果:
» operator show admin
null

» operator show leader
[
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:leader,region,replica, region:80(20,5), createAt:2021-04-15 08:31:18.102705515 +0800 CST m=+1112597.527981307, startAt:2021-04-15 08:31:18.102901128 +0800 CST m=+1112597.528176942, currentStep:0, steps:[transfer leader from store 4 to store 1, add learner peer 8378 on store 2117, promote learner peer 8378 on store 2117 to voter, remove peer on store 4])”,
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:leader,region,replica, region:64(16,5), createAt:2021-04-15 08:31:31.617549662 +0800 CST m=+1112611.042825460, startAt:2021-04-15 08:31:31.617830293 +0800 CST m=+1112611.043106100, currentStep:0, steps:[transfer leader from store 4 to store 1, add learner peer 8379 on store 2117, promote learner peer 8379 on store 2117 to voter, remove peer on store 4])”,
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:leader,region,replica, region:68(17,5), createAt:2021-04-15 08:31:40.123580365 +0800 CST m=+1112619.548856157, startAt:2021-04-15 08:31:40.123758136 +0800 CST m=+1112619.549033941, currentStep:0, steps:[transfer leader from store 4 to store 1, add learner peer 8380 on store 2117, promote learner peer 8380 on store 2117 to voter, remove peer on store 4])”
]

» operator show region
[
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:region,replica, region:12(3,5), createAt:2021-04-15 08:29:02.669176388 +0800 CST m=+1112462.094452151, startAt:2021-04-15 08:29:02.669277765 +0800 CST m=+1112462.094553685, currentStep:0, steps:[add learner peer 8377 on store 2117, promote learner peer 8377 on store 2117 to voter, remove peer on store 4])”,
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:leader,region,replica, region:64(16,5), createAt:2021-04-15 08:31:31.617549662 +0800 CST m=+1112611.042825460, startAt:2021-04-15 08:31:31.617830293 +0800 CST m=+1112611.043106100, currentStep:0, steps:[transfer leader from store 4 to store 1, add learner peer 8379 on store 2117, promote learner peer 8379 on store 2117 to voter, remove peer on store 4])”,
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:leader,region,replica, region:68(17,5), createAt:2021-04-15 08:31:40.123580365 +0800 CST m=+1112619.548856157, startAt:2021-04-15 08:31:40.123758136 +0800 CST m=+1112619.549033941, currentStep:0, steps:[transfer leader from store 4 to store 1, add learner peer 8380 on store 2117, promote learner peer 8380 on store 2117 to voter, remove peer on store 4])”,
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:region,replica, region:76(19,5), createAt:2021-04-15 08:28:34.643890686 +0800 CST m=+1112434.069166477, startAt:2021-04-15 08:28:34.644057689 +0800 CST m=+1112434.069333491, currentStep:0, steps:[add learner peer 8376 on store 2117, promote learner peer 8376 on store 2117 to voter, remove peer on store 4])”,
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:region,replica, region:84(21,5), createAt:2021-04-15 08:37:18.650708129 +0800 CST m=+1112958.075983922, startAt:2021-04-15 08:37:18.650849519 +0800 CST m=+1112958.076125322, currentStep:0, steps:[add learner peer 8381 on store 2117, promote learner peer 8381 on store 2117 to voter, remove peer on store 4])”,
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:leader,region,replica, region:80(20,5), createAt:2021-04-15 08:31:18.102705515 +0800 CST m=+1112597.527981307, startAt:2021-04-15 08:31:18.102901128 +0800 CST m=+1112597.528176942, currentStep:0, steps:[transfer leader from store 4 to store 1, add learner peer 8378 on store 2117, promote learner peer 8378 on store 2117 to voter, remove peer on store 4])”
]

重启 store 5自后的结果:
» operator show leader
null

» operator show region
[
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:region,replica, region:12(3,5), createAt:2021-04-15 09:09:10.199676873 +0800 CST m=+1114869.624952666, startAt:2021-04-15 09:09:10.19983761 +0800 CST m=+1114869.625113411, currentStep:0, steps:[add learner peer 8404 on store 2117, promote learner peer 8404 on store 2117 to voter, remove peer on store 4])”,
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:region,replica, region:76(19,5), createAt:2021-04-15 09:08:47.679064779 +0800 CST m=+1114847.104340573, startAt:2021-04-15 09:08:47.679254629 +0800 CST m=+1114847.104530504, currentStep:0, steps:[add learner peer 8403 on store 2117, promote learner peer 8403 on store 2117 to voter, remove peer on store 4])”,
“replace-offline-replica {mv peer: store [4] to [2117]} (kind:region,replica, region:84(21,5), createAt:2021-04-15 09:07:23.196152053 +0800 CST m=+1114762.621427846, startAt:2021-04-15 09:07:23.196312753 +0800 CST m=+1114762.621588572, currentStep:0, steps:[add learner peer 8402 on store 2117, promote learner peer 8402 on store 2117 to voter, remove peer on store 4])”
]