tikv 删除data恢复

测试删除3节点tikv中一个节点的data

删除一个节点的data文件夹后缩容tikv
[tidb@tidb1 ~]$ tiup cluster scale-in tidb-jiantest -N 192.168.135.133:20160 --force
成功后扩容节点,显示offline状态
192.168.135.133:20160 tikv 192.168.135.133 20160/20180 linux/x86_64 Offline /tidb-data/tikv-20160

tikv报错
[2021/12/10 10:38:48.002 +08:00] [FATAL] [server.rs:843] [“failed to start node: Grpc(RpcFailure(RpcStatus { code: 2-UNKNOWN, message: “duplicated store address: id:13004 address:\“192.168.135.133:20160\” version:\“5.3.0\” status_address:\“192.168.135.133:20180\” git_hash:\“6c1424706f3d5885faa668233f34c9f178302f36\” start_timestamp:1639103922 deploy_path:\”/tidb-deploy/tikv-20160/bin\” , already registered by id:5 address:\“192.168.135.133:20160\” state:Offline version:\“5.3.0\” status_address:\“192.168.135.133:20180\” git_hash:\“6c1424706f3d5885faa668233f34c9f178302f36\” start_timestamp:1639039505 deploy_path:\"/tidb-deploy/tikv-20160/bin\" last_heartbeat:1639102743243670223 “, details: [] }))”]

尝试删除id:5 报错"invalid state tombstone"

[tidb@tidb1 ~]$ curl -X POST http://192.168.135.132:2379/pd/api/v1/store/5/state?state=Tombstone
“invalid state Tombstone”
[tidb@tidb1 ~]$ curl -X POST http://192.168.135.132:2379/pd/api/v1/store/5/state?state=tombstone
“invalid state tombstone”
[tidb@tidb1 ~]$ curl -X POST http://192.168.135.132:2379/pd/api/v1/store/5/state?state=“tombstone”
“invalid state tombstone”
[tidb@tidb1 ~]$ curl -X POST http://192.168.135.132:2379/pd/api/v1/store/5/state?state=“Tombstone”
“invalid state Tombstone”

通过pdctl删除也无法删除
“stores”: [
{
“store”: {
“id”: 5,
“address”: “192.168.135.133:20160”,
“state”: 1,
“version”: “5.3.0”,
“status_address”: “192.168.135.133:20180”,
“git_hash”: “6c1424706f3d5885faa668233f34c9f178302f36”
“start_timestamp”: 1639039505,
“deploy_path”: “/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1639102743243670223,
“state_name”: “Offline”
},
“status”: {
“capacity”: “16.99GiB”,
“available”: “11.31GiB”,
“used_size”: “33.55MiB”,
“leader_count”: 0,
“leader_weight”: 1,
“leader_score”: 0,
“leader_size”: 0,
“region_count”: 3,
“region_weight”: 1,
“region_score”: 7738497.246537877,
“region_size”: 3,
“slow_score”: 1,
“start_ts”: “2021-12-09T16:45:05+08:00”,
“last_heartbeat_ts”: "2021-12-10T10:19:03.243670223+08
“uptime”: “17h33m58.243670223s”
}
},
» store delete 5
Success!
» store
{
“count”: 4,
“stores”: [
{
“store”: {
“id”: 1,
“address”: “192.168.135.134:20160”,
“version”: “5.3.0”,
“status_address”: “192.168.135.134:20180”,
“git_hash”: “6c1424706f3d5885faa668233f34c9f178302f36”,
“start_timestamp”: 1639039505,
“deploy_path”: “/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1639105274047365214,
“state_name”: “Up”
},
“status”: {
“capacity”: “16.99GiB”,
“available”: “11.31GiB”,
“used_size”: “34.01MiB”,
“leader_count”: 2,
“leader_weight”: 1,
“leader_score”: 2,
“leader_size”: 2,
“region_count”: 3,
“region_weight”: 1,
“region_score”: 7738871.373530997,
“region_size”: 3,
“slow_score”: 1,
“start_ts”: “2021-12-09T16:45:05+08:00”,
“last_heartbeat_ts”: “2021-12-10T11:01:14.047365214+08:00”,
“uptime”: “18h16m9.047365214s”
}
},
{
“store”: {
“id”: 4,
“address”: “192.168.135.132:20160”,
“version”: “5.3.0”,
“status_address”: “192.168.135.132:20180”,
“git_hash”: “6c1424706f3d5885faa668233f34c9f178302f36”,
“start_timestamp”: 1639039505,
“deploy_path”: “/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1639105275230512557,
“state_name”: “Up”
},
“status”: {
“capacity”: “16.99GiB”,
“available”: “11.3GiB”,
“used_size”: “33.68MiB”,
“leader_count”: 1,
“leader_weight”: 1,
“leader_score”: 1,
“leader_size”: 1,
“region_count”: 3,
“region_weight”: 1,
“region_score”: 7739282.703694476,
“region_size”: 3,
“slow_score”: 1,
“start_ts”: “2021-12-09T16:45:05+08:00”,
“last_heartbeat_ts”: “2021-12-10T11:01:15.230512557+08:00”,
“uptime”: “18h16m10.230512557s”
}
},
{
“store”: {
“id”: 5,
“address”: “192.168.135.133:20160”,
“state”: 1,
“version”: “5.3.0”,
“status_address”: “192.168.135.133:20180”,
“git_hash”: “6c1424706f3d5885faa668233f34c9f178302f36”,
“start_timestamp”: 1639039505,
“deploy_path”: “/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1639102743243670223,
“state_name”: “Offline”
},
“status”: {
“capacity”: “16.99GiB”,
“available”: “11.31GiB”,
“used_size”: “33.55MiB”,
“leader_count”: 0,
“leader_weight”: 1,
“leader_score”: 0,
“leader_size”: 0,
“region_count”: 3,
“region_weight”: 1,
“region_score”: 7738497.246537877,
“region_size”: 3,
“slow_score”: 1,
“start_ts”: “2021-12-09T16:45:05+08:00”,
“last_heartbeat_ts”: “2021-12-10T10:19:03.243670223+08:00”,
“uptime”: “17h33m58.243670223s”
}
},

2 个赞

1)你的诉求是什么?store=5的tikv没有变成tombstone?
2)replica=3,tikvs数为3的时候,做节点下线会有一些预期的问题,比如region不会自动迁移;
3)想解决这个问题,需要先扩容一个tikv,然后再按标准流程缩容即可。

1 个赞

我看其他的帖子好像可以先设置store=5的tikv为tombstone 然后删除就可以正常添加了但是这里不知道该怎么去设置状态为tombstone?

1 个赞

或者这个时候我可以修改replica=2或者更低一点的副本数来恢复么,如果可以方便告诉一下步骤么?

1 个赞

还有一个问题是pctl 去delete 5的时候为什么提示成功了但是却没有删除呢?

上面已经说了,delete 成功,只是触发整个store下线(offline),触发region迁移,如果正常迁移后,这个store上没有region后,会变成tombstone状态。但是实际上这个store上region是没有发生迁移(有效tikv数小于replica数)。建议先扩容一台tikv。

1 个赞

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。