tikv缩容后,tiup状态一直处于 Pending Offline 状态

【概述】场景+问题概述
tikv缩容,状态一直处于 Pending Offline 状态
【背景】做过哪些操作
tiup cluster scale-in tidb-cluster --node ip:20160


查看pd,tiup ctl:v5.0.1 pd store -u http://10.33.2.43:2379
[root@zabbixserver2 ~]# tiup ctl:v5.0.1 pd store -u http://10.33.2.43:2379
Starting component ctl: /root/.tiup/components/ctl/v5.0.1/ctl pd store -u http://10.33.2.43:2379
{
“count”: 4,
“stores”: [
{
“store”: {
“id”: 47009,
“address”: “10.33.2.42:20160”,
“version”: “5.0.1”,
“status_address”: “10.33.2.42:20180”,
“git_hash”: “e26389a278116b2f61addfa9f15ca25ecf38bc80”,
“start_timestamp”: 1624554400,
“deploy_path”: “/tikv/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1624607693273480869,
“state_name”: “Up”
},
“status”: {
“capacity”: “177GiB”,
“available”: “134.6GiB”,
“used_size”: “19.55GiB”,
“leader_count”: 944,
“leader_weight”: 1,
“leader_score”: 944,
“leader_size”: 52356,
“region_count”: 1906,
“region_weight”: 1,
“region_score”: 324784.33387210406,
“region_size”: 107217,
“start_ts”: “2021-06-25T01:06:40+08:00”,
“last_heartbeat_ts”: “2021-06-25T15:54:53.273480869+08:00”,
“uptime”: “14h48m13.273480869s”
}
},
{
“store”: {
“id”: 47026,
“address”: “10.33.2.41:20160”,
“version”: “5.0.1”,
“status_address”: “10.33.2.41:20180”,
“git_hash”: “e26389a278116b2f61addfa9f15ca25ecf38bc80”,
“start_timestamp”: 1624438931,
“deploy_path”: “/tikv/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1624607689940555143,
“state_name”: “Up”
},
“status”: {
“capacity”: “197GiB”,
“available”: “126.9GiB”,
“used_size”: “19.64GiB”,
“leader_count”: 962,
“leader_weight”: 1,
“leader_score”: 962,
“leader_size”: 54861,
“region_count”: 1906,
“region_weight”: 1,
“region_score”: 320481.71191295166,
“region_size”: 107217,
“start_ts”: “2021-06-23T17:02:11+08:00”,
“last_heartbeat_ts”: “2021-06-25T15:54:49.940555143+08:00”,
“uptime”: “46h52m38.940555143s”
}
},
{
“store”: {
“id”: 58198,
“address”: “10.33.2.48:3930”,
“labels”: [
{
“key”: “engine”,
“value”: “tiflash”
}
],
“version”: “v5.0.1”,
“peer_address”: “10.33.2.48:20170”,
“status_address”: “10.33.2.48:20292”,
“git_hash”: “1821cf655bc90e1fab6e6154cfe994c19c75d377”,
“start_timestamp”: 1624438954,
“deploy_path”: “/tidb-deploy/tiflash-9000/bin/tiflash”,
“last_heartbeat”: 1624607692200559639,
“state_name”: “Up”
},
“status”: {
“capacity”: “16.99GiB”,
“available”: “16.99GiB”,
“used_size”: “258KiB”,
“leader_count”: 0,
“leader_weight”: 1,
“leader_score”: 0,
“leader_size”: 0,
“region_count”: 0,
“region_weight”: 1,
“region_score”: 6602783.587947488,
“region_size”: 0,
“start_ts”: “2021-06-23T17:02:34+08:00”,
“last_heartbeat_ts”: “2021-06-25T15:54:52.200559639+08:00”,
“uptime”: “46h52m18.200559639s”
}
},
{
“store”: {
“id”: 1,
“address”: “10.33.2.43:20160”,
“state”: 1,
“version”: “5.0.1”,
“status_address”: “10.33.2.43:20180”,
“git_hash”: “e26389a278116b2f61addfa9f15ca25ecf38bc80”,
“start_timestamp”: 1624464666,
“deploy_path”: “/tikv/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1624591587889106889,
“state_name”: “Offline”
},
“status”: {
“capacity”: “39.25GiB”,
“available”: “9.206GiB”,
“used_size”: “19.25GiB”,
“leader_count”: 0,
“leader_weight”: 1,
“leader_score”: 0,
“leader_size”: 0,
“region_count”: 1906,
“region_weight”: 1,
“region_score”: 10729966.729996301,
“region_size”: 107217,
“start_ts”: “2021-06-24T00:11:06+08:00”,
“last_heartbeat_ts”: “2021-06-25T11:26:27.889106889+08:00”,
“uptime”: “35h15m21.889106889s”
}
}
]
}
【现象】业务和数据库现象

【业务影响】

【TiDB 版本】
v5.0.1

1 个赞

来自 TUG 群-田帅萌的回复

你刷 store 1 sleep 1分钟 在 store 1 看一眼 会不会变值

region_count 会变小吗?

刷 store 1 参考:
https://docs.pingcap.com/zh/tidb/stable/pd-control

tiup ctl:v5.0.2 pd -u http://pd:2379 store 1
region_count 要是在减少 你等一会就能好

一开始还会动,后面就不动了,我重启了下pd,,tikv显示为0了

[root@zabbixserver2 ~]# tiup ctl:v5.0.1 pd -u http://10.33.2.43:2379 store 1
Starting component ctl: /root/.tiup/components/ctl/v5.0.1/ctl pd -u http://10.33.2.43:2379 store 1
{
“store”: {
“id”: 1,
“address”: “10.33.2.43:20160”,
“state”: 1,
“version”: “5.0.1”,
“status_address”: “10.33.2.43:20180”,
“git_hash”: “e26389a278116b2f61addfa9f15ca25ecf38bc80”,
“start_timestamp”: 1624464666,
“deploy_path”: “/tikv/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1624591587889106889,
“state_name”: “Offline”
},
“status”: {
“capacity”: “0B”,
“available”: “0B”,
“used_size”: “0B”,
“leader_count”: 0,
“leader_weight”: 1,
“leader_score”: 0,
“leader_size”: 0,
“region_count”: 1948,
“region_weight”: 1,
“region_score”: 110010,
“region_size”: 110010,
“start_ts”: “2021-06-24T00:11:06+08:00”,
“last_heartbeat_ts”: “2021-06-25T11:26:27.889106889+08:00”,
“uptime”: “35h15m21.889106889s”
}
}

查看了其他相似问题,有些说是磁盘空间不够导致,我看了下其他kv的空间都有很多,排除了这个原因,能麻烦给一个思路吗,万分感谢。

我重启了pd和kv后,容量又开始减少了,再耐心等待吧

我后面观察发现使用空间不减反增,region_count数量也增加了,只有region_score再减少

问题解决了,排查思路如下:
1.使用以下命令查看region_count和used_size是不是在变小。
tiup ctl:v5.0.1 pd store 68057
2.如果没有变化卡住了,查看对应store的tikv log是不是报错了,pd的log是不是报错了,我碰到了pd提示迁移到其他tikv的空间不足,释放后又碰到tikv报错了,解决方案:重启卡主的tikv节点,再观察tikv日志是不是在打印,日志说会输出删除镜像删除文件之类的,这样再耐心等待就可以,等删除完就会进入墓碑状态,tiup cluster display 集群,下面会有提示命令进行删除节点。
tiup cluster restart tidb-zabbix --node 10.33.2.49:20160

1 个赞

cool~

我也碰到了,但是删除动做完后,要等好长时间。

如何加快这个缩离tidb

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。