[FATAL] [server.rs:428] ["panic_mark_file /data12/tidb/data/tikv-20162/panic_mark_file exists, there must be something wrong with the db. Do not remove the panic_mark_file and force the TiKV node to restart. Please contact TiKV maintainers to investigate the issue. If needed, use scale in and scale out to replace the TiKV node
【附件:截图/日志/监控】
直接通过扩容后再缩容处理吧,最保险处理也很快。
use scale in and scale out to replace the TiKV node
说的是先缩容后扩容,可以试试
[2024/05/22 12:07:06.165 +08:00] [INFO] [region_cache.go:2377] [“[health check] check health error”] [store=10.114.26.112:20162] [error=“rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 10.114.26.112:20162: connect: connection refused"”]
[2024/05/22 12:07:06.165 +08:00] [INFO] [region_request.go:785] [“mark store’s regions need be refill”] [id=183060412] [addr=10.114.26.112:20162] [error=“context deadline exceeded”]
第一个出错日志里,提了建议建议通过缩容和扩容替换tikv节点
扩缩容处理吧
region_cache.go:2377] [“[health check] check health error”] [store=10.114.26.112:20161] [error=“rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 10.114.26.112:20161: connect: connection refused"”]
port conflict for ‘20162’ between ‘tikv_servers:10.114.26.112.port’
问题出现之前你的部署是怎么样子的?
截个图看看你的机器配置情况?几个 tikv?
之前是好的,7台,昨天有台机器被搞挂了,今天重启不行了,扩容还报端口冲突:“code”: 1, “error”: “port conflict for ‘20162’ between ‘tikv_servers:10.114.26.112.port’ and ‘tikv_servers:10.114.26.112.port’”}
use scale in and scale out to replace the TiKV node
tikv 下线失败 linux/x86_64 Pending Offline
如果在同一台机器上扩容的话,那得换下端口,如果是7台tikv,其中1台panic的话,可以先缩容再扩容试下
先缩容后扩容这个可以试一下,以前遇到过一次就是这么解决的。
我在想出问题的tikv 实例, 在扩容后,能缩容成功吗》