9005 - Region is unavailable

【 TiDB 使用环境】生产环境
【 TiDB 版本】8.1.0
【复现路径】原来是3台tidb服务器,其中一台一tidb服务器(含有2个tikv节点)磁盘占满,导致服务器奔溃,重启服务器和服务,之后一段时间内没有问题,后新增一台tidb服务器(含一个新的tikv节点),所以希望原来的含有2个tikv节点的服务器能关闭一个tikv节点,进行缩容,缩容过程中发现有的region无法手动迁移,因为它们没有leader,报错如下:
cannot build operator for region with no leader
【遇到的问题:问题现象及影响】
遇到该问题,现在数据库无法查询某张业务核心表的出全部数据,导致业务受损,请问我们现在该如何恢复服务,或者说可以把数据进行完整备份,因为现在备份全部数据也会报 “9005 - Region is unavailable” 这个错
【资源配置】进入到 TiDB Dashboard -集群信息 (Cluster Info) -主机(Hosts) 截图此页面

【附件:截图/日志/监控】

确认下这个region是indexregion的话,那就还好,可以通过tikvscan。api获取全量数据,如果不行那就不能用scan了

确认的方式,index的话表现查询count走索引,现实数据量不对,但是查漏掉数据的时候where ID=**可以查询到,这个时候走tikv get by rowID,没有走索引

谢谢,刚才我试了用dumpling备份,也失败了,下面是日志:
[2025/01/10 21:11:56.136 +08:00] [ERROR] [main.go:78] [“dump failed error stack info”] [error=“Error 9005 (HY000): Region is unavailable”] [errorVerbose=“Error 9005 (HY000): Region is unavailable\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20240318064555-6bd07397691f/errors.go:178\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20240318064555-6bd07397691f/juju_adaptor.go:15\ngithub.com/pingcap/tidb/dumpling/export.(*multiQueriesChunkIter).nextRows.func1\n\tgithub.com/pingcap/tidb/dumpling/export/ir_impl.go:87\ngithub.com/pingcap/tidb/dumpling/export.(*multiQueriesChunkIter).nextRows\n\tgithub.com/pingcap/tidb/dumpling/export/ir_impl.go:99\ngithub.com/pingcap/tidb/dumpling/export.(*multiQueriesChunkIter).Next\n\tgithub.com/pingcap/tidb/dumpling/export/ir_impl.go:153\ngithub.com/pingcap/tidb/dumpling/export.WriteInsert\n\tgithub.com/pingcap/tidb/dumpling/export/writer_util.go:247\ngithub.com/pingcap/tidb/dumpling/export.FileFormat.WriteInsert\n\tgithub.com/pingcap/tidb/dumpling/export/writer_util.go:668\ngithub.com/pingcap/tidb/dumpling/export.(*Writer).tryToWriteTableData\n\tgithub.com/pingcap/tidb/dumpling/export/writer.go:243\ngithub.com/pingcap/tidb/dumpling/export.(*Writer).WriteTableData.func1\n\tgithub.com/pingcap/tidb/dumpling/export/writer.go:228\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry.func1\n\tgithub.com/pingcap/tidb/br/pkg/utils/retry.go:216\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetryV2[…]\n\tgithub.com/pingcap/tidb/br/pkg/utils/retry.go:234\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\tgithub.com/pingcap/tidb/br/pkg/utils/retry.go:215\ngithub.com/pingcap/tidb/dumpling/export.(*Writer).WriteTableData\n\tgithub.com/pingcap/tidb/dumpling/export/writer.go:192\ngithub.com/pingcap/tidb/dumpling/export.(*Writer).handleTask\n\tgithub.com/pingcap/tidb/dumpling/export/writer.go:115\ngithub.com/pingcap/tidb/dumpling/export.(*Writer).run\n\tgithub.com/pingcap/tidb/dumpling/export/writer.go:93\ngithub.com/pingcap/tidb/dumpling/export.(*Dumper).startWriters.func4\n\tgithub.com/pingcap/tidb/dumpling/export/dump.go:376\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.7.0/errgroup/errgroup.go:78\nruntime.goexit\n\truntime/asm_amd64.s:1650”]

dump failed: Error 9005 (HY000): Region is unavailable

我们count 这张表就会出 region unavailable 这个错

你做这个前提是先把满了的那台服务器磁盘清理出点空间在做下面操作
要不然 会有各种问题

1 个赞

参考专栏看看。估计大概率是服务器崩溃重启导致的。

1 个赞

不行使用 tiup ctl工具手动干预,强制 PD 进行 Region 重新分配

你在pdctl里面执行下region check miss-peer看看却副本的region有哪些