tidb5.4.2 1.2亿的分区表dumpling报[error="Error 9005: Region is unavailable"]

Bug 反馈
清晰准确地描述您发现的问题,提供任何可能复现问题的步骤有助于研发同学及时处理问题
【 Bug 的影响】
全表备份 导出异常中断 。
【可能的问题复现步骤】
./dumpling -h 127.0.0.1 -P 4000 -u backup -p wPjNXy9z8lrApDVb -B yixintui_operate --tables-list “yixintui_operate.Agent_material_report_cost” --filetype sql --threads 1 -o ${Bak_dir}/${Ip}/${Port}/${Time} -F 1024MiB --compress gz --params “tidb_distsql_scan_concurrency=1,tidb_mem_quota_query=8589934592” --where=" report_date >=’${lastMonth1}’" >> $Bak_log 2
【看到的非预期行为】
[2022/08/08 02:46:39.794 +08:00] [WARN] [writer_util.go:181] [“fail to dumping table(chunk), will revert some metrics and start a retry if possible”] [database=yixintui_operate] [table=Agent_material_report_cost] [“finished rows”=2024142] [“finished size”=757458552] [error=“Error 9005: Region is unavailable”]
[2022/08/08 02:46:39.796 +08:00] [INFO] [collector.go:203] [“backup failed summary”] [total-ranges=1] [ranges-succeed=0] [ranges-failed=1] [unit-name=“dump table data”] [error=“Error 9005: Region is unavailable”] [errorVerbose=“Error 9005: Region is unavailable\ngithub.com/pingcap/errors.AddStack\ \tgithub.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/errors.go:174\ github.com/pingcap/errors.Trace\ \tgithub.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/juju_adaptor.go:15\ github.com/pingcap/tidb/dumpling/export.(*multiQueriesChunkIter).nextRows.func1\ \tgithub.com/pingcap/tidb/dumpling/export/ir_impl.go:87\ github.com/pingcap/tidb/dumpling/export.(*multiQueriesChunkIter).nextRows\ \tgithub.com/pingcap/tidb/dumpling/export/ir_impl.go:107\ github.com/pingcap/tidb/dumpling/export.(*multiQueriesChunkIter).Next\ \tgithub.com/pingcap/tidb/dumpling/export/ir_impl.go:155\ github.com/pingcap/tidb/dumpling/export.WriteInsert\ \tgithub.com/pingcap/tidb/dumpling/export/writer_util.go:233\ github.com/pingcap/tidb/dumpling/export.FileFormat.WriteInsert\ \tgithub.com/pingcap/tidb/dumpling/export/writer_util.go:625\ github.com/pingcap/tidb/dumpling/export.(*Writer).tryToWriteTableData\ \tgithub.com/pingcap/tidb/dumpling/export/writer.go:216\ github.com/pingcap/tidb/dumpling/export.(*Writer).WriteTableData.func1\ \tgithub.com/pingcap/tidb/dumpling/export/writer.go:201\ github.com/pingcap/tidb/br/pkg/utils.WithRetry\ \tgithub.com/pingcap/tidb/br/pkg/utils/retry.go:60\ github.com/pingcap/tidb/dumpling/export.(*Writer).WriteTableData\ \tgithub.com/pingcap/tidb/dumpling/export/writer.go:172\ github.com/pingcap/tidb/dumpling/export.(*Writer).handleTask\ \tgithub.com/pingcap/tidb/dumpling/export/writer.go:105\ github.com/pingcap/tidb/dumpling/export.(*Writer).run\ \tgithub.com/pingcap/tidb/dumpling/export/writer.go:85\ github.com/pingcap/tidb/dumpling/export.(*Dumper).startWriters.func4\ \tgithub.com/pingcap/tidb/dumpling/export/dump.go:302\ golang.org/x/sync/errgroup.(*Group).Go.func1\ \tgolang.org/x/sync@v0.0.0-20210220032951-036812b2e83c/errgroup/errgroup.go:57\ runtime.goexit\ \truntime/asm_amd64.s:1371”]
[2022/08/08 02:46:39.796 +08:00] [ERROR] [client.go:752] ["[pd] fetch pending tso requests error"] [dc-location=global] [error="[PD:client:ErrClientGetTSO]context canceled: context canceled"]
[2022/08/08 02:46:39.797 +08:00] [ERROR] [main.go:76] [“dump failed error stack info”] [error=“Error 9005: Region is unavailable”] [errorVerbose=“Error 9005: Region is unavailable\ngithub.com/pingcap/errors.AddStack\ \tgithub.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/errors.go:174\ github.com/pingcap/errors.Trace\ \tgithub.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/juju_adaptor.go:15\ github.com/pingcap/tidb/dumpling/export.(*multiQueriesChunkIter).nextRows.func1\ \tgithub.com/pingcap/tidb/dumpling/export/ir_impl.go:87\ github.com/pingcap/tidb/dumpling/export.(*multiQueriesChunkIter).nextRows\ \tgithub.com/pingcap/tidb/dumpling/export/ir_impl.go:107\ github.com/pingcap/tidb/dumpling/export.(*multiQueriesChunkIter).Next\ \tgithub.com/pingcap/tidb/dumpling/export/ir_impl.go:155\ github.com/pingcap/tidb/dumpling/export.WriteInsert\ \tgithub.com/pingcap/tidb/dumpling/export/writer_util.go:233\ github.com/pingcap/tidb/dumpling/export.FileFormat.WriteInsert\ \tgithub.com/pingcap/tidb/dumpling/export/writer_util.go:625\ github.com/pingcap/tidb/dumpling/export.(*Writer).tryToWriteTableData\ \tgithub.com/pingcap/tidb/dumpling/export/writer.go:216\ github.com/pingcap/tidb/dumpling/export.(*Writer).WriteTableData.func1\ \tgithub.com/pingcap/tidb/dumpling/export/writer.go:201\ github.com/pingcap/tidb/br/pkg/utils.WithRetry\ \tgithub.com/pingcap/tidb/br/pkg/utils/retry.go:60\ github.com/pingcap/tidb/dumpling/export.(*Writer).WriteTableData\ \tgithub.com/pingcap/tidb/dumpling/export/writer.go:172\ github.com/pingcap/tidb/dumpling/export.(*Writer).handleTask\ \tgithub.com/pingcap/tidb/dumpling/export/writer.go:105\ github.com/pingcap/tidb/dumpling/export.(*Writer).run\ \tgithub.com/pingcap/tidb/dumpling/export/writer.go:85\ github.com/pingcap/tidb/dumpling/export.(*Dumper).startWriters.func4\ \tgithub.com/pingcap/tidb/dumpling/export/dump.go:302\ golang.org/x/sync/errgroup.(*Group).Go.func1\ \tgolang.org/x/sync@v0.0.0-20210220032951-036812b2e83c/errgroup/errgroup.go:57\ runtime.goexit\ \truntime/asm_amd64.s:1371”]
dump failed: Error 9005: Region is unavailable

【相关组件及具体版本】
统一 5.4.2 版本

另外一个小表 备份也异常 预估208万 , 备份了十几万就崩了 。 这个表前段时间备份正常的 。

tiup cluster display看下集群状态,看看再tidb.log里能否找到报错的region id

我把联系方式 私信发您了 。
我现在用where 条件 拆开一个月一个月备份的话 ,倒是能正常备份 。 (之前是备份从2022-07-01开始的 不到两个月的数据 )

dumpling 为什么非要关联 _tidb_rowid ?
执行计划走的tiflash ,有关系吗?
SELECT * FROM yixintui_operate.Agent_material_report_cost WHERE ( report_date >=‘2022-07-01’) AND (_tidb_rowid>=2305843016533029100 and _tidb_rowid<2305843016533314278) ORDER BY _tidb_rowid

dumpling 按照数值型来做并行任务的划分,如果有类似主键ID这样的会使用主键ID ,感觉是不是是导出时压力太大导致,-F 1024MiB 这个调小些看看

去掉这俩参数导出1900多万之后还是异常了
–params “tidb_distsql_scan_concurrency=1,tidb_mem_quota_query=8589934592”

看下pd 监控里 region health有没有什么状态不太对的region

导出压力挺小的 看着 ,这个tidb基本就为数据库备份准备的 没跑线上业务 。

我试试把线程改成 2 或者 改回默认的 4 看看能不能备份成功吧。
用线程2 是成功的 , 我再试试默认的 4线程 看看


默认4线程的 也正常 。

你好 备份的时候集群性能会有波动吗 我直接在线网集群备份个几百万行的表 备份了10几秒 就会导致查询其他表的语句也卡了10几秒

我一般凌晨备份 ,不太关注这个 。 另外我会用专门的tidb实例来做备份。

该主题在最后一个回复创建后60天后自动关闭。不再允许新的回复。