TiDB使用BR恢复集群时报错

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【 TiDB 使用环境】
TiDB v4.0.9
BR v4.0.9

【概述】 场景 + 问题概述

使用 BR 命令行备份集群数据,然后恢复集群到另一个集群环境。

【备份和数据迁移策略逻辑】

【背景】 做过哪些操作

【现象】 业务和数据库现象

【问题】 当前遇到的问题
Error: failed to validate checksum: [BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch
QQ%E6%88%AA%E5%9B%BE20211230111149
【业务影响】

【TiDB 版本】

【附件】

  • 相关日志、配置文件、Grafana 监控(https://metricstool.pingcap.com/)
  • TiUP Cluster Display 信息
  • TiUP CLuster Edit config 信息
  • TiDB-Overview 监控
  • 对应模块的 Grafana 监控(如有 BR、TiDB-binlog、TiCDC 等)
  • 对应模块日志(包含问题前后 1 小时日志)

报错日志:
[2021/12/29 22:20:23.856 +00:00] [INFO] [domain.go:622] [“domain closed”] [“take time”=7.982087774s]
[2021/12/29 22:20:23.863 +00:00] [INFO] [collector.go:188] [“Database restore Failed summary : total restore files: 4803, total success: 4803, total failed: 0”] [“split region”=2h46m46.681634392s] [“restore checksum”=42h48m33.103512213s] [“restore ranges”=4044] [Size=24012833189]
[2021/12/29 22:20:23.864 +00:00] [ERROR] [restore.go:24] [“failed to restore”] [error=“failed to validate checksum: [BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch”] [errorVerbose="[BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch\nfailed to validate checksum\ngithub.com/pingcap/br/pkg/restore.(*Client).execChecksum\n\tgithub.com/pingcap/br@/pkg/restore/client.go:796\ngithub.com/pingcap/br/pkg/restore.(*Client).GoValidateChecksum.func1.2\n\tgithub.com/pingcap/br@/pkg/restore/client.go:742\ngithub.com/pingcap/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\tgithub.com/pingcap/br@/pkg/utils/worker.go:63\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20201020160332-67f06af15bc9/errgroup/errgroup.go:57\nruntime.goexit\n\truntime/asm_amd64.s:1357"] [stack=“github.com/pingcap/br/cmd.runRestoreCommand\n\tgithub.com/pingcap/br@/cmd/restore.go:24\ngithub.com/pingcap/br/cmd.newDBRestoreCommand.func1\n\tgithub.com/pingcap/br@/cmd/restore.go:106\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v1.0.0/command.go:842\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v1.0.0/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/cobra@v1.0.0/command.go:887\nmain.main\n\tgithub.com/pingcap/br@/main.go:58\nruntime.main\n\truntime/proc.go:203”]
[2021/12/29 22:20:23.864 +00:00] [ERROR] [main.go:59] [“br failed”] [error=“failed to validate checksum: [BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch”] [errorVerbose="[BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch\nfailed to validate checksum\ngithub.com/pingcap/br/pkg/restore.(*Client).execChecksum\n\tgithub.com/pingcap/br@/pkg/restore/client.go:796\ngithub.com/pingcap/br/pkg/restore.(*Client).GoValidateChecksum.func1.2\n\tgithub.com/pingcap/br@/pkg/restore/client.go:742\ngithub.com/pingcap/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\tgithub.com/pingcap/br@/pkg/utils/worker.go:63\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20201020160332-67f06af15bc9/errgroup/errgroup.go:57\nruntime.goexit\n\truntime/asm_amd64.s:1357"] [stack=“main.main\n\tgithub.com/pingcap/br@/main.go:59\nruntime.main\n\truntime/proc.go:203”]


若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

1赞

两个集群的 new_collations_enabled_on_first_bootstrap参数值是一样吗

一样的,都是False
mysql> SELECT VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME=‘new_collation_enabled’;
±---------------+
| VARIABLE_VALUE |
±---------------+
| False |
±---------------+
1 row in set (0.00 sec)

新集群在恢复之前存在相同名称的库表吗

不存在,库、表都没有建

tidb_enable_clustered_index这个参数源集群和目标集群一致吗

还有源集群和目标集群的tidb版本一致吗

一致的,我是为了BR特意搭的集群环境。高度还原。

Hi,请问重试之后还有这个问题吗?