【 TiDB 使用环境】生产\测试环境\ POC
生产
【 TiDB 版本】
4.0.15
【遇到的问题】
[2022/06/07 09:30:36.995 +08:00] [ERROR] [restore.go:34] [“failed to restore”] [error=“split region failed: err=message:"Coprocessor [components/raftstore/src/coprocessor/split_observer.rs:154]: no valid key found for split." : [BR:Restore:ErrRestoreSplitFailed]fail to split region”] [errorVerbose=“[BR:Restore:ErrRestoreSplitFailed]fail to split region
split region failed: err=message:"Coprocessor [components/raftstore/src/coprocessor/split_observer.rs:154]: no valid key found for split."
github.com/pingcap/br/pkg/restore.(*pdClient).sendSplitRegionRequest
\tgithub.com/pingcap/br@/pkg/restore/split_client.go:287
github.com/pingcap/br/pkg/restore.(*pdClient).BatchSplitRegionsWithOrigin
\tgithub.com/pingcap/br@/pkg/restore/split_client.go:332
github.com/pingcap/br/pkg/restore.(*pdClient).BatchSplitRegions
\tgithub.com/pingcap/br@/pkg/restore/split_client.go:371
github.com/pingcap/br/pkg/restore.(*RegionSplitter).splitAndScatterRegions
\tgithub.com/pingcap/br@/pkg/restore/split.go:271
github.com/pingcap/br/pkg/restore.(*RegionSplitter).Split
\tgithub.com/pingcap/br@/pkg/restore/split.go:131
github.com/pingcap/br/pkg/restore.SplitRanges
\tgithub.com/pingcap/br@/pkg/restore/util.go:390
github.com/pingcap/br/pkg/restore.(*tikvSender).splitWorker
\tgithub.com/pingcap/br@/pkg/restore/pipeline_items.go:234
runtime.goexit
\truntime/asm_amd64.s:1357”] [stack=“github.com/pingcap/br/cmd.runRestoreCommand
\tgithub.com/pingcap/br@/cmd/restore.go:34
github.com/pingcap/br/cmd.newDBRestoreCommand.func1
\tgithub.com/pingcap/br@/cmd/restore.go:131
github.com/spf13/cobra.(*Command).execute
\tgithub.com/spf13/cobra@v1.0.0/command.go:842
github.com/spf13/cobra.(*Command).ExecuteC
\tgithub.com/spf13/cobra@v1.0.0/command.go:950
github.com/spf13/cobra.(*Command).Execute
\tgithub.com/spf13/cobra@v1.0.0/command.go:887
main.main
\tgithub.com/pingcap/br@/main.go:58
runtime.main
\truntime/proc.go:203”]
[2022/06/07 09:30:36.995 +08:00] [ERROR] [main.go:59] [“br failed”] [error=“split region failed: err=message:"Coprocessor [components/raftstore/src/coprocessor/split_observer.rs:154]: no valid key found for split." : [BR:Restore:ErrRestoreSplitFailed]fail to split region”] [errorVerbose=“[BR:Restore:ErrRestoreSplitFailed]fail to split region
split region failed: err=message:"Coprocessor [components/raftstore/src/coprocessor/split_observer.rs:154]: no valid key found for split."
github.com/pingcap/br/pkg/restore.(*pdClient).sendSplitRegionRequest
\tgithub.com/pingcap/br@/pkg/restore/split_client.go:287
github.com/pingcap/br/pkg/restore.(*pdClient).BatchSplitRegionsWithOrigin
\tgithub.com/pingcap/br@/pkg/restore/split_client.go:332
github.com/pingcap/br/pkg/restore.(*pdClient).BatchSplitRegions
\tgithub.com/pingcap/br@/pkg/restore/split_client.go:371
github.com/pingcap/br/pkg/restore.(*RegionSplitter).splitAndScatterRegions
\tgithub.com/pingcap/br@/pkg/restore/split.go:271
github.com/pingcap/br/pkg/restore.(*RegionSplitter).Split
\tgithub.com/pingcap/br@/pkg/restore/split.go:131
github.com/pingcap/br/pkg/restore.SplitRanges
\tgithub.com/pingcap/br@/pkg/restore/util.go:390
github.com/pingcap/br/pkg/restore.(*tikvSender).splitWorker
\tgithub.com/pingcap/br@/pkg/restore/pipeline_items.go:234
runtime.goexit
\truntime/asm_amd64.s:1357”] [stack=“main.main
\tgithub.com/pingcap/br@/main.go:59
runtime.main
\truntime/proc.go:203”]
【复现路径】做过哪些操作出现的问题
- 6月4号20点做了一次全备
- 6月6号10点开始全备恢复,大概在第二天的凌晨恢复完
- 6月7号10点开始增量恢复(6月4号20点到6月6号17点的增量), 出现的错误如下
此时gc 时间设置的比较长168h(7天)
后经沟通可能是gc 设置过长导致该问题, 于是调小gc 并执行compact , 结果磁盘满了, 集群挂了, 于是重做集群& 全量恢复
第二次全量恢复完之后, gc 相关信息如下
- 6月9号10点多开始做增量恢复,结果又出现一样的错误
【问题现象及影响】
【附件】
- 相关日志、配置文件、Grafana 监控(https://metricstool.pingcap.com/)
- TiUP Cluster Display 信息
- TiUP CLuster Edit config 信息
- TiDB-Overview 监控
- 对应模块的 Grafana 监控(如有 BR、TiDB-binlog、TiCDC 等)
- 对应模块日志(包含问题前后 1 小时日志)
若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。