迁移数据报错 停止

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:4.0.4 tiup部署
  • 【问题描述】:迁移数据报错 停止

[2020/10/23 22:17:47.715 +08:00] [ERROR] [local.go:1012] [“split & scatter ranges failed”] [error="split region failed: region=id:11153 start_key:“t\200\000\000\000\000\000\006\377\225_r\2050\253\260\222\377\221@\000\000\000\000\000\000\372” end_key:“t\200\000\000\000\000\000\006\377\225_r\2050\270\225\373\377\221@\000\000\000\000\000\000\372” region_epoch:<conf_ver:5 version:908 > peers:<id:11154 store_id:1 > peers:<id:11155 store_id:4 > peers:<id:11156 store_id:5 > , err=message:“EpochNotMatch [region 11153] 11155 epoch changed conf_ver: 5 version: 909 != conf_ver: 5 version: 908, retry later” epoch_not_match:<current_regions:<id:11153 start_key:“t\200\000\000\000\000\000\006\377\225_r\2050\260.\372\377\021@\003\000\000\000\000\000\372” end_key:“t\200\000\000\000\000\000\006\377\225_r\2050\270\225\373\377\221@\000\000\000\000\000\000\372” region_epoch:<conf_ver:5 version:909 > peers:<id:11154 store_id:1 > peers:<id:11155 store_id:4 > peers:<id:11156 store_id:5 > > > "] [errorVerbose=“split region failed: region=id:11153 start_key:“t\200\000\000\000\000\000\006\377\225_r\2050\253\260\222\377\221@\000\000\000\000\000\000\372” end_key:“t\200\000\000\000\000\000\006\377\225_r\2050\270\225\373\377\221@\000\000\000\000\000\000\372” region_epoch:<conf_ver:5 version:908 > peers:<id:11154 store_id:1 > peers:<id:11155 store_id:4 > peers:<id:11156 store_id:5 > , err=message:“EpochNotMatch [region 11153] 11155 epoch changed conf_ver: 5 version: 909 != conf_ver: 5 version: 908, retry later” epoch_not_match:<current_regions:<id:11153 start_key:“t\200\000\000\000\000\000\006\377\225_r\2050\260.\372\377\021@\003\000\000\000\000\000\372” end_key:“t\200\000\000\000\000\000\006\377\225_r\2050\270\225\373\377\221@\000\000\000\000\000\000\372” region_epoch:<conf_ver:5 version:909 > peers:<id:11154 store_id:1 > peers:<id:11155 store_id:4 > peers:<id:11156 store_id:5 > > > \ngithub.com/pingcap/br/pkg/restore.(*pdClient).sendSplitRegionRequest\ \t/home/jenkins/agent/workspace/ld_lightning_multi_branch_v4.0.7/go/pkg/mod/github.com/pingcap/br@v0.0.0-20200909093836-36281d93ab13/pkg/restore/split_client.go:273\ngithub.com/pingcap/br/pkg/restore.(*pdClient).BatchSplitRegions\ \t/home/jenkins/agent/workspace/ld_lightning_multi_branch_v4.0.7/go/pkg/mod/github.com/pingcap/br@v0.0.0-20200909093836-36281d93ab13/pkg/restore/split_client.go:319\ngithub.com/pingcap/tidb-lightning/lightning/backend.(*local).BatchSplitRegions\ \t/home/jenkins/agent/workspace/ld_lightning_multi_branch_v4.0.7/go/src/github.com/pingcap/tidb-lightning/lightning/backend/localhelper.go:156\ngithub.com/pingcap/tidb-lightning/lightning/backend.(*local).SplitAndScatterRegionByRanges\ \t/home/jenkins/agent/workspace/ld_lightning_multi_branch_v4.0.7/go/src/github.com/pingcap/tidb-lightning/lightning/backend/localhelper.go:68\ngithub.com/pingcap/tidb-lightning/lightning/backend.(*local).ImportEngine\ \t/home/jenkins/agent/workspace/ld_lightning_multi_branch_v4.0.7/go/src/github.com/pingcap/tidb-lightning/lightning/backend/local.go:1010\ngithub.com/pingcap/tidb-lightning/lightning/backend.(*ClosedEngine).Import\ \t/home/jenkins/agent/workspace/ld_lightning_multi_branch_v4.0.7/go/src/github.com/pingcap/tidb-lightning/lightning/backend/backend.go:329\ngithub.com/pingcap/tidb-lightning/lightning/restore.(*TableRestore).importKV\ \t/home/jenkins/agent/workspace/ld_lightning_multi_branch_v4.0.7/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:1582\ngithub.com/pingcap/tidb-lightning/lightning/restore.(*TableRestore).importEngine\ \t/home/jenkins/agent/workspace/ld_lightning_multi_branch_v4.0.7/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:1145\ngithub.com/pingcap/tidb-lightning/lightning/restore.(*TableRestore).restoreEngines.func1\ \t/home/jenkins/agent/workspace/ld_lightning_multi_branch_v4.0.7/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:948\ runtime.goexit\ \t/usr/local/go/src/runtime/asm_amd64.s:1357”]
[2020/10/23 22:17:47.715 +08:00] [WARN] [backend.go:334] [“import spuriously failed, going to retry again”] [engineTag=top_thread_game3.top_thread_game_order:1] [engineUUID=0d1460ea-21f1-529b-aae1-c51ba6e46f3b] [retryCnt=0] [error="split region failed: region=id:11153 start_key:“t\200\000\000\000\000\000\006\377\225_r\2050\253\260\222\377\221@\000\000\000\000\000\000\372” end_key:“t\200\000\000\000\000\000\006\377\225_r\2050\270\225\373\377\221@\000\000\000\000\000\000\372” region_epoch:<conf_ver:5 version:908 > peers:<id:11154 store_id:1 > peers:<id:11155 store_id:4 > peers:<id:11156 store_id:5 > , err=message:“EpochNotMatch [region 11153] 11155 epoch changed conf_ver: 5 version: 909 != conf_ver: 5 version: 908, retry later” epoch_not_match:<current_regions:<id:11153 start_key:“t\200\000\000\000\000\000\006\377\225_r\2050\260.\372\377\021@\003\000\000\000\000\000\372” end_key:“t\200\000\000\000\000\000\006\377\225_r\2050\270\225\373\377\221@\000\000\000\000\000\000\372” region_epoch:<conf_ver:5 version:909 > peers:<id:11154 store_id:1 > peers:<id:11155 store_id:4 > peers:<id:11156 store_id:5 > > > "]

你好,

请简述下当前集群的状态,
导入的工具名称和版本信息,辛苦提供下。

[lightning]
#region-concurrency =
level = “info”
file = “tidb-lightning.log”
[tikv-importer]
backend = “local”
sorted-kv-dir = “/root/kong”

[mydumper]
data-source-dir = “/root/backup-tidb”

[tidb]
host = “192.168.18.67”
port = 4000
user = “root”
password = “top789456”
status-port = 10080
pd-addr = “192.168.18.67:2379”

集群是tiup安装的 刚初始化的 都是 up状态的
工具是这个wget https://download.pingcap.org/tidb-toolkit-v4.0.3-linux-amd64.tar.gz

使用的是 local-lightning 模式,将 toolkit 版本提升至 v4.0.7 看下是否可以解决问题。

反馈下 lightning.log 看下日志的上下文信息。

cuowu.log (225.3 KB)

只能下wget https://download.pingcap.org/tidb-toolkit-v4.0.7-linux-amd64.tar.gz的

麻烦再上传一下,对应时间的 pd/tikv 的日志

额 当时的对应日志没了 能帮忙简单分析下 大概是哪问题 ,当时是刚初始化的集群 共4台机器.一台控制节点.其他三台做数据库

哪能描述一下,咱们的数据是通过 mydumper 备份的吗?生成的文件大小,能说一下吗?最好贴一下命令,如果是 mydumper 进行的备份,建议调大参数 mydumper.batch-size 字段(默认是 10G)

命令如下: ./bin/mydumper -h 127.0.0.1 -P 4000 -u root -F 256m -f ‘.’ -f ‘!top_log*.*’ --filetype csv --threads 32 --skip-tz-utc -o /data/my_database/ 文件ls -lh看最大的是257M

好的,咱们如果还有环境,建议修改一下 lightning 的参数【配置文件中 有个 dumper 模块,里面有 batch-size 参数】,建议修改大该参数,重新试一下

///好

我是导出没问题 导入到另一个初始环境报错 是改batch-size这个试试吗

嗯,对的,是改 lightning的 参数配置文件里的内容,可参考https://github.com/pingcap/tidb-lightning/blob/master/tidb-lightning.toml#L101