Lightning物理模式导入数据时报错error="peer 62039, store 4, region 62037, epoch conf_ver:5 version:172 , when close write stream: rpc error: code = Unknown desc = RequestTooOld(\"region 62037 is not found\")"并一直重试

【 TiDB 使用环境】测试
【 TiDB 版本】V8.1.1
【复现路径】做过哪些操作出现的问题
dumpling导出测试库,然后清空原集群数据库,lightning再导入到原测试集群
【遇到的问题:问题现象及影响】
[2024/11/04 12:19:40.070 +08:00] [INFO] [local.go:1316] [“pause pd scheduler of table scope”]
[2024/11/04 12:19:40.235 +08:00] [INFO] [local.go:1354] [“start import engine”] [uuid=ae650d2b-7f05-57f3-97ef-39844e3c72ea] [“region ranges”=1] [count=10] [size=1210]
[2024/11/04 12:19:40.235 +08:00] [INFO] [local.go:871] [“import engine ranges”] [count=1]
[2024/11/04 12:19:40.297 +08:00] [WARN] [local.go:1240] [“meet retryable error when writing to TiKV”] [error=“peer 57054, store 1, region 57053, epoch conf_ver:5 version:170 , when close write stream: rpc error: code = Unknown desc = RequestTooOld("request region 57053 is staler than local region, local epoch conf_ver: 5 version: 172, request epoch conf_ver: 5 version: 170")”] [“job stage”=needRescan]
[2024/11/04 12:19:40.303 +08:00] [WARN] [split.go:140] [“failed to scan region, retrying”] [error=“scan region return empty result, startKey: 7480000000000001FF485F728000000000FF0000010000000000FA, endKey: 7480000000000001FF485F728000000000FF0075370000000000FA: [BR:PD:ErrPDBatchScanRegion]batch scan region”] [regionLength=0]
[2024/11/04 12:19:40.316 +08:00] [INFO] [local.go:1447] [“put job back to jobCh to retry later”] [startKey=7480000000000001485F728000000000000001] [endKey=7480000000000001485F728000000000007537] [stage=regionScanned] [retryCount=1] [waitUntil=2024/11/04 12:19:42.316 +08:00]
[2024/11/04 12:19:42.328 +08:00] [WARN] [local.go:1240] [“meet retryable error when writing to TiKV”] [error=“peer 62039, store 4, region 62037, epoch conf_ver:5 version:172 , when send data: rpc error: code = Unknown desc = RequestTooOld("region 62037 is not found")”] [“job stage”=needRescan]
[2024/11/04 12:19:42.330 +08:00] [INFO] [local.go:1447] [“put job back to jobCh to retry later”] [startKey=7480000000000001485F728000000000000001] [endKey=7480000000000001485F728000000000007537] [stage=regionScanned] [retryCount=1] [waitUntil=2024/11/04 12:19:44.330 +08:00]
[2024/11/04 12:19:44.340 +08:00] [WARN] [local.go:1240] [“meet retryable error when writing to TiKV”] [error=“peer 62039, store 4, region 62037, epoch conf_ver:5 version:172 , when send data: rpc error: code = Unknown desc = RequestTooOld("region 62037 is not found")”] [“job stage”=needRescan]
[2024/11/04 12:19:44.343 +08:00] [INFO] [local.go:1447] [“put job back to jobCh to retry later”] [startKey=7480000000000001485F728000000000000001] [endKey=7480000000000001485F728000000000007537] [stage=regionScanned] [retryCount=1] [waitUntil=2024/11/04 12:19:46.343 +08:00]
[2024/11/04 12:19:46.354 +08:00] [WARN] [local.go:1240] [“meet retryable error when writing to TiKV”] [error=“peer 62039, store 4, region 62037, epoch conf_ver:5 version:172 , when send data: rpc error: code = Unknown desc = RequestTooOld("region 62037 is not found")”] [“job stage”=needRescan]
【资源配置】进入到 TiDB Dashboard -集群信息 (Cluster Info) -主机(Hosts) 截图此页面
【附件:截图/日志/监控】


lightning的版本也是8.1.1嘛?

[2024/11/04 12:19:40.297 +08:00] [WARN] [local.go:1240] [“meet retryable error when writing to TiKV”] [error=“peer 57054, store 1, region 57053, epoch conf_ver:5 version:170 , when close write stream: rpc error: code = Unknown desc = RequestTooOld(“request region 57053 is staler than local region, local epoch conf_ver: 5 version: 172, request epoch conf_ver: 5 version: 170”)”] [“job stage”=needRescan]
[2024/11/04 12:19:40.303 +08:00] [WARN] [split.go:140] [“failed to scan region, retrying”] [error=“scan region return empty result, startKey: 7480000000000001FF485F728000000000FF0000010000000000FA, endKey: 7480000000000001FF485F728000000000FF0075370000000000FA: [BR:PD:ErrPDBatchScanRegion]batch scan region”] [regionLength=0]

从这个错误上看,像是lightning导入某个表的时候有其他人也在这个表上做了变更。

版本一样的,就很简单的一个测试表,只有几行记录,导入前将数据库删除清空了。

全部清除临时文件后,又回到最初的错误了

把集群全部销毁,彻底重建再尝试一下。

起码在lightning的测试用例上看,这个错误和导入以及insert anto_random有明确的相关性。

RequestTooOld具体的报错位置还没找到。

1 个赞

确认是磁盘满了嘛?


确实小于5GB了

1 个赞

这个应该是我清除断点续传没清除干净,清除干净后没了

1 个赞

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。