tispark报错

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:

【TiDB 版本】
4.0.11
【问题描述】

tispark查询实时表

Job aborted due to stage failure: Task 3 in stage 176282.0 failed 4 times, most recent failure: Lost task 3.3 in stage 176282.0 (TID 4726207, bigdata.5.123, executor 29999): com.pingcap.tikv.exception.TiClientInternalException: Error reading region:
at com.pingcap.tikv.operation.iterator.DAGIterator.doReadNextRegionChunks(DAGIterator.java:189)
at com.pingcap.tikv.operation.iterator.DAGIterator.readNextRegionChunks(DAGIterator.java:166)
at com.pingcap.tikv.operation.iterator.DAGIterator.hasNext(DAGIterator.java:112)
at org.apache.spark.sql.tispark.TiRowRDD$$anon$1.hasNext(TiRowRDD.scala:69)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage66.coprocessorrdd_nextBatch$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage66.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: com.pingcap.tikv.exception.RegionTaskException: Handle region task failed:
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.pingcap.tikv.operation.iterator.DAGIterator.doReadNextRegionChunks(DAGIterator.java:184)
… 16 more
Caused by: com.pingcap.tikv.exception.RegionTaskException: Handle region task failed:
at com.pingcap.tikv.operation.iterator.DAGIterator.process(DAGIterator.java:233)
at com.pingcap.tikv.operation.iterator.DAGIterator.lambda$submitTasks$1(DAGIterator.java:90)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
… 3 more
Caused by: com.pingcap.tikv.exception.GrpcException: Request range exceeds bound, request range:[748000000000005DFF735F728000000002FF3365080000000000FA, end:748000000000005DFF735F728000000002FF35B9110000000000FA), physical bound:[748000000000005DFF735F728000000002FF3494C10000000000FA, 748000000000005DFF735F728000000002FF35B9110000000000FA)
at com.pingcap.tikv.region.RegionStoreClient.handleCopResponse(RegionStoreClient.java:713)
at com.pingcap.tikv.region.RegionStoreClient.coprocess(RegionStoreClient.java:660)
at com.pingcap.tikv.operation.iterator.DAGIterator.process(DAGIterator.java:219)
… 7 more

Driver stacktrace:


若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

麻烦参考帖子看下,另外参考这个帖子反馈下详细信息,比如 TiSpark 版本和遇到报错的查询 sql ,建表语句等,多谢。

tikv的错误日志, 看上去region已经合并了,但是没有获取到最新的region信息。
报错的时候是下午16点左右查的数据,但是看到对应的region其实在下午14.28就合并到新的region了

[err=“Request range exceeds bound, request range:[748000000000005DFFD75F728000000002FF4F58980000000000FA, end:748000000000005DFFD75F728000000002FF4FFFFF0000000000FA), physical bound:[748000000000005DFFD75F728000000002FF4FC7320000000000FA, 748000000000005DFFD75F728000000002FF4FFFFF0000000000FA)”]

[2021/05/28 14:28:54.223 +08:00] [INFO] [apply.rs:1977] [“execute CommitMerge”] [source_region=“id: 62378668 start_key: 748000000000005DFFD75F728000000002FF4FC7320000000000FA end_key: 748000000000005DFFD75F728000000002FF4FFFFF0000000000FA region_epoch { conf_ver: 682 version: 18605 } peers { id: 62378669 store_id: 2 } peers { id: 62502516 store_id: 1 } peers { id: 62511892 store_id: 3 }”] [index=24] [term=6] [entries=1] [commit=22] [peer_id=62415234] [region_id=62415233]
[2021/05/28 14:28:54.224 +08:00] [INFO] [peer.rs:2538] [“notify pd with merge”] [target_region=“id: 62415233 start_key: 748000000000005DFFD75F728000000002FF4F92310000000000FA end_key: 748000000000005DFFD75F728000000002FF4FFFFF0000000000FA region_epoch { conf_ver: 678 version: 18613 } peers { id: 62415234 store_id: 2 } peers { id: 62415236 store_id: 3 } peers { id: 62482095 store_id: 1 }”] [source_region=“id: 62378668 start_key: 748000000000005DFFD75F728000000002FF4FC7320000000000FA end_key: 748000000000005DFFD75F728000000002FF4FFFFF0000000000FA region_epoch { conf_ver: 682 version: 18605 } peers { id: 62378669 store_id: 2 } peers { id: 62502516 store_id: 1 } peers { id: 62511892 store_id: 3 }”] [peer_id=62415234] [region_id=62415233]

tispark是下面的包

tispark-assembly-2.3.13-SNAPSHOT.jar

  1. 是某一个 sql 导致的查询报错,还是很多sql都会有这个报错?
  2. 如果是某一个 sql,能否反馈查询的sql 和 建表信息?

有结论吗 ?我也遇到了这个问题