最基本的写入,返回错误:WARN RowIDAllocator: error during allocating row id

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【 TiDB 使用环境】
自己搭建的集群。 3个tidb,3个pd,3个tispark,3个tikv分别在三个机器上。

【概述】 场景 + 问题概述
在spark中,写入数据库出错

【背景】 做过哪些操作
按照文档(https://docs.pingcap.com/zh/tidb/stable/get-started-with-tispark)导入 tispark-sample-data到tidb数据库。然后读出数据后,创建数据表,然后将将读出的数据写入后,出错:
”WARN RowIDAllocator: error during allocating row id“

其中,数据库表创建的脚本:

CREATE TABLE target_customer (
C_CUSTKEY int(11) NOT NULL,
C_NAME varchar(25) NOT NULL,
C_ADDRESS varchar(40) NOT NULL,
C_NATIONKEY int(11) NOT NULL,
C_PHONE char(15) NOT NULL,
C_ACCTBAL decimal(15,2) NOT NULL,
C_MKTSEGMENT char(10) NOT NULL,
C_COMMENT varchar(117) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

写入数据库的java代码是:

SparkConf conf = new SparkConf()
.setMaster(“local[*]”)
.setAppName(“TiSparkJavaExample”)
.set(“spark.sql.extensions”, “org.apache.spark.sql.TiExtensions”)
.set(“spark.tispark.pd.addresses”, “127.0.0.1:2379”);
SparkSession spark = SparkSession
.builder()
.config(conf)
.getOrCreate();

    Dataset<Row> customers = spark.sql("select * from customer");

    customers.write().format("tidb")
            .option("tidb.addr", "127.0.0.1")
            .option("tidb.port", "4000")
            .option("tidb.user", "root")
            .option("tidb.password", "")
            .option("database", "tpch_001")
            .option("table", "target_customer")
            .mode("append")
            .save();

    spark.stop();

【现象】 业务和数据库现象
写入出错
【问题】 当前遇到的问题
写入出错, 错误日志是:

21/10/11 09:35:49 WARN RowIDAllocator: error during allocating row id
java.lang.RuntimeException: java.io.EOFException
at com.pingcap.tikv.codec.CodecDataInput.readLong(CodecDataInput.java:158)
at com.pingcap.tikv.allocator.RowIDAllocator.updateMeta(RowIDAllocator.java:159)
at com.pingcap.tikv.allocator.RowIDAllocator.updateHash(RowIDAllocator.java:196)
at com.pingcap.tikv.allocator.RowIDAllocator.udpateAllocateId(RowIDAllocator.java:230)
at com.pingcap.tikv.allocator.RowIDAllocator.initSigned(RowIDAllocator.java:273)
at com.pingcap.tikv.allocator.RowIDAllocator.doCreate(RowIDAllocator.java:109)
at com.pingcap.tikv.allocator.RowIDAllocator.create(RowIDAllocator.java:90)
at com.pingcap.tispark.write.TiBatchWriteTable.getRowIDAllocator(TiBatchWriteTable.scala:395)
at com.pingcap.tispark.write.TiBatchWriteTable.preCalculate(TiBatchWriteTable.scala:309)
at com.pingcap.tispark.write.TiBatchWrite$$anonfun$1.apply(TiBatchWrite.scala:203)
at com.pingcap.tispark.write.TiBatchWrite$$anonfun$1.apply(TiBatchWrite.scala:203)
at scala.collection.immutable.List.map(List.scala:284)
at com.pingcap.tispark.write.TiBatchWrite.doWrite(TiBatchWrite.scala:203)
at com.pingcap.tispark.write.TiBatchWrite.com$pingcap$tispark$write$TiBatchWrite$$write(TiBatchWrite.scala:88)
at com.pingcap.tispark.write.TiBatchWrite$.write(TiBatchWrite.scala:45)
at com.pingcap.tispark.write.TiDBWriter$.write(TiDBWriter.scala:40)
at com.pingcap.tispark.TiDBDataSource.createRelation(TiDBDataSource.scala:57)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
at com.zhiyun.sparkjob.FirstSpark.main(FirstSpark.java:48)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readLong(DataInputStream.java:416)
at com.pingcap.tikv.codec.CodecDataInput.readLong(CodecDataInput.java:156)
… 49 more
21/10/11 09:35:52 INFO TwoPhaseCommitter: prewrite primary key * successfully

【业务影响】
不能写入数据库
【 TiDB 版本】
v5.2.1

【附件】 相关日志及监控(https://metricstool.pingcap.com/)


若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

tispark 用的什么版本?

这些版本要求是否满足了?

参考下github 上的版本要求,地址:
https://github.com/pingcap/tispark

1赞

用tiup来部署的。 我去查了一下, spark version是2.4.3:

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.4.3
      /_/

Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_262
Branch
Compiled by user  on 2019-05-01T05:08:38Z
Revision
Url
Type --help for more information.

这个问题应该是已知的,已经在各个分支修复了。在下一个版本发版时会提供 fix。

那么:
1)大约什么时候修复?
2)现在要写入tidb,用什么方法呢? 直接用jdbc的方式?

so, 如果要写入数据到tidb,要怎么做呢? 直接用JDBC的方式也写不进去。

请问一下 JDBC 写入是报什么错误呢?

嗯。网上有很多教程都不靠谱。 试了很多方法, 把这个选项去掉后就写进去了。

               // .option("driver", "com.mysql.jdbc.driver")
     jdbcDF.write().mode(SaveMode.Append)
                .format("jdbc")
                .option("url", "jdbc:mysql://127.0.0.1:4000/test?rewriteBatchedStatements=true")
                .option("useSSL", "false")
                .option("dbtable", "target_customer") // database name and table name here
                .option("isolationLevel", "NONE") // set isolationLevel to NONE
                .option("user", "root") // TiDB user here
                .save();

感谢你的回复。

大约什么时候发版?