tispark 使用spark-sql报错

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:
    v3.0.8
  • 【问题描述】:
    20/01/14 17:21:02 INFO audit: ugi=tidb ip=unknown-ip-addr cmd=get_database: poit_dev
    20/01/14 17:21:03 WARN Utils: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting ‘spark.debug.maxToStringFields’ in SparkEnv.conf.
    Error in query: TiDBRelation(com.pingcap.tikv.TiSession@46a5aff,TiTableReference(poit_dev,prod_s_fact_data_extract,9223372036854775807),com.pingcap.tispark.MetaManager@695f382c,null) does not allow insertion.;;
    'InsertIntoTable Relation[ecode#36,collectime#37,type#38,version#39L,equipment#40,metrictag#41,metricval#42,devname#43,eid#44,pt#45L] TiDBRelation(com.pingcap.tikv.TiSession@46a5aff,TiTableReference(poit_dev,prod_s_fact_data_extract,9223372036854775807),com.pingcap.tispark.MetaManager@695f382c,null), true, false
    ± Project [ecode#0, collectime#1, type#2, version#3L, equipment#4, metrictag#5, metricval#6, devname#7, eid#9, pt#8L]
    ± Filter ((NOT (((eid#9 = f9e81a33193011e897097cd30adfba16) && (equipment#4 = MF_WM_11)) && ((pt#8L < cast(20190720 as bigint)) && (metricval#6 > cast(1000000 as double)))) && NOT (((eid#9 = 959179dde93711e7b15d7cd30ac334e0) && (equipment#4 = HXKTWJ)) && ((metrictag#5 = TR_LJ) && (second(cast(collectime#1 as timestamp), Some(Asia/Shanghai)) > 0)))) && (NOT (((eid#9 = d377f2f4b1534effbaa121881fa04aa9) && (equipment#4 = GM_TRQ2)) && ((metrictag#5 = TR_LJ) && (second(cast(collectime#1 as timestamp), Some(Asia/Shanghai)) > 0))) && NOT (((eid#9 = 432498ac8891409d8e264202d672afab) && (equipment#4 = PM_01_1HBYQ)) && ((metrictag#5 = E) && (second(cast(collectime#1 as timestamp), Some(Asia/Shanghai)) > 0)))))
    ± Join Inner, (ecode#0 = codename#13)
    :- SubqueryAlias ps
    : ± Project [ecode#0, collectime#1, type#2, version#3L, equipment#4, metrictag#5, metricval#6, devname#7, pt#8L]
    : ± Filter (((pt#8L >= cast(20191001 as bigint)) && (pt#8L <= cast(20191030 as bigint))) && metrictag#5 IN (E,HEAT_FLUX_TTL,F_TTL,TR_LJ))
    : ± SubqueryAlias prod_s_fact_data
    : ± Relation[ecode#0,collectime#1,type#2,version#3L,equipment#4,metrictag#5,metricval#6,devname#7,pt#8L] TiDBRelation(com.pingcap.tikv.TiSession@46a5aff,TiTableReference(poit_dev,prod_s_fact_data,112981554494),com.pingcap.tispark.MetaManager@695f382c,null)
    ± SubqueryAlias ee
    ± SubqueryAlias ent_i_enterprise
    ± Relation[eid#9,esco_id#10,name#11,fullname#12,codename#13,city_code#14,description#15,props#16,linkman#17,linkman_pn#18,tel#19,group_id#20,day_account_time#21L,day_account_flag#22L,rec_status#23L,update_time#24,address#25,sampling#26L,sequence_prefix#27,max_collection_point_num#28L,customization#29L,title#30,logo_img#31,default_title#32L,… 3 more fields] TiDBRelation(com.pingcap.tikv.TiSession@46a5aff,TiTableReference(poit_dev,ent_i_enterprise,137790),com.pingcap.tispark.MetaManager@695f382c,null)

若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

1 个赞

请看下 spark.debug.maxToStringFields 参数设置情况

我看之前的问答中 说 tispark 不支持insert 语句?

scala> spark.sql(“insert into table poit_dev.test2 select * from poit_dev.test”).show org.apache.spark.sql.AnalysisException: TiDBRelation(com.pingcap.tikv.TiSession@3a182eaf,TiTableReference(poit_dev,test2,9223372036854775807),com.pingcap.tispark.MetaManager@4bc15d49,null) does not allow insertion.;; 'InsertIntoTable Relation[name#3,age#4L,sex#5] TiDBRelation(com.pingcap.tikv.TiSession@3a182eaf,TiTableReference(poit_dev,test2,9223372036854775807),com.pingcap.tispark.MetaManager@4bc15d49,null), false, false ± Project [name#0, age#1L, sex#2] ± SubqueryAlias test ± Relation[name#0,age#1L,sex#2] TiDBRelation(com.pingcap.tikv.TiSession@3a182eaf,TiTableReference(poit_dev,test,1276),com.pingcap.tispark.MetaManager@4bc15d49,null)

at org.apache.spark.sql.execution.datasources.PreWriteCheck$.failAnalysis(rules.scala:442) at org.apache.spark.sql.execution.datasources.PreWriteCheck$$anonfun$apply$14.apply(rules.scala:465) at org.apache.spark.sql.execution.datasources.PreWriteCheck$$anonfun$apply$14.apply(rules.scala:445) at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:117) at org.apache.spark.sql.execution.datasources.PreWriteCheck$.apply(rules.scala:445) at org.apache.spark.sql.execution.datasources.PreWriteCheck$.apply(rules.scala:440) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$2.apply(CheckAnalysis.scala:386) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$2.apply(CheckAnalysis.scala:386) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:386) at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:95) at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:108) at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:105) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201) at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:105) at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:78) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) … 49 elided

目前 TiSpark 不支持直接将数据写入 TiDB 集群,但可以使用 Spark 原生的 JDBC 支持进行写入

https://pingcap.com/docs-cn/stable/reference/tispark/#通过-jdbc-将-dataframe-写入-tidb

有没有tispark 通过jdbc连接的详细示例文档,github上的建立客户端 感觉不是很详细

可以搜索一下原生 spark 通过 jdbc 的方式写入的例子

原生spark连接 怎么连接到tidb集群?

1 个赞
val sparkSession: SparkSession = SparkSession.builder()
		.appName("spark")
		.master("spark://ip:7077")
		.getOrCreate()
这部分 怎么连接到tidb 或者集群?  连接了只有default库  这里又没有hive-site文件之类的  是怎么弄得

之前配置错了 现在 连上了 但是还是只有 default 库

嗯嗯好的

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。