tispark报错:Failed to find data source

1、按照github上的tispark的使用方式:
val sparkConf = new SparkConf()
.setIfMissing(“spark.tispark.write.allow_spark_sql”, “true”)
.setIfMissing(“spark.master”, “spark://xxxxx:7077”)
.setIfMissing(“spark.app.name”, getClass.getName)
.setIfMissing(“spark.sql.extensions”, “org.apache.spark.sql.TiExtensions”)
.setIfMissing(“spark.tispark.pd.addresses”, “xxxx:2379,xxxx:2379,xxxx:2379”)
.setIfMissing(“spark.tispark.tidb.addr”, “tidb”)
.setIfMissing(“spark.tispark.tidb.password”, “xxxxx*”)
.setIfMissing(“spark.tispark.tidb.port”, “4000”)
.setIfMissing(“spark.tispark.tidb.user”, “xxxx”)
val spark = SparkSession.builder.config(sparkConf).getOrCreate()
val sqlContext = spark.sqlContext
val tidbOptions: Map[String, String] = Map()
val df = sqlContext.read
.format(“tidb”)
.options(tidbOptions)
.option(“database”, “energyanalysis”)
.option(“table”, “building”)
.load()
启动报错:
sted exception is java.lang.ClassNotFoundException: Failed to find data source: tidb. Please find packages at http://spark.apache.org/third-party-projects.html
19/08/09 13:53:52 ERROR SpringApplication: Application run failed
org.springframework.beans.factory.BeanCreationException: Error creating bean with name ‘hisDataService’: Invocation of init method failed; nested exception is java.lang.ClassNotFoundException: Failed to find data source: tidb. Please find packages at http://spark.apache.org/third-party-projects.html
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:139)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:414)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1770)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:593)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:515)
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:320)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:318)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:199)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:845)
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:877)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:549)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:742)
at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:389)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:311)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1213)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1202)
at com.sribs.ems.sparksumserver.SparksumserverApplication.main(SparksumserverApplication.java:10)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:47)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:86)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:50)
at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: Failed to find data source: tidb. Please find packages at http://spark.apache.org/third-party-projects.html
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:639)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:190)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
at com.sribs.ems.sparksumserver.scala.service.CalcService.calcHour(CalcService.scala:41)
at com.sribs.ems.sparksumserver.service.HisDataService.init(HisDataService.java:27)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:363)
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:307)
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:136)
… 35 more
Caused by: java.lang.ClassNotFoundException: tidb.DefaultSource
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:92)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23$$anonfun$apply$15.apply(DataSource.scala:622)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23$$anonfun$apply$15.apply(DataSource.scala:622)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23.apply(DataSource.scala:622)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23.apply(DataSource.scala:622)
at scala.util.Try.orElse(Try.scala:84)
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:622)
… 46 more
主要的错误是
Caused by: java.lang.ClassNotFoundException: Failed to find data source: tidb. Please find packages at http://spark.apache.org/third-party-projects.html
Caused by: java.lang.ClassNotFoundException: tidb.DefaultSource

2、使用:
var df=sparkSession.sql(“select * from energyanalysis.building”)
这样的使用方式没有问题
说明:在spark-defaults.conf中已经增加了tispark相关扩展:
spark.tispark.pd.addresses $your_pd_servers
spark.sql.extensions org.apache.spark.sql.TiExtensions
提交提交:spark-submit --master spark://XXX:7077 --jars tispark-core-2.1.1-spark_2.3-jar-with-dependencies.jar sparksumserver-1.0.0.jar

3.使用方式一为社么会报 Failed to find data source: tidb这样的错误?,直接这样sparkSession.sql底层是否使用了tispark?

4.补充说明,我整合了spring boot,混合使用java和scala,这是否有影响?

2 个赞
  1. 第一点提到的用法并不支持,TiSpark 并不是作为一个正常的 datasource 方式接入的 TiDB,因此尝试载入 Data Source 的语法并不会成功。
  2. 第二点是可以的也是现在版本推荐的方式。
  3. 第三点见第一点回复。
  4. 第四点:混合方式应该没有什么问题,TiSpark 本身也是混合 Java 和 Scala 的。
2 个赞

感谢,不过github上https://github.com/pingcap/tispark/blob/master/docs/datasource_api_userguide.md 写的是第一种方法,如果推荐用第二种,建议修改github上的说明

如果觉得别人的回答有帮助,可以标记为“解决方案”,帮助其他有同样问题的人也能快速找到答案~

1 个赞

请确认是否有多个 TiSpark 版本存在,在使用 spark-submit --jars 选项时要确保 $SPARK_HOME/jars 下没有旧版本的 TiSpark JAR 包

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。