tispark报错:Failed to find data source

1、按照github上的tispark的使用方式: val sparkConf = new SparkConf() .setIfMissing(“spark.tispark.write.allow_spark_sql”, “true”) .setIfMissing(“spark.master”, “spark://xxxxx:7077”) .setIfMissing(“spark.app.name”, getClass.getName) .setIfMissing(“spark.sql.extensions”, “org.apache.spark.sql.TiExtensions”) .setIfMissing(“spark.tispark.pd.addresses”, “xxxx:2379,xxxx:2379,xxxx:2379”) .setIfMissing(“spark.tispark.tidb.addr”, “tidb”) .setIfMissing(“spark.tispark.tidb.password”, “xxxxx*”) .setIfMissing(“spark.tispark.tidb.port”, “4000”) .setIfMissing(“spark.tispark.tidb.user”, “xxxx”) val spark = SparkSession.builder.config(sparkConf).getOrCreate() val sqlContext = spark.sqlContext val tidbOptions: Map[String, String] = Map() val df = sqlContext.read .format(“tidb”) .options(tidbOptions) .option(“database”, “energyanalysis”) .option(“table”, “building”) .load() 启动报错: sted exception is java.lang.ClassNotFoundException: Failed to find data source: tidb. Please find packages at http://spark.apache.org/third-party-projects.html 19/08/09 13:53:52 ERROR SpringApplication: Application run failed org.springframework.beans.factory.BeanCreationException: Error creating bean with name ‘hisDataService’: Invocation of init method failed; nested exception is java.lang.ClassNotFoundException: Failed to find data source: tidb. Please find packages at http://spark.apache.org/third-party-projects.html at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:139) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:414) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1770) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:593) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:515) at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:320) at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222) at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:318) at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:199) at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:845) at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:877) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:549) at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:742) at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:389) at org.springframework.boot.SpringApplication.run(SpringApplication.java:311) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1213) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1202) at com.sribs.ems.sparksumserver.SparksumserverApplication.main(SparksumserverApplication.java:10) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:47) at org.springframework.boot.loader.Launcher.launch(Launcher.java:86) at org.springframework.boot.loader.Launcher.launch(Launcher.java:50) at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: Failed to find data source: tidb. Please find packages at http://spark.apache.org/third-party-projects.html at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:639) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:190) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164) at com.sribs.ems.sparksumserver.scala.service.CalcService.calcHour(CalcService.scala:41) at com.sribs.ems.sparksumserver.service.HisDataService.init(HisDataService.java:27) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:363) at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:307) at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:136) … 35 more Caused by: java.lang.ClassNotFoundException: tidb.DefaultSource at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:92) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23$$anonfun$apply$15.apply(DataSource.scala:622) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23$$anonfun$apply$15.apply(DataSource.scala:622) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23.apply(DataSource.scala:622) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23.apply(DataSource.scala:622) at scala.util.Try.orElse(Try.scala:84) at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:622) … 46 more 主要的错误是 Caused by: java.lang.ClassNotFoundException: Failed to find data source: tidb. Please find packages at http://spark.apache.org/third-party-projects.html Caused by: java.lang.ClassNotFoundException: tidb.DefaultSource

2、使用: var df=sparkSession.sql(“select * from energyanalysis.building”) 这样的使用方式没有问题 说明:在spark-defaults.conf中已经增加了tispark相关扩展: spark.tispark.pd.addresses $your_pd_servers spark.sql.extensions org.apache.spark.sql.TiExtensions 提交提交:spark-submit --master spark://XXX:7077 --jars tispark-core-2.1.1-spark_2.3-jar-with-dependencies.jar sparksumserver-1.0.0.jar

3.使用方式一为社么会报 Failed to find data source: tidb这样的错误?,直接这样sparkSession.sql底层是否使用了tispark?

4.补充说明,我整合了spring boot,混合使用java和scala,这是否有影响?

1赞
  1. 第一点提到的用法并不支持,TiSpark 并不是作为一个正常的 datasource 方式接入的 TiDB,因此尝试载入 Data Source 的语法并不会成功。
  2. 第二点是可以的也是现在版本推荐的方式。
  3. 第三点见第一点回复。
  4. 第四点:混合方式应该没有什么问题,TiSpark 本身也是混合 Java 和 Scala 的。
1赞

感谢,不过github上https://github.com/pingcap/tispark/blob/master/docs/datasource_api_userguide.md 写的是第一种方法,如果推荐用第二种,建议修改github上的说明

如果觉得别人的回答有帮助,可以标记为“解决方案”,帮助其他有同样问题的人也能快速找到答案~

1赞

请确认是否有多个 TiSpark 版本存在,在使用 spark-submit --jars 选项时要确保 $SPARK_HOME/jars 下没有旧版本的 TiSpark JAR 包