cat numpy_test.py
#/usr/bin/python3
import pyspark
from pyspark.sql import SparkSession
import platform
import
spark=SparkSession.builder.getOrCreate()
spark.sparkContext.addFile("/data3/pytispark_env",recursive)
from numpy import *
import numpy
print(eye(2))
print(numpy.version)
print(platform.python_version())
print(pyspark.version)
from pytispark import pytispark as pti
import pytispark
ti=pti.TiContext(spark)
print(pytispark.version)
通过livy提交,log如下:
[[1.0.]
[0.1.]]
1.16.0
3.7.1
2.2.3
Tracebak(most recent call last):
File"/data3/numpy_test.py", line in
ti=pti.TiContext(spark)
File"/data/aiops/aiops/jars/pytispark/pytispark.py", in line 21, in init
self.ti = gw.jvm.TiExtensions.getInstance(sparkSession._jsparkSession).getOrCreateTiContext(sparkSession._jspark._jsparkSession)
TypeError: ‘JavaPackage’ object is not callable
tip:/data3/pytispark_env 是使用python的virtualenv生成的环境,包含numpy及pytispark依赖.
使用pyspark的submit方式无以下改动已测试成功:
jsparkSession = self._jvm.SparkSession(self._jsc.sc())
to
jsparkSession = self._jvm.SparkSession.builder().getOrCreate()
做以上改动后通过livy提交依然有同样的报错.
想了解下环境问题或者配置问题或者其他的原因.