ubuntu自带的python 版本是2.7,
我们要把pyspark默认改成anaconda python 3.6
down vot
You can specify the version of Python for the driver by setting the appropriate environment variables in the ./conf/spark-env.sh file. If it doesn't already exist, you can use the spark-env.sh.templatefile provided which also includes lots of other variables.
Here is a simple example of a spark-env.sh file to set the relevant Python environment variables:
#!/usr/bin/env bash
# This file is sourced when running various Spark programs.
export PYSPARK_PYTHON=/usr/bin/python3
export PYSPARK_DRIVER_PYTHON=/usr/bin/ipython
In this case it sets the version of Python used by the workers/executors to Python3 and the driver version of Python to iPython for a nicer shell to work in.
意思就是把spark文件夹下的./conf/spark-env.sh.tempalte 重命名成spark-env.sh
然后添加如下内容:
# This file is sourced when running various Spark programs.
export PYSPARK_PYTHON=/usr/bin/python3
export PYSPARK_DRIVER_PYTHON=/usr/bin/ipython
重启spark 即可