Skip to content

[BUG]: Spark submit fails: No such file or directory: '/opt/spark/python/pyspark/./bin/spark-submit' #60

@x1linwang

Description

@x1linwang

Describe the bug

I'm trying to run spark in jupyter notebook using the spylon-kernel but when I try to run any code, it is just stuck at "Initializing scala interpreter..." and the error in the ubuntu terminal (I'm a windows user running spark in the WSL - ubuntu 18.04 ) is attached below.

To Reproduce

Steps to reproduce the behavior:

  1. Install Anaconda3-2021.11-Linux-x86_64, java8 (openjdk-8-jdk), Spark 3.2.0 and spylon-kernel using the steps described in the attached file: spark installation instructions for Windows users.pdf
  2. Open a jupyter notebook and type sc.version or anything
  3. Observe that it is stuck at "Initializing scala interpreter..."
  4. Go to the ubuntu 18.04 terminal and see error

I used the following the set up the spark env variables: (my spark is installed in opt/spark and my python path is the anaconda 3 python path)

echo "export SPARK_HOME=/opt/spark" >> ~/.profile
echo "export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin" >> ~/.profile
echo "export PYSPARK_PYTHON=/home/lai/anaconda3/bin/python" >> ~/.profile 
source ~/.profile

Expected behavior
I expect the scala interpreter to run with no problem and sc.version should output 3.2.0

Screenshots
A screenshot from jupyter notebook
image

Error in ubuntu terminal is in the additional context section.

Desktop (please complete the following information):

  • OS: Windows 10
  • Browser: Chrome (for jupyter notebook)
  • Version: I have java8, python 3.9, spark 3.2.0 and hadoop 3.2.

Additional context
The error code is as follows:

[MetaKernelApp] ERROR | Exception in message handler:
Traceback (most recent call last):
  File "/home/lai/anaconda3/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 353, in dispatch_shell
    await result
  File "/home/lai/anaconda3/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 643, in execute_request
    reply_content = self.do_execute(
  File "/home/lai/anaconda3/lib/python3.9/site-packages/metakernel/_metakernel.py", line 397, in do_execute
    retval = self.do_execute_direct(code)
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_kernel.py", line 141, in do_execute_direct
    res = self._scalamagic.eval(code.strip(), raw=False)
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_magic.py", line 157, in eval
    intp = self._get_scala_interpreter()
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_magic.py", line 46, in _get_scala_interpreter
    self._interp = get_scala_interpreter()
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_interpreter.py", line 568, in get_scala_interpreter
    scala_intp = initialize_scala_interpreter()
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_interpreter.py", line 163, in initialize_scala_interpreter
    spark_session, spark_jvm_helpers, spark_jvm_proc = init_spark()
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_interpreter.py", line 99, in init_spark
    spark_context = conf.spark_context(application_name)
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon/spark/launcher.py", line 521, in spark_context
    return pyspark.SparkContext(appName=application_name, conf=spark_conf)
  File "/opt/spark/python/pyspark/context.py", line 144, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "/opt/spark/python/pyspark/context.py", line 339, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "/opt/spark/python/pyspark/java_gateway.py", line 98, in launch_gateway
    proc = Popen(command, **popen_kwargs)
  File "/home/lai/anaconda3/lib/python3.9/site-packages/spylon_kernel/scala_interpreter.py", line 94, in Popen
    spark_jvm_proc = subprocess.Popen(*args, **kwargs)
  File "/home/lai/anaconda3/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/home/lai/anaconda3/lib/python3.9/subprocess.py", line 1821, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/opt/spark/python/pyspark/./bin/spark-submit'

Thanks for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions