Skip to content

martinprobson/spark-parallel-execution

Repository files navigation

Spark Parallel Execution

Experiments around executing spark queries in parallel.

Three methods are used here: -

ParallelWithFutures

Uncomment the line : -

    // Uncomment the following line to pause the code and allow the Spark UI to be viewed 
//scala.io.StdIn.readLine()

to view the Spark UI.

ParallelWithCats/ParallelWithZIO

Note: Use method closeSparkSessionWithPause to keep the Spark UI active.

Example Spark Job Output with parTraverseN(1)

Screenshot

Example Spark Job Output with parTraverseN(10)

Screenshot

Notes

Running on Java 17+

Add the following to Java options to run : -

--add-exports java.base/sun.nio.ch=ALL-UNNAMED

ToDo

  1. Lift the SparkEnv config into a ZIO layer also. See zio.config package.

About

Spark parallel execution with CATS effect

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages