Spark Deployment Modes
Last updated
Was this helpful?
Last updated
Was this helpful?
Plz see Ch 10 and ch11 from spark in action book for more detailed discussion
Spark Deployment Modes Spark supports four , each with its own characteristics with respect to where Spark’s components run within a Spark cluster. Of all modes, the local mode, running on a single host, is by far the simplest.
As a beginner or intermediate developer you don’t need to know this elaborate matrix right away. It’s here for your reference, and the links provide additional information. Furthermore, Step 5 is a deep dive into all aspects of Spark Architecture.
a simple cluster manager included with Spark that makes it easy to set up a cluster.
In addition to running on the Mesos or YARN cluster managers, Spark also provides a simple standalone deploy mode. You can launch a standalone cluster either manually, by starting a master and workers by hand, or use our provided . It is also possible to run these daemons on a single machine for testing.
You can start a standalone master server by executing:
Once started, the master will print out a spark://HOST:PORT
URL for itself, which you can use to connect workers to it, or pass as the “master” argument to SparkContext
. You can also find this URL on the master’s web UI, which is by default.
Similarly, you can start one or more workers and connect them to the master via:
Once you have started a worker, look at the master’s web UI ( by default). You should see the new node listed there, along with its number of CPUs and memory (minus one gigabyte left for the OS).
Finally, the following configuration options can be passed to the master and worker:
From