Why Cluster Manager

The primary advantage of Spark Standalone is that it’s very easy to configure and makes it possible to get up and running quickly with a minimum of effort. However, its primary limitation is that it provides no utility for sharing the physical resources of a cluster with non-Spark applications.

If you have a scenario where you may have multiple groups sharing a cluster—some running Hive, some using HBase or Impala, and others running Spark, then you really need a cluster manager that can dynamically distribute resources between these different applications. Otherwise, the task of managing resource allocation between these different applications becomes extremely cumbersome and complex.

PreviousSpark job submission breakdown NextSparkContext and its components

Last updated 5 years ago

Was this helpful?