Why Cluster Manager

The primary advantage of Spark Standalone is that it’s very easy to configure and makes it possible to get up and running quickly with a minimum of effort. However, its primary limitation is that it provides no utility for sharing the physical resources of a cluster with non-Spark applications.

If you have a scenario where you may have multiple groups sharing a cluster—some running Hive, some using HBase or Impala, and others running Spark, then you really need a cluster manager that can dynamically distribute resources between these different applications. Otherwise, the task of managing resource allocation between these different applications becomes extremely cumbersome and complex.

Last updated