Notes
Last updated
Was this helpful?
Last updated
Was this helpful?
Any Spark application consists of a single Driver process and one or more Executor processes. The Driver process will run on the Master node of your cluster and the Executor processes run on the Worker nodes.
Transformations run on executors & actions runs on driver because it needs to return value.
From Spark In action
You can access an accumulator’s value only from within the driver. If you try to access it from an executor, an exception will be thrown.
How to understand the DAG shown in Spark UI ?
ReduceByKey to find averages - the solution uses tuples, understand how it is used ..