collect, take, top, and first Actions

Whereas the count action returns a count of data elements in the RDD, the collect, take, top, and first actions return data from the RDD.

collect()

Syntax:

RDD.collect()

The collect action returns a list that contains all of the elements in an RDD to the Spark driver. Because collect does not restrict the output, which can be quite large and can potentially cause out-of-memory errors on the driver, it is typically useful only for small RDDs or development.

Last updated