first

RDD.first()

The first action returns the first element in this RDD. Similar to take and collect and unlike top, first does not consider the order of elements and is a non-deterministic operation (in fully distributed environments specifically).

As you can see from Listing 10.13, the primary difference between first and take(1) is that first returns an atomic data element, take (even if n = 1) returns a list of data elements. The first action is useful for inspecting the output of an RDD as part of development or data exploration.

Last updated