spark notes
Ctrlk
  • Introduction
  • Databricks
  • Concepts
  • Spark Execution Flow
  • Resilient Distributed Dataset (RDD)
    • Caching
    • Pair RDDs
    • Transformations
    • Actions
    • Persistence
    • RDD lineage
    • Types of RDDs
    • Loading Data into RDDs
    • Data Locality with RDDs
    • How Many Partitions Does An RDD Have
  • Spark job submission breakdown
  • Why Cluster Manager
  • SparkContext and its components
  • Spark Architecture
  • Spark Deployment Modes
  • Running Modes
  • Spark Execution Flow
  • DataFrames, Datasets,RDDs
  • SparkSQL
  • Where Does Map Reduce Does not Fit
  • Actions
  • DataSets
  • Spark Application Garbage Collector
  • How Mapreduce works in spark
  • Notes
  • Scala
  • Spark 2.0
  • Types Of RDDs
  • Spark UI
  • Optimization
  • Spark Streaming
  • FlatMap - Different Variations
  • Examples
  • Testing Spark
  • Passing functions to Spark
  • CONFIGURATION, MONITORING, AND TUNING
  • References
Powered by GitBook
On this page

Was this helpful?

  1. Resilient Distributed Dataset (RDD)

How Many Partitions Does An RDD Have

https://databricks.gitbooks.io/databricks-spark-knowledge-base/content/performance_optimization/how_many_partitions_does_an_rdd_have.html

PreviousData Locality with RDDsNextSpark job submission breakdown

Last updated 6 years ago

Was this helpful?