# SparkContext and SparkSession

## SparkContext

SparkContext is the first object that a Spark program must create to access the cluster. In spark-shell, it is directly accessible via spark.sparkContext.

Here's how you can programmatically create SparkContext in your Scala code:

`import org.apache.spark.SparkContext`

`import org.apache.spark.SparkConf`

`val conf = new SparkConf().setAppName("my app").setMaster("master url")`

`new SparkContext(conf)`

## SparkSession - Starting from Spark 2.x

SparkContext, though still supported, was more relevant in the case of RDD . As you will see, different libraries have different wrappers around SparkContext, for example, HiveContext/SQLContext for Spark SQL, StreamingContext for Streaming, and

so on.

As all the libraries are moving toward DataSet/DataFrame, it makes sense to have a unified entry point for all these libraries as well, and that is SparkSession. SparkSession is available as spark in the spark-shell.

Here's how you do it:

import org.apache.spark.SparkContext

import org.apache.spark.SparkConf

val sparkSession = SparkSession.builder.master("master url").appName("my app").getOrCreate()


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://nag-9-s.gitbook.io/spark-notes/spark-execution-flow/sparkcontext-and-sparksession.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
