Scala is relatively new language based on the JVM. The main difference between other “Object Oriented Languages” and Scala is that everything in Scala is an object. The primitive types that are defined in Java, such as int or boolean, are objects in Scala. Functions are treated as objects, too. As objects, they can be passed as arguments, allowing a functional programming approach to writing applications for Apache Spark.

If you have programmed in Java or C#, you should feel right at home with Scala with very little effort.

You can also run or compile Scala programs from commandline or from IDEs such as Eclipse.

To learn and experiment with data, I prefer the interactivity of the Scala shell. Let’s launch the Spark shell as it is a fully capable Scala shell.

spark-shell --master yarn-client --driver-memory 512m --executor-memory 512m


Values can be either mutable or immutable. Mutable values are expressed by the var keyword.

var a: Int = 5
a = a + 1

Immutable values are expressed by the val keyword.

val b: Int = 7
b = b + 1 //Error

###Type Inference

Scala like Java is a strongly typed language. But, it is very smart in type inference, which alleviates the onus of specifying types from the programmers.

var c = 9

val d = 9.9

val e = "Hello"


Functions are defined with the def keyword. In Scala, the last expression in the function body is returned without the need of the return keyword.

def cube(x: Int): Int = {
  val x2 = x * x * x


You can write the function more succinctly by leaving out the braces and the return type as return type can be easily inferred.

def cube(x: Int) = x*x*x

###Anonymous Functions

Anonymous functions can be assigned as a var or val value. It can also be passed to and returned from other functions.

val sqr: Int => Int = x => x * x

Or, anonymous functions can be further shortened to

val thrice: Int => Int = _ * 3

where _ is a the shorthand for whatever the input was.


Scala has very convenient set of collections including Lists, Sets, Tuples, Iterators and Options. When you combine these data structures with anonymous functions and closures it is very expressive.

val strs = Array("This", "is", "happening")

strs.reduce(_+" "+_)

This is far from comprehensive. To learn more visit

Saptak Sen

If you enjoyed this post, you should check out my book: Starting with Spark.

Share this post