Over the past fifteen years Java has been a phenomenally popular programming language, but it is starting to show its age and programmers are increasingly looking at more modern languages. The purpose of this article is to explain why Scala is the most likely successor to Java and how it can make you more productive. Rather than simply listing the features that Scala has, I’ve included a number of comparisons between Java and Scala code, to demonstrate how the different Scala language features enable you to implement the same functionality more quickly in Scala than Java.
Since Scala compiles to Java bytecode and runs on the JVM, programs written in Scala can benefit from the huge amount of library code already written in Java. However, by using Scala you get the following benefits:
- Mandatory boilerplate code is gone – no getters and setters, no checked exceptions.
- More powerful constructs, that allow you to do more with less code, such as case classes, option and tuples.
- More powerful code reuse – the elements of code that you can reuse are smaller. Rather than classes with single inheritance, you have traits and functions.
I’ll work through these points in turn, giving examples of each.
No getters and setters
In Java, a class to represent a person with a name and age might be:
public class Person{
private String firstName = null;
private String lastName = null;
private int age = 0;
public void setFirstName(String firstName) {
this.firstName = firstName;
}
public String getFirstName() {
return firstName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
public String getLastName() {
return lastName;
}
}
In Scala, most likely you would write this class as:
public class Person {
var firstName = ""
var lastName = ""
var age = 0
}
The variables in this class are public. If you come from a Java background this sounds worrying - doesn't it mean that if we ever need getters and setters with additional code in them we'll have to change all of the code that uses this class? Not in Scala. This is because Scala has a very flexible method syntax, so we can write a method that looks the same as accessing the variables directly. An example would be:
public class Person {
var firstName = ""
var lastName = ""
private var theAge = 0
def age = theAge
def age_= (newAge : Integer) : Unit = {
if (newAge > 0) theAge = newAge
}
}
In this example, I've renamed the variable to theAge and made it private, but written a getter and setter. The getter method is called age so you can write p.age to get the age, just like before. The setter is called age_=. The underscore has a special meaning here - it allows you to write a method name with a space in it. This means that when you write:
val p = new Person()
p.age = 33
you are actually invoking the new setter method.
No checked exceptions
When Java was invented, it seemed like a good idea to force developers to deal with certain possible error conditions, which led to the concept of checked exceptions. Scala has removed these. If you want to catch an exception, you can do so, but you're not forced to insert try/catch statements throughout your code if you don't want to.
Case classes
Case classes are like an enhanced version of the Java switch statement. They are small classes that are usually defined in the same class file as the real classes that you want to match. Unlike switch, they can understand different object types and extract data from them. Consider a scenario in which you are iterating over a tree structure that represents an organisation chart for a company. The nodes in the tree are either of type Group or Employee. If you find a Group node, you want to print out the name of the group and the size. If you find an employee, you want to print out their name and job title. In Java, your code would look something like:
if (node instanceof Group) {
Group g = (Group)node;
System.out.println("Group name: " + g.getName() + " Size: " + g.getSize());
}
else if (node instanceof Employee) {
Employee e = (Employee)node;
System.out.println("Name: " + e.getName() + " Job: " + e.getJob());
}
In Scala this would be:
match node {
case g: Group => println("Group name: " + g.name + " Size: " + g.size)
case e: Employee => println("Name: " + e.name + " Job: " + e.job)
}
In this example I've used a "typed pattern" match, which avoids the type casts required in Java. If this was the only thing pattern matching could do, it wouldn't be that impressive, but it can do much more. There are several different sorts of pattern matching, the most powerful of which is probably a "constructor pattern". By matching against the contructor for a class, you can nest additional pattern matches against the values that have been passed into that constructor. These patterns can themselves be constructor matches, allowing you to match as deeply as you want. Continuing the example above, suppose that in addition to a manager, some groups also have a project manager. You want to find all groups that have a manager who is in salary band 10 and a project manager who is in salary band 9. In Java you'll need something like:
if (node instanceof Group) {
Group g = (Group)node;
Manager m = g.getManager();
ProjectManager pm = g.getProjectManager();
if (m.getJobBand() == 10 && pm != null && pm.getJobBand() == 9) {
System.out.println("Group: " + g.getName());
}
}
In Scala, with the appropriate case classes, this would be:
match node {
case g: Group(Manager(10),ProjectManager(9))
=> println("Group: " + g.getName())
}
Option
In Java, it can be painful having to perform a != null check each time you get a variable that might be null. For example:
items = shoppingBasket.getItems();
if (items != null) {
for (Item i : items) {
// process each item
}
}
else {
System.out.println("No items in shopping basket.")
}
The Scala standard library provides a class called Option, which has two subclasses, Some and None. Some is a container that wraps whatever class you are using. The basic pattern is that methods that could return null in Java return an Option, which will be either Some or None. Then calling code can use pattern matching on the returned value:
i = shoppingBasket.items
match i {
case Some(items) => items.foreach( // process each item )
case None => println("No items in shopping basket")
}
With the Java code, you can forget to insert the != null check, which can then lead to a NullPointerException at runtime, but with the Scala code, this isn't possible.
Tuples
How many times have you written a method in Java, only to find that you really want to return two things from the method, not one? In Java the standard way to fix this is to create a small class that just contains the return values, but then you are bloating your code by having a class when all you really need to do is specify that the method returns multiple things. Scala has exactly this concept with tuples. A tuple is common in functional languages and is simply a heterogenous list. It is written using brackets, so a tuple composed of the integer 5 and string "hello" would be written:
(5,"hello")
If you want to return multiple values from a method, you simply pass them back as a tuple like this.
Traits
In an effort to avoid the problems of multiple inheritance as it was defined in C++, Java eschewed multiple inheritance entirely. This can make reusing code from two places very difficult. In Scala you can only inherit from a single class but you can also mixin as many "traits" as you want. A "trait" can be thought of as similar to an abstract class in Java.
class MyQueue extends BasicIntQueue with Incrementing with Filtering
In C++, the above sort of statement could result in ambiguity as to which method to invoke. If Incrementing and Filtering both inherit from the same base class A, you have the "diamond problem" whereby there are two instances of class A. C++ addresses this by giving you the virtual keyword which ensures that there is only a single instance of A. In Scala, traits can extend other traits or classes, but Scala always has a defined order in which methods must be invoked, by using a linearization algorithm (similar to other languages such as Python). This means you get the code reuse benefits of inheriting from multiple places without the problems caused by non-virtual inheritance.
Functions as closures
In Java, if you want to allow callers to pass code into your class to be invoked, you have create an interface or concrete class before writing a callback method. Suppose that you have a class which contains a collection of Person objects. You want to write a method that will iterate over all of the Person objects and run some code that has been passed in, which will produce a summary of each Person as a String. In Java you would first have to declare an interface:
public interface PersonSummariser {
public String summarise(Person p);
}
Then you can write your callback method, specifying that code to be passed in must implement this interface:
public void summarisePeople(PersonSummariser summariser)
You've been forced to write an interface, and the person using your class has been forced to create a class (at best they might be able to create an anonymous class so they don't need a full class definition), just to pass in code that could be as short as a single line. In Scala, this would be handled by a closure. In mathematical terms, a closure is a function for which all of the variables are bound. i.e. given values. If the method you are writing supplies values for all of the parameters in the function that is passed in, you have a closure. The Scala method definition would be:
def summarisePeople(s : Person => String)
Here we have written a method that accepts a function s, which takes a single parameter of type Person, and returns a value of type String. No need to create any additional interfaces or classes.
Closures are used extensively to perform operations on collections in functional languages. Here are just a few of the methods Scala provides in its collection classes which allow you to pass in a function to perform various operations:
- map - transform a collection of type A to another collection of type A
- filter - reduce the collection by filtering out all elements that don't meet a boolean condition
- foldLeft - apply a function to each element of the collection in turn and sum the results e.g. square every integer in a list
Standalone functions
In the example above, if you declare the Scala summarise function inline, you are creating a closure. But functions are first class entities in Scala, so you can define them independently and reuse them wherever you want. Suppose you had multiple classes holding Person objects, such OrganisationChart, Company, Team and so on, if you wanted to define a function to print out the Person objects that you could pass into any method with the same signature, you could do so, anywhere in your code:
def summarise(p : Person) : String = p.firstName + " " + p.lastName
In fact, you don't need to declare that the return type on the above method is String, as the Scala compiler will infer it, but I added it for clarity. No longer is a class the smallest element of reuse you have, you can define individual functions and pass them around as you wish.
Currying
Suppose you're writing code that calculates economic statistics for countries. You have a function that takes a population size and a GDP value. What if you wanted to invoke this multiple times with a fixed population size but differing GDP values? You might expect to have to repeat the first argument whenever you use the function:
calculateStats(pop1, 2000)
calculateStats(pop1, 10000)
In fact, you can "curry" the function, which means creating a new version of the function in which all but one of the parameters have already been supplied:
val cs = calculateStats(pop1, _ : Double)
Here we have supplied a value for the first parameter, but used the underscore to show that we're not supplying a value for the second parameter. You can then invoke this new "cs" method to calculate statistics specifically for countries of a specified size. At first this might not seem that powerful - surely we're just saving ourselves a bit of typing? However, consider that in Scala, a method parameter doesn't have to be a simple object, it can itself be a function. This makes it very easy to write code that is both powerful and flexible. You can write functions that perform specific tasks and combine them however you want.
Summary
If you haven't used Scala before, hopefully this article has persuaded you that it's worth investigating. We've seen that:
- It doesn't require all of the boilerplate code that is needed in Java, such as getters, setters and checked exceptions.
- It has powerful constructs that allow you to do more with less code, such as case classes and tuples.
- It gives you better code reuse with traits and functions.
It's worthwhile explaining why I haven't mentioned a couple of things that Scala is known for - actors and parser combinators. Actors are a powerful mechanism for multi-threaded programming that avoid some of the problems of locks. Parser combination is a way of writing language parsers by combining lots of small parsers, rather than writing (or more likely generating) a single parser from a BNF grammar. Whilst both of these topics are interesting, I'm not sure either of them is necessarily indicative of the power of the Scala
language. Each of them can be implemented in Java using an appropriate library - Kilim for actors and jparsec for parser combinators. By contrast, the topics I've covered above show things that have to be implemented within a language itself and cannot be provided by library code.
Hang on - what about languages like Ruby, Groovy or Clojure?
All of these languages are good, powerful languages that can make you more productive. However, you can't necessarily learn all four. Why should you choose Scala over the others?
The feature set of Ruby, Groovy and Scala is broadly the same. They have all done away with getters and setters, and checked exceptions. They all have more functional concepts than Java, such as closures and first class functions. They all offer multiple inheritance via traits (mixins). However, both Ruby and Groovy are scripting languages that are dynamically and weakly typed so whilst they are good for small tasks such as automation, they don't lend themselves to constructing large enterprise applications as well as Scala and Clojure. Scala has a very powerful type system and compiler so many bugs can be found at compile time. Scala is a hybrid of object oriented and functional concepts, so its syntax is broadly object oriented, whereas Clojure is a lisp variant and hence uses Church's lambda calculus notation, which is a very different syntax. Finally, Scala does have some concepts which don't really appear in the other languages, of which the most obvious example is case classes, which offer a very powerful syntax for matching objects by type and extracting data from them.
Okay, you got me. How do I find out more about Scala?
If you want a comprehensive overview of the entire language, the first edition of the book "Programming in Scala" book is available free online:
Programming in Scala
In particular, some of the topics I've mentioned above are:
Traits
Case classes
The Eclipse Scala IDE is available from:
http://scala-ide.org/
Daniel Spiewak's blog has numerous good posts on Scala, such as:
Funtional currying in Scala
The Option pattern
Pingback: This week in #Scala (04/05/2012) | Cake Solutions Team Blog