50
CHAPTER 3
■ COLLECTIONS AND THE JOY OF IMMUTABILITY
Spreadsheets are far at the what end of the spectrum. Spreadsheets contain formulas
that define the relationship between cells (so we tell the computer what we want to do).
The order of evaluating the cells, the cells that are calculated based on changes in other
cells, and so on, are not specified by the user but are inferred by the spreadsheet engine
based on the relationships among the cells (the computer does the how part for us). In a C
program, one always thinks about changing memory. In a spreadsheet (which is a program;
Excel is the most popular programming language in the world), one thinks about altering
input (changing nonformula cells) and seeing the output (what is recalculated).
Java evolved from imperative roots and spouted mostly mutable data structures in
its standard libraries. The number of mutable classes in
java.util.* far outnumber the
immutable classes. By default, variables in Java are mutable, and you have to explicitly
declare them as
final if you want them to be assign-once. Despite that I’ve written a
couple of commercial spreadsheets,
1
and should have understood the benefits of the
functional “what” approach, until I spent a lot of time with Scala, I did not think about
immutable data structures. I thought about flow of control. It took over a year of practicing
immutability and avoiding flow-of-control imperatives in my code before I really grokked
immutability and what-oriented coding. So, why doesn’t Java have more immutable data
structures? Because it’s not obvious that Java needs them until you code with them for a
while, and very few Java developers I know spent a lot of time with Lisp, ML, or Haskell.
But immutability is better.
With a good garbage collector like the one in the JVM, immutable data structures tend
to perform better than mutable data structures. For example, Scala’s
List, which is an
immutable linked list, tends to perform better than Java’s
ArrayList using real-world data.
This advantage exists for a couple of reasons.
ArrayList pre-allocates an internal array of
10 slots to put items in. If you store only two or three items in the
ArrayList, seven or eight
slots are wasted. If you exceed the default 10 slots, there’s an O(n) copy operation to move
the references from the old array to the new array. Contrast this with Scala’s
List, which
is a linked list. Adding elements is a constant-time, O(1), operation. The only memory
consumed for storing items is the number of items being stored. If you have hundreds of
items or if you’re going to do random access on the collection, an
Array is a better way to
store data. But, most real-world applications are moving two to five items around in a
collection and accessing them in order. In this case, a linked list is better.
Immutable data structures are part of the formula for more stable applications. As you
start thinking about immutable data structures, you also start reducing the amount of
state that is floating around in your application. There will be less and less global state.
There will be fewer things that can be changed or mutated. Your methods will rely less and
less on setting global state or changing the state of parameters, and your methods will
become more and more transformative. In other words, your methods will transform the
1. Mesa for NextStep, which is still available for Mac OS X (http://www.plsys.co.uk/mesa), and Mesa 2 for
OS/2 and the Integer multiuser spreadsheet engine for the JVM.
19897ch03.fm Page 50 Thursday, April 16, 2009 4:54 PM