Be Lazy!

Or, why lazy evaluation is such a good thing

I was recently challenged, while teaching a Scala class, to explain exactly why lazy evaluation is important. That had me looking at the use cases enumerated in the course notes, and it wasn’t really compelling. Sure, there are some obvious reasons that lazy evaluation makes sense — but I really think there are even more compelling reasons to use lazy evaluation.

What it means to be lazy

Most languages are not inherently lazy. Java and Scala, for instance, depend on specific syntax to use lazy evaluation:

lazy val later = calculateTheNumberz()

By using the lazy keyword we defer evaluation of an expression until the defined name is first referenced. In other words, in the above declaration the value of later won’t be calculated until we first reference later. At that point, it will be calculated (just once) and any future reference will return the calculated value. In contrast, this expression:

val later = calculateTheNumberz()

Will immediately calculate the value of later. The obvious advantage here is really twofold:

  1. Potentially, we never call calculateTheNumberz() and, presuming this is an expensive operation, we never incur the overhead of that calculation.
  2. If we do reference later, we at least defer calling calculateTheNumberz() until we actually needed it. This can mean faster program startup, for instance.

So, lazy evaluation is a method of evaluation. It means that expressions are not evaluated when they are bound to variables, but their evaluation is deferred until the results are needed. Consequently, arguments are not evaluated before they are passed to a function, but only when their values are actually used. This kind of evaluation is call-by-name, as opposed to call-by-value or eager evaluation.

def someComputation(x: => Int, y: => Int) = if (x > 0) x else y

In the above example, the Scala function someComputation() takes two parameters. The => notation indicates lazy evaluation (or “call-by-name”), so the value of x and y will not be computed until referenced. In the expression if (x > 0) x else y the value x will be computed and, if positive, y will never be computed. If we imagine that y is a potentially expensive operation, this can be a huge performance saver.

Lazy by nature

Some languages are lazy by nature. These tend to be functional languages, such as Haskell. A potential advantage to languages that evaluate everything lazily is that you don’t have to remember to use the lazy syntax.

Another hidden advantage of “lazy by nature” is that programs tend to be more efficient without forethought. You can bang out a potentially expensive expression, like this:

if (someCondition) expensiveCall() else heftyCall()

We don’t have to worry about such an expression, because only one of expensiveCall() or heftyCall() will be evaluated. In contrast, languages that default to eager evaluation may evaluate both functions before deciding which one to call. Some languages are smart enough to avoid unnecessary function calls, but in a truly “lazy by nature” setting, even expressions such as 2 * 2 will not evaluate unless referenced.

The less obvious

Clearly, lazy evaluation readily translates into pragmatic efficiency — most obviously, by avoiding unnecessary computation. But what else does it do?

One benefit is working with infinitely large data sets. For instance, you can model an infinite list of numbers or an infinite stream of data. In a lazy evaluation model, only the head element of such a stream will be evaluated, whereas an eager evaluation will calculate the entire data set:

def listRange(lo: Int, hi: Int): List[Int] = {
	if (lo >= hi) Nil
	else lo :: listRange(lo + 1, hi)

scala> listRange(5, 10)
5678910 res0: List[Int] = List(5, 6, 7, 8, 9)

In this example, listRange() creates a list of integers from a low to a high range. Since it uses call-by-value evaluation, the entire data set is calculated (as we can see from the output). If this were, say, a huge or infinite data set we would have a problem.

Lazy evaluation let’s us solve this problem. Here’s the same implementation, using a Scala Stream:

def streamRange(lo: Int, hi: Int): Stream[Int] = {
	print(s"${lo} ")
	if (lo >= hi) Stream.empty
	else Stream.cons(lo, streamRange(lo + 1, hi))

scala> streamRange(5, 10)
5 res0: Stream[Int] = Stream(5, ?)

In this case, the lazy Stream class only calculates the head value of the data set, returning the first integer. The tail calculation is left for later, giving us an easy (and efficient) way to work with arbitrarily large data sets. Only when we ask for the next value will the stream calculate whether we are done with the set.

This works because the Stream class uses a very simple call-by-name lazy calculation similar to the following:

class Stream[T]
object Stream {
	def cons[T](h: T, t: => Stream[T]) = new Stream[T] {
		def isEmpty = false
		def head = h
		def tail = t

Notice how the tail of the stream is constructed using call-by-name semantics: t: => Stream[T]. The value t is not calculated until we ask for it some time in the future.

But there’s more!

We’ve discussed some very practical advantages to using lazy evaluation, but I think there are even more important reasons to use it liberally.

Lazy languages are pure, because it is very hard to reason about side effects in a lazy language. By “pure” I mean pure in the sense of functional programming: pure functions are referentially transparent. In a language that doesn’t evaluate expressions until they are referenced, it’s more natural to think in terms of deferring evaluation — and that strikes me as conducive to functionally pure reasoning.

Also, laziness allows you to define new control structures in the language or implement DSLs. You can’t write conditional evaluators (think if-then-else expressions) in a strictly evaluated language. Likewise it’s even harder when you consider how to deal with loops or infinite data sets.

And finally — and maybe most important — the best approach to dealing with side-effects in the type system, like monads, can only be expressed effectively by relying on laziness. Think about how hard it would be to implement monadic operations without violating at least one of the monad laws (for instance, mapping over a Future).

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s