Elegance Can Be the Enemy of Efficiency

I’ve been auditing some of the advanced Scala courses offered by École Polytechnique Fédérale de Lausanne, and as always am curious about my own solutions compared against what other people come up with.

One of the things I love about Scala is its elegance. Once you understand the language, you can achieve some really complex things quickly and easily — and, elegantly. So, a problem posed in the coursework is to write a function that, given a character in a game field, returns the first index of that character in the field.

After writing my own solution, I Googled a few other solutions. Here’s one:

def findChar(c: Char, levelVector: Vector[Vector[Char]]): Pos = {
   val ys = levelVector map (_ indexOf c)
   val x = ys indexWhere (_ >= 0)
   Pos(x, ys(x))
}

This is pretty elegant. In fact, I was first taken by how functional it looked — more functional than my own code. But there’s a problem. This elegant code is potentially very inefficient.

Let’s assume the game field is big. It consists of a Vector of Vectors, and the above code searches the entire game field. That is, even if the character matches the very first element in the field, all the rest of the elements will be scanned:

  1. ys maps over the entire Vector of Vector scanning for the index of c.
  2. x then scans the vector ys for a positive hit.
  3. Then the coordinate is returned from the points described by x and ys(x).

So the obvious problem here is our elegant, functional, good looking code could be terribly inefficient. Here’s my implementation:

def findChar(c: Char, levelVector: Vector[Vector[Char]]): Pos = { 
   @tailrec def iterate(x: Int): Pos = levelVector(x).indexOf(c) match { 
      case y if y > -1 => Pos(x, y) 
      case _ => iterate(x + 1) 
   } 
   iterate(0) 
}

Ok, it’s not as pretty, it’s longer, but — it’s still functional by design. More important, it will look for the character using a linear search, and stops when it finds the character. It’s efficient:

  1. We start in the first Vector, checking for the index of the character.
  2. As soon as we find it, the function returns. Otherwise, it recurses and moves to the next Vector.

It’s not as pretty but it preserves efficient design. Just because we can write pretty, elegant code doesn’t mean we always should.

Incidentally, you could also achieve nearly the same efficiency using something like this:

val row = levelVector.indexWhere(_.contains(c))
val col = levelVector(row).indexOf(c)

Personally, I prefer to avoid any potential inefficiency — but, over-optimizing can also be a problem. It’s probably best to find a happy balance. Don’t ignore efficiency, but don’t spend all your time optimizing situations that, quite likely, don’t really need the effort.

You’re Testing It Wrong…

I really like test driven development. It lets me be lazy. I don’t have to worry about my software quality, or that something I did broke some other thing. And with good dependency injection to make sure every component is working right, “it just works.” Now I code using TDD (writing my tests first, then coding to fulfill them), and I focus our QA efforts on making sure we have great test plans, and great coverage.

Closed System
A closed system wants to be tested.

So, when one of my project teams kept telling me they couldn’t write tests because the database wasn’t ready I got worried. Our team had been immersed in TDD for months — and every single engineer had nodded vigorously when I set expectations. The team leader recited the definition of “dependency injection,” just to drive home how ready they were to embrace it!

But when I asked to see the what was wrong, I knew we had a problem. The team’s tests were not injecting mock objects the right way. The idea behind dependency injection is to replace the smallest component possible in a closed system with another object, a “mock.” That mock can then monitor the system around it, inject different behaviors, and create desired results.

For example, let’s say we have a program that connects to a gizmo — your home thermostat. The thermostat itself is a separate component that lives outside your program. We can expect the thermostat to behave like a thermostat should… reporting the current temperature, and letting the home owner enter a desired room temperature. Pretty straight forward.

So the first step is to write a program that talks to the thermostat. We can wire up a real thermostat, but we’ve got a problem right off the bat. We want to know how our program behaves as the ambient temperature changes — 65 degrees, 32 degrees, or 100 degrees. But a real thermostat is only going to report the actual room temperature, and making the room frigid or boiling just isn’t going to be very comfortable or practical.

Not Mocking
Faking is not mocking.

This is where dependency injection comes in — wouldn’t it be great if we could inject a new gizmo, one that behaves according to our test plan?

It turns out that my team had been taking the wrong approach — but one that is pretty easy to make if you’re new to the idea of mocking and dependency injection. Unfortunately, it meant that they weren’t really testing the application. The were testing something else entirely.

Once we walked through the system, the mistake was clear. During application start up, it created a connection to a database. My team’s approach had been to add a “mocking” variable to the application. In effect, it created a test condition; if the application was in “mocking mode” it would only simulate database calls. If the application was not in “mocking mode” it sent real requests to a real database. Sounds great, right?

But it’s all wrong. Here’s the problem: The application was faking real world behavior. That is, throughout the program there were dozens of little tests, in effect asking, “if we are mocking, then don’t check the database for a user record, instead just return this fake user record.”

This meant that the real program — the actual application logic that would be deployed to the real world — was never tested. Instead, an alternate branch of logic was tested — a fake program, if you will. So two things happened:

  1. We weren’t testing the real program, we were testing something else altogether.
  2. The program itself became terribly complicated because of all the checks to find out “are we mocking?” and the subsequent code to do something else entirely.

And all of that is why my team said they couldn’t really test the system, because the database wasn’t up and running.

So what does real dependency injection look like? It’s simple: You want to change the actual gizmo, but change it in the most subtle way possible — and then you want to put that actual gizmo right back into your program.

Mocking
Real mocking doesn’t affect the original program flow.

Getting back to the thermostat example, an ideal solution would be to modify a real thermostat. You could crack it open, remove the temperature sensor, and add a little dial to it that lets you change the reported temperature. Then you plug the “mock thermostat” into your program, and you change the temperature manually! A potentially better approach would be to change the software that talks to your thermostat, and instrument it so that you can override the actual reported temperature. Your program would still think it’s talking to a real thermostat, and the connecting software could change the actual temperature before handing it off to your program.

In our case, the right solution could be injecting a simple mock component at the just the right point in our program.

For example, lets say our application uses an Authenticator object to log in users. The Authenticator checks the validity of a user in the database, and then returns a properly constructed User object. We can use dependency injection to substitute our own test data by overriding the single function we care about:

object fakeAuthenticator extends Authenticator {
    override def getUser(id: Int): Option[User] = {
        Some(User(id: -1, name: "Fake User"))
    }
}

On line 2, we replace the real Authenticator’s getUser function. The overridden method returns a hard-wired User object (in this case, one that clearly doesn’t represent a valid user account). By overriding the Authenticator in the test package only, the original program is not altered — all that’s left is to inject our altered Authenticator into the program.

The old fashioned way of doing injection is still reliable: Don’t tell, ask. Use a factory object to ask for the Authenticator. Given a factory in the application (let’s call it the AuthenticatorFactory) we can override what the factory actually returns in our test case only:

AuthenticatorFactory.setAuthenticatorInstance(fakeAuthenticator)

A slightly more modern approach is to use a dependency injection framework, but the underlying principle is exactly the same.

Likewise we can take the concept of mock objects further by using frameworks such as Mockito (a framework that works wonderfully with specs2). Mockito makes it easy to instrument real objects with test driven behavior. For example, Mockito will produce a mock object that acts just like a real object, but fulfills expectations (such as testing to make sure that a specific function is called a certain number of times).

Whatever tools and frameworks you use, test driven development has proven itself over the past decade. My own experience is the same: Every TDD project has produced more predictable results, has better velocity, and has been more reliable overall. It’s why I don’t do any coding without following TDD.

Quick! Explain “Microservices”

Recently Martin Fowler (a well-known co-author of the Agile Manifesto) wrote an in-depth and very lucid explanation of Microservices, along with co-author James Lewis. With new technologies, new fads, and new twists on old terminology popping up every month, it can be hard to weed out just another buzzword from something tangible.

Fowler’s definition is, in short, that Microservices are “an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API.”

So what does that mean? Isn’t any program, ultimately, a bunch of small services working together? Here’s a quick synopsis for the TL;DR types:

  1. Microservice architectures embody small, independently deployable services (usually business functions). Such functions tend to have a small scope, and can sustain themselves. They don’t need a big, monolithic application framework — they are “small services.”
  2. Such services are independently scalable. In other words, you don’t have to invest a huge amount of resource scaling your entire application — instead, you can focus on making only those “important or difficult” services scalable.
  3. They provide explicit (component) interfaces. This means that each small service provides a clean, usable API. This tends to promote excellent encapsulation and also stability across a suite of services.
  4. This same attribute (explicit interfaces) also promotes segmented development. Since each service is, by definition, independent and each offers a stable API, each can be built and maintained on its own.
  5. Change cycles tend to accelerate, again because each service is independent. Since each service stands on its own, a single service can be changed, tested, and redeployed without a monolithic build/test/fix/deploy cycle.
  6. This architecture also tends to lead to robustness. Since each service is independent, networks of services tend to test for, isolate, and handle failure gracefully. In other words, since there is no single monolithic system, trust is low. Services tend to anticipate and handle failure well.

Microservices tend to be very agile-friendly, too — leading to the question, which was first, the microservice or agile (it was probably agile). The point is, since services can be quickly iterated, we can make small changes more easily. And as Martin Fowler points out, we also benefit from great architectural flexibility:

Splitting the monolith’s components out into services we have a choice when building each of them. You want to use Node.js to standup a simple reports page? Go for it. C++ for a particularly gnarly near-real-time component? Fine. You want to swap in a different flavour of database that better suits the read behaviour of one component? We have the technology to rebuild him.

 

For a more in-depth explanation of Microservices, I’d recommend reading Fowler’s article.