Purify Your Tests: 2 Parametric, 2 Declarative

Purify Your Tests: 2 Parametric, 2 Declarative

In the last part we learned how to purify our tests using type parameters. In this and following parts we'll see some further benefits of adding type parameters this way.

Let's dig in!


Recall that last time we ended up with the following code1:

class UberService[Bookkept, Stored]( // 1
    fetcher: Fetcher,
    enricher: Enricher,
    bookkeeper: Bookkeeper[Bookkept], // 2
    storage: Storage[Stored]):

  def fetchAndStore(user: UserID): (Bookkept, Stored) = // 3
    val data = fetcher.fetch(user)
    val enriched = enricher.enrich(user, data)

    val bookkept = bookkeeper.bookkeep(data, enriched)
    val stored = storage.store(enriched)

    (bookkept, stored)

This is our UberService, doing some uber-work. Note the "fake" type parameters (1). These are used by Bookkeeper and Storage (2) in place of Unit that they used to return. The fetchAndStore method returns these values (3), so that we can inspect them in the test (by setting the type parameters to something useful), and discard them in production (by setting them to Unit).

So far, what we gained is the ability to write pure, stateless mocks in our tests, and avoid mutable state altogether. Is that all though?

NFT Type Parameters

There's something interesting going on with the type parameters we introduced. Unless we're cheating2, from the point of view of UberService there's no way of knowing what Bookkept and Stored is going to be, and so it cannot "fake" these values into existence3. UberService is forced to call Bookkeeper and Storage in order to comply with the return type of fetchAndStore.

This is what makes the resulting type signature so declarative:

def fetchAndStore(user: UserID): (Bookkept, Stored)

We know for a fact that there's no way to produce these values without using the provided Bookkeeper and Storage instances.

This property of type parameters being both "non-fungible" and declarative is something that we can take further advantage of.

A New Requirement

As sometimes happens, we just got a new requirement for our UberService. Turns out that the Bookkeeper service can sometimes fail by throwing an exception4. The new requirement is that it's no longer allowed to store the data unless bookkeeping succeeded. Can we guarantee this is the case?

The type signature of fetchAndStore says nothing about the ordering of bookkeeping and storage. Both Bookkept and Stored are returned, and there's no way of knowing which was invoked first. We could flip the invocation order, or even run in parallel and still have the same (Bookkept, Stored) result type.

That's okay, not everything can be enforced by types. Can we test that the new requirement holds?

Bad, Bad Tests!

We can actually check the invocation order of our mocks. But that will require us to use some form of side-effect/mutation. Like so:

class TestBookkeeper(invocations: ListBuffer[String]) // 1
  extends Bookkeeper[(UserData, EnrichedUserData)]:

  def bookkeep(original: UserData, enriched: EnrichedUserData): (UserData, EnrichedUserData) =
    invocations += "bookkeep" // 2

    (original, enriched)

class TestStorage(invocations: ListBuffer[String]) // 3
  extends Storage[EnrichedUserData]:

  def store(data: EnrichedUserData): EnrichedUserData =
    invocations += "store" // 4


In these new mocks we now take a constructor argument (1 and 3). A mutable list that will store the invocations that the test is about to perform. Then when actually invoking the bookkeep and store methods we add the appropriate invocation name to our list of invocations (2 and 4).

In the test itself we can use the new mocks:

"The uber-service" should:
  "invoke bookkeeping before storage" in:
    val invocations = ListBuffer.empty[String] // 1

    // 2
    val bookkeeper = new TestBookkeeper(invocations) 
    val storage = new TestStorage(invocations)

    val service = new UberService(TestFetcher, TestEnricher, bookkeeper, storage)

    val _ = service.fetchAndStore(UserID(5))

    invocations shouldBe List("bookkeep", "store") // 3

This new test first initializes an empty list of invocations (1). We then pass that empty list to our two new mocks (2). After invoking fetchAndStore we verify that the resulting invocations are in the correct order (3).

The test works, but yet again it sucks. We just spent a whole blog post eradicating mutation from our mocks. All that for nothing? Look how quickly mutation crept back in.

You might say that these concerns are just theoretical, the test works, leave it at that. And yet I won't leave it alone. Not only that I'm personally offended by mutability, but in the presence of concurrency this is a flaky test waiting to torment you in your CI pipeline forever. In a concurrent setting, which is quite likely for such an esteemed UberService, you might get arbitrary interleavings of bookkeep and store, and the test will randomly succeed or fail accordingly.

Are you willing to tolerate this?

Be Declarative Young Padawan

Remember way back when, a couple of paragraphs ago when I lamented that not everything can be enforced by types? Well, shame on me, I shouldn't have given up on them so quickly.

You see, there is a way to enforce invocation order using types. And it's one of the most elementary things that we have – functions.

Suppose I have a function f: A => B and a function g: B => C. If I start with A and end up with C, there's only one order in which f and g can be invoked. You must first invoke f to produce a B value, and only then can you pass it to g to produce the final C value.

But. This handwavy argument5 only works if B is "non-fungible". That is, if we have no way of obtaining B without invoking f.

All we need to do now, is to actually declare our intentions with the types, and to make sure they are non-fungible, or, in our case, parametric.

To that end, we are going to explicitly encode the relationship between Bookkeeper and Storage.

First we'll make our storage accept a precondition

trait Storage[Precondition, A]:
  def store(precondition: Precondition, data: EnrichedUserData): A

Now Storage takes another type parameter, which stands for a precondition that should hold before we can call store. We then modify store to accept the Precondition as an argument. That means that there is no way to call store without first obtaining a Precondition value. This signature declares our intentions more precisely, directly in the types. A Precondition value is proof that a function producing Precondition was invoked.

While Bookkeeper can remain the same, we'll now modify UberService to take advantage of the new precondition:

class UberService[Bookkept, Stored]( // 1
    fetcher: Fetcher,
    enricher: Enricher,
    bookkeeper: Bookkeeper[Bookkept],
    storage: Storage[Bookkept, Stored]): // 2

  def fetchAndStore(user: UserID): Stored = // 3
    val data = fetcher.fetch(user)
    val enriched = enricher.enrich(user, data)

    val bookkept = bookkeeper.bookkeep(data, enriched) // 4
    val stored = storage.store(bookkept, enriched) // 5

    stored // 6

UberService still takes the same two type parameters (1), but we now declare that the precondition for calling store is the presence of a Bookkept value (2). For reasons that will be clear a bit later, fetchAndStore no longer returns a tuple, instead it's only a Stored value (3).

Now for the main part. After we invoke bookkeep (4) and obtain a Bookkept value, we pass that value to store (5). We then return the Stored result (6).

Note how we no longer have a choice, if we want to obtain a Stored value we must invoke store after we successfully invoked bookkeep. Because the precondition for Storage is now the Bookkept value, and because the only way to obtain the (non-fungible) Bookkept value is by calling bookkeep, we are forced to observe our new ordering requirement.

Stated more explicitly, we now have the following chain:

bookkeep: (UserData, EnrichedUserData) => Bookkept
store: (Bookkept, EnrichedUserData) => Stored

Since Bookkept cannot be faked into existence, the only way to obtain a Stored value is to chain bookkeep with store

If we try to flip the order of store and bookkeep, we will fail to compile. If we forget to invoke bookkeep 12, we will fail to compile. We really have no choice. This is also the reason why we removed Bookkept from the return type of fetchAndStore, we already know for sure that bookkeep was called, there's no need to assert it again.

That's the power of having declarative types. By thoroughly declaring our requirements in the type system, we now have the compiler to enforce our invariants. Better yet, we can say that we "made an illegal state unrepresentable7".

All the while the production code can still remain completely oblivious to this:

class ProductionStorage(/* ... */) extends Storage[Unit, Unit]

The production implementation doesn't care about any preconditions, and just sets it to Unit. Nonetheless this cannot be abused by UberService to circumvent our rules. Since it's completely parametric, it cannot know that the real implementation has such an easily forgeable8 precondition, and so it cannot abuse that fact.

What's even better, we no longer need to write those terrible, mutable tests to check the invocation ordering requirement anymore. The compiler has our back. As they say 6:

The best tests are the ones you never had to write.

Even More Type Parameters?

The rush you get from deleting tests with impunity...

Can we delete more tests? Can we delete the original test we started with? Unfortunately, no, the original test is verifying that we are invoking Bookkeeper and Storage with the correct input. Unlike the outputs, the inputs of UserData and EnrichedUserData can be faked into existence, as they are concrete types.

Seeing the success we had so far with using fake type parameters, maybe we can apply this to the inputs as well. Indeed we can, but the proliferation of type parameters might not be worth it. In the next part I will find some justification to fake the inputs as well. But for now, we'll keep them as is and instead adapt the existing test to the new type signatures. This can be instructive.

Every Day I'm Refactoring...

Recall that we no longer have Bookkept as the output of fetchAndStore. Previously we used the Bookkept parameter as the vehicle to carry data about the invocation of bookkeep. How can we obtain this information now?

Swimmingly, it turns out.

On the one hand Bookkept is being piped into store, on the other, type parameters are so flexible we can put anything we want into them. The solution here would be to add the info from Bookkept directly to the Stored result.

The Bookkeeper mock remains exactly as before, it just produces its inputs as the output:

object TestBookkeeper extends Bookkeeper[(UserData, EnrichedUserData)]:

  def bookkeep(original: UserData, enriched: EnrichedUserData): (UserData, EnrichedUserData) =
    (original, enriched)

Now we know that this type (UserData, EnrichedUserData) is going to be fed to our Storage mock. Let's use it:

object TestStorage extends Storage[
  (UserData, EnrichedUserData), // 1
  ((UserData, EnrichedUserData), EnrichedUserData)]: // 2

  def store(
      bookkeepingResult: (UserData, EnrichedUserData), // 3
      data: EnrichedUserData): ((UserData, EnrichedUserData), EnrichedUserData) =
    (bookkeepingResult, data) // 4

This mock looks a bit scary now, but it's still essentially the identity function.

We align the Precondition type parameter to be the output of Bookkeeper: (UserData, EnrichedUserData) (1). The output is now modified to be both the original output of the Storage mock (EnrichedUserData) and a tuple of (UserData, EnrichedUserData) (2). This way we have access both to the data from Bookkeeper and the data from Storage in a single output.

In the store implementation, we get the output from Bookkeeper (3), which we then produce as part of the output of store (4).

With the flexibility of type parameters we managed to pipe all the data we need to be able to test the fetchAndStore flow. And now the test remains exactly the same as before:

"The uber-service" should:
  "fetch the user data, enrich it, and store the results" in:
    val service = new UberService(TestFetcher, TestEnricher, TestBookkeeper, TestStorage)

    val expectedUserData = UserData(s"data: 5")
    val expectedEnriched = EnrichedUserData("enriched: 5 - data: 5")

    val (bookkeeperResult, storageResult) = service.fetchAndStore(UserID(5)) // 1

    bookkeeperResult shouldBe (expectedUserData, expectedEnriched)
    storageResult shouldBe expectedEnriched

The output of fetchAndStore (1) has the same type and meaning as the output when Bookkept was part of the signature. The only difference is that now it comes through the store call. By plumbing the type parameters correctly we managed to reproduce the old functionality9 .

A Silver Bullet?

This is all great, unfortunately, it's not a silver bullet for all possible requirements we might get. As it stands, for example, there is no way to prevent the code from invoking bookkeep more than once. We are forcing the code to invoke bookkeep at least once, but it can accidentally (or on purpose) invoke it more than once.

I'm not aware of a way to use Scala's current type system to enforce such an invariant10. Maybe one day substructural types will enter the mainstream, and will let us enforce such basic invariants. One can only hope...

For now, we'll have to make due with the unpleasant mutable mocks if we want to check for multiple invocations. If you come up with a way to avoid them in this scenario as well, I'll be glad to hear about it.


Type parameters are great, they are flexible, and enable us to enforce some non-trivial invariants. By making our code more declarative, we managed to promote a requirement from a runtime test into a compile-time verified fact11.

More generally, what was it about our code that enabled us to apply these powerful techniques?

I would posit that a lot of the well-factored, testable code that we write tends to end up turning into glue code that is ignorant of the specific context it is being used in. It basically just orchestrates other code (like UberService), instead of doing much on its own. Type parameters lets us promote this ignorance to the type-level where we can reap the benefits that we just saw: declarative, type-safe code and pure tests.

In the next part we'll continue reaping the benefits of type parameters and handle those pesky inputs.

Till next time!

Buy Me A Coffee

  1. The full code for the examples is available on GitHub.
  2. By "cheating" I mean things like subverting the type system with nulls, casts, and the like. See the Scalazzi subset of Scala. I'll assume we are not cheating for the rest of the post.
  3. This is a loose application of "parametricity".
  4. We could use a more functional approach to error handling, but the same point I'm about to make will still stand.
  5. This is yet another loose application of "parametricity"
  6. I'm sure someone, somewhere said that.
  7. As per Yaron Minsky.
  8. I mean, it's not that difficult to obtain a value of type Unit.
  9. If we are being paranoid, it's true that since we are dealing with concrete types in the test, the Storage mock can now fake the output of Bookkeeper. If indeed we submit to the paranoia, we can easily make the mock "ignorant" of the input, and keep Precondition as a type-parameter, and just pipe that to the output. This will prevent TestStorage from messing around with the data flowing through it, just like it did in UberService. Incidentally, this will make the mock more flexible and reusable in other contexts as well. I leave this as an exercise to the curious reader.
  10. Maybe something with the still in research capture checking? Reach out if you have any suggestion on how this can be achieved.
  11. For a more involved example along these lines, see the last part of my "Make Illegal States Unrepresentable" talk.
  12. E.g., after an accidental refactoring.