Purify Your Tests III: Lean, Mean Testing Machine

Purify Your Tests III: Lean, Mean Testing Machine

In the last part we saw how to make our code more declarative, and the tests more functional by introducing type parameters for the inner data flowing through our code. In this part we are going to make our test inputs leaner.

Previously...

Recall that last time we ended up with the following code1:

class UberService[Bookkept, Stored](
    fetcher: Fetcher,
    enricher: Enricher,
    bookkeeper: Bookkeeper[Bookkept],
    storage: Storage[Bookkept, Stored]): // 1

  def fetchAndStore(user: UserID): Stored =
    val data = fetcher.fetch(user)
    val enriched = enricher.enrich(user, data)

    val bookkept = bookkeeper.bookkeep(data, enriched) // 2
    val stored = storage.store(bookkept, enriched) // 3

    stored

The key insight from the last post is that we can use type parameters to enforce the call order between the Bookkeeper and Storage instances (1). We are forced to produce a Bookkept value (2) before being able to call store (3), and the only way to produce a Bookkept value is by calling bookkeep. This makes our code more declarative, and that much more type-safe. So much so, that it enabled us to delete a test.

Last time I suggested and quickly dismissed the idea of turning the inputs into type parameters as well. I (quite reasonably) claimed that the proliferation of type parameters might not be worth the potential gains. That may be the case in this specific example. But what if we imagine a slightly different scenario. One where our inputs are much more cumbersome.

This Is All Way Too Much

Till now, these were the classes representing the inputs to our tests:

case class UserID(value: Int)
case class UserData(value: String)
case class EnrichedUserData(value: String)

As you can see, these inputs our quite lean, just some wrappers around strings and ints. It's not a big deal to create fake instances of them in a test. This was perfectly fine for a blog post example, but it's usually not representative of real code.

Let's imagine that we had some more realistic inputs.

First stop, UserID is actually defined like so:

case class UserID private (value: Int) // 1

object UserID:
  def make(value: Int): Option[UserID] = ... // 2

Notice the private constructor (1), and the smart constructor make (2). UserID cannot be created willy-nilly, there are some invariants9 to be maintained, and a smart constructor to enforce them. But do you care about that in the test? Do you want to deal with creating valid instances and handling that smart constructor just to pass data around in the test?

Next, the user-data is actually this monster:

case class UserData(
  id: UserID,
  username: String,
  email: Email,
  password: Hashed[Password],
  firstName: FirstName,
  lastName: LastName,
  bio: String,
  location: Location,
  birthDate: LocalDate,
  gender: Gender,
  profilePicture: Option[Media],
  coverPhoto: Option[Media],
  education: List[Education],
  workExperience: List[WorkExperience],
  interests: List[String],
  socialMediaAccounts: List[SocialMediaAccount],
  privacy: Privacy,
  followers: List[UserID],
  following: List[UserID],
  posts: List[Post],
  notifications: List[Notification],
  createdAt: Instant,
  lastLogin: Instant,
  isVerified: Boolean,
  // ... another 20 fields...
)

Not only does it have way too many fields, some of those are nested classes as well (like Post). How cumbersome would it be to create fake instances of this data in your test2? And for what, just to pass it around? Or maybe you're a well-disciplined functional programmer, and use randomly generated data a la property-based testing. That's great, but you will still have to create a gnarly generator for this purpose. The fact that you can use random data and not care might be hinting something...

You can imagine that EnrichedUserData is just as bad, maybe it has even more fields. I won't daunt you with an example, just add it to our overly heavy baggage.

The ongoing theme here is that we are paying the price in the test for something we really don't care about.

Hopefully, by now I'm becoming predictable and the solution intuitive. When code "doesn't care" about something it might be a fertile ground to introduce a new type parameter. Encoding the code's indifference at the type-level.

Every piece of data the details of which we don't care about in our flow and want to avoid faking in the tests can become a brand-new type parameter3 13.

Fair warning, some individuals may find the number of type parameters in the following section disturbing. Reader discretion is advised.

Type Parameter Galore

Here's the new Fetcher trait:

trait Fetcher[UserID, UserData]: // 1
  def fetch(user: UserID): UserData // 2

Note the new type parameters, UserID and UserData. I'm breaking the FP convention of naming type parameters using single alphabet letters4. The reasoning being that these are not "really" generic things, they are just stand-ins for the real UserID and UserData classes. They exist merely to facilitate testing.

The real production implementation is going to set the parameters to the original classes, and otherwise remain unchanged:

class ProductionFetcher(/* ... */) extends Fetcher[UserID, UserData]:
  def fetch(user: UserID): UserData = ...

Just as before, the production code doesn't care about the type parameters, and can keep using the original classes. The actual benefit of using the type parameters will become apparent when we write down the tests. But first, let's parametrize the rest of the classes:

trait Enricher[UserID, UserData, EnrichedUserData]: // 1
  def enrich(user: UserID, data: UserData): EnrichedUserData

trait Bookkeeper[A, UserData, EnrichedUserData]: // 2
  def bookkeep(original: UserData, enriched: EnrichedUserData): A

trait Storage[Precondition, A, EnrichedUserData]: // 3
  def store(precondition: Precondition, data: EnrichedUserData): A

Similarly to Fetcher, we are replacing every occurrence of UserID, UserData, and EnrichedUserData with the type parameters (1, 2, 3). We then use them consistently in all the methods. Just like with Fetcher, the production classes can remain unchanged by setting the type parameters to the original classes.

Now for the main bit, UberService:

class UberService[UserID, UserData, EnrichedUserData, Bookkept, Stored]( // 1
    fetcher: Fetcher[UserID, UserData], // 2
    enricher: Enricher[UserID, UserData, EnrichedUserData], // 3
    bookkeeper: Bookkeeper[Bookkept, UserData, EnrichedUserData], // 4
    storage: Storage[Bookkept, Stored, EnrichedUserData]): // 5

  def fetchAndStore(user: UserID): Stored = // 6
    val data = fetcher.fetch(user)
    val enriched = enricher.enrich(user, data)

    val bookkept = bookkeeper.bookkeep(data, enriched)
    val stored = storage.store(bookkept, enriched)

    stored

UberService now takes an impressive total of five type parameters5 (1). We then consistently pass these parameters around the different classes (2, 3, 4, 5). But despite this apparent mess, the actual implementation code (6) remained unchanged6.

Now we are ready for the payoff.

Lean Fakes

To make things slightly less confusing7 I'm going to define the following aliases for the test:

type UserID = Int
type UserData = String
type EnrichedUserData = String

What I mean by that is that in the context of the test, the different type parameters are going to be chosen to be these simple aliases. All we have to do now is to reimplement our mocks using these parameters. To wit:

object TestFetcher extends Fetcher[UserID, UserData]: // 1
  def fetch(user: UserID): UserData =s"data: $user" // 2

The new Fetcher mock now uses the aliases UserID (actually Int) and UserData (actually String) for the type that it's going to be processing (1). And the implementation of fetch (2) can now use a plain string as its output. No need to deal with the huge "real" UserData class. Our fake data couldn't be much leaner than that.

Similarly, Enricher:

object TestEnricher extends Enricher[UserID, UserData, EnrichedUserData]:
  def enrich(user: UserID, data: UserData): EnrichedUserData =
    s"enriched: $user - $data"

Again, these are just the simple aliases, not the real "classes", we can use a lean string instead of some huge data class.

Finally, the Bookkeeper and Storage mocks. Recall that they were a bit of mess before because we had to pipe various tuples of UserData and EnrichedUserData through them. But now that we are no longer bound to these specific types, we can choose some other way of reflecting the data that flowed through UberService8.

The new Bookkeeper mock:

object TestBookkeeper extends Bookkeeper[List[String], UserData, EnrichedUserData]: // 1

  def bookkeep(original: UserData, enriched: EnrichedUserData): List[String] = // 2
    List(s"bookkept: $original", s"bookkept: $enriched") // 3

In this new mock we choose the output of Bookkeeper to be List[String] (1). Since both the UserData and EnrichedUserData types are actually plain strings now (1), we can just place them in a list, instead of using a tuple.

Accordingly, bookkeep now returns a List[String] (2). And we construct that list (3) in the implementation, by placing both inputs to bookkeep in the result. To make it a bit more obvious when asserting, we are adding a marker "bookkept" prefix to that string. This is possible because everything here is actually a String, we are not relying on the toString representation of the real UserData and EnrichedUserData classes, or anything evil like that.

Similarly, we modify the Storage mock to use List[String]:

object TestStorage extends Storage[List[String], List[String], EnrichedUserData]: // 1

  def store(bookkeepingResult: List[String], data: EnrichedUserData): List[String] =
    bookkeepingResult :+ s"stored: $data" // 2

To be consistent with the TestBookkeeper we are setting the input to TestStorage to be List[String], as well as the output (1).

The final result from store is now a compound list with both the Bookkeeper result and the entry from the storage process (2). Again, we are adding a string prefix to make it easier to see what's going on when we finally assert on the data. Which we are now ready to do.

Here's what the new test looks like now:

"The uber-service" should:
  "fetch the user data, enrich it, and store the results" in:
    val service = new UberService(TestFetcher, TestEnricher, TestBookkeeper, TestStorage)

    val result = service.fetchAndStore(5) // 1

    result shouldBe List( // 2
      "bookkept: data: 5",
      "bookkept: enriched: 5 - data: 5",
      "stored: enriched: 5 - data: 5"
    )

First we are running fethAndStore (1) with the ID 5. Note that this is a plain Int. No need to try to circumvent the smart constructor of the real UserID type. After running fetchAndStore we have a single result which is a List[String]. We can now assert on it (2) and see that we indeed have all the relevant outputs, both from Bookkeeper and Storage, as plain strings.

And that's it. Keep in mind what we didn't have to do, we never initialize any instances of "real" production classes, we never had to come up with a multitude of fake values for all the numerous fields of UserData. All we had to do is use some simple strings and ints, and yet this test is powerful enough to test real production code, code that can work with real, messy data.

This is the power of type parameters, they blend immense flexibility with extraordinary type-safety.

Was It All Really Necessary?

Recall that in the last post we talked about "non-fungible" type parameters. That is how in code that uses type parameters we cannot fake parametric values into existence out of nowhere. We are forced to "follow the types" and call the relevant producers of each type (assuming we are not cheating10).

This allowed us to avoid testing various aspects of the code. Like the fact that we definitely called the Bookkeeper before calling Storage.

Now that we have even more type parameters, can we remove more tests?

If we look at the code of fethAndStore again:

def fetchAndStore(user: UserID): Stored = // 1
  val data = fetcher.fetch(user) // 2
  val enriched = enricher.enrich(user, data) // 3

  val bookkept = bookkeeper.bookkeep(data, enriched) // 4
  val stored = storage.store(bookkept, enriched) // 5

  stored

Is it even possible to not call the Fetcher or Enricher? Is it possible to pass the wrong inputs to Bookkeeper and Storage?

If we follow the types backwards from the return type, we observe the following.

  • We must return a Stored value (1), but this is a type parameter, so it cannot be faked. The only way to obtain it is to call store (5).
  • The only way to do that is to pass it a Bookkept value11 and an EnrichedUserData value. But EnrichedUserData is now a type parameter, so there's no way to fake it as well.
  • To obtain an EnrichedUserData value we must call Enricher (3), but to do that we must provide it with a UserID and a UserData instance. Both are now type parameters.
  • UserData can only be obtained from a call to Fetcher (2).
  • UserID can only be obtained from the input to fetchAndStore (1).

Following this chain we see that the answers to the questions above is "no", we cannot mess these things up. The type-signatures force us into a certain chain of calls, that cannot be avoided.

This means that we could actually remove the test we just wrote, as it doesn't seem to test anything that is not already enforced by the type-system. Our types are so declarative that the compiler can just check everything for us. It's as if the types have become the code's specification.

If you're like me though, and have a slight measure of paranoia17, maybe one little sanity test like the one we wrote is not that bad. Especially when the resulting test is both pure and uses very lean fake data.

Conclusion

Following the pains of mocking data we made our code so parametric that it no longer cares about anything, allowing us to use the leanest possible fake data in our tests. This might seem an extreme measure, but depending on how difficult it is to create the fake data, it might be worth it.

Even if you don't end up using so many type parameters, the mere fact that you can do it tells us something important about the code: it's agnostic to all the data flowing through it, it's pure orchestration. Every time you see such indifference14 in your code you can take advantage of it to make your tests both leaner and purer.

On a different note. I'm not a big believer in Test Driven Development12. I don't think that it's necessary to write tests upfront, and I'm probably guilty of skipping the occasional test. But I'm a firm believer that making code testable tends to improve the overall quality of the code.

If writing tests for some piece of code is painful, maybe that code can be improved to be more testable, and as a side-effect15 the quality of the code itself will be improved.

I hope that this blog was a case in point. Our quest to make the code more easily testable made it so type-safe that it's nigh impossible to write it incorrectly. I think that's a win16.

In the next part we'll see how we can leverage our fake type parameters directly for the benefit of the production code, and not just for tests.

See you next time!

Buy Me A Coffee


  1. The full code for the examples is available on GitHub.
  2. Don't fall for the luring trap of an Object Mother, only pain and misery can come from that.
  3. As if we don't have enough of them already...
  4. Or the dubious Java convention of naming parameters with T, U, and V.
  5. Are we getting closer to the glorious septifunctor?..
  6. Partly due to the unconventional naming of the type parameters, we can still pretend that we are working with the same classes as before.
  7. Or more.
  8. Though the old way with more granuraly typed tuples is still possible.
  9. Like value > 0.
  10. By "cheating" I mean things like subverting the type system with nulls, casts, and the like. See the Scalazzi subset of Scala. I'll assume we are not cheating for the rest of the post.
  11. I'm ignoring Bookkept now because we already discussed it at length in the last installment.
  12. Prepare the pitchforks...
  13. Actually the burning desire to avoid annoying fake data in tests was the initial trigger that got me into this approach of using type parameters to help with testing. You wouldn't believe the size of the class I wanted to avoid...
  14. And it tends to be more common than you might guess.
  15. Pun intended.
  16. Although the exact number of type parameters such that the cost/benefit of using them is sufficiently worthwhile is up for debate.
  17. This is fairly simple code, but in the presence of something like IO we might want, for example, to sanity test that we are not accidentally short-circuiting where we are not supposed to. One might ponder whether introducing a type parameter instead of IO will enable us to skip these tests as well...