Random Scala Tip #624: The Pitfalls of Option Blindness

Given this class:

case class User(
  id: UserId,
  email: Option[Email],
  address: Option[Address],
  posts: Option[List[Post]],
  lastLogin: Option[Timestamp])

What does Option on each of the fields actually means?

Well, after spending hours digging in the code, documentation¹, and use sites, I gleaned the following:

email is actually mandatory, but that requirement was only added recently, so it was made optional for backwards compatibility with older clients
address is taken from user input, and it's not mandatory and can be missing
posts is fetched from a database, but some times the database is down, so we fallback to None, which is distinct from Some(Nil), and should be handled appropriately
lastLogin will be empty if the user never logged in previously, we need to provide a special greeting when a user logs in for the first time

We have four distinct meanings for Option, but only one data type to represent them all. This phenomenon is called "Option blindness".

Differently Typed Blindnesses

Boolean blindness is a well known phenomenon. It has a somewhat lesser known cousin called algebraic blindness. That's when we use very generic algebraic data types, like Option, Either, and These/Ior² to represent domain-specific data, and in doing so we miss that domain-specific context in our code.

Here I'm picking specifically on Option because it's so common. As it's often touted as a solution to the "null problem", it easy to think about it as an "never wrong" solution, even more so if you're new to functional programming. On top of that, Option enjoys a privileged status in the wider Scala ecosystem. Especially in various serialization libraries, where it commonly gets special handling, making it all the more tempting to use it all over the place.

The Tip

Before wrapping anything in Option, consider whether you should use a more meaningful domain-specific data type instead.

Examples

There are plenty of domain-specific meanings that an Option can take, but I'll use the four examples above. Hopefully this will illustrate the general principle well enough.

Backwards Compatibility

This one is particularly annoying, as it tends to get more widespread as the system evolves over time. You add a new field to a class that's constrained by backwards compatibility (e.g., it is used for communication, or stored in a database), and then you mark it Optional for perpetuity. Time passes by, and you no longer have any idea whether the field can be missing "for real", or it's just some vestigial Option supporting a no longer relevant version of code.

Instead, I would suggest to use a custom generic type, isomorphic to Option, to communicate the backwards compatibility concern directly³:

enum BackCompat[+A]:
  case Present(value: A)
  case Missing

How you end up treating such BackCompat values is very domain-dependent, maybe you can replace them with A over time, maybe not. But in any case, at least now you know why they are there and treat them accordingly.

Missing User Inputs

For user input that is not mandatory I'm on the fence, this might be the case where just using Option is the right fit, with all the convenience that you get from using a standard data type. The data is there or not, and that's all you care about. On the other hand, knowing that the data came from user input, and that it was explicitly not provided, might be relevant to your domain, in which case a custom Option-like data type (see above) may be in order.

This dilemma highlights the point that choosing the correct way to model your data is to an extent an art rather than just science.

Feature Toggle

The next example is using Option to indirectly communicate that a certain feature in the system (data fetching in the example) was turned off. That means that we have to remember that None has a special meaning. It's not just an empty value, but instead it's empty because a feature was turned off, and we might want to apply special handling to it. Using the terminology from the boolean blindness post, the provenance of the Option is just as important as the value of the data itself.

Instead of using Option we can communicate the provenance with yet another domain-specific, Option-like data type:

enum FetchedData[+A]:
  case Available(value: A)
  case Disabled

Now, every time someone has a FetchedData value, they have to decide what to do with it based on the informatively named cases. No longer do we have to remember the special meaning of None, the type system will guide us in the right direction.

One day, if and when we'll have more data fetching toggles, it'll be easy to add them to the custom FetchedData type, and then let the compiler guide us with fixing up the code⁴. Piggybacking on Option for this might get cumbersome.

System State

The last example is when Option is used to represent different states of the system. Now None means that the user never logged in. Information that we have to hold in our heads rather than in the type system⁵.

Unsurprisingly, the solution would be to create a domain-specific data type. But to add some variety, for this case I will go with a non-generic type. The "last logged in" state doesn't have a meaningful value that's not a Timestamp (at least in my interpretation of this example).

The state machine for this part of the system can be represented as:

enum LastLogin:
  case At(value: Timestamp)
  case Never

With this little state machine the compiler itself will remind us of whatever special behavior that we need to apply when a user logs in for the first time.

Trade Offs

Like most things in life, using custom data types is a trade off. Even if we ignore the boilerplate of actually defining a new type every time, we still pay in at least two ways:

We lose all the standard Option functions
We cannot automatically participate in whatever special treatment that Option gets in various libraries

We can try to mitigate the first point by defining conversions to/from Option and then use those to gain back some of the Option functionality⁶. Even better, if you're into functional programming you can use some standard typeclasses to imbue our custom types with Option-like functionality⁷. Some libraries will even let you derive those typeclasses automatically.

For the second point, special Option handling is very common in serialization libraries, e.g., converting Option to null in JSON. If you're lucky, whatever special functionality Option gets is guided by implicits. In which case custom data types can participate just as well (at the price of some more boilerplate). If you're not that lucky, and Option handling is hardcoded into the library, then you can try to petition the library authors to change that. Feel free to use this post as a form of justification.

Despite these cons, I still think that the maintainability gains we get from domain-specific, Option-like data types often outweigh the downsides.

May you never be blinded by Option ever again, till next time!

If you enjoyed this, reach out for a workshop on Functional Programming, where I teach how to apply Functional Programming to improve code quality in any language.

Permalink Twitter Reddit LinkedIn Hacker News Facebook WhatsApp

✓ Link copied to clipboard!

Obviously unmaintained and misleading. ↩
Like here, or here. ↩
I'm using Scala 3 syntax which is pleasantly compact. The same applies to Scala 2 albeit with noisier syntax. ↩
Assuming that you have pattern match exhaustivity errors turned on, and if not, go now and enable them. What are you still doing here? Go! ↩
Going down the route of moving system states into the type system is a good first step towards "making illegal states unrepresentable". ↩
If you're feeling particularly adventurous you can make those conversions implicit. ↩
Things like Functor, Applicative, or even Optional ↩

# Random Scala Tip #624: The Pitfalls of Option Blindness

Differently Typed Blindnesses

The Tip

Examples

Backwards Compatibility

Missing User Inputs

Feature Toggle

System State

Trade Offs

# Random Scala Tip #697: Avoid Anonymous Functions as Dependencies

# Random Scala Tip #568: Beware of Leaking Iterators

# Random Scala Tip #624: The Pitfalls of Option Blindness

# Random Scala Tip #534: Adopt an Error Handling Convention for `Future`

# Random Scala Tip #624: The Pitfalls of Option Blindness

Differently Typed Blindnesses

The Tip

Examples

Backwards Compatibility

Missing User Inputs

Feature Toggle

System State

Trade Offs

Footnotes

# Random Scala Tip #697: Avoid Anonymous Functions as Dependencies

# Random Scala Tip #568: Beware of Leaking Iterators

# Random Scala Tip #624: The Pitfalls of Option Blindness

# Random Scala Tip #534: Adopt an Error Handling Convention for `Future`