ANY Overloading Considered Evil

This is not just clickbait, I really do mean that ANY form overloading in programming should be considered evil.

By "overloading" I shall refer to any instance of using one name to refer to multiple different things¹. A common example would be operator overloading, where the operator + can refer both to integer addition and to floating point addition.

Although I cannot possibly make an exhaustive list of all possible means of overloading, I shall try to illustrate this point with a few examples of different forms of it². Hopefully, by the end of this post you too will be of the opinion that overloading is quite evil. Though possibly a form of necessary evil.

On to our first example.

Operator Overloading

Overloading the meaning of arithmetic operators is so common that we don't even notice it. Even in high-school one can find the + operator overloaded to mean both real number addition and complex number addition.

Despite this innocuous seeming usage, in sufficiently complex applications, evil may lurk. Consider the following ³:

record Foo (int bar, double baz) {} // 1

record Bar (int qux) {} // 2

double doStuff(Foo foo, Bar bar) { // 3
  return (foo.bar / bar.qux) * foo.baz; // 4
}

In this code we have two records Foo (1) and Bar (2), containing some numbers. We then have a method doStuff (3) that takes instances of Foo and Bar and does some arithmetic (4). This code looks confusing, and that's on purpose. In a large enough codebase, following things might get difficult.

Now we can invoke this code as follows:

doStuff(new Foo(1, 2), new Bar(3))

And the result will be:

0.0

This is because we're doing integer division with 1 and 3. Nothing particularly interesting here. But now imagine that we are refactoring our code and deciding for some reason that the qux field should actually be a double:

record Bar (double qux) {}

Can you guess what the very same call to doStuff will return now?

0.6666666666666666

As we changed the type of qux to double the meaning of the / operator changed from integer division to floating point division. And so the result is now completely different.

Whether this is a bug in our application or not depends on context. But the point here is that we weren't warned about it in any way. The meaning of / changed silently without any kind of compile-time warning.

And given that the definition of doStuff can be far away from the definition of Bar, this is also an example of spooky action at a distance. Nobody likes spooky action at a distance...

It's reasonable for things to change when changing types, but it's not reasonable for it to happen silently and only be visible at run time.

The culprit here is the overloaded meaning of /.

Let's move on to the next example.

Literal Overloading

This was hinted at in the previous section. Note this call:

doStuff(new Foo(1, 2), new Bar(3))

What are the types of 1, 2, and 3?

The answer is: it depends. Depends on the expected types of the arguments to the Foo and Bar constructors.

As we changed the type of Bar.qux the meaning of 3 changed as well. Yet again, silently. If we had some compile-time warning about that change, we might've caught the change in the doStuff call. But alas, that's not the case. As the meaning of integer numeric literals is overloaded, we just silently changed behavior at run time.

I won't dwell on this point, as from my experience code usually doesn't rely too much on literals beyond tests and examples. And in any case, in most languages literal overloading is usually fairly limited⁴.

Method Overloading

A more common example of overloading, at least in the Java world, is the classic method overloading. That's when one class contains multiple methods with the same name but different type signatures. To wit, Arrays.sort:

public static void sort(int[] a)
public static void sort(long[] a)
public static void sort(char[] a)
public static void sort(byte[] a)
public static void sort(double[] a)
// ...

As you can see, there are many overloads of the sort method, accommodating different array types. All aptly named sort for convenience's sake.

Now imagine the following scenario. You have an array of prices, represented as double[]⁵. And you have some complicated logic to process them like so:

public void processPrices(double[] prices) {
  // ... do stuff with prices

  Arrays.sort(prices);

  // ... do more stuff with prices
}

Deep inside the processing logic we have an Arrays.sort hiding. But that's okay, it all works perfectly fine:

double[] prices = { 2, 5, 4 };

processPrices(prices);

After running processPrices the contents of prices is as expected:

[2.0, 4.0, 5.0]

Move along, nothing to see here...

But then we think to ourselves, is it good practice to represent business entities like "price" using raw doubles? Of course not.

Instead we decide to make a small domain wrapper record to represent the concept of "price" in our system:

record Price(double value) {}

This is a good practice, as we now have a point of reference and a single source of truth for anything that relates to the concept of "price" in our system.

As part of this refactor we now have to change the processPrices method as well:

public void processPrices(Price[] prices) {
  // ... the rest as before...
}

Great, this now all compiles and we can run the code from before:

Price[] prices = { new Price(2), new Price(5), new Price(4) };

processPrices(prices);

Any guesses as to the result of running this code?

...

Exception in thread "main" java.lang.ClassCastException: class overloading_post.
Price cannot be cast to class java.lang.Comparable (Price is in unnamed module of loader 'app'; java.lang.Comparable is in module java.base of loader 'bootstrap')
        at java.base/java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:320)
        ...

Right, we got a class cast exception. That makes prefect sense.

How did that happen? Well, there is no Arrays.sort overload for Price, but one of the overloads is this one:

public static void sort(Object[] a)

It takes any array of Objects and tries to sort them. How can you possibly sort Objects? Obviously you can't as Object doesn't have any sorting methods. What you can do instead is to cross your fingers and cast an Object to Comparable and then sort that.

In our case though, Price did not implement comparable, and the cast failed spectacularly.

Now a lot of bad things are going on here. Using Object directly, array covariance, relying on casting ⁶, and of course, method overloading.

And yet, method overloading in Java is determined statically, the compiler knew that we just changed the type from double to Price, and it determined, at compile-time, to use the Object overload of sort. In another world the same compiler making all these decisions could've warned us that we are doing something suspicious, and ask us to sign off on that⁷. This, sadly, is not the world we live in.

The things we do for a bit of overloaded convenience⁸...

To paraphrase Benjamin Franklin:

Those who would give up essential Safety, to purchase a little temporary Convenience, deserve neither Safety nor Convenience.

Runtime Polymorphism

A staple of Object Oriented Programming, runtime polymorphism is arguably one of the most characteristic features of doing OOP. But it too falls under my loose definition of "overloading" given above: we are using a single name for an (abstract) method to provide two or more different implementations.

Unlike the examples above, runtime polymorphism, as the name implies only happens at runtime, and is not determined at compile-time. As a result, chaos must obviously ensue.

Continuing our Price example from above, suppose we need to do some further processing: we have to avoid using illegal prices (like negative ones). We also decided that using array is so 90s. We should modernize and program to interface, the List interface.

As a result we have this fine method:

void removeIllegalPrices(List<Price> prices) {
  prices.removeIf(price -> price.value < 0);
}

And we can use it like so:

List<Price> prices = new ArrayList<>();
prices.add(new Price(1));
prices.add(new Price(2));
prices.add(new Price(-1));

removeIllegalPrices(prices);

As you might expect, the result will be:

[Price[value=1.0], Price[value=2.0]]

Again, plain, old, boring Java code.

But then yet again the winds of change are blowing. Somehow you ended up with a copy of that newfangled book everybody's talking about, Effective Java, and you read about item 15⁹:

Minimize mutability

Sure enough, that rings true in the depths of your immutable soul, and you go on implementing this in your code:

List<Price> prices = new ArrayList<>();
prices.add(new Price(1));
prices.add(new Price(2));
prices.add(new Price(-1));

prices = Collections.unmodifiableList(prices);

There you go, item 15 successfully applied. And what's super convenient is that unmodifiableList returns an instance of the same List interface. So we can happily keep on using it like we did before.

I hope that by now you see where the path of convenience is leading us...

Running this:

removeIllegalPrices(prices);

Can only end in tears:

Exception in thread "main" java.lang.UnsupportedOperationException
        at java.base/java.util.Collections$UnmodifiableCollection.removeIf(Collections.java:1096)
        ...

The remove method (and by extension removeIf method) on List is overloaded to mean different things depending on the concrete runtime implementation. In one case it removes an element, in another it throws an exception. All these two scenarios share in common is a name. And that's plenty enough to wreak havoc in our code.

Silent breakage haunts us yet again. This time, though, we can't blame the compiler. Runtime polymorphism was not meant to be tracked at compile-time. The culprit is the (runtime) overloaded meaning of the remove method.

Just Random Things Breaking All Over

Okay, okay. This all sucks. But maybe, maybe it's just this outdated Object Oriented thing? Who does that anymore? I hear that all the young people are doing something called "Functional Oriented Programming", or some such. Surely this kind of legacy mess doesn't happen there.

We sure can join that party. Except we are still in the confines of our enterprise cage. So baby steps, baby steps. Let's use Functional Java. It's quite an oldie, so maybe our corporate masters won't object too much.

You're really sold on the whole functional thing. And you learned to "make illegal state unrepresentable". As a result you're using the NonEmptyList type religiously. This way you can enforce the presence of at least one item in your lists at compile-time, which can be handy for some business requirements.

For example:

record Debtor(NonEmptyList<Payment> payments) { // 1
  Payment maxPayment() {
    return payments.maximum(Payment.ord); // 2
  }
}

As unfortunately we are working for an evil corporation, we have to model some debtors. A Debtor has a list of payments owed to us. And here we make an illegal state unrepresentable: the list of payments must be non-empty, as otherwise there is no actual debt. This is modeled by the NonEmptyList type (1). We are enforcing at compile-time the business requirement we have in our code. And all is well.

Furthermore, this Debtor entity has a utility method maxPayment that returns the maximum from the list of payments owed by the debtor (2). We use this for reporting purposes¹⁰. Notice that unlike the legacy Java code, despite being able to invoke maximum on any instance of NonEmptyList we are obligated to provide an Ordering instance that tells the list how to actually choose the maximum. So no unsafe casting is involved.

Even more importantly though, this is completely safe. Since the list is enforced to be non-empty at the type level, the call to maximum can never fail, there must be at least one element in our list.

Safety... What a breath of fresh air. Enter the product manager...

Well actually, there's a new business feature, we now sometimes consider a person a debtor without an actual debt. Don't ask questions, it was all cleared with the accountants.

Right.

Our careful modelling now goes down the drain, and we obediently replace our NonEmptyList with the List type from Functional Java. As we still want to retain that immutable functional goodness with all the cool data processing APIs.

Now we have this:

record Debtor(List<Payment> payments) {
  // the rest as before
}

Where the List type here is not the builtin one, but rather fj.data.List. We can now support the new business requirement of debtless debtors. Well done.

Some time later...

Our reporting mechanism is randomly crashing. You dig into the logs and find this:

Exception in thread "main" java.lang.Error: Undefined: foldLeft1 on empty list
        at fj.Bottom.error(Bottom.java:29)
        at fj.data.List.foldLeft1(List.java:972)
        at fj.data.List.maximum(List.java:1624)
        ...

Recall our cool maximum method? Well apparently it's not safe for empty lists, as you cannot in any way produce an actual element (maximal or not). And yet the fd.data.List class defines the maximum method for, um, convenience. The reports for our debtless debtors now trigger maximum on an empty List and the result is this exception.

We can question the choice of having the patently unsafe maximum method on List. But be that as it may, the real culprit here is, yet again, overloading. List and NonEmptyList are completely unrelated classes, they only share and overload the method name maximum "by coincidence". There's no common interface that forces them to do that. As we changed the type of the payments field we silently changed the meaning of maximum in our code. The compiler knew what we were doing but chose to let us shoot ourselves in the foot.

The silence of the compiler is deafening...

And Now For Something Completely Different

You're finally tired of all these Java shenanigans and your enterprise cage and decide to move to a hip new startup.

What a wondrous world, they use the very niche Haskell language. The pinnacle of "functional oriented programming" and pretty much the polar opposite of Java, in being both principled and safe.

Even better, function overloading is forbidden in Haskell. Every imported function name must refer to exactly one function, otherwise you get a compile-time ambiguity error. It might be a bit tedious to muck around with all the qualified imports, but at least now you're safe from the woes of overloading.

But one day you discover that actually, Haskell supports a form of ad-hoc polymorphism called "type classes". And the moment you see a footgun you quickly go and shoot yourself in the foot.

As follows¹¹:

newtype Debtor = Debtor { payments :: NonEmptyList Payment} -- 1

maxPayment :: Debtor -> Payment
maxPayment = maximum . payments -- 2

Turns out you work for a debt collection startup...

We use AI and the blockchain to make debt collection as pleasant as possible to all parties involved.

This is basically exactly the same example as before but rewritten in Haskell. We now have a Debtor type with a single field of a NonEmptyList of payments (1). We then define a utility function maxPayment (2) which uses the maximum function to compute the maximum of the non-empty list of payments.

What is notable here is that maximum is part of the Foldable type class. And so is actually overloaded to mean different things depending on the target type.

As a result, when once again our dear product manager¹² forces us to break our invariants we end up with this:

newtype Debtor = Debtor { payments ::  List Payment }

-- the rest is the same

And once again our reports explode when stumbling on debtless debtors:

Prelude.maximum: empty list

Overloading just keeps haunting us. We changed types, and a function silently changed its meaning without warning. Since type classes are resolved at compile-time this is pretty much the same pitfall as we had with method overloading in Java ¹³. The compiler knew, but kept mum.

Does the flap of a butterfly's wings in Brazil set off a tornado in Texas?

Maybe we should quit programming altogether, maybe physical debt collection is a less overloaded field.

Can We Do Better?

I mean, we can forgo overloading altogether and completely forbid it at the language level. But truth is, overloading is very convenient. Can you imagine instead of using + you'd have to write +_int, +_double, +_short?..

On the other hand, all the pitfalls I showed above stem from the fact that the code silently changes meaning in the presence of overloaded definitions. I can imagine a language where every time an overloaded definition changes its meaning the compiler will flag it and ask for confirmation¹⁴. Maybe something like a magic import that states the current type of the overloaded definition, and if the import and actual type diverge, the compiler will fail compilation until the import is fixed. At which point you can decide whether you're doing the right thing when changing types.

Making this sufficiently ergonomic is left as an exercise to the reader...

Circling back to the original statement, I stand by the claim that ANY form of overloading should be considered evil. It may be necessary and even convenient, but make no mistake, where overloading exists, evil lurks not far behind.

This statement goes beyond any specific languages and even programming in general¹⁵ ¹⁶.

I purposefully chose the word "evil" because there's something sinister about the way it strikes you. Waiting for you to feel cozy with the typechecker, lulling you into a false sense of security, only to strike you when you least expect it.

Although I mostly used various Java warts to illustrate my point, this holds even without Java or those warts. Whenever you change types and your compiler lets behavior change silently and without your knowledge, it's a recipe for disaster.

The consequences of overloading can be more subtle than actual exceptions. Especially as code grows larger and we rely more and more on the compiler to track the correctness of our code. Like a silent performance degradation due to a change of collection types in some unrelated area of the code¹⁷. Or some accidental parallelism where none was present. Or improperly sorted data that you discover far away from the point of sorting. Or poorly formatted strings in your critical financial reports. Or anything really, let your imagination (paranoia?) run wild!

Thank you for reading this far, and may your future not be overloaded...

If you enjoyed this, reach out for a workshop on Functional Programming, where I teach how to apply Functional Programming to improve code quality in any language.

Permalink Twitter Reddit LinkedIn Hacker News Facebook WhatsApp

✓ Link copied to clipboard!

For our purposes here, Java-style method overriding is also a form of "overloading", as it too ascribes potentially different meanings to the same name. ↩
This is mostly in the context of statically typed languages as well, although it also holds to a degree in more dynamic languages. It's just that in statically typed languages we do expect to have more assistance from the compiler, which makes the evilness of overloading more apparent. ↩
I know this may seem a bit contrived, but in real code this can come up naturally on its own. It's just difficult to condense into blog form. ↩
A notable exception would be the Haskell language with its very flexible and overloaded numeric, string, and other literals. ↩
Of course one should never represent actual financial numbers with double, but we'll ignore this for the sake of the example. ↩
You could argue that I'm picking on geriatric code. The Arrays class began life in Java 1.2 (circa 1998), so it's probably older than some of the readers of this blog. And yet if you take a look at Stream, despite its veneer of modern respectability it has the sorted method, which works generically for any Stream<T>, and so it too can potentially throw a class cast exception. ↩
The Scala tool Wartremover has something that kind of does that, the Any wart (where Any is the Scala equivalent of Object). ↩
Method overloading can also exist with signatures of different arity. Although in the spirit of this post I should say something against this as well, for the moment I can't come up with a likely scenario where overloading over arity will break things silently. Reach out if you have examples of that as well. ↩
Or 17, depending on edition. I guess that numbering wasn't immutable... ↩
So as to depress our clients with the size of their debts. ↩
Here and below I'm using type aliases with type NonEmptyList a = NonEmpty a and type List a = [a]. This is not essential, I'm using the aliases just to make it a bit easier to relate the Haskell code to Java code above. ↩
Is it the same one we had in our enterprise cage? I can't tell them all apart anymore... ↩
Unlike Java though it's encouraged in Haskell for type classes to come equipped with laws. The laws help us reason about the behavior of the different implementations generically, making sure that all implementations "make sense". That's great, and better than nothing. But I still think you can stumble on overloading pitfalls even in the presence of laws. ↩
This can help with the statically resolved overloading cases, but won't help with all instances of runtime polymorphism (although some can be caught at compile-time, like changing the receiver type of toString in a non-generic context). ↩
I hear that human languages are known to be notoriously overloaded. Can you think of any problems that can cause? ↩
It's not uncommon for mathematicians to rely on overloading to make derivations more concise. Unfortunately for them, most of them don't even have a typechecker to begin with. ↩
The runtime polymorphic get method on List can have wildly different performance characteristics depending on the concrete implementation. Some algorithms may become unacceptably slow when you swap constant-time indexing with linear indexing. ↩