Maybe - a pragmatic approach

vhsven

2016-12-18

Last time, I introduced the concept of Maybe from a theoretical point of view. In this text I would like to take a more pragmatic approach. We will start from the same premise: the fact that null is an abomination and we do not want to deal with it. What are our alternatives?

One simple, out of the box method to represent a value that may or may not exist, is to use the old-fashioned IEnumerable<T>. If we have a value that we want to return, we wrap it in a enumerable like so:

return Enumerable.Repeat(value, 1);

If, on the other hand, we cannot return a value, we simply return an empty enumerable:

return Enumerable.Empty<T>();

The naive way to go about using these return values is to check if anything has been returned, and use it if that is the case.

if(result.Any())
{
    // ...
}

A more elegant solution is to use one of the built-in LINQ methods.

IEnumerable<TOut> newResult = result.Select(value => ...);

This way, the lambda function will be executed only if the enumerable contained the value we were after. If the enumerable was empty, we simply got another empty enumerable in return.

At some point you will probably end up with two such IEnumerables. Say you already have a method for combining their underlying values:

TOut Combine<T1, T2, TOut>(T1 a, T2 b) { ... }

Based on this, how can we combine IEnumerable<T1> with IEnumerable<T2>? First, we should consider how we want this new combine function to behave. If both parameter values are empty, simply return an empty enumerable. If both contain a value, combine them using the method above and wrap the result in a new enumerable. The real question is what we would like to happen if one but not both values are empty. For our case, let's just say we again return an empty enumerable. Is there an existing LINQ method that can help us with this? There is, and it is called Zip!

list1.Zip(list2, f) takes two lists, say [1, 2, 3, 4] and [a, b, c], and combines them to [f(1, a), f(2, b), f(3, c)]. If the lists are of unequal length, the tail of the longest list is ignored. In our case both enumerables can only have length zero or one. If they both contain one element, the Zip operation will also return an enumerable containing one element. This element is then the combination of the two input elements. In all other cases, Zip returns an empty enumerable just as we described in our specification above.

first.Zip(second, (a, b) => Combine(a, b));

or alternatively:

first.Zip(second, Combine);

We just learned how to combine IEnumerables in parallel, i.e. two inputs, one output. Other times, we would like to chain IEnumerables in series. Say you have two methods:

IEnumerable<T> Bar();
IEnumerable<T2> Foo(T input);

Simply calling Foo(Bar()) obviously will not compile, because the output type of Bar does not match the input type of Foo. However, as we have seen above, a Select often helps in this case.

var result = Bar().Select(bar => Foo(bar));

or more concisely

var result = Bar().Select(Foo);

At first sight, everything looks okay. Upon closer inspection howver, we notice that the actual type of result is IEnumerable<IEnumerable<T2>>. Either the outer IEnumerable is empty, or it contains one other IEnumerable. In the latter case, this inner IEnumerable in turn can also contain either zero or one item. All in all, this construction is no more expressive than a single IEnumerable. Consequently, we would like to flatten this structure. We could do this manually, but why get our hands dirty when LINQ has a built-in method to deal with this. And this method is... you guessed it: SelectMany!

IEnumerable<T2> result = Bar().SelectMany(Foo);

Finally, we have found generic solutions for combining and chaining as many IEnumerables as we want. Best of all, it only uses out of the box features.

If you have already seen the Maybe API, a lot of the above probably looked strangly familiar. This is no coincidence. There exists a very specific mapping between certain Enumerable and Maybe methods.

IEnumerable<T>.Any() <=> Maybe<T>.HasValue;

IEnumerable<T>.Select<TOut>(Func<T, TOut> selector) <=> Maybe<T>.Select<TOut>(Func<T, TOut> selector);

IEnumerable<T>.SelectMany<TOut>(Func<T, IEnumerable<T>> selector) <=> Maybe<T>.Bind<TOut>(Func<T, Maybe<TOut> selector);

IEnumerable<T>.Zip<T2, TOut>(IEnumerable<T2> b, Func<T, T2, TOut> f) <=> Maybe<T>.Combine<T2, TOut>(Maybe<T2> b, Func<T, T2, TOut> f);

In fact, you could repeat this excersise with either Func<T>, Lazy<T> or Task<T> instead of IEnumerable<T> (just to name a few) and still come to the same conclusion. If this piques your interest, consider reading this series of blog posts by former member of the C# compiler team Eric Lippert: http://ericlippert.com/2013/02/21/monads-part-one.

I think the word neat aptly describes these symmetries we have just discovered. However, it is clear that IEnumerable<T> is not the ideal candidate for this purpose. The most important inconvenience is that IEnumerable<T> can contain more than one item, which would be considered an invalid state in our scenario. Good code explicitly checks this and throws an exception if an operation is about to cause an invalid state. Great code, however, makes invalid states entirely unrepresentable. This is why we need a dedicated solution such as Maybe<T>. It offers all the same advantages, but cannot be put in an invalid state. On top of that, it also makes much more sense from a semantic point of view.