Monad is not difficult

In Haskell, the typeclass Monad is a way for programmers to customize how sequences of function calls are run. Originally the goal was just to make IO-heavy code easier to read, but it turns out that Monad-shaped APIs are wonderfully flexible and can simplify all sorts of programming problems.

For example, say we want to define a function to look up two people in a database, and return information about the older one (or the first, if both are the same age). The database API already defines a way to look up a person by name, which returns NULL if the requested name isn’t found. Our function should return NULL if either name cannot be found.

The imperative-style implementation would look like this:

Maybe<Person> findOldest(Database db, Name name1, Name name2) {
    Maybe<Person> maybePerson1 = findByName(db, name1);
    if (maybePerson1 == Nothing) {
        return Nothing;
    }
    Person person1 = maybePerson1.Value();

    Maybe<Person> maybePerson2 = findByName(db, name2);
    if (maybePerson2 == Nothing) {
        return Nothing;
    }
    Person person2 = maybePerson2.Value();

    if (person1.BirthDate > person2.BirthDate) {
        return person2;
    }
    return person1;
}

Other than verbosity, this code is fairly readable, because the language has built-in support for returning early from a function.

But what happens if we write it in a declarative-style language? In most declarative languages, a function is a single expression, and every branch must return a value. This prevents the programmer from returning early.

findOldest :: Database -> Name -> Name -> Maybe Person
findOldest db name1 name2 =
    case findByName db name1 of
        Nothing -> Nothing
        Just person1 -> case findByName db name2 of
            Nothing -> Nothing
            Just person2 -> if birthDate person1 > birthDate person2
                then Just person2
                else Just person1

You can see that each time the programmer wants to check for an empty result, they have to add another level of indentation. This sort of code quickly becomes difficult to read and maintain.

Monad provides a solution, because it allows the programmer to customize how to handle functions that return an empty value. If the programmer instructs the compiler to return early when an empty value is encountered, then she doesn’t have to write all those checks manually.

instance Monad Maybe where
    return x = Just x
    val >>= f = case val of
        Nothing -> Nothing
        Just x -> f x

findOldest :: Database -> Name -> Name -> Maybe Person
findOldest db name1 name2 = do
    person1 <- findByName db name1
    person2 <- findByName db name2
    return (if birthDate person1 > birthDate person2
        then person2
        else person1)

The same pattern extends to all sorts of code that needs extra control over the execution sequence. With only a small tweak, the instance for Maybe can be used for Either:

instance Monad (Either a) where
    return x = Right x
    val >>= f = case val of
        Left err -> Left err
        Right x -> f x

Monad instances can also be defined for types which change how functions are called, rather than whether they are called. The Reader type adds an implicit environment to a function:

data Reader env a = Reader (env -> a)

ask :: Reader env env
ask = Reader (\env -> env)

runReader :: Reader env a -> env -> a
runReader (Reader run) env = run env

instance Monad (Reader env) where
   return x = Reader (\_ -> x)
   val >>= f = Reader (\env -> runReader (f (runReader val env)) env)

Either is used instead of exceptions, and Reader is used instead of global state, because their behavior can be mechanically checked by the compiler. By offloading concerns about error and state handling to the compiler, the programmer can focus on the higher-level behavior of their program.

Exercise for the reader: Given this definition, how would you define a Monad instance for the State type?

data State st a = State (st -> (st, a))

get :: State st st
get = State (\st -> (st, st))

put :: st -> State st ()
put st = State (\_ -> (st, ()))