Last Modified on February 17, 2021
- (2021.01.18) Added MonadFail subsection to Maybe on Hackage
- (2021.02.17-now) More Links (list of related links)
- (2022.08.30) reclassified under patterns-of-erroneous-code tag
Motivating References:
I was very happy to find Michael Snoyman’s Haskell The Bad Parts series.
I was also motivated by Elementary Programming post (reddit).
Similar: reddit data_maybe_harmful
This post is Haskell specific.
This post treats the term error colloquially, it does not distinguish between exceptions and errors. In particular, error information loss refers to exceptions not errors.
Nutshell
Maybe
is the functional answer to null
- the billion dollar mistake. I claim that using Maybe
can still be problematic.
IMO Maybe
is often overused. I have started to question the use of Maybe
every time I see it in the code base I maintain. The result is either accepting its usage or rewriting the code to use Either
. This approach has been effective in creating more robust code. I am not claiming that Maybe
has no place in a well written code, only that its use should be closely examined. I have seen brilliant code that has been hard to maintain because of its overuse of Maybe
. This post shares a perspective of someone who maintains a complex Haskell code base.
Maybe
improves over null
. But it does not supersede it. Languages that have null
also have easy access to logging, stack traces etc.
Maybe
typically represents data that can be missing or a computation that can result in an unknown error.
What you typically care about is what data is missing and what is the error.
Code correctness, reasoning about code are, arguably, the defining aspects of FP. Reasoning about code typically refers to some advanced use of the type system or formal methods. IMO “reasoning about code” should start with reasoning about errors and corner cases (like missing data). This is why the use of Maybe
needs to be examined and questioned. In my experience this aspect of reasoning about code is often overlooked.
Reasoning about errors is not easy: The type system can’t help with errors that bypass it (e.g. error :: String -> a
). It can’t help with exceptions which were intentionally suppressed into Nothing
(the focus of this post). And, the list goes on…
My points / pleas are:
- The ecosystem would be better off without offering convenience combinators that return
Maybe
if an equivalent returningEither
exists - Examples, tutorials, and blog posts should favor
Either
overMaybe
- Developers should be careful about not overusing
Maybe
Error Clarity Rule
What does Nothing
mean? If the reason behind it can be disambiguated to one root cause, then I consider the use of Maybe
justified. Otherwise, I question its use.
Consider this code:
import Data.Map
import Prelude hiding (lookup)
type Key = String
type Value = ...
-- OK
phone :: Map Key Value -> Maybe Value
= lookup "phone"
phone
-- OK
email :: Map Key Value -> Maybe Value
= lookup "email"
email
-- OK
creditCardNum :: Map Key Value -> Maybe Value
= lookup "card-number"
creditCardNum
data FormData = FormData {
fdPhone :: Value
fdEmail :: Value
, fdCardNum :: Value
,
}
-- less OK
formData :: Map Key Value -> Maybe FormData
map = FormData <$> phone map <*> email map <*> creditCardNum map formData
you can clearly explain the first 3 functions:
explain :: err -> Maybe a -> Either err a
= ...
explain
data Err = MissingEmail | MissingPhone | MissingCardNum
email' :: Map Key Value -> Either Err Value
= explain MissingEmail . email email'
How do I explain formData
? I am stuck with:
data UnknownFieldMissing = UnknownFieldMissing
formData' :: Map Key Value -> Either Err Value
= explain UnknownFieldMissing . formData formData'
and that “Unknown” is not a field name.
The bigger the record, the bigger the problem.
Harmful Real-World Examples
These are in no particular order, other than this presentation reuses types defined earlier.
Maybe
on Hackage
servant-multipart example:
If you used older versions of servant-multipart you are familiar with
status code 400, message “fromMultipart returned Nothing”.
You must have noticed that your logs have been silent as well.
The fix was implemented in 0.11.6
New version (much better):
-- version 0.11.6
class FromMultipart tag a where
fromMultipart :: MultipartData tag -> Either String a
Old version:
-- version 0.11.5
class FromMultipart tag a where
fromMultipart :: MultipartData tag -> Maybe a
Any typo, missed form field, wrong form field type submitted from the calling program resulted in a meaningless 400 error.
To work around this issue I ended up implementing fromMultipart
in Either MultiformError
monad and converting it to Maybe
with something like that:
loggedMultipartMaybe :: Either MultipartException a -> Maybe a
Left err) = do
loggedMultipartMaybe (let logDetails = ...
seq
-- uses unsafePeformIO to match your logging style
(debugLogger logDetails) Nothing
Right r) = Just r loggedMultipartMaybe (
to, at least, get some logs.
In just one project, that saved me hours in both new development and troubleshooting cost.
For a more complex multipart form that implements both FromMultipart
and ToMultipart
by hand, verifying that fromMultipart . toMultipart
is the identity would have been very hard without some information about errors. If the multipart is called from a different program, different language …
Convenience Combinators:
It should be noted that many popular packages offer convenience Maybe
functions even though it is very easy to write this natural transformation:
unExplain :: Either err a -> Maybe a
Why is that? Why not just provide Either
versions? Looking at aeson as an example:
decode :: FromJSON a => ByteString -> Maybe a
eitherDecode :: FromJSON a => ByteString -> Either String a
parseEither :: (a -> Parser b) -> a -> Either String b
parseMaybe :: (a -> Parser b) -> a -> Maybe b
the name decode
is suggestive of being the one commonly used.
Having aeason
in the spotlight, I like this part of their documentation:
The basic ways to signal a failed conversion are as follows:
- fail yields a custom error message: it is the recommended way of reporting a failure;
- empty (or mzero) is uninformative: use it when the error is meant to be caught by some (<|>);
Overuse of mzero
in parsing code is as bad as the overuse of Maybe
.
MonadFail
and Maybe
:
Number of packages try to be polymorphic and use MonadFail
constraint to provide information about unexpected errors (e.g. time, mongoDB). Sadly base provides no standard way to retrieve this information. The ticket to add Either String
instance is a no-go for now.
The packages which use MonadFail
do not offer convenience MonadFail
monads either. It seems wrong and asymmetric to force the caller to define their type for retrieving error information.
But MonadFail
has Maybe
instance! I strongly believe in make writing good code easy, bad code hard design principle. This is clearly violated here.
Also, notice this part of documentation (in base Control.Monad.Fail):
If your Monad is also MonadPlus, a popular definition is
fail _ = mzero
(quiet sob)
Cut catMaybes
Replacing
catMaybes :: [Maybe a] -> [a]
with
partitionEithers :: [Either e a] -> ([e], [a])
very often improves the robustness of code.
Consider Contact
record type with cEmail :: Maybe Email
field. We can get
- lists of emails by using
catMaybes
- list of emails and the information which
Contact
-s do not have an email usingpartitionEithers
.
The Contact
list could come from a parsed JSON and could contain a list of company employees or is a parsed mail-mime CC: header. Missing email should be rare.
import Control.Arrow ((&&&))
-- maybe version
getEmails :: [Contact] -> [Email]
= catMaybes . map cEmail
getEmails
-- either version
data MissingData a = MissingEmailData a | ...
getEmails' :: [Contact] -> ([MissingData Contact], [Email])
= partitionEithers . map (uncurry explain . (MissingEmailData &&& cEmail)) getEmails'
These are “it is rare, therefore can be ignored” vs “it is rare and therefore cannot be ignored” approaches.
I do not know what is the proper name for software that does not handle corner cases. I call it expensive ;)
I question the use of catMaybes
every time I see it.
I try to be terse in the above example, but it is clear that Either
is more work. Terseness and ease of programming are IMO some of the reasons for Maybe
overuse.
HKD pattern
Higher-Kinded Data pattern is super cool and can be very useful. This post explains what it is: reasonablypolymorphic on HKD pattern. My example follows reasonablypolymorphic blog closely.
In nutshell, we can create a record type like
data Person f = Person {
pName :: f String
pAge :: f Int
,-- imagine a lot more fields here
}
parametrized by a type of kind * -> *
e.g. Maybe
or Identity
. All or some of the fields in that record type have an f
in front of them.
HKD Patter is about using generic programming to transform that record based on operations that work on the fields.
For example, we can hoist forall a . f a -> g a
functions to hkd f -> hkd g
(here Person f -> Person g
).
Imagine that Person
has a long list of fields and there is a web form for entering them.
The post descibes a completely generic validation function that, when restricted to our Person
type, looks like this:
validate :: Person Maybe -> Maybe (Person Identity)
I hate when this is done to me: it took 5 minutes to enter the information, the submit button is grayed out, and I see no way to move forward. Typically, when that happens, it is caused by a JavaScript error.
In this example it is not a programming bug, it is a design decision: a very sophisticated way to check that user entered all fields that does not provide information about which fields they missed.
Web form data entry aside, I challenge you to find one meaningful example where the above validate
is useful. A one off data science code that processes massive amount of data and requires all fields to be present to be useful? All examples I can come up with seem far-fetched and still would benefit from having Either
.
Questions I am asking:
- Would you expect a production code somewhere out there that validates user input using HKD pattern and actually uses
Maybe
? - Did reasonablypolymorphic confuse or simplify things by using
Maybe
in its example?
In reasonablypolymorphic post Maybe
is not just in the validate
function. The post defines the whole GValidate
boilerplate that assumes Maybe
.
Fortunately, the approach can be generalized to other f
types.
A meaningful validation would have a type
data FieldInfo = ...
validate :: Person Maybe -> Either [FieldInfo] (Person Identity)
This is arguably more work to do and beyond what HKD pattern can offer. However, it is quite possible to do something like this generically:
data FieldInfo = ...
validate :: Person (Either FieldInfo) -> Either [FieldInfo] (Person Identity)
Check out the documentation for the barbies package, it comes with exactly this example! Notice, some boilerplate work is still needed to annotate missing values with field information:
addFieldInfo :: Person Maybe -> Person (Either FieldInfo)
addFieldInfo
would need to happen outside of the HKD pattern.
One can argue that a better solution would be not to use Person Maybe
and convert user form data entry directly to Person (Either FieldInfo)
.
Traversable
with Maybe
barbies validation of Person
required a traversal of the HKD type. So, maybe, we should consider a somewhat simpler design where fields are unified into one type. Keeping up with the reasonablypolymorphic example:
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
data Person' a = Person' {
pName' :: a
pAge' :: a
,-- imagine a lot more fields here
deriving (..., Functor, Foldable, Traversable) }
And “now we are cooking with gas”!
data FieldData = ... -- unifying type for Person' fields
validateMaybe :: Person' (Maybe FieldData) -> Maybe (Person' FieldData)
= traverse id
validateMaybe
validateEither :: Person' (Either FieldInfo FieldData) -> Either FieldInfo (Person' FieldData)
= traverse id validateEither
New validateEither
is not as good as the barbies version. It gives the user only one of the fields they missed.
IMO validateMaybe
is useless for any data entry form validation.
The validateMaybe
example is here for a reason. It directly mimics the example discussed in Elementary Programming, which used
mapMaybe :: (a -> Maybe b) -> [a] -> Maybe [b]
= traverse mapMaybe
as its only example.
Maybe
is viral.
Questioning Record Types with all Maybe
Fields
Besides being a natural fit for data coming from something like a web form, there are other reasons for designing record types with many Maybe
fields.
Here is my attempt at debunking some of them.
Recreating Java Beans with Maybe
;)
Defining record types with many Maybe
fields allows to construct such records easily if you care about only some of the fields.
This can be done via some empty
defaulting mechanism (I will use the Person
type defined above to serve as an example):
emptyPerson :: Person Maybe
= Person Nothing Nothing emptyPerson
or with a use of Monoid
and mempty
. emptyPerson
could be defined in a generic way as well (see hkd-default).
Say, your code cares about pAge
only, you can just set pAge
:
isDrinkingAge :: Person Identity -> Bool
= runIdentity (pAge p) >= 21
isDrinkingAge p
= emptyPerson {pAge = Identity 10}
test10YearOld = isDrinkinAge test10YearOld test
I do not like this approach. It feels like a poorly typed code.
It also reminds me of null
and hence the Java Bean reference. (Java Bean was a popular pattern in the Java ecosystem. A Bean needs to have an empty constructor, and a setter / getter method for each field. It seems very similar to a record type with lots of Maybe
fields.)
An easy improvement would be to create Age
type
newtype Age = Age Int
data Person'' f = Person'' {
...
pAge'' :: f Age
,-- imagine a lot more fields here
}
isDrinkingAge' :: Age -> Bool
Age a) = a >= 21 isDrinkingAge' (
If we feel strongly about checking age on the Person
type, we can use Haskell’s ability to program with polymorphic fields:
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE DuplicateRecordFields #-}
import GHC.Records
isDrinkingAge :: HasField "pAge" p (Identity Int) => p -> Bool
= runIdentity (getField @ "pAge" p) >= 21
isDrinkingAge p
newtype AgeTest = AgeTest { pAge :: Identity Int}
= isDrinkingAge $ AgeTest (Identity 40) test40YearOld
IMO creating record types with lots of Maybe
fields for the benefit of easy construction is not a good pattern.
Maybe-First Monoid
fields
Maybe (First _)
is a valid Monoid
.
Using Maybe-First semantics, mappend
selects the first non-Nothing
element. Appending can be implemented as:
import Data.Semigroup (First (..))
<> b = fmap getFirst $ fmap First a <> fmap First b
a
-- or in more "elemenary" way as:
Nothing <> b = b
<> Nothing = a
a <> b = a a
You can use this approach on each field to define Monoid
instances for large record types that consist of Maybe
fields.
This pattern provides a convenient defaulting mechanism and allows to set groups of fields at once using mappend
.
<> record overrides
This approach can also result in very weird data combinations if one is not careful:
= (mempty :: Person Maybe) {pName = "grandpa"} -- missing age
gd = (mempty :: Person Maybe) {pName = Just "baby", pAge = Just 1}
baby
= gd <> baby -- it is easy to create surprising data grampatoddler
The reverse: First (Maybe _)
Monoid
instance is far less convenient to use but is much less surprising.
I do not think that Maybe-First
Monoid
is necessarily bad. There is simply a trade-off between the conveniences it offers and its gotchas. I prefer designs that provide more safety over accidental bugs.
Alternative
typeclass
Alternaive
(<|>)
is very convenient tool often used with parsers. It can be dangerous on its own merit. It can suppress error information in questionable ways if overused (I wrote a separate post about it).
There is currently no Alternative
instance for Either err
but there is one for Maybe
. This creates temptation to unExplain
the err
…
Good uses of Maybe
I think the following old design principle (Postel’s law) is still valid (on the implementation side)
Lenient input, Strict Output.
This means Maybe
is great as input parameter, less so in the result. The above Error Clarity Rule should be the overriding factor here. If the call-site can disambiguate what Nothing
is, then Maybe
results are fine.
In particular, prisms are typically used on not nested coproducts, thus, the call-site can disambiguate at which level the pattern match failed. lookup
in Data.Map
, find
for a Foldabe
are all perfectly good choices for a Maybe
result type.
Why Maybe
is Overused? Possible Explanations
IMO these are the main causes of the overuse:
Using
Maybe
is simpler thatEither
. If doing the right thing takes more time and effort it will often not be done.Coding with
Maybe
is terser. Thus, coding withMaybe
may seem more elegant.Maybe
is more expressive. Examples:Alternative
instance; to useMonad
,Applicative
withEither err
you need to unify on theerr
type which is extra work.Sophisticated abstractions can obscure common sense.
Maybe
is likely to fit the abstraction more often and easier thanEither
.
Oversimplifications are nothing new in mathematical modeling. Anyone who studied, for example, mathematical physics has seen a lot of crazy oversimplifications. Code design appears not that different.
- Non production code. Lots of Haskell code is about CS research. Lots of Haskell code is pet projects. Such code does not need to be maintained in production.
Maybe
is good enough.
If developer can disambiguate the reason for Nothing
, then use of Either
is optional. This is not an overuse case and is justified.
I started with link to Elementary Programming post and want to end with it. Would more explicit “elementary” programs help in spotting obvious things like error information loss? I think they could. Starting from requirements and going back to most elementary solution would probably never arrive at:
mapMaybe :: (a -> Maybe b) -> [a] -> Maybe [b]
if only the requirements cared about errors.
I do not advocate avoiding abstractions, just do not forget errors on the way. Don’t throw the baby out with the bathwater.
I am sure I do not have a full understanding of why and how Maybe
is overused. The intent of this post is to start a discussion.
Discussion links:
- github discussions
More Links
(Links and information added later, after the original post)
The Trouble with Typed Errors - My post is often read as “Use Either” (I would prefer just “Do not overuse Maybe”). Matt Parsons’s great article talks about problems with monolithic error types and more. Extensible error types is important topic worth its own github awesome page.
Haskell Weekly Podcast has discussed this post. Thank you so much!