Last Modified on February 17, 2021
- (2021.01.18) Added MonadFail subsection to Maybe on Hackage
- (2021.02.17-now) More Links (list of related links)
- (2022.08.30) reclassified under patterns-of-erroneous-code tag
Motivating References:
I was very happy to find Michael Snoyman’s Haskell The Bad Parts series.
I was also motivated by Elementary Programming post (reddit).
Similar: reddit data_maybe_harmful
This post is Haskell specific.
This post treats the term error colloquially, it does not distinguish between exceptions and errors. In particular, error information loss refers to exceptions not errors.
Nutshell
Maybe is the functional answer to null - the billion dollar mistake. I claim that using Maybe can still be problematic.
IMO Maybe is often overused. I have started to question the use of Maybe every time I see it in the code base I maintain. The result is either accepting its usage or rewriting the code to use Either. This approach has been effective in creating more robust code. I am not claiming that Maybe has no place in a well written code, only that its use should be closely examined. I have seen brilliant code that has been hard to maintain because of its overuse of Maybe. This post shares a perspective of someone who maintains a complex Haskell code base.
Maybe improves over null. But it does not supersede it. Languages that have null also have easy access to logging, stack traces etc.
Maybe typically represents data that can be missing or a computation that can result in an unknown error.
What you typically care about is what data is missing and what is the error.
Code correctness, reasoning about code are, arguably, the defining aspects of FP. Reasoning about code typically refers to some advanced use of the type system or formal methods. IMO “reasoning about code” should start with reasoning about errors and corner cases (like missing data). This is why the use of Maybe needs to be examined and questioned. In my experience this aspect of reasoning about code is often overlooked.
Reasoning about errors is not easy: The type system can’t help with errors that bypass it (e.g. error :: String -> a). It can’t help with exceptions which were intentionally suppressed into Nothing (the focus of this post). And, the list goes on…
My points / pleas are:
- The ecosystem would be better off without offering convenience combinators that return
Maybeif an equivalent returningEitherexists - Examples, tutorials, and blog posts should favor
EitheroverMaybe - Developers should be careful about not overusing
Maybe
Error Clarity Rule
What does Nothing mean? If the reason behind it can be disambiguated to one root cause, then I consider the use of Maybe justified. Otherwise, I question its use.
Consider this code:
import Data.Map
import Prelude hiding (lookup)
type Key = String
type Value = ...
-- OK
phone :: Map Key Value -> Maybe Value
phone = lookup "phone"
-- OK
email :: Map Key Value -> Maybe Value
email = lookup "email"
-- OK
creditCardNum :: Map Key Value -> Maybe Value
creditCardNum = lookup "card-number"
data FormData = FormData {
fdPhone :: Value
, fdEmail :: Value
, fdCardNum :: Value
}
-- less OK
formData :: Map Key Value -> Maybe FormData
formData map = FormData <$> phone map <*> email map <*> creditCardNum map you can clearly explain the first 3 functions:
explain :: err -> Maybe a -> Either err a
explain = ...
data Err = MissingEmail | MissingPhone | MissingCardNum
email' :: Map Key Value -> Either Err Value
email' = explain MissingEmail . emailHow do I explain formData? I am stuck with:
data UnknownFieldMissing = UnknownFieldMissing
formData' :: Map Key Value -> Either Err Value
formData' = explain UnknownFieldMissing . formDataand that “Unknown” is not a field name.
The bigger the record, the bigger the problem.
Harmful Real-World Examples
These are in no particular order, other than this presentation reuses types defined earlier.
Maybe on Hackage
servant-multipart example:
If you used older versions of servant-multipart you are familiar with
status code 400, message “fromMultipart returned Nothing”.
You must have noticed that your logs have been silent as well.
The fix was implemented in 0.11.6
New version (much better):
-- version 0.11.6
class FromMultipart tag a where
fromMultipart :: MultipartData tag -> Either String aOld version:
-- version 0.11.5
class FromMultipart tag a where
fromMultipart :: MultipartData tag -> Maybe aAny typo, missed form field, wrong form field type submitted from the calling program resulted in a meaningless 400 error.
To work around this issue I ended up implementing fromMultipart in Either MultiformError monad and converting it to Maybe with something like that:
loggedMultipartMaybe :: Either MultipartException a -> Maybe a
loggedMultipartMaybe (Left err) = do
let logDetails = ...
seq
(debugLogger logDetails) -- uses unsafePeformIO to match your logging style
Nothing
loggedMultipartMaybe (Right r) = Just r to, at least, get some logs.
In just one project, that saved me hours in both new development and troubleshooting cost.
For a more complex multipart form that implements both FromMultipart and ToMultipart by hand, verifying that fromMultipart . toMultipart is the identity would have been very hard without some information about errors. If the multipart is called from a different program, different language …
Convenience Combinators:
It should be noted that many popular packages offer convenience Maybe functions even though it is very easy to write this natural transformation:
unExplain :: Either err a -> Maybe aWhy is that? Why not just provide Either versions? Looking at aeson as an example:
decode :: FromJSON a => ByteString -> Maybe a
eitherDecode :: FromJSON a => ByteString -> Either String a
parseEither :: (a -> Parser b) -> a -> Either String b
parseMaybe :: (a -> Parser b) -> a -> Maybe b the name decode is suggestive of being the one commonly used.
Having aeason in the spotlight, I like this part of their documentation:
The basic ways to signal a failed conversion are as follows:
- fail yields a custom error message: it is the recommended way of reporting a failure;
- empty (or mzero) is uninformative: use it when the error is meant to be caught by some (<|>);
Overuse of mzero in parsing code is as bad as the overuse of Maybe.
MonadFail and Maybe:
Number of packages try to be polymorphic and use MonadFail constraint to provide information about unexpected errors (e.g. time, mongoDB). Sadly base provides no standard way to retrieve this information. The ticket to add Either String instance is a no-go for now.
The packages which use MonadFail do not offer convenience MonadFail monads either. It seems wrong and asymmetric to force the caller to define their type for retrieving error information.
But MonadFail has Maybe instance! I strongly believe in make writing good code easy, bad code hard design principle. This is clearly violated here.
Also, notice this part of documentation (in base Control.Monad.Fail):
If your Monad is also MonadPlus, a popular definition is
fail _ = mzero
(quiet sob)
Cut catMaybes
Replacing
catMaybes :: [Maybe a] -> [a]with
partitionEithers :: [Either e a] -> ([e], [a])very often improves the robustness of code.
Consider Contact record type with cEmail :: Maybe Email field. We can get
- lists of emails by using
catMaybes - list of emails and the information which
Contact-s do not have an email usingpartitionEithers.
The Contact list could come from a parsed JSON and could contain a list of company employees or is a parsed mail-mime CC: header. Missing email should be rare.
import Control.Arrow ((&&&))
-- maybe version
getEmails :: [Contact] -> [Email]
getEmails = catMaybes . map cEmail
-- either version
data MissingData a = MissingEmailData a | ...
getEmails' :: [Contact] -> ([MissingData Contact], [Email])
getEmails' = partitionEithers . map (uncurry explain . (MissingEmailData &&& cEmail))These are “it is rare, therefore can be ignored” vs “it is rare and therefore cannot be ignored” approaches.
I do not know what is the proper name for software that does not handle corner cases. I call it expensive ;)
I question the use of catMaybes every time I see it.
I try to be terse in the above example, but it is clear that Either is more work. Terseness and ease of programming are IMO some of the reasons for Maybe overuse.
HKD pattern
Higher-Kinded Data pattern is super cool and can be very useful. This post explains what it is: reasonablypolymorphic on HKD pattern. My example follows reasonablypolymorphic blog closely.
In nutshell, we can create a record type like
data Person f = Person {
pName :: f String
, pAge :: f Int
-- imagine a lot more fields here
} parametrized by a type of kind * -> * e.g. Maybe or Identity. All or some of the fields in that record type have an f in front of them.
HKD Patter is about using generic programming to transform that record based on operations that work on the fields.
For example, we can hoist forall a . f a -> g a functions to hkd f -> hkd g (here Person f -> Person g).
Imagine that Person has a long list of fields and there is a web form for entering them.
The post descibes a completely generic validation function that, when restricted to our Person type, looks like this:
validate :: Person Maybe -> Maybe (Person Identity)I hate when this is done to me: it took 5 minutes to enter the information, the submit button is grayed out, and I see no way to move forward. Typically, when that happens, it is caused by a JavaScript error.
In this example it is not a programming bug, it is a design decision: a very sophisticated way to check that user entered all fields that does not provide information about which fields they missed.
Web form data entry aside, I challenge you to find one meaningful example where the above validate is useful. A one off data science code that processes massive amount of data and requires all fields to be present to be useful? All examples I can come up with seem far-fetched and still would benefit from having Either.
Questions I am asking:
- Would you expect a production code somewhere out there that validates user input using HKD pattern and actually uses
Maybe? - Did reasonablypolymorphic confuse or simplify things by using
Maybein its example?
In reasonablypolymorphic post Maybe is not just in the validate function. The post defines the whole GValidate boilerplate that assumes Maybe.
Fortunately, the approach can be generalized to other f types.
A meaningful validation would have a type
data FieldInfo = ...
validate :: Person Maybe -> Either [FieldInfo] (Person Identity)This is arguably more work to do and beyond what HKD pattern can offer. However, it is quite possible to do something like this generically:
data FieldInfo = ...
validate :: Person (Either FieldInfo) -> Either [FieldInfo] (Person Identity)Check out the documentation for the barbies package, it comes with exactly this example! Notice, some boilerplate work is still needed to annotate missing values with field information:
addFieldInfo :: Person Maybe -> Person (Either FieldInfo) addFieldInfo would need to happen outside of the HKD pattern.
One can argue that a better solution would be not to use Person Maybe and convert user form data entry directly to Person (Either FieldInfo).
Traversable with Maybe
barbies validation of Person required a traversal of the HKD type. So, maybe, we should consider a somewhat simpler design where fields are unified into one type. Keeping up with the reasonablypolymorphic example:
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
data Person' a = Person' {
pName' :: a
, pAge' :: a
-- imagine a lot more fields here
} deriving (..., Functor, Foldable, Traversable)And “now we are cooking with gas”!
data FieldData = ... -- unifying type for Person' fields
validateMaybe :: Person' (Maybe FieldData) -> Maybe (Person' FieldData)
validateMaybe = traverse id
validateEither :: Person' (Either FieldInfo FieldData) -> Either FieldInfo (Person' FieldData)
validateEither = traverse idNew validateEither is not as good as the barbies version. It gives the user only one of the fields they missed.
IMO validateMaybe is useless for any data entry form validation.
The validateMaybe example is here for a reason. It directly mimics the example discussed in Elementary Programming, which used
mapMaybe :: (a -> Maybe b) -> [a] -> Maybe [b]
mapMaybe = traverseas its only example.
Maybe is viral.
Questioning Record Types with all Maybe Fields
Besides being a natural fit for data coming from something like a web form, there are other reasons for designing record types with many Maybe fields.
Here is my attempt at debunking some of them.
Recreating Java Beans with Maybe ;)
Defining record types with many Maybe fields allows to construct such records easily if you care about only some of the fields.
This can be done via some empty defaulting mechanism (I will use the Person type defined above to serve as an example):
emptyPerson :: Person Maybe
emptyPerson = Person Nothing Nothingor with a use of Monoid and mempty. emptyPerson could be defined in a generic way as well (see hkd-default).
Say, your code cares about pAge only, you can just set pAge:
isDrinkingAge :: Person Identity -> Bool
isDrinkingAge p = runIdentity (pAge p) >= 21
test10YearOld = emptyPerson {pAge = Identity 10}
test = isDrinkinAge test10YearOldI do not like this approach. It feels like a poorly typed code.
It also reminds me of null and hence the Java Bean reference. (Java Bean was a popular pattern in the Java ecosystem. A Bean needs to have an empty constructor, and a setter / getter method for each field. It seems very similar to a record type with lots of Maybe fields.)
An easy improvement would be to create Age type
newtype Age = Age Int
data Person'' f = Person'' {
...
, pAge'' :: f Age
-- imagine a lot more fields here
}
isDrinkingAge' :: Age -> Bool
isDrinkingAge' (Age a) = a >= 21If we feel strongly about checking age on the Person type, we can use Haskell’s ability to program with polymorphic fields:
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE DuplicateRecordFields #-}
import GHC.Records
isDrinkingAge :: HasField "pAge" p (Identity Int) => p -> Bool
isDrinkingAge p = runIdentity (getField @ "pAge" p) >= 21
newtype AgeTest = AgeTest { pAge :: Identity Int}
test40YearOld = isDrinkingAge $ AgeTest (Identity 40)IMO creating record types with lots of Maybe fields for the benefit of easy construction is not a good pattern.
Maybe-First Monoid fields
Maybe (First _) is a valid Monoid.
Using Maybe-First semantics, mappend selects the first non-Nothing element. Appending can be implemented as:
import Data.Semigroup (First (..))
a <> b = fmap getFirst $ fmap First a <> fmap First b
-- or in more "elemenary" way as:
Nothing <> b = b
a <> Nothing = a
a <> b = aYou can use this approach on each field to define Monoid instances for large record types that consist of Maybe fields.
This pattern provides a convenient defaulting mechanism and allows to set groups of fields at once using mappend.
overrides <> recordThis approach can also result in very weird data combinations if one is not careful:
gd = (mempty :: Person Maybe) {pName = "grandpa"} -- missing age
baby = (mempty :: Person Maybe) {pName = Just "baby", pAge = Just 1}
grampatoddler = gd <> baby -- it is easy to create surprising dataThe reverse: First (Maybe _) Monoid instance is far less convenient to use but is much less surprising.
I do not think that Maybe-First Monoid is necessarily bad. There is simply a trade-off between the conveniences it offers and its gotchas. I prefer designs that provide more safety over accidental bugs.
Alternative typeclass
Alternaive (<|>) is very convenient tool often used with parsers. It can be dangerous on its own merit. It can suppress error information in questionable ways if overused (I wrote a separate post about it).
There is currently no Alternative instance for Either err but there is one for Maybe. This creates temptation to unExplain the err …
Good uses of Maybe
I think the following old design principle (Postel’s law) is still valid (on the implementation side)
Lenient input, Strict Output.
This means Maybe is great as input parameter, less so in the result. The above Error Clarity Rule should be the overriding factor here. If the call-site can disambiguate what Nothing is, then Maybe results are fine.
In particular, prisms are typically used on not nested coproducts, thus, the call-site can disambiguate at which level the pattern match failed. lookup in Data.Map, find for a Foldabe are all perfectly good choices for a Maybe result type.
Why Maybe is Overused? Possible Explanations
IMO these are the main causes of the overuse:
Using
Maybeis simpler thatEither. If doing the right thing takes more time and effort it will often not be done.Coding with
Maybeis terser. Thus, coding withMaybemay seem more elegant.Maybeis more expressive. Examples:Alternativeinstance; to useMonad,ApplicativewithEither erryou need to unify on theerrtype which is extra work.Sophisticated abstractions can obscure common sense.
Maybeis likely to fit the abstraction more often and easier thanEither.
Oversimplifications are nothing new in mathematical modeling. Anyone who studied, for example, mathematical physics has seen a lot of crazy oversimplifications. Code design appears not that different.
- Non production code. Lots of Haskell code is about CS research. Lots of Haskell code is pet projects. Such code does not need to be maintained in production.
Maybeis good enough.
If developer can disambiguate the reason for Nothing, then use of Either is optional. This is not an overuse case and is justified.
I started with link to Elementary Programming post and want to end with it. Would more explicit “elementary” programs help in spotting obvious things like error information loss? I think they could. Starting from requirements and going back to most elementary solution would probably never arrive at:
mapMaybe :: (a -> Maybe b) -> [a] -> Maybe [b]if only the requirements cared about errors.
I do not advocate avoiding abstractions, just do not forget errors on the way. Don’t throw the baby out with the bathwater.
I am sure I do not have a full understanding of why and how Maybe is overused. The intent of this post is to start a discussion.
Discussion links:
- github discussions
More Links
(Links and information added later, after the original post)
The Trouble with Typed Errors - My post is often read as “Use Either” (I would prefer just “Do not overuse Maybe”). Matt Parsons’s great article talks about problems with monolithic error types and more. Extensible error types is important topic worth its own github awesome page.
Haskell Weekly Podcast has discussed this post. Thank you so much!