Algebraic Data Types (Sum & Product types)
==========================================

Learning Goal
-------------

In this chapter we'll understand the downside of **not** being able to
communicate *semantics* of a function through the built-in types that we have
seen till now (in previous chapters). We'll try to address that problem, first
by using ``newtype`` and then via product-types. 

After that we will see how to *restrict* the values that a function can accept
by using sum-types, thus making our code type-safe, correct, and less
error-prone. 

We will also be covering pattern-matching along the way.

By the end of this chapter, and after solving the exercises, you should be able
to appreciate when to use vanilla built-in types, when to use custom types (like
newtypes, sum-types, and product-types) in your code, and the pros & cons of
each approach.

Structure vs Semantics - Downside of "regular" type-signatures
--------------------------------------------------------------

Let's try to understand *why* we need more "complicated" types than the regular
built-in types that we've seen till now (i.e. ``String``, ``Char``, ``Bool``,
``Int``, etc). Consider our ``calculateEndTime`` function from the previous
chapter:

.. code-block:: haskell

  calculateEndTime :: Int -> Int -> Int -> (Int, Int)
  calculateEndTime hr mn durationInMins = 
    let (addHour, finalMins) = divMod (mn + durationInMins) 60
    in (hr + addHour, finalMins)

If we look at *only* the function signature, and not the variable names or the
actual function definition, here is what we'll see:

.. code-block:: haskell

  calculateEndTime :: Int -> Int -> Int -> (Int, Int)

By looking at *just* this type signature can you be sure about what each ``Int``
argument represents in real life? Which one is for the hour? Which one is for
the minute? Which one is for the duration? Is the duration in seconds or
minutes?

This is what is meant by "semantics". By the way, here are a couple of
dictionary meanings of the word:

.. code-block:: text

  semantics 

    1. the meaning, or an interpretation of the meaning, of a word, sign, sentence, etc.: 
    2. the meaning of a word, phrase, or text.
    3. the branch of linguistics and logic concerned with meaning.

While the *structure* of the ``calculateEndTime`` function is absolutely clear -
i.e. it takes three integer arguments and returns a 2-tuple of two integers -
the *semantics* are not clear. What does each integer argument really *mean*?
What does each represent in real life?

Let's take another example. Say, you are working on a shopping-cart project, and
one of your team mates has written a function to prepare the content/HTML for an
order confirmation email. Here's what the type-signature looks like:

.. code-block:: haskell

  prepareConfirmationEmail :: String -> String -> [String] -> [String] -> String

I'm pretty certain that you'll be pulling your hair out if you had to work with
just this! Again, the structure is clear, and the compiler is going to enforce
it for you, but that doesn't help you much, does it? 

Since, it's your team's code (and not some closed-source library), you can peek
into the definition of the function and see this:

.. code-block:: haskell

  prepareConfirmationEmail :: String -> String -> [String] -> [String] -> String
  prepareConfirmationEmail orderId subject to cc = _todo

Aha! Just by looking at the variable names you can understand the *semantics* of
the function call. Given the type-signature **along with** the variable names,
now you know what you need to pass where, **and** the compiler will enforce that
you are passing the correct types. (i.e. you are not passing a list of strings
to the ``subject`` argument).

.. note:: 

  By the way **this** is why naming is very, very important in computer
  engineering. It is a well-known saying - **There are two hard problems in
  computer science: cache invalidation and naming things**

So, that should solve the problem then, right? Well, not quite...

Problem with API docs
^^^^^^^^^^^^^^^^^^^^^

Firstly, the standard doc generation tool in Haskell - haddock [1]_ - **does
not** generate API docs with variable names. This is one of the problems that
you might already be facing. Compare the API docs of the ``filter`` function
from Clojure vs Haskell:

.. code-block:: clojure

  ; clojure

  (filter pred coll)

.. code-block:: haskell

  -- haskell

  filter :: (a -> Bool) -> [a] -> [a] 

If you aren't accustomed to reading type-signatures, you'll definitely struggle
with the Haskell API docs. On the other hand Clojure's API docs, don't contain
any type signature, but because of the variable names - ``pred`` (for predicate)
and ``coll`` for collection - make intuitive sense. The flip-side of this is, in
Clojure's case, without seeing usage examples, you cannot really be sure what
type of values to pass to ``pred`` & ``coll``. Does it work for *all* types of
collections? i.e. lists, vectors, hashmaps, trees, or only certain kinds of
collections?

So, the Haskell way of giving more importance to type-signatures is
*structurally superior*, but *semantically inferior.*

But, why can't Haskell generate API docs with variable names, like how Java
does? The *probable* reason has to do with "function currying" and "point-free
syntax", but explaining those would be quite a detour for this chapter.

Variable names alone, do not prevent bugs adequately
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Secondly, just because you know variable names does not protect you adequately
from certain classes of bugs. My favourite example for this is the `Mars Climate
Orbiter failure <https://en.wikipedia.org/wiki/Mars_Climate_Orbiter#Cause_of_failure>`_:

  "...communication with the spacecraft was lost as the spacecraft went into orbital insertion, due to ground-based computer software which produced output in non-SI units of pound (force)-seconds (lbf·s) instead of the SI units of newton-seconds (N·s) specified in the contract between NASA and Lockheed..."

If you're writing a program where one ``Int`` argument is *semantically*
different from another ``Int`` argument, say the ``x`` co-ordinate from the
``y`` co-ordinate -- or in the Mars Climate Orbiter's case, one ``Double``
(force-seconds) from another ``Double`` (newton-seconds) -- just knowing the
variable names will not protect you from inadvertently doing an incorrect
computation. For example: 

- adding the ``x`` co-ordinate to the ``y`` co-ordinate
- dividing a force in ``newtown-second`` with a force in ``pound-seconds``
- substituting someone's name with their email address
- adding something in INR with something in USD

**Thus, we use Haskell's advanced type-system to design custom-types that help
us write better programs.** Let us see how...

.. _positional_product_types:

Positional product types
------------------------

.. note:: 

  In academic literature there are only "sum" types and "product" types.
  However, I have further split product types into two parts - positional and
  named/records - just to make them easier to refer to.

Let us go back to the ``calculateEndTime`` example and fix the *semantics*
problem in the type-signature, by changing this...

.. code-block:: haskell

  calculateEndTime :: Int -> Int -> Int -> (Int, Int)
  calculateEndTime = _todo

...to the following:

.. code-block:: haskell

  data TimeOfDay = MkTimeOfDay Int Int deriving (Eq, Show, Ord)
  
  calculateEndTime :: TimeOfDay -> Int -> TimeOfDay
  calculateEndTime (MkTimeOfDay hr mn) durationInMins = 
    let (addHour, finalMins) = divMod (mn + durationInMins) 60
    in MkTimeOfDay (hr + addHour) finalMins

  -- GHCi> calculateEndTime (MkDuration 10 30) 45
  -- MkDuration 11 15

Let's break this down line-by-line:

.. code-block:: haskell

  data TimeOfDay = MkTimeOfDay Int Int deriving (Eq, Show, Ord)

- We've defined a new "positional product-type"
- The *type* is called ``TimeOfDay`` (which is now at the same *level* as the built-in types like ``Int`` or ``Char``). This means that we can actually use it in type-signatures.
- The *constructor* of the ``TimeOfDay`` *type* is called ``MkTimeOfDay``. This is unlike a lot of other OOP languages where the constructors must either have the *same* name as the type (or class) OR should have a *special* name. For example, here's what a constructor looks like in Java:

  .. code-block:: java
    :emphasize-lines: 5,6,7

    class Person {
      private String name;
      private String email;

      // Cosntructors in Java must have the same name as the class
      //
      Person(String name, String email) {
        this.name = name;
        this.email = email;
      }
    }

  And here is what a constructor in Ruby looks like:

  .. code-block:: ruby
    :emphasize-lines: 2,3,4

    class Person
      # Constructors in Ruby must be named `initialize`
      #
      def initialize(name, email)
        @name = name
        @email = email
      end
    end

  When we talk about "sum" types we'll see why is it possible in Haskell to have constructors with different names for the same type.
- The ``MkTimeOfDay`` constructor "holds" two ``Int`` values. 
- Finally there is a ``deriving (Eq, Show, Ord)`` boilerplate, which we'll understand when we cover type-classes.

  .. important::

    Whenever you are defining custom types, please put ``deriving (Eq, Show,
    Ord)`` at the tne end, otherwise GHCi won't be able to print values of your
    custom type to the terminal. Why exactly is this required will become clearer
    when we cover type-classes.

The reason why I'm calling this type a **positional** product-type is because
there is no *explicit* differentiation between the two ``Int`` values. One of
them holds the hour and the other minute. But, by just looking at the type
definition, we cannot be sure which holds which. We have to fallback on a strong
"positional convention", whenever you're writing the time-of-day, the first one
is hour and the second one is minutes.

Now, before we break down the new definition of ``calcluateEndTime`` let's see
what this type actually is, and how to define values of this type. Let's play
around with it in GHCi:

.. code-block:: haskell

  GHCi> data TimeOfDay = MkTimeOfDay Int Int deriving (Eq, Show, Ord)
  
  GHCi> let y = MkTimeOfDay 10 30
  GHCi> y
  MkTimeOfDay 10 30

  GHCi> :t y
  y :: TimeOfDay

  GHCi> MkTimeOfDay 12 45
  MkTimeOfDay 12 45

  GHCi> y + 20
  <interactive>:10:1: error:
      • No instance for (Num TimeOfDay) arising from a use of ‘+’
      • In the expression: y + 20
        In an equation for ‘it’: it = y + 20

Notice the last part where we tried to do ``y + 20``. The compiler prevented us
from evaluating that expression becuase it doesn't know how to add a
``TimeOfDay`` to an ``Int``, even though ``y`` internally holds two integers.
Some type-safety has started showing up...

Also notice the output of ``y`` and ``:t y``. The first printed the *value*,
i.e. ``MkTimeOfDay 10 30`` and the latter printed the *type*, i.e.
``TimeOfDay``. This is completely analogous to, say, printing ``123`` (the
*value*) and ``Int`` (the *type*).

Now, we have "wrapped" our two integers into a custom data-type called
``TimeOfDay`` that we can't do numerical computations on. However, we *need* to
do numerical computations on the hour & minute values that have been wrapped.
How do we do this?

If this were Java, we would be reaching out for things like getters/setters (for
private variables in a class), or would be directly manipulating the variables
(for public variables):

.. code-block:: java

  Person p = Person("Saurabh", "saurabh@vacationlabs.com");
  
  // either this....
  System.out.println(p.name);

  // ... or this...
  System.out.println(p.getName());

However, Haskell doesn't have this concept of getters/setters. In fact, it
doesn't even have the concept of public/private variables with respect to custom
data types. Instead it has, what is called, **pattern matching**

Pattern matching
^^^^^^^^^^^^^^^^

Pattern-matching allows you to extract "inner / wrapped" values from
constructors of various types. Let's see some pattern-matches in action:

**Pattern-matching a tuple:** This works because ``(,)``, ``(,,)``, ``(,,,)``,
and so on, are actually constructors for 2-tuples, 3 tuples, 4 tuples,
respectively. It's just that tuples are built into the language and you can use
them in the more intuitive infix notation as well (i.e. what we have been using
till now).

.. code-block:: haskell

  -- using the regular "infix" notation for the (,) constructor
  -- 
  GHCi> let (x, y) = (10, 'a')
  GHCi> x
  10
  GHCi> y
  'a'

  -- Using the prefix notation for the (,) constructor
  --
  GHCi> (,) 'a' 'b'
  ('a', 'b')
  GHCi> let (x, y, z) = (,,) 10 20 'a'
  GHCi> x
  10
  GHCi> y
  20
  GHCi> z
  'a'

  -- Even this works...
  GHCi> let (,,) x y z = (,,) 10 20 'a'

  -- And this works as well...
  GHCi> let (,,) x y z = (10, 20, 'b')


**Pattern-matching a positional product type:** *Usually*, it is not possible to
use the constructor for a positional product type in an infix notation. So,
sticking to the prefix notation, pattern-matching for a positional product type
works similarly:

.. code-block:: haskell

  GHCi> let (MkTimeOfDay hr mn) = MkTimeOfDay 10 30
  GHCi> hr
  10
  GHCi> mn
  30

  GHCi> let t = MkTimeOfDay 12 45
  GHCi> let (MkTimeOfDay hr mn) = t
  GHCi> hr
  12
  GHCi> mn
  45

.. note:: 

  It is always a good idea to put parentheses around your pattern matches as
  shown in the example above. While it can work without parentheses some times,
  a lot of times the compiler gets confused if you use multiple-pattern matches
  in function arguments without parentheses. This leads to very complicated
  compiler errors. So, **always put parentheses around pattern matches**

**Pattern-matches can be nested:** What if we had a value of type ``(Char, Int,
TimeOfDay)``?

.. code-block:: haskell

  GHCi> let (x, y, MkTimeOfDay hr mn) = ('a', 'b', MkTimeOfDay 10 30)
  GHCi> x 
  'a'
  GHCi> y
  'b'
  GHCi> hr
  10
  GHCi> mn
  30

**It is not necessary to pattern-match:** It is not always necessary to
"destructure" a constructor into its "inner values" if you don't need them. For
example:

.. code-block:: haskell

  GHCi> let (x, y, t) = ('a', 'b', MkTimeOfDay 10 30)
  GHCi> x 
  'a'
  GHCi> y
  'b'
  GHCi> t
  MkTimeOfDay 10 30

**You can pattern-match the entire "wrapped" value as well as the inner values:**

.. code-block:: haskell
  :emphasize-lines: 1

  GHCi> let (x, y, t@(MkTimeOfDay hr mn)) = ('a', 'b', MkTimeOfDay 10 30)
  GHCi> x 
  'a'
  GHCi> y
  'b'
  GHCi> t
  MkTimeOfDay 10 30
  GHCi> hr
  10
  GHCi> mn
  30

With this detour of pattern-matching out of the way, re-read the function
definition of ``calculateEndtime`` would be making more sense now:

.. code-block:: haskell

  calculateEndTime :: TimeOfDay -> Int -> TimeOfDay
  calculateEndTime (MkTimeOfDay hr mn) durationInMins = 
    let (addHour, finalMins) = divMod (mn + durationInMins) 60
    in (MkTimeOfDay (hr + addHour) finalMins)

By the way, this is functionally equivalent to the following:

.. code-block:: haskell
  :emphasize-lines: 2,3

  calculateEndTime :: TimeOfDay -> Int -> TimeOfDay
  calculateEndTime t durationInMins = 
    let (MkTimeOfDay hr mn) = t
        (addHour, finalMins) = divMod (mn + durationInMins) 60
    in (MkTimeOfDay (hr + addHour) finalMins)

(The difference is *where* we are doing the pattern match. In the former, we are
pattern matching the function arguments directly. In the latter the function
argument is ``t`` and is then pattern-matched in the ``let`` binding).

While this is a significantly better than our original function signature, there
is still a slight problem with our function signature. Notice the ``Int``
argument there? Just by looking at the type-signature we still don't know
whether that is supposed to be in hours, minutes, or seconds. Let's fix that
using another custom type:

.. code-block:: haskell

  data TimeOfDay = MkTimeOfDay Int Int deriving (Eq, Show, Ord)
  data DurationInMins = MkDuration Int deriving (Eq, Show, Ord)

  calculateEndTime :: TimeOfDay -> DurationInMins -> TimeOfDay
  calculateEndTime (MkTimeOfDay hr mn) (MkDuration d) = 
    let (addHour, finalMins) = divMod (mn + d) 60
    in (MkTimeOfDay (hr + addHour) finalMins)  

  -- GHCi> calculateEndTime (MkTimeOfDay 10 30) (MkDuration 45)
  -- MkTimeOfDay 11 15

Now, in our code, if there are some places where we are representing event
duration in seconds and in some places we are representing it in minutes, the
compiler will prevent us from passing duration-in-seconds to this function, by
mistake.

.. _immutability:

Detour: Immutability
^^^^^^^^^^^^^^^^^^^^

Let's go back to our ``calculateEndTime`` code. Did you notice that we didn't
really "update" the value of the hour & minute fields? Certainly not in the
sense that you would probably be used to in languages like Java, Javascript,
Ruby, or Python. For example, consider the same function written in *typical*
Javascript:

.. code-block:: javascript

  function calculateEndTime(t, duration) {
    // btw, Javascript doesn't have a built-in divMod function...
    //
    var addHour = Math.floor((t.mins + duration)/60);
    var finalMins = (t.mins + duration) % 60;
    t.hour = t.hour + addHour;
    t.mins = finalMins;
    return t;
  }

(Notice, that without reading the code (or documentation) you cannot really know
*what* structure to pass for the ``t`` parameter).

Now, try the following with this function in Javascript:

.. code-block:: javascript
  :emphasize-lines: 8,9

  var start = {hour: 10, mins: 30};
  var end = calculateEndTime(start, 45);
  console.log("end=", end);

  // OUTPUT:
  // end={hour: 11, mins: 15}

  console.log("start=", start);
  // start={hour: 11, mins: 15}
  //
  // ... WTF?!

.. important:: 
  
  **The Javascript function is "mutating" the arguments that you pass to it.**
  "Mutation" of this kind is one of the biggest source of bugs. 

Say, you have a ``startTime`` and ``duration``, and you want to calculate the
``endTime`` to display all the three values (startTime, duration, endTime) to
the user. However, with such bugs (originating due to mutation), you'll end up
with your ``startTime`` being sneakily changed behind your back! In small
code-bases this kind of unintended mutation may be easy to manage, but it
becomes increasingly hard as the code base grows - the place where a bug is
being observed, and the place where the bug is being created could very well be
completely different modules, written by different people!

Now, going back to ``calculateEndTime`` in Haskell, if you notice, we aren't
really mutating the hour & minute fields in the ``TimeOfDay`` that has been
passed to the function. We are creating a **new** ``TimeOfDay`` by invoking the
``MkTimeOfDay`` constructor.

.. code-block:: haskell
  :emphasize-lines: 7

  data TimeOfDay = MkTimeOfDay Int Int deriving (Eq, Show, Ord)
  data DurationInMins = MkDuration Int deriving (Eq, Show, Ord)

  calculateEndTime :: TimeOfDay -> DurationInMins -> TimeOfDay
  calculateEndTime (MkTimeOfDay hr mn) (MkDuration d) = 
    let (addHour, finalMins) = divMod (mn + d) 60
    in (MkTimeOfDay (hr + addHour) finalMins)  

Try replicating the same Javascript bug in Haskell. You will be unable to!
**Haskell's immutability protects us from such bugs.**

.. code-block:: haskell

  GHCi> let startsAt = MkTimeOfDay 10 30
  GHCi> let endsAt = calculateEndTime startsAt (DurationInMins 45)
  GHCi> endsAt
  MkTimeOfDay 11 15
  GHCi> startsAt
  MkTimeOfDay 10 30

.. important::

  And this is what makes Haskell an **immutable language**. Once a variable is
  defined you **cannot** change it's value. However, using that variable, you
  can compute/create a new value, and assign it to a new variable (or return it
  as a result of a function). In fact, it's a good idea to stop calling them
  variables, because of the connotation that you would have already built up.
  It's better to start calling them **bindings** instead - once a name is bound
  to a value, it cannot be bound to a new value **within the same scope.**

  The last part is very important. You can have the same variable name across
  different funtions, because they are different scopes. In fact, this is how
  you can keep doing ``let x = _something`` over & over again in GHCi. It is
  because GHCi creates a new scope for every statement that you issue. This is
  to make interactive evaluation simpler. However, try doing the followoing in a
  Haskell source file (the compiler won't allow it):

  .. code-block:: haskell

    someFunction :: Int -> Int
    someFunction x = 
      let var1 = _something
          var1 = _somethingElse
      in _todo

  One point to note is that nested functions are technically different scopes.
  So you *can* have bindings with the same name in the inner & outer functions,
  but the compiler will emit a warning when you do that. This is because the
  binding in the inner function is ambiguous and *may* be the cause of
  inadvertent bugs. For example:

  .. code-block:: haskell
    :emphasize-lines: 4

    import Data.Char (toUpper)

    toUppercase :: String -> String
    toUppercase str = map (\str -> toUpper str) str

  While the example given above will compile, you can probably spot the inherent
  ambiguity in what ``str`` refers to inside the lambda passed to ``map``. The
  compiler will emit a warning, but will still go-ahead and *implicitly* use the
  innermost ``str``. This means, in the ``toUppercase`` function-body ``str``
  will refer to the value passed as the first argument, **except** within the
  lambda, where ``str`` would refer to the value passed to the lambda, instead.
  This implicit assumption is not always a good idea and *can* be the cause of
  subtle bugs.

  .. code-block:: text

    +------------------------------------------------------+
    | OUTER SCOPE           +-----------------------+      |
    |                       | INNER SCOPE           |      |
    | toUppercase str = map | (\str -> toUpper str) | str  |
    |                       +-----------------------+      |
    +------------------------------------------------------+

  This ambiguity is easily resolved by one of two approaches:

  1. Do not use *overlapping* binding names. Again, this does not mean that you can not, or should not, use the same name in *different* functions/scopes. Just don't use same names in overlapping, or nested, scopes.
  2. If you cannot come up with a completely new name to disambiguate (remember, naming things is a hard problem in computer science!), you can use a suffix of ``_`` or ``'`` to create a different name, eg. ``customer`` and ``customer_``, or ``user`` and ``user'``


Named product types OR records
------------------------------

Let's go back to the original defintion of ``TimeOfDay`` and what we said about it:

  "The reason why I'm calling this type a **positional** product-type is because
  there is no *explicit* differentiation between the two ``Int`` values. One of
  them holds the hour and the other minute. But, by just looking at the type
  definition, we cannot be sure which holds which. We have to fallback on a strong
  "positional convention", whenever you're writing the time-of-day, the first one
  is hour and the second one is minutes."

With ``TimeOfDay`` we could get away with *positional semantics* because the
convention of stating the hour before the minute is almost universal. However,
that is not always the case. Take the example of dates. Some countries use
``dd/mm/yy`` (eg. India), while some countries use ``mm/dd/yy`` (eg. UK?), and
some countries even use ``yy/mm/dd`` (eg. Japan). What happens if you use a
positional product type to represent a date in your application?

.. code-block:: haskell

  data Date = MkDate Int Int Int deriving (Eq, Show, Ord)

If you're an Indian, you would *assume* ``dd/mm/yy`` *semantics*, whereas your
team-mate, who's working out of UK, would *assume* ``mm/dd/yy``. If you were on
the Mars Orbiter team, you'd surely end-up causing a major confusion around
launch dates!

Thus, the motivation for using named product types aka records for cases where
*positional semantics* are not universal, or can confusion at a later date (or
when the maintainer of the code changes).

Let's see how we can use records in the date example above:

.. code-block:: haskell

  data Date = MkDate {
    dtDay :: Int
  , dtMonth :: Int
  , dtYear :: Int
  } deriving (Eq, Show, Ord)

- We created a new record-type, called ``Date``
- The constructor is called ``MkDate``
- The constructor wraps three **named fields** called ``dtDay``, ``dtMonth``, & ``dtYear`` all of the type ``Int``
- If you notice, all the fields are prefixed with ``dt`` and are not simply called ``day``, ``month``, and ``year``. The reason will be explained in :ref:`field_names`

Let's play around with this type in GHCi:

.. code-block:: haskell

  GHCi> data Date = MkDate {dtDay :: Int, dtMonth :: Int, dtYear :: Int} deriving (Eq, Show, Ord)
  GHCi> let d = MkDate{dtDay = 1, dtMonth = 1, dtYear = 2017}
  GHCi> d
  MkDate {dtDay = 1, dtMonth = 1, dtYear = 2017}

  -- Notice these "field-accesor" functions that the compiler gives us for free...
  GHCi> :t dtDay
  dtDay :: Date -> Int

  GHCi> :t dtMonth
  dtMonth :: Date -> Int

  GHCi> :t dtYear
  dtYear :: Date -> Int

  -- Let's use these field-accesssors to access individial fields...
  GHCi> dtDay d
  1
  GHCi> dtMonth d
  1
  GHCi> dtYear d
  2017

The field-accessor functions that the compiler gives us for free (for each
record field), is the closes that you can get to Java's "getters" in Haskell.

Pattern matching
^^^^^^^^^^^^^^^^

Since ``Date`` also has a constructor it should be possible to pattern-match it,
right? Absolutely right!

.. code-block:: haskell

  GHCi> let dt = MkDate{dtDay = 1, dtMonth = 1, dtYear = 2017}
  GHCi> let (MkDate {dtDay = d, dtMonth = m, dtYear = y}) = dt
  GHCi> d
  1
  GHCi> m
  1
  GHCi> y
  2017

However, unlike positional product-types where you are forced to pattern-match
*all values* of a constructor, pattern-matching a record-constructor allows us
to omit fields. Example:

.. code-block:: haskell
  :emphasize-lines: 1,4,8-10

  GHCi> let (MkTimeOfDay hr) = MkTimeOfDay 10 30

  <interactive>:94:6: error:
      • The constructor ‘MkTimeOfDay’ should have 2 arguments, but has been given 1
      • In the pattern: MkTimeOfDay hr
        In a pattern binding: (MkTimeOfDay hr) = MkTimeOfDay 10 30

  -- We are not pattern matching dtDay & dtMonth
  -- 
  GHCi> let (MkDate {dtYear = y}) = MkDate{dtDay=31, dtMonth=12, dtYear=2017}
  GHCi> y
  2017

Record updates
^^^^^^^^^^^^^^

In :ref:`immutability` we saw how we can't *mutate* values in Haskell. Using
existing values to create new values (rather than mutating existing values),
wasn't a problem with positional products, because, realistically, positional
product types are only used with a small number of fields/values. For example,
it's not a good idea to use a positional product for representing a ``Customer``
value:

.. code-block:: haskell

  data Customer = Customer Int String String String Int String String deriving (Eq, Show, Ord)

You'd much rather use named products for a type with more than five values
(it's a guideline, not a hard rule):

.. code-block:: haskell

  data Customer = Customer
    { customerId :: Int
    , customerFirstName :: String
    , customerLastName :: String
    , customerEmail :: String
    , customerBirthYear :: Int
    , customerCity :: String
    , customerCountry :: String
    } deriving (Eq, Show, Ord)

So, for small positional products you can do the following without much hassle:

.. code-block:: haskell

  -- Representing weigth in pounds & ounces. Take a look at 
  -- https://en.wikipedia.org/wiki/Imperial_units#Mass_and_weight to
  -- understand what's going on.
  --
  data ImperialWeight = MkImperialWeight Int Int Int Int deriving (Eq, Show, Ord)

  addWeights :: ImperialWeight -> ImperialWeight -> ImperialWeight
  addWeights (MkImperialWeight p1 o1 d1 g1) (MkImperialWeight p2 o2 d2 g2) = 
    let (addDrachm, finalGrain) = divMod (g1 + g2) 7000
        (addOunce, finalDrachm) = divMod (d1 + d2 + addDrachm) 256
        (addPound, finalOunce) = divMod (o1 + o2 + addOunce) 16
    in (MkImperialWeight (p1 + p2 + addPound) finalOunce finalDrachm finalGrain)

However, for large records, doing something similar is going to get irritating rather quickly:

.. code-block:: haskell

  import Data.List (dropWhile, dropWhileEnd)
  import Data.Char (isSpace)

  data Customer = MkCustomer
    { customerId :: Int
    , customerFirstName :: String
    , customerLastName :: String
    , customerEmail :: String
    , customerBirthYear :: Int
    , customerCity :: String
    , customerCountry :: String
    } deriving (Eq, Show, Ord)

  removeSpaces :: String -> String
  removeSpaces s = dropWhileEnd (\x -> isSpace x) (dropWhile (\x -> isSpace x) s)

  cleanupName :: Customer -> Customer
  cleanupName MkCustomer{customerId=cid, customerFirstName=fn, customerLastName=ln, customerEmail=e, customerBirthYear=by, customerCity=cy, customerCountry=co} = 
    MkCustomer 
      { customerId = cid
      , customerFirstName = (removeSpaces fn)
      , customerLastName = (removeSpaces ln)
      , customerEmail = e
      , customerBirthYear = by
      , customerCity = cy
      , customerCountry = co
      }

This is solved by the "record update syntax". **However, please note, no real
"updation" or "mutation" happens even when you use the record update syntax.** A
new record is **always created**, just that certain fields will contain
different values.

Let's see this in action - rewriting the ``cleanupName`` function shown above:

.. code-block:: haskell

  cleanupName :: Customer -> Customer
  cleanupName cust@MkCustomer{customerFirstName=fn, customerLastName=ln} =
    cust 
      { customerFirstName = (removeSpaces fn)
      , customerLastName = (removeSpaces ln)
      }

Or, another way of writing the same thing with accessor functions, instead of pattern-matching:

.. code-block:: haskell

  cleanupName :: Customer -> Customer
  cleanupName cust =
    cust 
      { customerFirstName = (removeSpaces (customerFirstName cust))
      , customerLastName = (removeSpaces (customerLastName cust))
      }

- Instead of using the ``MkCustomer`` constructor, we using an existing ``cust`` value to update
- Intead of providing all the fields within the curly braces, we provided only those fields that we wanted to change.

.. important::

  Please verify that the ``Customer`` record being passed to the ``cleanupName``
  function is not getting mututated irrespective of which technique you use to
  write the code.

.. note::

  This section makes it sound as if positional products don't have any way to
  "update" them and that you always have to recreate them from scratch. That is
  not completely true. We will see a *unified* way of changing values in any
  data-structure using "lenses" in a later chapter.

.. _field_names:

Gotcha: Field names
^^^^^^^^^^^^^^^^^^^

If you've noticed, in all our code examples, records always had field-names prefixed with a common name. Example:

.. code-block:: haskell

  -- all field names prefixed with `dt`
  -- 
  data Date = MkDate {dtDay :: Int, dtMonth :: Int, dtYear :: Int} deriving (Eq, Show, Ord)

  -- instead of
  --
  data Date = MkDate {day :: Int, month :: Int, year :: Int} deriving (Eq, Show, Ord)

This is because of a limitation in Haskell's record-types. You cannot define
records with the **same field-name in the same module.** For exampe, try doing
the following in your editor and **loading your file in GHCi:** (note, doing
this in a file, and loading the file in GHCi is important. Directly typing out
this code in GHCi will not uncover this limitation)

.. code-block:: haskell

  module MyRecordTypes where

  data Date = MkDate {day :: Int, month :: Int, year :: Int} deriving (Eq, Show, Ord)

  data Copyright = MkCopyright {author :: String, year :: Int} deriving (Eq, Show, Ord)

You will get an error, and the compiler won't allow it. One way to intuitively
understand this limitation is to see what's going on with respect to
field-accessor functions. The ``Date`` type will automatically give you the
following three field accessors:

.. code-block:: haskell
  :emphasize-lines: 6

  -- Inspect the types of `day`, `month`, and `year`, via `:t` in GHCi
  -- to confirm this
  --
  day :: Date -> Int
  month :: Date -> Int
  year :: Date -> Int

Similarly, the ``Copyright`` type will also *attempt* to give you the following
two field accessors...

.. code-block:: haskell
  :emphasize-lines: 2

  author :: Copyright -> String
  year :: Copyright -> Int

... and, thus, cause a compile error. You can't define functions having the same
name but different type-signatures in Haskell. This is not to say that function
overlaoding is not possible in Haskell. It is possible, but it is *very
different* from how it's done in Java or C++. We will cover it along with
type-classes and will see how it is possible to use the same "name" for
different record fields, via lenses.

.. important::

  To sum it up, if you're sticking with basic Haskell, and not going crazy with
  advanced type-level hackery, it is **strongly recommended** you prefix all
  fields in your record with a common name. And use camelCase. For example, if
  your field is ``firstName``, call it ``userFirstName``, and not
  ``userfirstName``, or ``user_firstName``.

Sum types
---------

Let's motivate the *need* for sum-types through an example. Say, we were
building a *typical* user signup flow and needed a way to represent a user in
our code. We could try using a positional-product...

.. code-block:: haskell

  data Customer = MkCustomer String String String String String String deriving (Eq, Show, Ord)

...but that wouldn't take us very far, because we would keep getting confused
about the *semantics* of each ``String`` value. Positional products would not be
appropriate because the *semantics* of each field would not be clear. So, we go
with a named product type, aka, record:

.. code-block:: haskell
  :emphasize-lines: 11

  data User = MkUser
    { userEmail :: String
    , userFulllName :: String

    -- NOTE: DON'T ever do this in production code. You should be using an
    -- appropriate type exported by the `bcrypt` library for passwords.
    --
    , userPassword :: String 

    , userPostalCode :: String
    , userStatus :: String
    , userVerificationCode :: String
    } deriving (Eq, Show, Ord)

Now, as most *typical* user signups go, a user cannot fill up a registration
form and start using your service/app immediately. There is an verification
step, where a code is send via email or SMS, which the user has to enter on a
webpage. Once the verification step is complete, the user is deemed as *active*
or *verified*. Further, due to misuse of your app/service/platform, you may want
to deactivate a user account as well, thus preventing any further access.

All of this complexity is usually represented by a value stored in the
``userStatus`` field.

Now, let's try writing the ``registerUser``, ``verifyUser``, and
``deactivateUser`` functions using this ``User`` record:

.. _user_registration:

.. code-block:: haskell

  module UserRegistration where

  import Data.List (break, null, filter, take)

  data Email = MkEmail String deriving (Eq, Show, Ord)

  data User = MkUser
    { userEmail :: Email
    , userFullName :: String

    -- NOTE: DON'T ever do this in production code. You should be using an
    -- appropriate type exported by the `bcrypt` library for passwords.
    --
    , userPassword :: String

    , userPostalCode :: String
    , userStatus :: String
    , userVerificationCode :: String
    } deriving (Eq, Show, Ord)

  data NewUser = MkNewUser
    { nuserEmail :: Email
    , nuserFullName :: String
    , nuserPassword :: String
    , nuserPostalCode :: String
    } deriving (Eq, Show, Ord)

  registerUser :: NewUser -> [User] -> [User]
  registerUser newUser@MkNewUser{nuserEmail=(MkEmail email)} userDb =
    let user = MkUser
                { userEmail = (nuserEmail newUser)
                , userFullName = (nuserFullName newUser)
                , userPassword = (nuserPassword newUser)
                , userPostalCode = (nuserPostalCode newUser)
                , userStatus = "unverified"

                -- NOTE: Pretty lame way of generating a verification code, but
                -- let's stick with this, for now, in order to not complicate the
                -- example.
                --
                , userVerificationCode = (take 2 email) ++ (take 2 (nuserFullName newUser)) ++ (take 2 (nuserPostalCode newUser))
                }
    in (user:userDb)

  verifyUser :: Email -> String -> [User] -> (Bool, String, [User])
  verifyUser e code userDb =
    let existingUsers = Data.List.filter (\u -> (userEmail u) == e) userDb
    in  if (Data.List.null existingUsers)
          then (False, "No such user", userDb)
          else  let existingUser = head existingUsers
                in  if (code==(userVerificationCode existingUser))
                      then  let verifiedUser = existingUser{userStatus="active"}
                                newUserDb = replaceUserInDb verifiedUser userDb
                            in  (True, "Verified", newUserDb)
                      else (False, "Incorrect verification code", userDb)

  deactivateUser :: Email -> [User] -> (Bool, String, [User])
  deactivateUser e userDb =
    let existingUsers = Data.List.filter (\u -> (userEmail u) == e) userDb
    in  if (Data.List.null existingUsers)
          then (False, "No such user", userDb)
          else  let existingUser = head existingUsers
                    deactiveUser = existingUser{userStatus = "deactivated"}
                in  (True, "User deactivated", replaceUserInDb deactiveUser userDb)

  replaceUserInDb :: User -> [User] -> [User]
  replaceUserInDb u userDb =
    let (a, b) = Data.List.break (\x -> (userEmail x)==(userEmail u)) userDb
    in if (Data.List.null b) -- NOTE: Handling the edge-case where the user may not be in the DB
        then  (a ++ [u])
        else  (a ++ (u:(tail b)))

.. note:: 

  Notice how the ``replaceUserInDb`` function is written. The concept of
  "immutability" cuts across the *entire* language. In this particular function,
  we aren't mutating the ``[User]`` list (which represents the user DB). In
  fact, even if we wanted to, the language wouldn't allow us. So, we accept a
  ``[User]`` list and return a *new* ``[User]`` list in all our functions.
  **Please verify this for yourself in GHCi.**

Let's play around with this in GHCi:

.. code-block:: haskell

  GCHi> :l UserRegistration
  GHCi> let db1 = []
  GCHi> let db2 = registerUser MkNewUser{nuserEmail = (MkEmail "saurabhnanda@gmail.com"), nuserFullName = "Saurabh Nanda", nuserPassword = "unsafe", nuserPostalCode = "110001"} db1
  GHCi> db2
  [MkUser {userEmail = MkEmail "saurabhnanda@gmail.com", userFullName = "Saurabh Nanda", userPassword = "unsafe", userPostalCode = "110001", userStatus = "unverified", userVerificationCode = "saSa11"}]

  GHCi> let db3 = registerUser MkNewUser{nuserEmail = (MkEmail "john@doe.com"), nuserFullName = "John Doe", nuserPassword = "unsafeagain", nuserPostalCode = "220002"} db2
  GHCi> db3
  [MkUser {userEmail = MkEmail "john@doe.com", userFullName = "John Doe", userPassword = "unsafeagain", userPostalCode = "220002", userStatus = "unverified", userVerificationCode = "joJo22"},MkUser {userEmail = MkEmail "saurabhnanda@gmail.com", userFullName = "Saurabh Nanda", userPassword = "unsafe", userPostalCode = "110001", userStatus = "unverified", userVerificationCode = "saSa11"}]

  -- NOTE the pattern-matching in this let-binding. `verifyUser` returns a 3-tuple.
  --
  GHCi> let (result, reason, db4) = verifyUser (MkEmail "saurabhnanda@gmail.com") "incorrect" db3
  GHCi> result
  False
  GHCi> reason
  "Incorrect verification code"

  -- NOTE no change in the user DB
  --
  GHCi> db4
  [MkUser {userEmail = MkEmail "john@doe.com", userFullName = "John Doe", userPassword = "unsafeagain", userPostalCode = "220002", userStatus = "unverified", userVerificationCode = "joJo22"},MkUser {userEmail = MkEmail "saurabhnanda@gmail.com", userFullName = "Saurabh Nanda", userPassword = "unsafe", userPostalCode = "110001", userStatus = "unverified", userVerificationCode = "saSa11"}]

  -- Let's try this again, with the correct verification code, and notice that `db4` changes...
  --
  GHCi> let (result, reason, db4) = verifyUser (MkEmail "saurabhnanda@gmail.com") "saSa11" db3
  GHCi> db4
  [MkUser {userEmail = MkEmail "john@doe.com", userFullName = "John Doe", userPassword = "unsafeagain", userPostalCode = "220002", userStatus = "unverified", userVerificationCode = "joJo22"},MkUser {userEmail = MkEmail "saurabhnanda@gmail.com", userFullName = "Saurabh Nanda", userPassword = "unsafe", userPostalCode = "110001", userStatus = "active", userVerificationCode = "saSa11"}]

  GHCi> let (result, reason, db5) = deactivateUser (MkEmail "incorrect@incorrect.com") db4
  GHCi> result
  False
  GHCi> reason
  "No such user"

  GHCi> let (result, reason, db5) = deactivateUser (MkEmail "john@doe.com") db4
  GHCi> db5
  [MkUser {userEmail = MkEmail "john@doe.com", userFullName = "John Doe", userPassword = "unsafeagain", userPostalCode = "220002", userStatus = "deactivated", userVerificationCode = "joJo22"},MkUser {userEmail = MkEmail "saurabhnanda@gmail.com", userFullName = "Saurabh Nanda", userPassword = "unsafe", userPostalCode = "110001", userStatus = "active", userVerificationCode = "saSa11"}]

Sum-types save you from typos
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Now, let's write a function that returns the number of users in ``verified``,
``inactive``, and ``deactive`` statuses:

.. code-block:: haskell

  countUsers :: String -> [User] -> Int
  countUsers status userDb = length (filter (\u -> (userStatus u) == status) userDb)

  userStatusSummary :: [User] -> [(String, Int)]
  userStatusSummary userDb = 
    let statuses = ["verified", "inactive", "deactive"]
    in map (\status -> (status, countUsers status userDb)) statuses

Look fine, right? **WRONG!** Can  you spot the error in the ``userStatusSummary``
function given above?

.. container:: toggle

  .. container:: header

    **Reveal the error**

  .. container::

    The ``UserRegistration`` module expects the ``userStatus`` field to have the following values:

    * ``active`` for active/verified users
    * ``deactivated`` for inactive/deactivated users
    * ``unverified`` for new users

    However, in our ``userStatusSummary`` function we have used similarly named,
    but different, values of - ``verified``, ``inactive`` and ``deactive``

This is the **first use-case** for sum-types. Whenever you have a piece of
data/field/value that can only have a fixed set of values, **always** use a
sum-type, instead of an ``Int`` or ``String`` to represent all possible values.
In other languages this is possible by using ``enum`` types, but as we will see,
Haskell's sum-types are more powerful than standard ``enum`` types.

So, let's create a sum-type for our ``userStatus`` - 

.. code-block:: haskell
  :emphasize-lines: 1-4, 14

  data Status = Active
              | Inactive
              | Deactive
              deriving (Eq, Show, Ord)

  data User = MkUser
    { userEmail :: Email
    , userFullName :: String

    -- NOTE: Bad idea alert!
    , userPassword :: String

    , userPostalCode :: String
    , userStatus :: Status
    , userVerificationCode :: String
    } deriving (Eq, Show, Ord)

Let's unwrap this:

1. We've created a *sum type* called ``Status``. This is at the same *level* as ``Int``, ``String``, or ``Bool``, which means it can be used in type signatures.
2. The ``Status`` *type* can have only three *values*: ``Active``, ``Inactive`` and ``Deactive``. **Please make sure you understand this to avoid confusion going forward.** ``Active``, ``Inactive`` and ``Deactive`` are **not types, but values.** This is similar to how ``Int`` is a type and ``1``, ``5``, ``10``, are values of the type.

Let's try and appreciate the second point about types vs values a little more:

.. code-block:: haskell

  GHCi> data Status = Active | Inactive | Deactive deriving (Eq, Show, Ord)
  GHCi> :t Active
  Active :: Status

  GHCi> :t Inactive
  Inactive :: Status

  GHCi> :t Deactive
  Deactive :: Status

  GHCi> let x = Active
  GHCi> :t x
  x :: Status

  
  -- Try inspecting the type of Status and notice the error that you'll get.
  -- This is because Status itself is a type. To find more information about 
  -- types you can use `:i` instead of `:t`
  -- 
  -- GHCi> :t Status
  -- <interactive>:1:1: error: Data constructor not in scope: Status
  --
  -- GHCi> :i Status
  -- data Status = Active | Inactive | Deactive
  --          Defined at <interactive>:13:1
  -- instance [safe] Eq Status -- Defined at <interactive>:13:54
  -- instance [safe] Ord Status -- Defined at <interactive>:13:64
  -- instance [safe] Show Status -- Defined at <interactive>:13:58  

  -- Now, notice the similarity with Int & Bool. Try `:t` and `:i`
  -- with `Int` and `Bool` as well
  --
  GHCi> let y = 10
  :t y

  GHCi> :t True
  True :: Bool

  GHCi> :t False
  False :: Bool

  GHCi> let z = True
  GHCi> :t z
  z :: Bool

Reinforcing this *types* vs *values*, yet again, please see if you can understand why the following is incorrect:

.. code-block:: haskell

  data Status = Active | Inactive | Deactive deriving (Eq, Show, Ord)

  someFunc :: Active -> String
  someFunc = _todo

.. container:: toggle

  .. container:: header

    **Reveal the error**

  .. container::

    ``Active`` is a *value* of *type* ``Status``. The type signature of ``someFunc`` given above is on the lines of the following, which you intuitively understand to be nonsensical:

    .. code-block:: haskell

      someFunc :: 1 -> String
      someFunc = _todo

    You would rather write the following:

    .. code-block:: haskell

      someFunc :: Int -> String
      someFunc = _todo

    Similarly, you'd write the following, when it comes to the ``Status`` type:

    .. code-block:: haskell

      someFunc :: Status -> String
      someFunc = _todo

Now, let's use the ``Status`` type in the ``UserRegistration`` module given above, and include the the ``userStatusSummary`` function, as well:

.. code-block:: haskell
  :emphasize-lines: 6, 17, 35, 47, 58, 68-74

  module UserRegistration where

  import Data.List (break, null, filter, take)

  data Email = MkEmail String deriving (Eq, Show, Ord)
  data Status = Active | Inactive | Deactive deriving (Eq, Show, Ord)

  data User = MkUser
    { userEmail :: Email
    , userFullName :: String

    -- Bad idea alert!
    --
    , userPassword :: String

    , userPostalCode :: String
    , userStatus :: Status
    , userVerificationCode :: String
    } deriving (Eq, Show, Ord)

  data NewUser = MkNewUser
    { nuserEmail :: Email
    , nuserFullName :: String
    , nuserPassword :: String
    , nuserPostalCode :: String
    } deriving (Eq, Show, Ord)

  registerUser :: NewUser -> [User] -> [User]
  registerUser newUser@MkNewUser{nuserEmail=(MkEmail email)} userDb =
    let user = MkUser
                { userEmail = (nuserEmail newUser)
                , userFullName = (nuserFullName newUser)
                , userPassword = (nuserPassword newUser)
                , userPostalCode = (nuserPostalCode newUser)
                , userStatus = Inactive
                , userVerificationCode = (take 2 email) ++ (take 2 (nuserFullName newUser)) ++ (take 2 (nuserPostalCode newUser))
                }
    in (user:userDb)

  verifyUser :: Email -> String -> [User] -> (Bool, String, [User])
  verifyUser e code userDb =
    let existingUsers = Data.List.filter (\u -> (userEmail u) == e) userDb
    in  if (Data.List.null existingUsers)
          then (False, "No such user", userDb)
          else  let existingUser = head existingUsers
                in  if (code==(userVerificationCode existingUser))
                      then  let verifiedUser = existingUser{userStatus=Active}
                                newUserDb = replaceUserInDb verifiedUser userDb
                            in  (True, "Verified", newUserDb)
                      else (False, "Incorrect verification code", userDb)

  deactivateUser :: Email -> [User] -> (Bool, String, [User])
  deactivateUser e userDb =
    let existingUsers = Data.List.filter (\u -> (userEmail u) == e) userDb
    in  if (Data.List.null existingUsers)
          then (False, "No such user", userDb)
          else  let existingUser = head existingUsers
                    deactiveUser = existingUser{userStatus = Deactive}
                in  (True, "User deactivated", replaceUserInDb deactiveUser userDb)

  replaceUserInDb :: User -> [User] -> [User]
  replaceUserInDb u userDb =
    let (a, b) = Data.List.break (\x -> (userEmail x)==(userEmail u)) userDb
    in if (Data.List.null b) -- NOTE: Handling the edge-case where the user may not be in the DB
        then  (a ++ [u])
        else  (a ++ (u:(tail b)))

  countUsers :: Status -> [User] -> Int
  countUsers status userDb = length (filter (\u -> (userStatus u) == status) userDb)

  userStatusSummary :: [User] -> [(Status, Int)]
  userStatusSummary userDb = 
    let statuses = [Active, Inactive, Deactive]
    in map (\status -> (status, countUsers status userDb)) statuses

Sum-types force you to consider all known cases
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Firstly, take a moment to appreciate that using a sum-type forced us to list
down (or enumerate) all *known values* (or valid values) of a given type. As
long as the ``userStatus`` field was a ``String`` it could potentially have
*any* value, even though all throughout the code we had *implicitly* assumed
that only three values were possible - ``active``, ``verified``, and
``deactivated``. By changing the type of ``userStatus`` to ``Status`` we made
this *implcit* assumption absolutely *explicit* and, more imporantly,
*self-documenting*. If you were inheriting this code, what kind of functions
would you rather deal with:

.. code-block:: haskell

  countUsers :: Status -> [User] -> Int
  
  -- or

  countUsers :: String -> [User] -> Int

Now, let's build-up a use-case for why we would *want* to be forced to consider
all known-cases.

Say, this user-registration module were part of a larger app, where users could
comment on posts (something like Reddit), and we needed to write a function that
would go through all the comments in the DB for a specific ``PostId`` and
display them on a webpage. Here's a starting point for our hypothetical
``fetchCommentsForPost`` function, whose result will be given to the UI:

.. code-block:: haskell

  module Post where

  -- NOTE: Importing the previous UserRegistration module
  --
  import UserRegistration
  import Data.List as DL

  data PostId = MkPostId Int deriving (Eq, Show, Ord)

  data Comment = MkComment
    { commentPostId :: PostId
    , commentUserEmail :: Email
    , commentBody :: String
    } deriving (Eq, Show, Ord)

  fetchCommentsForPost :: [Comment] -> PostId -> [Comment]
  fetchCommentsForPost commentDb postId =
    filter (\c -> (commentPostId c) == postId) commentDb

Now, if we have deactivated any user we don't want his/her comments to show-up
on the post's page. Here's *one possible way* we could achieve this:

.. code-block:: haskell

  fetchCommentsForPost :: [Comment] -> [User] -> PostId -> [Comment]
  fetchCommentsForPost commentDb userDb postId = 
    filter (\c -> ((commentPostId c) == postId) && (userNotDeactivated userDb (commentUserEmail c))) commentDb

  userNotDeactivated :: [User] -> Email -> Bool
  userNotDeactivated userDb email = 
    let deactivatedUsers = DL.filter (\u -> ((userEmail u) == email) && ((userStatus u) == Deactive)) userDb
    in (DL.null deactivatedUsers)

So far, so good.

Now, after 3 months a new user-status is added to the system, called
``ShadowBanned``. In most message forums, comments by shadow-banned users are
visible *only to the shadow-banned* user himself/herself. Banned/deactivated
users immediately come to know because they are locked out of the app
completely. On the other hand, shadow-banned users can keep commenting (or
sharing, or posting) to their heart's content without *anyone else* ever seeing
their content. It's a pretty devilish scheme, if you come to think of it!

Another team member goes and adds the following code to the ``UserRegistration``
module:

.. code-block:: haskell

  -- First adding another value to the Status type:
  data Status = Active | Inactive | Deactive | ShadowBanned deriving (Eq, Show, Ord)

  -- Then, adding another function to shadow-ban a user:
  shadowbanUser :: Email -> [User] -> (Bool, String, [User])
  shadowbanUser e userDb =
    let existingUsers = Data.List.filter (\u -> (userEmail u) == e) userDb
    in  if (Data.List.null existingUsers)
          then (False, "No such user", userDb)
          else  let existingUser = head existingUsers
                    shadowBannedUser = existingUser{userStatus = ShadowBanned}
                in  (True, "User deactivated", replaceUserInDb shadowBannedUser userDb)

You probably know where this is going.

The other team-member won't immediately know where else in the app
``userStatus`` has been been used to take important decisions. The actual
implementation of shadow-banning would end-up becoming a whack-a-mole debugging
session, with the developer having to manually test the entire app to discover
places where shadow-banning isn't working and patching the code one-by-one.

Let's use sum-types to solve this problem...


.. _enable_compiler_warnings:

.. important::

  **First ensure that all compiler warnings are enabled**

  What're we're about to do next, is technically called **exhaustiveness
  checking** and for it to work, we need to tell enable some compiler-level
  warnings. In your stack project open the ``package.yaml`` file and locate the
  following section and ensure the highlighted line is there:

  .. code-block:: yaml
    :emphasize-lines: 4

    library:
      source-dirs: src
      ghc-options:
      - -Wall

  Now, go exit your ``stack ghci`` session and run the ``hpack`` command from your project's root directory:

  .. code-block:: shell

    $ cd exercises
    $ hpack
    generated exercises.cabal

    $ stack ghci
    GHCi>

  ==> Now **restart your IDE as well**. <==

Let's rewrite our ``fetchCommentsForPost`` function leveraging case-matching for sum-types:

.. code-block:: haskell

  fetchCommentsForPost :: [Comment] -> [User] -> PostId -> [Comment]
  fetchCommentsForPost commentDb userDb postId = 
    filter 
      (\c -> ((commentPostId c) == postId) 
               && (DL.all (\x -> shouldDisplayCommentsForUser x) userDb))))
      commentDb

  shouldDisplayCommentsForUser :: User -> Bool
  shouldDisplayCommentsForUser user = 
    case (userStatus user) of
      -- NOTE: We are displaying comments for inactive users because they may
      -- have been forced to re-verify their accounts. And we wouldn't want their comments
      -- to suddenly disappear from the site just because they haven't yet *re-verified* their account.
      -- 
      Inactive -> True
      Active -> True
      Deactive -> False

Go back and add the ``ShadowBanned`` value to the ``Status`` type and see what
the compiler says for the ``shouldDisplayCommentsForUser`` function.

- Till now, it looks deceptively similar to an ``enum`` type from other languages, but here's what it makes this different. ``Active``, ``Inactive`` and ``Deactive`` are not only *values*, they are also *zero-parameter data-constructors* of the ``Status`` type.
- Referring back to what was written in :ref:`positional_product_types`, you will now begin to understand why it's necessary in Haskell to *allow* a type and it's constructor to have different names:

  .. code-block:: text

    "...This is unlike a lot of other OOP languages where the constructors must
    either have the same name as the type (or class) OR should have a special
    name...When we talk about 'sum' types we’ll see why is it possible in
    Haskell to have constructors with different names for the same type."

.. todo::

  complete this section

Sum-types can help you to always consider edge-cases
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. todo::

  complete this section

Sum types that you've alredy been using
---------------------------------------

Bool
^^^^

.. todo::

  complete this section

Lists
^^^^^

.. todo::

  complete this section

Exercises
---------

Rails fence cipher
^^^^^^^^^^^^^^^^^^

Revisit the ``encodeMessage`` function in :ref:`rails_fence_cipher`. It is a
classic case of structure-vs-sematics problem described in this chapter. Rewrite
the ``encodeMessage`` function using various techniques discussed in this
chapter and describe the pros-and-cons of each solution.

You should try rewriting the function using 2-3 different approaches, and
actually **write down** the pros & cons as comments.

Pagination
^^^^^^^^^^

Revisit the ``displayPagination`` function in :ref:`pagination`. There we had
used **type synonyms** to clarify the semantics of the function signature.
However, with type-synonyms, the compiler doesn't prevent us from passing
``TotalItems`` to something that is expecting ``ItemsPerPage``.

Modify your solution to the problem by using techniques discussed in this
chapter. Try rewriting the function using 2-3 different approaches, and actually
**write down** the pros & cons as comments.

Exam score processing
^^^^^^^^^^^^^^^^^^^^^

Revisit all your solution to the :ref:`exam_score_processing` problem. How can
they benefit from using product and/or sum types discussed in this chapter?
Reimplement *all* your solutions using the new techniques that you have learnt.
Try rewriting your solutions using 2-3 different approaches, and actually
**write down** the pros & cons as comments.

Date arithmetic
^^^^^^^^^^^^^^^

Write a function that adds number-of-days to a given date, while taking care of a few edge-cases:

- Spill-over of days is added to months
- Spill-over of months is added to years
- How are you determining which month has how many days? What about February in leap years?

Solve this problem using the following techniques:

1. Using plain built-in types:

  .. code-block:: haskell

    addDays :: (Int, Int, Int) -> Int -> (Int, Int, Int)

2. Using separate types for date/month/year:

  .. code-block:: haskell

    data Day = MkDay Int deriving (Eq, Show, Ord)
    data Month = MkMonth Int deriving (Eq, Show, Ord)
    data Year = MkYear Int deriving (Eq, Show, Ord)

    addDays :: (Day, Month, Year) -> Int -> (Day, Month, Year)

3. Using a single positional-product type:

  .. code-block:: haskell

    data Date = MkDate Int Int Int deriving (Eq, Show, Ord)

    addDays :: Date -> Int -> Date

4. Using a record type, and **pattern matching** on the ``MkDate`` constructor to extract the day, month, and year fields

  .. code-block:: haskell

    data Date = MkDate
      { dtDay :: Int
      , dtMonth :: Int
      , dtYear :: Int
      } deriving (Eq, Show, Ord)

    addDays :: Date -> Int -> Date

5. Using the same record type as (4) above, but using accessor functions to access the day, month, and year fields.

Linked lists
------------

Write a program to create an manipulate a linked list of integers. Implement the following operations on linked lists:

- Inserting an integer at the front of the linked-list
- Inserting an integer at the end of the linked-list
- Inserting an integer at the n-th position of a linked-list
- removing an integer at the n-th position of a linked-list
- Obtaining the length of a linked list
- Reversing a linked list
- Returning the index of an element in the linked list
- Breaking a linked list into two separate linked lists at the n-th position


What is the core data type that you will use for representing a linked list? If
you have implemented linked lists in languages like C/C++/Java, then you
probably know that the core data-type has to be recursive in nature.

.. container:: toggle

  .. container:: header

    **Reveal the core data-type**

  .. container::

    .. code-block:: haskell

      data LinkedList = LLEmpty
                      | LLNode Int LinkedList
                      deriving (Eq, Show, Ord)

Stack
-----

Write a program to create and manipulate a stack of integers. Implement the following operations for your stack:

- Inserting an element at the top of the stack
- Popping an element from the top of the stack

Can you re-use your linked-list code to create your stack?

Reverse polish notation calculator
----------------------------------

Write a program that can calculate a mathematical expression expressed in
`reverse polish notation
<https://en.wikipedia.org/wiki/Reverse_Polish_notation>`_ One possible core
data-type for your calculator can be the following:

.. code-block:: haskell

  data Expr = ELit Float Expr
            | EMul Expr
            | EAdd Expr
            | ESubtract Expr
            | EDivide Expr
            | ENone
            deriving (Eq, Show, Ord)

  expr1 :: Expr
  expr1 = ELit 10.0 (ELit 20.0 (EAdd ENone)) 

  -- NOTE: This is an invalid expression because it would lead to to ``10`` and
  -- ``50`` remaining on the stack with no operand to process them further.
  -- 
  expr2 :: Expr
  expr2 = ELit 10.0 (ELit 20.0 (ELit 30.0 (EAdd ENone)))

  -- NOTE: This is an invalid expression because it would lead to to ``10 / 0`` 
  -- 
  expr2 :: Expr
  expr2 = ELit 10.0 (ELit 20.0 (ELit 20.0 (EMinus (EDivide ENone))))


  calculate :: Expr -> (Bool, String, Float)
  calculate e = _todo

  -- GHCi> calcualte expr1
  -- (True, "Valid expression", 30.0)

  -- GHCi> calcualte expr2
  -- (True, "Invalid expression", 0.0)

  -- GHCi> calcualte expr3
  -- (True, "Divide by zero", 0.0)

You can also choose to keep your ``Expr`` data-type separate and a stack of ``Expr`` separate.

Run-length encoding
-------------------

Run-length encoding (RLE) is a simple form of data compression, where runs
(consecutive data elements) are replaced by just one data value and count.

For example we can represent the original 53 characters with only 13:

.. code-block:: text

  "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB"  ->  "12WB12W3B24WB"

And from the compressed string, we can get the original string back, without loss of any data. eg:

.. code-block:: text

  "12WB12W3B24WB" -> "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB"

Can you think of a sum-type that will help you in writing a correct codec easily?

1. Write a program that can run-length encode and decode any string comprising of alphabets alone.
2. Write a program that can run-length encode and decode any string comprising of alphabets **and** numbers. What makes this part of the problem tougher than the earlier part? What do you need to change in the problem statement to be able to solve encode/decoding of strings containing alphabets **and** numbers? Make your suggesting change and solve the problem.

Scratch section - PLEASE IGNORE
-------------------------------

In this video we'll cover:

- 
- The downside of tuples
- Fixing the downside by using "product types"
- Using product-types (and tuples) via pattern matching
- Difference between types, data-constructors, and actual values.
- Downside of "tagging" product types and fixing them with newtypes.
- Downside of product-types and fixing them with records
- Sum types and why does one needs them
- Exhaustiveness checking
- Creating our own type using a mix of sum, product, and records
- Why are these called sum & product types
- Named Puns


Exercises:
- Run length encoding
- 

Core motivation:


We touched upon tuples previously and outlined three reasons why they're used:

- Returning multiple values from functions
- Passing multiple values to high-order functions where a lambda just allows a single parameter
- Ad-hoc way to associate data or elements

If you think about it, the first two are really specific use-cases of the last.

One of the problems with tuples like (Int, Int) is that when you're looking at
type-signatures you don't really come to know what each of the Int's means. The
structure of the value is known, but the semantics are not known. Discuss about
what semantics means.

An (Int, Int) could mean anything -- hour / min, x / y coordinates, row / col
coordinates, 

PRODUCT TYPE
- Canonical example of representing a point in 2-d space - (x, y) coordinates OR (row, col) coordinates
  
  - data Point = MkPoint Double Double deriving (Eq, Show)
  - equivalent to writing: data Point where MkPoint :: Double -> Double -> Point

- Inspect the type of the data-constructors in GHCi. They are just functions -- for most part.
- Pattern matching against **constructors**
- Representing an hour/min pair: data TimeOfDay = TimeOfDay Int Int
- This can also be represented as (Int, Int) ==> TimeOfDay Int Int

- Similar to tuples, but different because of semantics. They are structurally the same, but semantically different. They mean very, very different things.
- For example, the Mars Climate Orbiter disaster due to confusion between pounds (imperial units) and kilograms (metric, SI units)

SUM TYPE
- Designing an Enum -- status of an Order, status of a Payment, presence/absence of a field (say social media profiles -- basically a hand-written Maybe)
- Advantages of using exhaustiveness checking
- That is called a sum-type

Mix of Sum & Product:
- Example of representing a person's full-name:

  data Title = Mr | Mrs | Dr | Miss | Other String
  data FullName = FullName Title String String

Need for records.
- But there's a problem here. What if you were building the data-structure for 

- Presence / absence of fields. Why Haskell doesn't have NULL values.
- 

.. [1] Haddock is to Haskell, what Javadocs is to Java.