While there are some other, quite wonderful, answers here they all miss your first question somewhat. To be clear, values simply do not exist and have no meaning in the category Hask. That's just not what Hask is there to talk about.
The above seems a bit dumb to say or feel, but I bring it up because it's important to note that category theory provides just one lens for examining the much more complex interactions and structures available in something as sophisticated as a programming language. It's not fruitful to expect all of that structure to be subsumed by the rather simple notion of a category. [1]
Another way of saying this is that we're trying to analyze a complex system and it's sometimes useful to view it as a category in order to seek out interesting patterns. It's this mindset that lets us introduce Hask, check that it really forms a category, notice that Maybe
seems to behave like a Functor, and then use all of those mechanics to write down coherence conditions.
fmap id = id
fmap f . fmap g = fmap (f . g)
These rules would make sense regardless of whether we introduce Hask, but by seeing them as simple consequences of a simple structure we can discover within Haskell we understand their importance.
As a technical note, the entirety of this answer assumes Hask is actually "platonic" Hask, i.e. we get to ignore bottom (undefined
and non-termination) as much as we like. Without that almost the entire argument falls apart just a bit.
Let's examine those laws more closely since they seem to almost run against my initial statement---they're patently operating at the value level, but "values don't exist in Hask", right?
Well, one answer is to take a closer look at what a categorical functor is. Explicitly, it's mapping between two categories (say C and D) which takes objects of C to objects of D and arrows of C to arrows of D. It's worth noting that in general these "mapping"s are not categorical arrows—they simply form a relation between the categories and do not necessarily share in structure with the categories.
This is important because even considering Haskell Functor
s, endofunctors in Hask, we have to be careful. In Hask the objects are Haskell types and the arrows are Haskell functions between those types.
Let's look at Maybe
again. If it's going to be an endofunctor on Hask then we need a way to take all types in Hask to other types in Hask. This mapping is not a Haskell function even though it might feel like one: pure :: a -> Maybe a
doesn't qualify because it operates at the value level. Instead, our object mapping is Maybe
itself: for any type a
we can form the type Maybe a
.
This already highlights the value of working in Hask without values---we really do want to isolate a notion of Functor
which doesn't depend on pure
.
We'll develop the rest of Functor
by examining the arrow mapping of our Maybe
endofunctor. Here we need a way to map the arrows of Hask to the arrows of Hask. Let's for now assume that this is not a Haskell function—it doesn't have to be—so to emphasize it we'll write it differently. If f
is a Haskell function a -> b
then Maybe[f
] is some other Haskell function Maybe a -> Maybe b
.
Now, it's hard to not skip ahead and just start calling Maybe[f
] "fmap f
", but we can do a bit more work before making that jump. Maybe[f
] needs to have certain coherence conditions. In particular, for any type a
in Hask we have an id arrow. In our metalanguage we might call it id[a
] and we happen to know that it also goes by the Haskell name id :: a -> a
. Altogether, we can use these to state the endofunctor coherence conditions:
For all objects in Hask a
, we have that Maybe[id[a
]] = id[Maybe a
]. For any two arrows in Hask f
and g
, we have that Maybe[f . g
] = Maybe[f
] . Maybe[g
].
The final step is to notice that Maybe[_] just happens to be realizable as a Haskell function itself as a value of the Hask object forall a b . (a -> b) -> (Maybe a -> Maybe b)
. That gives us Functor
.
While the above was rather technical and complex, there's an important point in keeping the notions of Hask and categorical endofunctors straight and separate from their Haskell instantiation. In particular, we can discover all of this structure without introducing a need for fmap
to exist as a real Haskell function. Hask is a category without introducing anything at all at the value level.
That's where the real beating heart of viewing Hask as a category lives. The notation that identifies endofunctors on Hask with Functor
requires much more line blurring.
This line blurring is justified because Hask
has exponentials. This is a tricky way of saying that there's a unification between whole bundles of categorical arrows and particular, special objects in Hask.
To be more explicit, we know that for any two objects of Hask, say a
and b
, we can talk about the arrows between those two objects, often denoted as Hask(a
, b
). This is just a mathematical set, but we know that there's another type in Hask which is closely related to Hask(a
,b
): (a -> b)
!
So this is strange.
I originally declared that general Haskell values have absolutely no representation in the categorical presentation of Hask. Then I went on to demonstrate that we can do a whole lot with Hask by using only its categorical notions and without actually sticking those pieces inside of Haskell as values.
But now I'm noting that the values of a type like a -> b
actually do exist as all of the arrows in the metalinguistic set Hask(a
, b
). That's quite a trick and it's exactly this metalinguistic blurring that makes categories with exponentials so interesting.
We can do a little better, though! Hask also has an terminal object. We can talk about this metalinguistically by calling it 0, but we also know of it as the Haskell type ()
. If we look at any Hask object a
we know that there are a whole set of categorical arrows in Hask(()
, a
). Further, we know that these correspond to the values of the type () -> a
. Finally, since we know that given any function f :: () -> a
we can immediately get an a
by applying ()
one might want to say that the categorical arrows in Hask(()
, a
) are exactly the Haskell values of type a
.
Which ought to either be utterly confusing or incredibly mind-blowing.
I'm going to end this somewhat philosophically by sticking with my initial statement: Hask doesn't talk about Haskell values at all. It really doesn't as a pure category---categories are interesting precisely because they're very simple and thus don't need all these extra-categorical notions of types and values and typeOf
inclusion and the like.
But I also, perhaps poorly, showed that even as a stricly just a category, Hask has something that looks very, very similar to all of the values of Haskell: the arrows of Hask(()
, a
) for each Hask object a
.
Philosophically we might argue that these arrows aren't really the Haskell values we're looking for---they're just stand-ins, categorical mocks. You might argue that they're different things but simply happen to be in one-to-one correspondence with Haskell values.
I actually think that's a really important idea to keep in mind. These two things are different, they just behave similarly.
Very similarly. Any category lets you compose arrows, and so let's assume we pick some arrow in Hask(a
, b
) and some arrow in Hask(()
, a
). If we combine these arrows with category composition we get an arrow in Hask(()
, b
). Turning this all on its head a bit we might say that what I just did was find a value of type a -> b
, a value of type a
, and then combined them to produce a value of type b
.
In other words, if we look at things sideways we can see categorical arrow composition as a generalized form of function application.
This is what makes categories like Hask so interesting. Broadly, these kinds of categories are called Cartesian Closed Categories or CCCs. Due to having both initial objects and exponentials (which require products as well) they have structures which completely model typed lambda calculus.
But they still don't have values.
[1] If you're reading this before reading the rest of my answer, then keep reading. It turns out that while it's absurd to expect that to happen, it actually kind of does. If you're reading this after reading my whole answer then let's just reflect on how cool CCCs are.