Author: | Raphael ‘kena’ Poss |
---|---|
Date: | April 2014 |
Contents
Note
The latest version of this document can be found online at http://science.raphael.poss.name/categories-from-scratch.html. Alternate formats: Source, PDF.
The concept of category from mathematics happens to be useful to computer programmers in many ways. Unfortunately, all “good” explanations of categories so far have been designed by mathematicians, or at least theoreticians with a strong background in mathematics, and this makes categories especially inscrutable to external audiences.
More specifically, the common explanatory route to approach categories is usually: “here is a formal specification of what a category is; then look at these known things from maths and theoretical computer science, and admire how they can be described using the notions of category theory.” This approach is only successful if the audience can fully understand a conceptual object using only its formal specification.
In practice, quite a few people only adopt conceptual objects by abstracting from two or more contexts where the concepts are applicable, instead. This is the road taken below: reconstruct the abstractions from category theory using scratches of understanding from various fields of computer engineering.
The rest of this document is structured as follows:
Many introductory texts about category theory use another group of examples: type systems for programming languages, matrices from linear algebra and sometimes directed graphs. I did not choose them here as I believe they are already too abstract for many computing engineers. Instead, I believe these examples should only be mentioned as additional instances of categories after the concept has been properly recognized.
Pipes: In a Unix system, processing activities are organized in processes. Nearly all communication between processes and with I/O devices is organized via file descriptors: to real files on disk, to terminals with the user, but also FIFO buffers between processes. Communication can be monitored directly on terminals, or by looking at the contents of files on disk, or interleaving the program tee between programs chained by FIFOs. |
Compilers: A compiler is a program that takes another program as input, and transforms it to produce another program as output. The set of valid inputs for a compiler is its input language, and the set of possible outputs is its output language. For example, a “C compiler” accepts C code as input and produces assembly code as output (for some specific ISA). |
Circuits: An electronic circuit is usually recognized when one sees a plastic enclosure with some metal bits sticking out. Signal processors are a specific type of circuit, with a notion of “input” and “output” connectors: For example, a hi-fi amplifier has distinct pins for the “audio source” and one or more “speaker” output connectors. When plugged in to an active input signal, the circuit is itself activated and starts driving its output pins. The output signal can be exploited by further plugging to other circuits, or output devices. It can also be measured, for example using an oscilloscope. |
Each topic has two "groups" of things: things that can be observed from the “outside”, and things that appear (mostly) as black boxes from the outside but are “connected” between the things in the first group:
Example: | Pipes | Compilers | Circuits |
Observables: | The data streams going through file descriptors. | The set of all possible programs written in the input languages or generated in the output language. | The electric signals over time. |
Black boxes: | The running processes with their internal state (program counter, stack, heap, etc.). | The program transformation algorithms. | The circuits themselves. |
Each topic provides a mechanism to plug the "black box" things together:
Pipes: Given the two commands "grep -v foo" and "grep -v bar", one can write the command "grep -v foo | grep -v bar", which combines the two behaviors. |
Compilers: Given a compiler from C to assembly text, and a compiler from assembly text to machine code, one can combine them (eg. by means of a script) to create a compiler from C to machine code. |
Circuits: Given the a DVI-VGA adapter and a VGA-SVideo adapter, it is possible to plug them together to form a DVI-SVideo adapter. |
Each topic has a notion of "good" compositions that "make sense", and "bad" compositions that nonsensical and not expected to "work properly":
Pipes: Piping ls with grep foo is sensical. |
Compilers: Connecting the output of a C-to-assembly compiler to the input of a assembly-to-code compiler is sensical. |
Circuits: Plugging a USB-serial adapter to a DB9-DB25 serial adapter (with a 9-pin interface between them) is sensical. |
Piping ls with gv (PostScript viewer) is nonsensical. | Connecting the output of a C-to-C compiler to the input of an assembly-to-code compiler is nonsensical. | Plugging a USB-serial adapter to an EGA display (physically possible as they share the same DB9 connector) is nonsensical. |
To identify "good" from "bad" compositions, each topic places a large emphasis on the notion of interface:
Pipes: Both ls and grep operate on plain text streams, which is why they compose well with pipes. In contrast gv expects PostScript as input, which ls cannot produce. |
Compilers: Each compiler has a notion of input language for the set of accepted input programs and output language for the set of possible outputs. The languages must match when composing the compilers together. |
Circuits: The USB-serial adapter and serial-serial adapter plug well together because they both use the same standard (RS232) signalling protocol at the interface. |
Remarkably, users understand or conceptualize interfaces, despite the fact they are not always defined explicitly beforehand.
Each topic has some instances of black box things that "do nothing", ie have a neutral behavior:
Pipes: when the input and output data streams are the same (byte-wise). |
Compilers: when the input and output languages are the same. |
Circuits: when the output signals are measured the same as the input. |
The knowledge of whether a black box is neutral can be gained in either or three ways.
Either it is known to be neutral by construction, because the specification is available for scrutinity and can be proven to define a neutral behavior:
Pipes: The program source code contains a loop that iteratively inputs a byte and outputs the same byte, stopping at end-of-stream. |
Compilers: The transformation algorithm does not change the semantics of the program. |
Circuits: The blue print defines direct links between the input and output pins. |
Or, the knowledge is provided externally, eg. by fiat:
Pipes: The manual page for a command specifies that the process will replicate its input to its output unchanged. |
Compilers: The documentation says it is a “source-to-source” compiler, or explicitly indicates that its input and output language are the same. |
Circuits: The manufacturer guarantees that the circuit is fully pass-through. |
Or, it is discovered: to find out whether a black box thing A is neutral, assuming an observer has access to a pre-existing, valid "observable thing" that can be fed to A, then the observer can deduce the behavior of A is "neutral" if the observable as a result of A's activity is the same as the original observable:
Pipes: An unknown program xxx is known a priori to only read from its standard input and write to its standard output. So one can run the command "xxx <iN >oN" for various input files iN and compare whether the contents of each file oN are equal to the corresponding iN. If so, xxx appears to be "neutral". |
Compilers: The input language is known, but not the output language. So one can generate some random but valid input programs and feed them to the compiler. If the output programs are also valid input programs, then original program piece appears to be "neutral". |
Circuits: A signal processor appears to have the same number of input and output pins, and its input signal specification is known a priori. So one can use various valid input signals, feed them to the circuit, and measure the output. If the output signals measures the same as the input every time, the circuit appears to be "neutral". |
Once a black box thing N is known to be neutral, then its composition "left" and "right" with another black box thing A can be assumed to have the same behavior as A on its own:
Pipes: Both "xxx | grep foo" and "grep foo | xxx" can be assumed to behave like grep foo once xxx is known to be neutral. |
Compilers: A compiler built by composing a neutral compiler N either before or after another compiler C will have the same input and output languages as C. |
Circuits: Plugging a neutral circuit on either the input or output side of another circuit A will process signals as A alone would. |
When one composes three black box things A, B and C together (assuming the compositions are point-wise sensical), the order in which the composition is realized does not change the behavior:
Pipes: The commands "(ls | grep -v foo) | grep -v bar" and "ls | (grep -v foo | grep -v bar)" have equivalent behavior on their data streams. |
Compilers: If a script A invokes a Scheme-to-C compiler C1 followed by a C-to-assembly compiler C2, and another script B invokes A and then an assembly-to-code compiler C3, then B has the same input and output language as a script C that calls C1 then D, where D calls C2 then C3. |
Circuits: Whether a DVI-VGA adapter is plugged into a VGA-SVideo adapter, and then the result is plugged into a TV screen with SVideo input, or if a VGA-SVideo adapter is first plugged into the TV, and then plugged to a DVI-VGA adapter, both resulting circuits are working TVs from a DVI input signal. |
The previous section has introduced the following key concepts:
These are the concepts manipulated in category theory.
First, the observables are named objects. The black boxes things are named morphisms or arrows.
Example: | Pipes | Compilers | Circuits |
Objects: | Data streams. | Languages. | Signals. |
Morphisms: | Processes. | Compilers. | Circuits. |
Categories are defined over mathematical sets of objects and morphisms. Sets are different from simple “collections” (or “bag”) of things from the real world: all their elements are distinct, according to some equivalence relation.
So in order to talk about categories over things from the real world, we must first choose how to define the mathematical sets. This choice is called a model and multiple models are possible for the same collection of things.
The most focus should be given to the set of morphisms. The set of objects is simply derived from it once the morphisms are properly identified. For example, in our pipes example, considering what happens to data streams. What does it mean for two processes to be equal or different?
We can choose for example “data stream equality”. By this standard, two processes that filter out lines containing the text “foo” over any data stream are the “same thing.” So “sed '/foo/d'” and “grep -v foo” are the same morphism.
If we choose this definition for morphisms, then the objects are not individual files (or time-particular datastreams over FIFOs), but rather entire classes of all possible data streams that compare equal to each other byte by byte. For example, a stream that delivers “helloworld” in one go is the same stream as another that delivers “hello” and then “world” 5 seconds later.
Another possible choice for a definition is "physical equality". By this standard, two processes that run at different times or in different physical regions of the system are distinct, even if they perform the same task. So two processes run from the same command (eg cat) at different times end up as different morphisms in the set.
If we choose this definition, then the mathematical objects are not only data streams, but also where and when the bytes are physically encoded. So two streams that deliver “helloworld” in different places/times are distinct objects.
The rule of thumb while choosing a definition is the following: if one wants to talk about categories over sets of objects and morphisms that are already mathematically defined, then all is well. If one wants to use category theory over things that are not yet mathematical, be careful to explain clearly and explicitly how they are modeled using mathematical sets, and which equivalence relation is used.
For the next sections, we use the following definitions:
Pipes: We use “data stream equality” as defined above. With this, the commands “sed '/foo/d'” and “grep -v foo” define the same morphism; so do “tr x y” and “sed s/x/y/g”. With this definition, each morphism may have multiple names (different commands to define it). This is ok. |
Compilers: We use “program equality”: two compilers are the same morphism if they produce the same output program from the same input program. By this definition, two different C-to-assembly compilers (eg. gcc and clang) are different morphisms, but a compiler defined by a script combining cpp with gcc is the same morphism as gcc on its own. Again, with this definition, each morphism may have multiple names. |
Circuits: We use “interface and protocol equality”. With this, both the USBGEAR/USBG-232FT-1 and MCT/U232-P9 are the same morphism, as they both interface USB to RS232. However, a USB-to-EGA adapter would not be the same morphism, because it uses a different signal protocol even though the interface (DB9) is the same. |
The arrow notation describes a morphism: “A → B” is a description for a morphism from object A to object B. The two sides of a morphism can be named “input” and “output”, but are usually named “source” and “target”.
There may be multiple morphisms between any two objects, so the arrow notation does not identify a particular morphism; instead, it can be seen as an “interface” or “type” for the set of all morphisms between the designated objects:
Pipes: Both the processes resulting from running “tr x y <f1 >f2” and “tr yz <f1 >f2” are different morphisms, and both can be described by the arrow “f1 → f2”. |
Compilers: Both the gcc and clang programs are morphisms, and both can be described by the arrow “C → assembly”. |
Circuits: Both the USBGEAR/USBG-232FT-1 and MCT/U232-P9 are morphisms, and both can be described by the arrow “USB → RS232”. |
If two morphisms have the same intermediary object, it is possible to compose them together (cf. Composition semantics above). This is abstracted by an composition operator noted "⋅": given two compatible morphisms f and g, "g⋅f" designates their composition, ie. (g⋅f)(x) = g(f(x)).
By construction, if f can be described by A → B and g by B → C, then g⋅f can be described by A → C.
Composition is associative: for any f, g, h, (f⋅g)⋅h = f⋅(g⋅h).
For any object, there must exist at least one morphism that keeps the object unchanged. Each such identity morphism for an object x is called “id_{x}” (sometimes also “1_{x}”) and can be described as x → x; it must satisfy the following property: for every morphism f : A → B, id_{B}⋅f = f = f⋅id_{A}.
So there must be at least as many identity morphisms as there are objects.
For categories defined by modeling over concrete things, it may be necessary to extend the mathematical set of morphisms with “theoretical” identity morphisms, when there are no concrete identities.
For example, it is not possible to build a concrete identity circuit: any one-to-one pairing of physical input and output connectors with direct wires between them is bound to introduce noise in the signal due to the physical distance. However, the mathematical set modeling circuits can be naturally extended to include “virtual” identity circuits that preserve signals unchanged.
Hopefully, with many categories the identity morphisms can be concretely constructed in the application domain:
Pipes: For any data stream, the morphism defined from the commands “cat” or “grep '.*'” is an identity. |
Compilers: For the C language, the preprocessor (cpp) is an identity. |
Circuits: see above. |
A category is an algebraic structure formed over:
Category: | Pipes | Compilers | Circuits |
Objects: | Data streams. | Languages. | Signals. |
Morphisms: | Stream transformers. | Compilers. | Models of circuits. |
Identities: | Running cat or similar pass-through commands. | cpp for C, in general cat for any langage. | Virtual pass-through circuits. |
Composition: | Chaining processes via FIFO buffers, eg by running them with the pipe operator in commands. | Creating a script that invokes two existing compilers, applying the 2nd on the output of the 1st. | Plugging the circuits together. |
Although the existence of identity morphisms is a prerequisite to form a category (axiomatic), it is possible to prove within category theory that each identity is unique.
You can do this as follows.
Suppose you have two morphisms id1_{x} and id2_{x} that preserve object x and satisfy the axiomatic identity properties:
In equation #1, replace f by id2_{x}, and you find that id1_{x}⋅id2_{x} = id2_{x}.
In equation #2, replace g by id1_{x}, and you find that id1_{x}⋅id2_{x} = id1_{x}.
Since both left-hand sides are equal, you have proven that id2_{x} = id1_{x}. ■
What this means in practice: if you can construct/define two morphisms in a category and prove they satisfy the identity laws, then you have proven they are the same morphism. In our examples, that means identity commands (pipes), compilers or circuits become interchangeable with regards to their properties in category theory. This can be used to simplify formulas that use complex morphisms into simpler ones.
The general notion of invertibility for a morphism can be expressed purely in the vocabulary of category theory:
f : A → B is invertible if there exists g : B → A such that g⋅f : id_{A} and f⋅g : id_{B}.
Invertible morphisms are also called isomorphisms.
By extension, two objects A and B are isomorphic if there exists at least one isomorphism described by A → B.
For any category C, it is possible to define mathematically another category C^{op} where the source and target of every morphism are interchanged. This is called the “opposite category” or dual category of C.
Duality is “invertible”: (C^{op})^{op} is the same category as C.
The dual category of a category C of concrete objects may be purely abstract, ie. without concrete representations for morphisms in the application domain of C, for example if the morphisms in C are not invertible.
Nevertheless, duality serves an important purpose: say, you have demonstrated a property that holds within a category C, which you can express within the language of category theory using a formula σ (some text string).
If you then replace all occurences of “source” by “target” and vice-versa in σ, and all occurences of f⋅g by g⋅f, you obtain a new formula σ^{op}. By construction, this formula is true in C^{op}. It is said to be the dual property of σ. Conversely, if you know a property σ to be true in C^{op}, then σ^{op} will be true in C as well.
Why this is useful: many important/useful results and properties of mathematics come in pairs that are expressed using “symmetric” formulas, which are dual in category theory. For example, monomorphisms and epimorphisms are morphisms for which different properties hold, but their definitions are dual. From this, if one can derive another unrelated property φ that relies on the fact a morphism is monomorphic, then thanks to duality, automatically the dual property φ^{op} is also proven over epimorphisms. Thanks to category theory, many pairs of results/theorems in algebra can be obtained with half the effort.
The observation that there are some common features between different categories intuitively brings the idea to transform one category into another, while preserving its structure.
For example, our category of “pipes” over Unix data streams can be transformed into a category of “networked services” over network data streams trivially, by attaching the program nc around each Unix processes.
Such a transformation of a category into another is called a functor. Generally, a functor F from a category C to a category D has the following properties:
(This specific flavor of functors is called “covariant.” If F maps each arrow in C to an arrow with opposite direction in D, it is called “contravariant” instead. Covariant and contravariant functors are dual.)
Side property: Because of the properties of functors, the algebraic structure formed by 1) a set of categories 2) a set of functors over these categories 3) the identity functors that leave each category in set #1 unchanged, and 4) the natural generalization of composition, together, is itself a category.
I am grateful to Karst Koymans for helping develop my understanding of categories and denouncing major problems with an earlier version of this document.
Copyright © 2014, Raphael ‘kena’ Poss. Permission is granted to distribute, reuse and modify this document according to the terms of the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/.
SC fingerprint: fp:boaT01NNxsK5g1kZfT_03hFFhdOCwUzw4NjT1mCwBbIl2g