Cubical Type Theory

Posted on March 7, 2021
7068

Hello, everyone! It’s been a while, hasn’t it? Somehow, after every post, I manage to convince myself that I’m gonna be better and not let a whole season go by between posts, but it never happens.

For the last two posts I’ve been going on at length about fancy type theories, and this post, as the title implies, is no exception. In fact, two posts ago I mentioned, offhand, Cubical type theory as a possibility for realising HoTT in a constructive way, but 128 days ago I did not understand cubical type theory in the slightest.

Now, however, I do! I still don’t know what the hell the word “fibration” is supposed to mean, or indeed “fibrant”, but we can gloss over that and present cubical type theory with as little category-theoretical jargon as possible. In fact, I have a mostly1-complete implementation of cubical type theory for us to use as a discussion framework.

As mentioned in Reflections on Equality, the main idea of Cubical Type Theory is the type of paths, so let’s talk about that at length.

Paths

Even in boring old Martin-Löf type theory, as soon as we have equalities and induction, we can prove a very interesting theorem: Every function preserves paths. This is actually a simplification of a more fundamental fact in MLTT, its groupoid structure, in which functions are interpreted as functors. Like a category-theoretical functor has an action on objects and an action on morphisms, a type-theoretical function has an action on values and an action on paths.

Using path induction, we can prove it (roughly) like this. Suppose (given a f:ABf : A \to B), there is a path p:xAyp : x \equiv_A y. By induction, we may assume yy is xx and pp is refl\mathrm{refl}, in which case what we need to prove is f(x)Bf(x)f(x) \equiv_{B} f(x). But this is what refl\mathrm{refl} states. This isn’t a complicated proof, because it’s not a complicated theorem: the images of equal elements are equal, big deal.

This is where things get a little mind-bending. What would happen if we had a type with “two” values, with a path between them? The values of the function at either end could be different, but they would still be… equal. This is the main idea of cubical type theory: We add an interval type, I\mathbb{I}, which denotes the interval object [0,1][0,1] in our model. Then we can drop the inductive definition of equalities as generated by refl\mathrm{refl} and simply define equalities in AA as functions IA\mathbb{I} \to A.

Let’s not get ahead of ourselves, though, and talk a bit more about the interval type. It has two elements, i0i0 and i1i1, but it’s not isomorphic to the type of booleans: Internally to the type theory, we have no way of distinguishing i0i0 and i1i1, since every function must be continuous.

Since it denotes [0,1][0,1], we can define a lattice operations on elements of the interval, enough to equip it with the structure of a De Morgan algebra, but not a boolean algebra. We have meets, aba \land b, the logical operation of “and”, interpreted as min(a,b)\min(a,b); Joins, aba \lor b, the logical operation of “or”, interpreted as max(a,b)\max(a,b); And an involution, ¬a\neg a, which denotes the algebraic operation 1a1 - a.2

These operations follow the usual laws of Boolean logic save for two: In general, min(x,1x)min(x, 1 - x) is not 00 and max(x,1x)max(x, 1 - x) is not 11, only for the endpoints. While internally to the type theory we have no element representing “half”, since the object I\mathbb{I} denotes does have these filler points, we can’t in general expect those equations to hold. Hence, De Morgan algebra, not Boolean.

Another thing to keep in mind is that, while the interval is an expression which other expressions have as a type (namely, Γi0:I\Gamma \vdash i0 : \mathbb{I} and Γi1:I\Gamma \vdash i1 : \mathbb{I}), we do not call it a type. We reserve the word type for objects with more structure (which we will discuss later). For now, it’s enough to think of the interval as a “pre”type, something which is almost, but not quite, a type. Cubical type theory has plenty of these pretypes so we include a separate universe Uω\mathscr{U}_{\omega} to classify them.

Now that we’re familiar with the interval, we can discuss the actual title of this section, paths. We define the type of paths in AA as a refinement of the function space f:IAf : \mathbb{I} \to A, where the values of f(i0)f(i0) and f(i1)f(i1) are indicated in the type. Hence the formation rule, on the left:

Γ,i:Ie:AΓλi.e:Path A e[i0/i] e[i1/i]\frac{\Gamma, i : \mathbb{I} \vdash e : A}{\Gamma \vdash \lambda i. e : \mathrm{Path}\ A\ e[i0/i]\ e[i1/i]}

Γp:Path A x yΓi:IΓp(i):A\frac{\Gamma \vdash p : \mathrm{Path}\ A\ x\ y\quad \Gamma \vdash i : \mathbb{I}}{\Gamma \vdash p(i) : A}

On the right is the elimination rule, which says that if we have an element of the interval we can project the value the path takes at that point. Alternatively we could represent paths by the type with an inclusion inP:(f:IA)Path A f(i0) f(i1)\mathrm{inP} : \prod{(f : \mathbb{I} \to A)} \to \mathrm{Path}\ A\ f(i0)\ f(i1) and projection outP:Path A x yIA\mathrm{outP} : \mathrm{Path}\ A\ x\ y \to \mathbb{I} \to A. Furthermore, we impose a pair of “regularity” equations, which state that p(i0)=xp(i0) = x and p(i1)=yp(i1) = y for paths p:Path A x yp : \mathrm{Path}\ A\ x\ y.

One important difference between functions out of the interval and paths is that, while the former would be put in the universe Uω\mathscr{U}_{\omega} by virtue of its domain being a pretype, paths do have the required additional structure to be in the universe U\mathscr{U} of “proper types”, as long as the type AA of the endpoints does.

Using the algebraic structure of the interval we can define some operations on paths, which we may represent diagramatically. For simplicity, paths will be drawn as direct lines between their endpoints, and the type will be left to be inferred from the context; A path whose bound variable is ii will be drawn in the left-to-right direction, and a path whose bound variable is jj will be drawn in the upwards direction.

Since bound interval variables are variables, they have all the same structural rules as normal variables! In particular, weakening lets us drop an interval variable to have a constant path. This is a proof of reflexivity, which we diagram as follows:

reflexivity
The reflexivity path for aa is represented by a constant path.

Given a path pp with endpoints xx and yy (concisely written as p:xyp : x \equiv y) we compute its inversion, sym(p):yxsym(p) : y \equiv x by “precomposition” with the interval involution:

p from a to b the inverse of p from b to a
By inverting the interval argument, we can invert paths.

The meet and join operations on the interval let us define two kinds of squares called connections, which let us concisely turn a one-dimensional path into a two-dimensional square, which gives us paths between paths (paths in the second dimension). The connection generated by iji \land j is going to be especially helpful in a bit, when we prove that singletons are contractible, and hence that paths are a suitable definition of equality.

and connection
The square generated by λi j.p(ij)\lambda i\ j. p(i \land j)
or connection
The square generated by λi j.p(ij)\lambda i\ j. p(i \lor j)

Let’s walk through the construction of the left square, keeping in mind that ii goes right and jj goes up. Since the top and bottom faces vary in the ii direction but not the jj direction, they’ll all have a prefixed λi\lambda i; The left and right faces just correspond to applying the outermost lambda inside the square. For the faces, we have:

You can see that in either the ii or jj direction the inside of this square connects the path pp with the constant path at its left endpoint. This is exactly what we need for the following proof that singletons are contractible:

singContr : {A : Type} {a : A} -> isContr (Singl A a)
singContr {A} {a} = ((a, \i -> a), \y i -> (y.2 i, \j -> y.2 (iand i j)))

This proof is written syntactically, in the language of cubical. This proof appears on line 114 of the massive source file which has everything I’ve tried to prove with this so far. What’s a module system? The actual proof file has some preliminaries which would be interesting if you care about how cubical type theory is actually implemented.

Another operation on equalities which is very hard in MLTT, but trivial with cubes, is function extensionality. You can see why this would be simple if you consider that a pointwise equality between functions would be an element of AIBA \to \mathbb{I} \to B, while an equality between functions themselves is an element of IAB\mathbb{I} \to A \to B. By simply swapping the binders, we get the naive function extensionality.

The proof of full function extensionality as per the HoTT book is also very simple, but it requires quite a bit more infrastructure to talk about; For now, rather than saying happly (line 667) is an equivalence, we can simply say that happly has funext as right and left inverses, and the proof is trivial in both directions (line 675).

With the infrastructure so far we can’t prove a whole lot, though. For instance, we have prove that singletons are contractible, but this doesn’t freely get us axiom J; Neither can we prove that every property respects equalities, or anything like that. For that sort of proof, we need to introduce a transport operation, which, given the left endpoint of a path of types, returns the right endpoint. However, cubical type theory refuses to be simple.

Quick sidenote, path application corresponds to the eliminator for I\mathbb{I}, since it conceptually has the type in the box below. We use here the type of dependent paths, PathP.

iElim : {A : I -> Type} {x : A i0} {y : A i1} -> PathP A x y
     -> (i : I) -> A i
iElim p i = p i

Simplicity is disallowed

While providing a primitive transp:(A:I)UA(i0)A(i1)\mathrm{transp} : \prod{(A : \mathbb{I}) \to \mathscr{U}} \to A(i0) \to A(i1) might seem like all we need to make paths a sensible notion of equality, reality is not that simple. In particular, transport on paths is hard to define with such an operation, so, as is tradition in type theory, we make things simpler by making them more general. Rather than providing a primitive transport, we provide a primitive composition operation, which generalises transport and composition of paths.

Composition expresses the funny-sounding principle that “every open box has a lid”. No, that is not a joke; That’s actually what we’re talking about. A description in (almost!) English would be to say that composition, given a shape, a partial cube of that shape, and a face (which must agree with the partial cube), returns the opposite face. If you think that description is nonsensical, strap in, because interpreting it type-theoretically requires another 67 lines of definitions in the code! For reference, the almost 2000 words which precede this paragraph covered roughly 20 lines of actual code.

Crying over what I still have to write won’t help me get this blog post out any sooner though, so let’s get to it.

Cofibrations

Again that god damn word. In addition to the interval object, to define a cubical model of type theory, we need a notion of cofibration, which is a fancy way of saying “shape of a partial cube”. In the papers which introduced cubical type theory, they use a “face lattice”, F\mathbb{F}. However, this turns out to be needlessly complicated, as we can get this structure from the interval object.

To each element ϕ:I\phi : \mathbb{I} (referred to as a formula) we assign a cofibrant proposition [ϕ]:Uω[\phi] : \mathscr{U}_{\omega}3 which is inhabited when ϕ=i1\phi = i1. In the code, we write IsOne phi for [ϕ][\phi] and it is inhabited by a distinguished element itIs1 : IsOne i1. This family of types is definitionally proof-irrelevant, which means that any two inhabitants of IsOne phi are equal.

A note on terminology

Throughout the rest of this post I’ll refer to elements of the interval as either “endpoints” or “formulas” depending on how they’re used. These aren’t technical terms, and are meant to be just indicative. The convention is roughly that, if i:Ii : \mathbb{I} is used as the argument to a path, or to a filler, or it’s the bound variable in a composition (or etc), it’s called an endpoint; If it’s used to denote a restriction (i.e., there might reasonably be an element of [ϕ][\phi] in the context), it’s called a formula.

Also I apologise for the garbled terminology (or even ambiguity) when talking about [ϕ][\phi] vs ϕ\phi, since both can reasonably be called formulas.

We can interpret these propositions as being shapes of partial cubes. For instance, the proposition [i¬i][i \lor \neg i] (for i:Ii : \mathbb{I}) represents a “line” which is defined when i=i0i = i0 or i=i1i = i1, but not in the middle; This isn’t a line as much as it is a pair of points.

Thinking back to the “human-readable” description of the composition operation, the proposition ϕ\phi specifies the shape of the open box, but not the box itself.

Partial Elements

We call a function f:[ϕ]Af : [\phi] \to A a partial element of AA, that is, an element of AA which is only defined when [ϕ][\phi] is inhabited. For these we have a special pattern-matching notation, termed a system, which is written between brackets.

partialBool : (i : I) -> Partial (ior i (inot i)) Bool
partialBool = \i [ (i = i0) -> false, (i = i1) -> true ]

The element partialBool above is a boolean with different values when i = i0 or i = i1. However, this does not lead to a contradiction, because to extract the underlying bool we need to apply partialBool not only to an element of the interval, but also to an inhabitant of IsOne (ior i (inot i)). This is why it’s critical that the type checker distinguishes between i¬ii \lor \neg i and i1i1!

As another implementation note, the type Partial phi A is a version of IsOne phi -> A with a more extensional equality. Two elements of Partial phi A are equal when they represent the same subcube, i.e., they take equal values for every assignment of variables which makes phi = i1.

Furthermore, there is a dependent version of Partial, PartialP4, which allows the type A itself to be a partial element of U\mathscr{U}. This will be used later when we introduce the glueing operation.

In the composition operation, the partial element with shape ϕ\phi specifies the open box itself.

Extensibility

Given a type AA and a partial element u:[ϕ]Au : [\phi] \to A, we can define the type of elements a:Aa : A which extend uu. These are total elements, in that their existence does not depend on the inhabitation of [ϕ][\phi] (for any ϕ\phi). To say they extend uu is to say that, given [ϕ][\phi], we have that u(1is1)=au(\mathtt{1is1}) = a. In the theory, where we have All the fancy symbols, we write A[ϕu]A[\phi \to u] for the type of extensions of uu, but in the code, where we’re limited to boring ASCII, we just write Sub A phi u.

We can make any total element u : A into a partial element, with any formula that we want, by ignoring the proof. The constructor inS for the Sub-types expresses that this partial element agrees with u on any phi that we choose.

inS : {A : Type} {phi : I} (u : A) -> Sub A phi (\x -> u)

We also have a projection operation for Sub types, which undoes inS. Furthermore, outS {A} {i1} {u} x computes to u i1 itIs1, since x agrees with u on phi.

outS : {A : Type} {phi : I} {u : Partial phi A} -> Sub A phi u -> A

With the idea of a cubical Subtype we can express the type of the fourth argument of the composition operation, the “bottom” face of an open box with agrees with (extends!) the partial element specifying the sides.

Composition

As stated before, the composition operation takes as input the description of an open cube with a face removed and computes that missing face. However this is not a helpful definition if we do not yet have intuition for what “cubes with missing faces” look like! So before explaining the computational behaviour of the composition operation (which is… quite something), let me show you some examples.

Before we get to the examples, for reference, this is the type of the composition operation, written out in syntax:

comp : (A : I -> Type) {phi : I} (u : (i : I) -> Partial phi (A i))
    -> (a0 : Sub (A i0) phi (u i0))
    -> A i1

A trivial use of composition is one where we take the formula ϕ=i0\phi = i0, that is, the partial cube specifying the sides is defined nowhere. In this case we may illustrate the input face of the composition operation as agreeing with… nothing.

transport, illustrated
The arguments to comp A {i0} (\k []), illustrated.

That’s right, in the case where the formula is always false and the partial cube is empty, the input of the composition operation is just a point a0 : A i0, the left endpoint of a path. And by looking at the type of the composition operation, or thinking about its description, you can see where this is going! We give it a0 : A i0, and it gives us an element comp A {i0} (\k []) a0 : A i1!

That’s right, by ignoring the extra power which the composition operation gives us over boring transport, we get back boring transport. Not too surprising, let’s keep going.

For an example which illustrates composition with a cube, suppose we have three points, xx, yy, and zz, all in some type AA. Furthermore suppose that we have paths p:xyp : x \equiv y and q:yzq : y \equiv z. By the transitive property of equality, we know there should be a path between yy and zz. Furthermore, we know that transporting along this composite should be equivalent to transporting along pp then along qq. But how can we, using cubical methods, build the composite of pp and qq?

If you guessed the answer was “using composition”, you… don’t get a lot of extra points. It was heavily implied. But you can still have a cookie, since I suppose it can’t be helped. To create this composite we need to draw a square with 3 lines, such that the missing line connects xx and zz. Furthermore, the requirement that transporting along the composite transports along both constituent paths will guide us in creating this drawing. We only have two paths, though!

composition of paths
The lid of this square gives us the composite qpq \circ p of pp and qq.

Turns out that only having two paths is not an issue, since we can always take the reflexivity path to get the side we didn’t have. To make it clearer, the partial element u:(j:I)Partial (¬ii) Au : (j : \mathbb{I}) \to \mathrm{Partial}\ (\neg i \lor i)\ A is the tube with sides aa and q(j)q(j), and the input p(i):Ap(i) : A is the bottom side. These agree because when jj (the direction of composition) is i0i0 (the base), uu has left endpoint aa and right endpoint bb; A path between these is exactly what p(i)p(i) (ii is the direction of the path) is.

trans : {A : Type} {x : A} {y : A} {z : A}
     -> Path x y
     -> Path y z
     -> Path x z
trans {A} {x} p q i =
  comp (\i -> A)
    {ior i (inot i)}
    (\j [ (i = i0) -> x, (i = i1) -> q j ])
    (inS (p i))

This expression is a syntactic representation of the composition drawn above; The dotted line in that diagram is the result of the composition operation.

Cubical Complication 2: Computing Compositions

It doesn’t suffice to describe the composition operation in types, we also need to describe how it computes when applied to enough arguments. The composition operation reduces to a canonical element of the type A(i1)A(i1) based on the structure of the function A:IUA : \mathbb{I} \to \mathscr{U}, by cases. For example, when AA computes to a function type, the composition will evaluate to a lambda expression; When AA is a \sum-type, it computes to a pair, etc.

Before we get started, one thing to note is that, since we have the iji \land j operation on elements of the interval, the composition operation can compute not only missing faces, but the missing inside of a cube, which we call its filler. For instance, the filler fill A {i0} (\k []) a0 i connects a0 and comp A {i0} (\k []) a0 in the i direction, since it is the 1-dimensional cube (path) between the given and missing faces.

fill : (A : I -> Type) {phi : I}
       (u : (i : I) -> Partial phi (A i))
       (a0 : Sub (A i0) phi (u i0))
    -> (i : I) -> A i
fill A {phi} u a0 i =
    comp (\j -> A (iand i j))
         {ior phi (inot i)}
         (\j [ (phi = i1) as p -> u (iand i j) p, (i = i0) -> outS a0 ])
         (inS (outS a0))

Fillers will be fundamental in reducing compositions in dependent types, including pairs, functions, and general inductive types.

Simple types

A good place to start is composition for inductive types without parameters, since that is trivial. For instance, any composition in the booleans just evaluates to argument. This is also the case for many other types: the natural numbers, the integers, the rational numbers, etc.

comp (λi.Bool) [ϕu] a0=a0\mathrm{comp}\ (\lambda i. \mathrm{Bool})\ [\phi \to u]\ a0 = a0

For parametrised types like lists, we need to explain composition by recursion. In the nil case it’s trivial, we can just return nil. In the cons case, though, we need to recursively apply composition in the head and the tail, to end up with a list of the right type, agreeing with the right faces.

comp (λi.List(A)) [ϕcons x xs] (cons a as)=cons (comp (λi.A)[ϕx] a)(comp (λi.List(A))[ϕxs] as) \mathrm{comp}\ (\lambda i. \mathrm{List}(A))\ [\phi \to \mathtt{cons}\ x\ xs]\ (\mathtt{cons}\ a\ as) =\\ \mathtt{cons}\ (\mathrm{comp}\ (\lambda i. A) [\phi \to x]\ a) (\mathrm{comp}\ (\lambda i. \mathrm{List}(A)) [\phi \to xs]\ as)

Dependent functions

Starting with the full reduction rule for composition in functions would be a lot, so I’ll build it up incrementally. First, I’ll explain transport in simple functions. Then, transport in dependent functions. After I’ve explained those two we can add back the sides to get the full composition for functions.

So, consider for starters transport in a line of ABA \to B, where both are functions IU\mathbb{I} \to \mathscr{U}. We’re given a function f:A(i0)B(i0)f : A(i0) \to B(i0) and want to compute a function f:A(i1)B(i1)f : A(i1) \to B(i1). Start by introducing a λ\lambda abstraction binding a single variable x:A(i1)x : A(i1), under which we’ll work.

Since to get any sort of element of BB we need to apply ff, we must first transport xx to get an element of A(i0)A(i0), to be the argument of ff. The line λi.A(¬i)\lambda i. A(\neg i) connects A(i1)A(i1) and A(i0)A(i0), so that’s what we transport over. Take x=comp (λi.A(¬i)) (λk[]) xx\prime = \mathrm{comp}\ (\lambda i. A (\neg i))\ (\lambda k [])\ x.

The application f xf\ x\prime has type B(i0)B(i0), and we need to transport that to an element of B(i1)B(i1). Again we invoke the trivial composition to get y=comp B (λk[]) (f x)y = \mathrm{comp}\ B\ (\lambda k [])\ (f\ x\prime). Since we have computed an element of B(i1)B(i1), we’re done; Define the composition Thus, we can take comp (λi.AB) (λk[]) f=λx.y\mathrm{comp}\ (\lambda i. A \to B)\ (\lambda k [])\ f = \lambda x. y.

To see the details of how composition generalises to dependent functions, consult the appendix, since it’s a bit verbose to be here.

Dependent pairs

The composition for pairs is what you’d expect. We have to transport the first element of the pair, and use a filler when transporting the second element to make sure the endpoints line up. Again, the details are in the appendix if knowing more about composition strikes your fancy, but it’s not too necessary to follow the proofs.

To be concise here, a simple equation that should clarify the behaviour of transport on pairs is the simply-typed definition of transport:

transp (λi.A×B) (x,y)=(transp (λi.A) x,transp (λi.B) y) \mathrm{transp}\ (\lambda i. A \times B)\ (x, y) =\\ (\mathrm{transp}\ (\lambda i. A)\ x, \mathrm{transp}\ (\lambda i. B)\ y)

Paths

In the case of paths, composition is composition. We’re given a path p0:Path A u vp0 : Path\ A\ u\ v, where all of AA, uu and vv can depend on a variable i:Ii : \mathbb{I}, which is the direction of composition. Furthermore we have a family of partial paths pp with which p0p0 agrees, and with which the result must also agree.

We start by assuming the existence of a dimension j:Ij : \mathbb{I}, which will be bound later. When j=i0j = i0, the resulting composition has to have value u(i)u(i), and when j=i1j = i1, the result must be v(i)v(i). Furthermore, when phiphi, the result must have the same value as p(j)p(j). We can package these constraints straightforwardly in the partial element [ϕp(j),(j=i0)u,(j=i1)v][ \phi \to p(j), (j = i0) \to u, (j = i1) \to v ], again abusing notation for the applications of u(i)u(i) and v(i)v(i).

comp (λi.Path A(i) u(i) v(i)) [ϕp] p0=λj.comp A [ϕp(j),(j=i0)u,(j=i1)v] (p0(j))\mathrm{comp}\ (\lambda i. \mathrm{Path}\ A(i)\ u(i)\ v(i))\ [\phi \to p]\ p0 =\\ \lambda j. \mathrm{comp}\ A\ [ \phi \to p(j), (j = i0) \to u, (j = i1) \to v ]\ (p0(j))

A note on naming: Pretypes

All of the types we explained composition for above are, well, types. In cubical type theory, or at least in this presentation, we reserve the word type for those objects which have a composition structure. The ones which don’t have a composition structure are called pretypes.

Alternatively we could call the types for which we have composition the fibrant types, since they have a fibrancy structure, as in the CHM paper: They have a transport structure and a homogenous composition structure, with which we can assemble a composition structure as above.

All of the type formers inherited from MLTT (\prod and \sum), the path types, and every inductive and higher inductive type made out of types are fibrant, leaving only the cubical primitives (the interval, partial elements, and cubical subtypes) as pretypes. However, we could consider an extension of type theory where both sorts are given equal importance: This would be a two-level type theory, a realisation of Voevodsky’s Homotopy Type System.

Auxiliary Definitions

In this section we’re going to talk about a handful of operations, which can be defined in terms of what we have so far, which will be used in discussing the Glue\mathrm{Glue} types, which are used in interpreting the univalence axiom. In contrast to the CCHM paper, I’ll only talk about the notions which are mandatory for defining the glueing operation. Composition for glue is very complex, and needlessly detailed for the purposes of this post.

Contractible Types

We define a type AA to be contractible if, and only if, there exists an element x:Ax : A (called the centre of contraction) to which all other elements y:Ay : A are Path-equal. Cubically, we can give an alternative formulation of contractibility: AA is contractible iff. every partial element u:Partial ϕ Au : \mathrm{Partial}\ \phi\ A is extensible.

Let pp be the proof that AA is contractible, a pair containing the centre of contraction and the proof that any element of the type is equal to the centre. We define contr [ϕu]=comp (λi.A) [ϕ(p.2 u)(i)] p.1\mathrm{contr}\ [\phi \to u] = \mathrm{comp}\ (\lambda i. A)\ [\phi \to (p.2\ u)(i)]\ p.1.

Conversely, if we have an extension for any partial element, we can prove that type is contractible in the typical sense: Take the centre of contraction to be contr []\mathrm{contr}\ [] and the proof that any yy is equal to that is given by extending the partial element [(i=i0)contr [],(i=i1)y][ (i = i0) \to \mathrm{contr}\ [], (i = i1) \to y].

As an example of contractible types, we have already seen Singl A a, the type of “elements of A equal to a”. This has a centre at (a, refl), which can be proven by a connection. The unit (or top) type is also contractible, having tt as a centre, which can be proven by induction. It can be proven that any contractible type is equivalent to the unit type, making all of them maximally uninteresting.

Equivalences

Since we have the univalence axiom, it is important for soundness that we define a notion of equivalence for which “being an equivalence” is a mere proposition: Either a function is an equivalence, or it isn’t. We choose one which is cubically convenient, namely that of “contractible fibers”.

The fiber of a function f:ABf : A \to B at a point y:By : B is a pair of an input x:Ax : A together with a proof that f(x)yf(x) \equiv y. We define ff to be an equivalence if for every element y:By : B, the fiber fiber f y\mathrm{fiber}\ f\ y is contractible. That means that, for every element in the range, there is a corresponding element in the domain, and this element is unique.

Using this notion of equivalence we can prove that every equivalence has an inverse, by taking the first element of the centre of contraction for every fiber:

inverse : {A : Type} {B : Type} {f : A -> B} -> isEquiv f -> B -> A
inverse eqv y = (eqv y) .1 .1

Furthermore, this function is an actual inverse:

section : {A : Type} {B : Type} (f : A -> B) (eqv : isEquiv f)
       -> Path (\x -> f (inverse eqv x)) id
section f eqv i y = (eqv y) .1 .2 i

We can also formulate the requirement that a function has contractible fibers cubically: A function is an equivalence iff every one of its partial fibers is extensible.

Glueing & Univalence

Since I like quoting the impenetrable definitions of the paper, glueing expresses that “extensibility is invariant under equivalence”. Concretely, though, it’s better to think that the Glue\mathrm{Glue} operation “glues” together a partial type TT onto a total type AA (which we call the base) to get a total type which extends TT. We can’t do this freely, though, so we require an extra datum: A (partial) equivalence between TT and AA.

Glue : (A : Type) {phi : I} -> Partial phi ((T : Type) * Equiv T A) -> Type

The type Glue A [ϕ(T,f)]\mathrm{Glue}\ A\ [\phi \to (T, f)] extends TT in the sense that, when ϕ=i1\phi = i1, Glue A [ϕ(T,f)]=T\mathrm{Glue}\ A\ [\phi \to (T, f)] = T.

The “user-friendly” typing rule for Glue is as presented above. Internally we separate the type TT from the equivalences ff to make defining composition in Glue simpler. These types come with a constructor, glue\mathrm{glue}, which says that, given an inhabitant t:PartialP ϕ Tt : \mathrm{PartialP}\ \phi\ T, and a total element a:Aa : A which extends the image of f tf\ t (the equivalence), we can make an inhabitatnt of Glue A [ϕ(T,f)]\mathrm{Glue}\ A\ [\phi \to (T, f)].

Conversely there is a projection, unglue\mathrm{unglue}, which extracts a value of AA from a value of Glue A [ϕ(T,f)]\mathrm{Glue}\ A\ [\phi \to (T, f)]. When applied to an element constructed with glue\mathrm{glue}, unglueing simply extracts it; When applied to a neutral value, as long as ϕ=i1\phi = i1, the value of the glued type will be a value of TT, and the equivalence is defined; We can then apply the equivalence to get a value of AA.

Using the boundary conditions for Glue\mathrm{Glue} we can define, from any equivalence ABA \simeq B, a path ABA \equiv B.

univalence : {A : Type} {B : Type} -> Equiv A B -> Path A B
univalence {A} {B} equiv i =
  Glue B (\[ (i = i0) -> (A, equiv),
             (i = i1) -> (B, the B, idEquiv {B}) ]) 

For the proof that transporting along this path has the effect of applying the equivalence, I’ll need to handwave some stuff about the behaviour of transport in Glue\mathrm{Glue}. First, we can illustrate the Glue done above as the dotted line in the square below:

reflexivity
This square represents the glueing used for univalence. The left and right sides are equivalences.

How would one go about transporting an element across the dotted line there? Well, I have a three-step program, which, since we’re talking about squares, has to be rounded up to a neat 4. Suppose we have an element x:Ax : A which we want to turn into an inhabitant of BB.

We’d be done here, but since transport is a special case of composition, we need to compose along the line λi.B\lambda i. B with the faces of the overall composition to get a proper element of the type BB. Of course, in this case, the faces are trivial and the system is empty, but we still have to do it.

To construct a Path (transp (λi.univalence f i)) f.1\mathrm{Path}\ (\mathrm{transp}\ (\lambda i. \mathrm{univalence}\ f\ i))\ f.1, there is a bit of cubical trickery which needs to be done. This proof is commented in the repository here, so I recommend you read it there for the details. The short of it is that univalence\mathrm{univalence} plus this path, which we call univalenceβ\mathrm{univalence}\beta, implies the full univalence axiom, namely that (AB)(AB)(A \simeq B) \simeq (A \equiv B).

Proofs using univalence

With univalence, and a proof that isomorphisms give rise to equivalences, we can get to proving some stuff about types! That’s exciting, right? I’m excited. The proof that isomorphisms give rise to equivalences is, uh, very complicated, so I won’t explain it here. Full disclosure, it seems like this proof is a bit of folklore: I got it from the cubicaltt repo, and I think the version in Cubical Agda’s base library is the same!

One very simple use of univalence, which doesn’t require more fancy types, is proving that the universe U\mathscr{U} is not a set, in the sense of HoTT. Recall that a set (or h-set, to be more precise), is a type where any parallel equalities are themselves equal. In a type:

isHSet : Type -> Type
isHSet A = {x : A} {y : A} (p : Path x y) (q : Path x y) -> Path p q

We are going to prove that any inhabitant of isHSet U\mathrm{isHSet}\ \mathscr{U} is baloney. For this, we must define the type of booleans, the discrete space with two points:

data Bool : Type where
  true : Bool
  false : Bool

First, we can prove that true≢false\mathrm{true} \not\equiv \mathrm{false}. For this, suppose it were: Given a proof p:truefalsep : \mathrm{true} \equiv \mathrm{false}, we can build the path λi.if p(i) then Bool else \lambda i. \mathrm{if}\ p(i)\ \mathrm{then}\ Bool\ \mathrm{else}\ \bot, which connects Bool\mathrm{Bool} (an arbitrary choice) and \bot. Transporting true\mathrm{true} (another arbitrary choice) along this path gives us an inhabitant transp (λi.) true:\mathrm{transp}\ (\lambda i. \dots)\ true : \bot, which is what we wanted.5 \blacksquare

Define the function not x=if x then false else true\mathrm{not}\ x = \mathrm{if}\ x\ \mathrm{then}\ \mathrm{false}\ \mathrm{else}\ \mathrm{true}. By induction, one can prove that not (not x)x\mathrm{not}\ (\mathrm{not}\ x) \equiv x for any boolean, and thus not\mathrm{not} is its own inverse. Appealing to the fact that isomorphisms are equivalences, and then to univalence, we get a path notp:BoolBool\mathrm{notp} : \mathrm{Bool} \equiv \mathrm{Bool} such that transp notp x=not x\mathrm{transp}\ \mathrm{notp}\ x = \mathrm{not}\ x.

Now we assume an inhabitant sure\mathrm{sure} (to be read in a very sarcastic voice) of isHSet U\mathrm{isHSet}\ \mathscr{U} and derive a contradiction, that is, an inhabitant of \bot. The path sure notp refl\mathrm{sure}\ \mathrm{notp}\ \mathrm{refl} connects notp\mathrm{notp} and refl\mathrm{refl} in the direction ii. From this we build the path λi.transp (sure notp refl)(i) false\lambda i. \mathrm{transp}\ (\mathrm{sure}\ \mathrm{notp}\ \mathrm{refl})(i)\ \mathrm{false}, which has as endpoints truetrue and falsefalse. To see this, compute:

Applying the proof that true≢false\mathrm{true} \not\equiv \mathrm{false} we have a contradiction, which is exactly what we wanted.6\blacksquare

“Big deal,” I hear you say. “So what, the universe isn’t a set?” Well, you’re right. This isn’t an exciting fact, or an exciting proof. To read. Getting this to go through was incredibly satisfying. But if we want to prove non-trivial facts using univalence, we’re going to need a bigger boat universe. Ours doesn’t have enough types.

Higher Induction

To say that our universe U\mathscr{U} with its infinitely many types is lacking some is… weird, I’ll admit. However, it’s missing a lot of them! A countably infinite amount, in fact. While we have all inductive types, we only have the zero-dimensional inductive types, and not the higher inductive types!

I’ve written about these before a bit in the previous post, about induction. In short, while inductive types allow us to define types with points, higher inductive types let us define types with points and paths. Full disclosure, of time of writing, the implementation of HITs in cubical is partial, in that their fibrancy structure is a big error. However we can still write some simple proofs involving them.

The Interval

Wait, didn’t we talk about this before? No, no, this is the right interval. We’re still on track.

The Interval\mathrm{Interval} is the inductive type freely generator by two constructors, ii0\mathrm{ii0} and ii1\mathrm{ii1}, and a path seg\mathrm{seg} connecting them. Well, that’s the theory, but the reality is a bit different. In order to support eliminating (read: pattern matching on) inductive types, we can’t simply assume paths exist, even in cubical type theory. What we end up with instead is a constructor parametrised by some interval (that’s I\mathbb{I}) variables, and an attached boundary condition.

In the case of the Interval, we have this definition:

data Interval : Type where
  ii0 : Interval
  ii1 : Interval
  seg i : Interval [ (i = i0) -> ii0
                   , (i = i1) -> ii1
                   ]

This says that seg i0 is definitionally equal to ii0, and seg i1 is definitionally equal to ii1. We can get a path connecting them by abstracting over the ii variable: λi.seg i:Path ii0 ii1\lambda i. \mathrm{seg}\ i : \mathrm{Path}\ \mathrm{ii0}\ \mathrm{ii1}. To pattern match on an element of the interval we need three (really, four, but one is details—and automated) things:

To express the type of cseg, we need to power up our path types a bit. Conceptually, just like a Path\mathrm{Path} is a specialised version of IA\mathbb{I} \to A, we need a dependent path, called PathP\mathrm{PathP}, which specialises (i:I)A i\prod{(i : \mathrm{I})} A\ i, that is, the type of the endpoints is allowed to depend on the interval variable. With that, the type of p becomes PathP (\i -> P (seg i)) c0 c1, since c0 : P (seg i0) and c1 : P (seg i1).

As for that fourth thing I mentioned? In addition to preserving each of the constructor data, a map between Interval-algebras needs to be fibrancy preserving: Compositions in the domain are mapped to the “appropriate” compositions in the range. In implementations of cubical type theory, this is automatic, since the range has a fibrancy structure (since it is in U\mathscr{U}), and preserving compositions can be done automatically and uniformly.

Since we already have an interval pretype I\mathbb{I}, having an interval type isn’t too interesting. One thing we can do is prove function extensionality… again… reproducing an argument from the HoTT book.

iFunext : {A : Type} {B : A -> Type}
          (f : (x : A) -> B x)
          (g : (x : A) -> B x)
       -> ((x : A) -> Path (f x) (g x)) -> Path f g
iFunext f g p i = h' (seg i) where
  h : (x : A) -> Interval -> B x
  h x = \case
    ii0 -> f x
    ii1 -> g x
    seg i -> p x i

  h' : Interval -> (x : A) -> B x
  h' i x = h x i 

I’m pretty sure that I had reproduced this proof in the previous blog post as well, so you can check there for a more thorough explanation. Let’s move on to some more exciting higher inductive types.

Synthetic Homotopy Theory: S1\mathbb{S}^1

I am not a homotopy type theorist, but I am a homotopy type theorist, which means I am qualified to prove some facts about spaces. A particularly simple space, which is nonetheless non trivial, is the circle, S1\mathbb{S}^1, the type freely generated by a point and a loop.

data S1 : Type where
  base : S1
  loop i : S1 [ (i = i1) -> base, (i = i0) -> base ]

We can illustrate this type like this:

The circle
The circle.

The elimination principle for this is just like for the interval. We need a point b : P base and a dependent path l : PathP (\i -> P (loop i)) b b (since loop i0 = loop i1 = base the dependent path is not strictly necessary). For example, to define a function S1U\mathbb{S}^1 \to \mathscr{U}, we need to pick a type X:UX : \mathscr{U} and a path XXX \equiv X. All non-trivial paths in types are going to be generated by univalence on some interesting equivalence.

Allow me one paragraph’s worth of digression before we get to the point. The type of integers is defined as the coproduct of N+N\mathbb{N} + \mathbb{N}7, were inl x\mathrm{inl}\ x is interpreted as +x+x and inr x\mathrm{inr}\ x is (x+1)-(x + 1). With this representation, one can define the functions sucZ=x+1\mathrm{sucZ} = x + 1 and predZ=x1\mathrm{predZ} = x - 1, and prove that they are inverses, such that sucZ\mathrm{sucZ} is an autoequivalence of Z\mathbb{Z}.

Consider the function helix:S1U\mathrm{helix} : \mathbb{S}^1 \to \mathscr{U} which maps base\mathrm{base} to Z\mathbb{Z} and loop(i)\mathrm{loop}(i) to (univalence sucZ)(i)(\mathrm{univalence}\ sucZ)(i). It’s easy to check that this definition is type-correct (and boundary-correct), so we can apply it to elements of the circle and get back types and equivalences. Now we can define the function winding:basebaseZwinding : \mathrm{base} \equiv \mathrm{base} \to \mathbb{Z} by

winding : Path base base -> Int
winding p = transp (\i -> helix (p i)) (pos zero)

This map counts, for any loop x:basebasex : \mathrm{base} \equiv \mathrm{base}, the number of times x “goes around” the loop\mathrm{loop}. For example, going around it once:

windingLoop : Path (winding (\i -> loop i)) (pos (succ zero))
windingLoop = refl

or once in the other direction:

windingSymLoop : Path (winding (\i -> loop (inot i))) (neg zero)
windingSymLoop = refl

or no times at all:

windingBase : Path (winding (\i -> base)) (pos zero)
windingBase = refl

If we also write a function wind:Zbasebasewind : \mathbb{Z} \to \mathrm{base} \equiv \mathrm{base} and prove that they are inverses, what we end up with is a fully synthetic, machine-checked proof that Ω(S1)Z\Omega(\mathbb{S}^1) \equiv \mathbb{Z}. Of course, we could also define Z\mathbb{Z} as Ω(S1)\Omega(\mathbb{S}^1), but in that case the proof is a lot less interesting!

Category Theory: The Homotopy Pushout

Category theory has the notion of limits and colimits of diagrams, which give rise to lots of important concepts. A full explanation of colimits is not due here, but it should suffice to say that if we want to do mathematics internally to cubical type theory, a complete and co-complete category is a fine setting to do it. Given a diagram like the one on the left, a cocone under it is a diagram like the one on the right, which commutes. The pushout of a span is its colimt, that is, the “smallest” such cocone.

Span
A span is a triple of types AA, BB, CC with maps f:ABf : A \to B and g:ACg : A \to C
Colimit of a span
A cocone under a span is a type PP and inclusions i1:BPi_1 : B \to P and i2:CPi_2 : C \to P such that i1f=i2gi_1 \circ f = i_2 \circ g

Normal Martin-Löf type theory does not give us the tools to define pushouts, but, as you will have guessed, cubical type theory does. We can define pushouts as a higher inductive type, like this:

data Pushout {A B C : Type} (f : A -> B) (g : A -> C) : Type where
  inl : (x : B) -> Pushout f g
  inr : (y : C) -> Pushout f g
  push i : (a : A) -> Pushout f g [ (i = i0) -> inl (f a)
                                  , (i = i1) -> inr (g a) ]

The push path constructor is parametrised by an element a:Aa : A and an endpoint i:Ii : \mathbb{I}. Applying function extensionality, one can turn this into a path between finlf \circ \mathrm{inl} and ginrg \circ \mathrm{inr}, which is what we need for the diagram to commute. Homotopy pushouts are very general and can be used to define a number of homotopy-theoretic constructions. Quoting the HoTT book, section 6.8, we have:

  • The pushout of 1A11 \leftarrow A \to 1 is the suspension ΣA\Sigma A
  • The pushout of AA×BBA \leftarrow A \times B \to B is the join of AA and BB, written ABA * B
  • The pushout of 1AfB1 \leftarrow A \xrightarrow{f} B is the cone or cofiber of ff

The big file with all the proofs in cubical features a proof that the suspension ΣA\Sigma A defined directly as a HIT is the same as the one defined by the pushout of 1A11 \leftarrow A \to 1.

But Why?

The motivation for cubical type theory was made explicit two posts ago, when I was talking about equality for the first time, but it’s worth mentioning it again, especially after all8 of its complexity has been exposed like this. And let me be clear, it is very complex. No amount of handwaving away details can make cubical type theory seem like a “natural” extension: It’s not something we found, like the groupoid interpretation of type theory. It’s something we found.

And what did we find? A type system with great computational behaviour for all of Homotopy Type Theory. In particular, an argument based on the cubical set model of type theory, rather than on the syntax, proves that cubical type theory enjoys canonicity: Every boolean in the empty context is strictly equal to either true\mathrm{true} or false\mathrm{false}, and other types enjoy similar properties for their canonical elements.

The big failing of Homotopy Type Theory before the cubes came to save us was that there were closed inhabitants of types not equal to any of their constructors. In particular, any construction with path induction would get stuck on the terms ua(e)\mathrm{ua}(e) of the univalence axiom. Cubical type theory solves this twofold: It gives us ways of working with paths directly, using operations on the interval and composition, and it explains what the computational behaviour of univalence\mathrm{univalence} is.

So, if you ask me, the complexity is justified. It’s one of those things that took me a while to get my head around, but where the learning process and the result (knowing about cubes) were beneficial. And god, did it take a while. The first time I encountered the cubical type theory paper was in mid 2019, almost two years ago! It took me that long to go from “what the hell is this” to “this is neat but it confuses me” to “I understand this” to “I can implement this” (we are here).

Writing about it has been my white whale for that long—I’ll need a new one, suggestions welcome! Maybe I should write a monad tutorial? Heard those are complicated, too.

If you made it this far, I thank you deeply. This post is a behemoth! In fact the next word is the 7000th, which almost makes this post longer than my two previous longest posts combined! If you haven’t abandoned me yet, I swear: I will never make you read this much again. However, if you made it this far and understood everything, I only have one thing to say: Go forth, dear reader, and fill those cubes.

Well, that sounded weird. I won’t say it again.


Appendix: Blog/EXTRA CCC

CCC stands for “computing cubical compositions”, I’m so sorry

Functions

Now we add one step of generalisation, and consider transport in a line of (x:A)B x\prod{(x : A)} B\ x, where A:IUA : \mathbb{I} \to \mathscr{U} as before but B:(i:I)A(i)UB : \prod{(i : \mathbb{I})} \to A(i) \to \mathscr{U}. A given f:(x:A(i0))B(i0) xf : (x : A(i0)) \to B(i0)\ x will become, through a trick of magic, a function (x:A(i1))B(i1) x(x : A(i1)) \to B(i1)\ x.

The first step is to define x:A(i0)x\prime : A(i0) as before, apply ff to get an element B(i0) xB(i0)\ x\prime, then cast the result of the application along λi.B(i)\lambda i. B(i)… Wait. The function is dependent. Can we cast along λ.B(i) x\lambda. B(i)\ x? No, not quite. x:A(i1)x : A(i1), but we need an element of A(i)A(i). λ.B(i) x\lambda. B(i)\ x\prime won’t do either, since that has type A(i0)A(i0).

What we need is a line, dependent on ii, which connects xx\prime and xx, call it pp; Then we can transport along λi.B(i) p(i)\lambda i. B(i)\ p(i) to get the element of B(i1) xB(i1)\ x\prime which we want. The filler of the composition which generated xx\prime is exactly what we need. Define v(i)=fill (λi.A(¬i)) (λj[]) xv(i) = \mathrm{fill}\ (\lambda i. A (\neg i))\ (\lambda j [])\ x, so that we may define y=comp (λi.B(i) v(i)) (λk[]) (f x)y = \mathrm{comp}\ (\lambda i. B(i)\ v(i)) \ (\lambda k [])\ (f\ x\prime), and the composition is λx.y\lambda x. y as before.

To generalise this to non-empty compositions only requires a very small change. If you think of functions as extensional black boxes, like we do, one thing to realise is that it doesn’t really matter how we turn xx into an argument to the function, as long as we do; The only thing which needs to respect the constraints of the composition is the overall function, that is, its result. So we can simply take xx\prime, v(i)v(i) as in the case for dependent compositions and define the full composition to be:

comp (λi.(x:A(i))B(i) x) [ϕu] a0=λx.comp (λi.B(i) v(i)) [ϕu v] (a0 x) \mathrm{comp}\ (\lambda i. \prod{(x : A(i))} B(i)\ x)\ [\phi \to u]\ a0 =\\ \lambda x. \mathrm{comp}\ (\lambda i. B(i)\ v(i))\ [\phi \to u\ v]\ (a0\ x\prime)

Note the light abuse of notation we use in the mathematics; More properly, the system of sides in the resulting composition would be written λi x.u(i)(x) v(i)\lambda i\ x. u(i)(x)\ v(i).

Pairs

Assume, we’re given an element p:(x:A)B xp : \sum{(x : A)} B\ x, and take x=p.1x = p.1 and y=p.2y = p.2. Just like in the case for dependent functions, AA is a line and BB is a dependent line; What we want is an element p:(x:A(i1))B(i1) xp\prime : \sum{(x : A(i1))} B(i1)\ x\prime for some xx\prime.

To define comp (λi.(x:A(i))B(i) x) [ϕu] p\mathrm{comp}\ (\lambda i. \sum{(x : A(i))} B(i)\ x)\ [\phi \to u]\ p, first define v(i)=fill A [ϕu.1] xv(i) = \mathrm{fill}\ A\ [\phi \to u.1]\ x, which is a line connecting xx and comp A [ϕu.1] x\mathrm{comp}\ A\ [\phi \to u.1]\ x. For the second element we’ll do the same thing as we did for dependent functions, and define y=comp (λi.B(i) v(i)) [ϕu.2] yy\prime = \mathrm{comp}\ (\lambda i. B(i)\ v(i))\ [\phi \to u.2]\ y. Then we can define composition as follows:

comp (λi.(x:A(i))B(i) x) [ϕu] p=(v(i1),y)\mathrm{comp}\ (\lambda i. \textstyle\sum{(x : A(i))} B(i)\ x)\ [\phi \to u]\ p = (v(i1), y\prime)


  1. At the time of writing, in the very early AM between Saturday and Sunday, the only thing missing is the implementation of composition for higher inductive types. However, this is mostly because I’m hella bored of writing code and wanted to write words instead. This way I can have more fun!↩︎

  2. If you, like me, are always confused by why aba \land b is min and aba \lor b is max, check out these Desmos links: min and max. Keep these in mind the next time you’re confused :)↩︎

  3. As a funny sidenote, the object in a category (if it exists) which corresponds to the type-theoretical universe of propositions is called the subobject classifier, written Ω\Omega. So [][] is a family of maps IΩω\mathbb{I} \to \Omega_{\omega}. If only we could fit another Ω\Omega in there…↩︎

  4. By convention we call the dependent versions of cubical primitives their name suffixed with a big P. PathP, PartialP, etc. Don’t ask me why.↩︎

  5. No, there is no reason to use the QED symbol here. It’s my blog, though!↩︎

  6. I know, I know, I have to stop. Did you know I had to add the word “exactly” there so the paragraph overflew onto the next line and the QED symbol would show up right? It’s terrible!↩︎

  7. In the implementation, this definition is unfolded↩︎

  8. Actually, the most complex part of Cubical Type Theory is the definition of composition for Glue\mathrm{Glue}, which is far too hardcore for a blog post, even for its appendix.↩︎