Types, morphisms, and concepts in C++

03 Dec 2016

C++ is a marvelous language that allows one to express ideas in code. An idea is represented by a type (also known as class in C++). You cannot see an idea, but you can see its realizations in objects. Union of objects forms an idea. Type templates are morphisms of ideas. Intersection of types defines a concept.

Set theory turns out to be very well fitted for describing the foundations of C++. In this post, we will look at a variety of language constructs, including types, morphisms, and concepts, through the prism of set theory. You will see how beautiful, simple, and flexible these language facilities are, and how easy it is to write superb generic code once you understand the fundamental principles.

Types

We will start our journey with the basic building blocks of every C++ program: types and objects.

A type $T$ (also referred to as class in C++) is a set of objects $V$ (also referred to as values) together with a set of functions $F$ from $V$ and to $V$.

For an explanation why this definition makes sense, refer to Types in C++ as sets of objects. What is important for us here is that objects are immutable entities. The only thing you can mutate in a program is a variable, and you are mutating a variable by placing a different object inside it.

Type `Point`

The following little program defines and uses type Point.

#include <iostream>

struct Point {
  double x;
  double y;
  Point() : x{0}, y{0} {}
  void move_to(double a, double b) {
    x = a;
    y = b;
  }
};

void print(Point p) {
  std::cout << p.x << " " << p.y << std::endl;
}

int main() {
  Point p{};        // (1) default constructor
  p.move_to(4, 6);  // (2) member function
  print(p);         // (3) non-member function
}

`Point` objects

Every object of type Point (or $T$ for short) is uniquely identified by two numbers of type double. Therefore, the set of all objects of type Point is the Cartesian product of two copies of the set of real numbers, $V = \mathbb{R} \times \mathbb{R}$.

Type double is a finite-precision representation of the field of real numbers. I prefer mathematical notation, so I will often write $\mathbb{R}$ instead of double; it should not cause confusion.

`Point` functions

I want to emphasize that objects do not possess functions. We put some functions inside a type declaration for convenience, but those functions exist independently of objects. So, conceptually, there is only one function void Point::move_to(double, double) in our example, and all objects will call that function when they need it.

You may argue that void Point::move_to(double, double) will be inlined because it is defined inside the type declaration. You are correct, it will be inlined, but it will belong to the variable which holds an object and not to the object itself. So, when you are defining inline functions, you are making the box into which you are putting an object bigger and smarter. Bigger because it contains the code of the inline functions; smarter because it can do something with the object directly.

We have explicitly defined three functions. Let’s carefully specify domain and codomain of each of them.

Point::Point() is the default constructor. It seemingly has no arguments, but we should rather say that it takes a unit set as its input (read about Types and functions to learn why it is a good idea). We will denote the unit set and its element by $1$. What is the output of the constructor? It is an object of type Point—namely, the default Point. So, we have a well-defined function $f : 1 \rightarrow T$ that acts on the unit set $1$ and returns a fixed object $t \in T$. Although you might find a function that always returns the same object quite boring, the constructor plays a very important role.
void Point::move_to(double, double) is a member function. It seemingly produces no output; however, we should rather say that it returns an object $x \in T$. The reason is the following. Every member function of a type can only be called using a variable that holds an object of that type, e.g. p.move_to(4, 6);. Implicitly, a pointer to the variable is passed to the member function, such that the member function can put a new object into the variable. If we call this function $g$, then $g : \mathbb{R} \times \mathbb{R} \rightarrow T$.
void print(Point p) is an ordinary function that gets a Point as its input and returns a singleton (by our convention, void is the same as $1$). What is interesting about this function is that it is not a pure function—it changes something in the global state of the program; in particular, it prints some numbers on the screen. Such functions are somewhat messy to deal with if we insist on strictly following our functional approach. On the other hand, it doesn’t hurt too much to allow functions to do things on the side, provided that they don’t mess with the objects in our own code. Then, if we allow such side effects, we can write $h : T \rightarrow 1$.

All these functions belong to the set $F$ (i.e., not only member functions belong to a type). Thus, we have a complete definition of the type $T = (V, F)$; it consists of the set of objects $V = \mathbb{R} \times \mathbb{R}$ and the set of functions $F = \left\{ f, g, h \right\}$.

I have ignored some details to simplify exposition of the core ideas. For example, every type gets five implicitly generated functions by default, and they should all be included in $F$—not just the constructor that we explicitly defined. I also didn’t say a word about pointers and references, although they are very important in C++ programming. I omitted them because they do not add anything new to the discussion. Once you learn to discern between an object and a variable, you realize that pointers and references are simply different ways to refer to a variable, therefore they do not affect our discussion of objects and types.

Morphisms

The notion of a function is a very general one. If desired, you can even view sets as functions in a one-object category. We will not go that deep into category theory, but we will borrow some terminology. Certain mappings will be called morphisms in order to discern them from ordinary functions in C++.

In the previous section, we have seen several examples of functions already. In general, we can say that every function in C++ is a map from a finite Cartesian product of types to a type. (As already mentioned, we silently ignore side effects.) Symbolically,

$\begin{equation} \label{fun} f : T_1 \times \cdots \times T_n \rightarrow T. \end{equation}$

Till now, we only looked at functions like \eqref{fun}, that act on types. Now it’s time to ascend to the next level.

Type templates

Consider the class $\mathbb{T}$ of all types $T$. You may take a moment to meditate on how big it is. It encompasses all imaginable types! Previously, we said that an object is an element of a type, $x \in T$. Now, types themselves are elements of an even bigger entity, $T \in \mathbb{T}$.

A type template is a morphism $$ \begin{equation} \label{type_temp} \phi : T \times \mathbb{T} \rightarrow \mathbb{T}. \end{equation} $$

A type template is a morphism between types that can optionally take objects as arguments. Morphisms can be composed.

In C++, one has to discern between compile-time computations and run-time computations. Morphisms are executed at compile time; functions are executed at run time. That is why there are certain restrictions on template parameters and template arguments. See Templates as partial evaluation for more information.

`Point<float>`

A t-t (type -> type) morphism $\lambda : \mathbb{T} \rightarrow \mathbb{T}$. You may omit an argument in \eqref{type_temp}.

template <typename T>
struct Point {
  T x;  // use type parameter
  T y;
  Point() : x{0}, y{0} {}
  void move_to(T a, T b) {  // use type parameter
    x = a;
    y = b;
  }
};
// ...
int main() {
  Point<float> p{};  // pass type argument
  // ...
}

`Point<float, 1, 2>`

A too-t (type, object, object -> type) morphism $\lambda’ : \mathbb{T} \times T \times T \rightarrow \mathbb{T}$. You may add an argument in \eqref{type_temp}.

template <typename T, int k, int m>
struct Point {
  T x;
  T y;
  Point() : x{k}, y{m} {}  // use object parameters
  // ...
};
// ...

Function templates

Consider the class $\mathbb{F}$ of all functions $f$ of the form \eqref{fun}.

A function template is a morphism $$ \begin{equation} \label{fun_temp} \psi : T \times \mathbb{T} \rightarrow \mathbb{F}. \end{equation} $$

`f<int>(3.14)`

A t-f morphism $\mu : \mathbb{T} \rightarrow \mathbb{F}$. You may omit an argument in \eqref{fun_temp}.

template<typename T>
void f(T x) {
  std::cout << x << std::endl;
}
int main() {
  std::cout << f<int>(3.14) << std::endl;
}

`f<2>(3.14)`

A o-f morphism $\mu’ : T \rightarrow \mathbb{F}$. You don’t have to provide a type argument in \eqref{fun_temp}.

template<int i>
float f(float x) {
  return i*x;
}
int main() {
  std::cout << f<2>(3.14) << std::endl;
}

Object templates

An object template is a morphism $$ \begin{equation} \label{obj_temp} \chi : T \times \mathbb{T} \rightarrow T. \end{equation} $$

I purposefully call it object template (instead of using the official name “variable template”) in order to keep the separation between objects and variables clear.

`pi<int>`

A t-o morphism $\nu : \mathbb{T} \rightarrow T$. You may omit an argument in \eqref{obj_temp}.

template<typename T>
constexpr T pi = T(3.1415926535897932385);

`pi_times<2>`

A o-o morphism $\nu’ : T \rightarrow T$. You don’t have to provide a type argument in \eqref{obj_temp}.

template<int i>
constexpr float pi_times = i*pi<float>;
int main() {
  std::cout << pi_times<2> << std::endl;
}

The last example shows that the boundary between functions and morphisms (in the sense in which they are used here) is nominal. It is up to you to decide when to perform computations: during compile time or during run time.

Concepts

A concept is a set of constraints on a type. Each concept is a predicate, evaluated at compile time, $$ \begin{equation*} C : \mathbb{T} \rightarrow \left\{0, 1\right\}. \end{equation*} $$

Concepts allow one to express requirements on types. In a sense, they do what abstract classes do, but in a more elegant way. This is an extremely powerful abstraction. If you are interested, you can learn more here, here, and here.

Boris Belousov

Types, morphisms, and concepts in C++

Types

Type Point

Point objects

Point functions