UofT ECE 467 (2022 Fall) - A Bit About Types



The website and starter code is being pre-emptively updated!

I don't know if I'll be teaching the course again, but if I wait until next year I'll forget feedback I've received. Note that more test cases have been added (if you pull the new starter code); those are for the (potential) future!


In the time before time, the great beings created languages like C, C++, and Java, where you said “int x”, and you had a variable “x” that was an int. Then, JavaScript, Python, and Ruby came along and said: “what if we didn’t know that x was an int?”. The Pythonistas, Rubyists, and, whatever the JavaScript developers were, laughed at the C programmers clumsily writing down the types of their variables. Then, time passed. Code grew. Humans… evolved. Dropbox ended up with over 4 million lines of untyped Python code in a single repository. In a spark of unprecedented, divine inspiration, some programmers said “what if we did know what a variable was?”. And so Python introduced type annotations. Ruby introduced type annotations. A whole new language TypeScript was invented to be JavaScript with types. The message of my lecture today is: please let the compiler help you, by choosing a language with a strong type system.

Introduction

A type describes a value. Intuitively, the type of a value defines what you can do with it. For example, in C++, if the value named by x is an int, it means you can multiply x with another value of type int using the multiplication (*) operator. If the value named by s is an std::string, you cannot multiply it with anything.

Often, for a given type, there are many possible values; there are over 4 billion 32-bit integers, and, barring physical memory limitations, there are infinitely many possible std::strings. Since it would be impossible to reason about every single possible value (as just mentioned - there are infinitely many possible std::strings), a compiler reasons about types. In fact, a type is a generalization of a value, and a single type defines a set of values. In some sense, values in the same set - of the same type - are the same. If x and y are ints, it doesn’t matter what their values are, by the fact that they are ints, the compiler knows you can always multiply them.

Internally, for optimizations, a compiler may use more fine-grained types than what is visible to the programmer. For example, given the following loops, a compiler may assign i a type like “[0, 10)”, and j a type like “[0, 20)”.

for (int i = 0; i < 10; i++) {
	for (int j = 0; j < 20; j++) {
		if (i * j >= 200) {
			printf("big\n");
		}
	}
}

At this point, it is (more) straightforward for the compiler to reason that i * j has type “[0, 200)”, and so printf never gets called.

Note that if the compiler were to reason specifically about individual values, it would more advanced/ad-hoc analysis to understand the different values that i and j could take on in the two loops.

In an extreme case, a compiler can assign a dedicated type for each unique value, and reason about them in an arbitrarily advanced/ad-hoc way.

At this point, hopefully I’ve convinced you that types are a powerful tool for compilers.

Exhibit: distinguishing between nullable and non-nullable pointers.

An entire class of bugs in C++ - null pointer dereferences - can be eliminated by replacing (nullable) pointers with (non-nullable) references. Unfortunately, due to language limitations, composing a nullable reference type using std::optional in C++ is incredibly verbose and looks like this: std::optional<std::reference_wrapper<int>>. While this code would be more robust (making it harder to use a nulled value), it’s unsurprising that most programmers end up using an int*.

A type system can describe more than values.

Many languages have achieved memory safety (e.g., among other things, preventing use of freed memory) using some form of automatic garbage collection. We may call this dynamic memory safety, since it involves checks and work at execution time. Rust is a relatively new programming language that achieves memory safety statically:

Roughly, this is accomplished by keeping track of lifetimes. A lifetime is a static region of code for which a value is valid - more or less the static scope of a variable.

For example, the equivalent Rust code is forbidden in the “obvious” way by inspecting the lifetime/scope of y, and comparing it to the lifetime/scope of x.

int* x = nullptr;
if (...) {
	int y = ...;
	x = &y;
}
printf("x is %d.\n", *x);

This example is “obvious” because it doesn’t require analyzing the interaction with other functions (it does not require inter-procedural analysis). In general, it is infeasible to reason about the interactions of different functions in a program. For example, it is impossible to know whether the following code is correct without seeing the implementation of foo.

int* x = nullptr;
if (...) {
	int y = ...;
	x = foo(&y); // foo returns a pointer
}
printf("x is %d.\n", *x);

Note that the following implementation of foo is not incorrect in itself, so it would be undesirable to forbid all such code.

int* foo(int* y) {
	return y;
}

Rust’s solution is to augment the declaration/signature of the function foo to indicate that the returned pointer has the same lifetime as the argument y, which in Rust may (verbosely) look like this.

fn foo<'a>(y: &'a i32) -> &'a i32 {
	return y;
}

The ampersand (&) denotes a pointer/reference type, and the apostrophe followed by an identifier ('a) introduces a named lifetime; the signature of foo says that the return value must have the same lifetime as the lifetime of the input y.

Note that foo can be validated independently; for example, the following would be forbidden because the local z does not have the lifetime of y.

fn foo<'a>(y: &'a i32) -> &'a i32 {
	let z: i32 = ...;
	return &z;
}

As well, the original code (shown now in Rust below) can be validated based on the signature of foo, without looking at the implementation of foo.

let x: Option<&i32> = None; // nullable pointer
if (...) {
	let y: i32 = ...;
	x = Some(foo(&y)); // the return value of foo has the same lifetime as y, which is shorter than the lifetime of x; forbidden!
}
println!("x is {:?}.", x);

Lifetimes are types.

Lifetimes are types, and as the angled brackets suggests, they can be generic parameters. Strictly speaking, two different calls to foo (as defined previously) are two separate instantiations of a generic function, for which the generic types may be different.

To illustrate, the “same thing” is going for the two calls to foo and the two calls to bar.

// Copied from above.
fn foo<'a>(y: &'a i32) -> &'a i32 {
	return y;
}

fn bar<T>(y: T) -> T {
	return y;
}

fn main() {
	if (...) {
		let x: i32 = 0;
		let y: u32 = 0;
		foo(&x); // foo instantiated and called with lifetime of x
		bar(y); // bar instantiated and called with type u32
	} else {
		let z: i32 = 1;
		let s: String = ...;
		foo(&z); // foo instantiated and called with lifetime of z
		bar(s); // bar instantiated and called with type String
	}
}

Subtyping

Consider the following function in Rust, which returns (a reference to) the longer of two strings.

fn longer(s1: &str, s2: &str) -> &str {
	if s1.len() >= s2.len() {
		s1
	} else {
		s2
	}
}

The above code does not compile in Rust, because the function declaration/signature does not (unambiguously) indicate what the lifetime of the returned reference is. Instead, we could write the following, which shouldn’t seem too controversial.

fn longer<'a>(s1: &'a str, s2: &'a str) -> &'a str {
	if s1.len() >= s2.len() {
		s1
	} else {
		s2
	}
}

However, suppose we want to call it like this.

let s1: String = "hello".to_owned();
if (...) {
	let s2: String = "world".to_owned();
	println!("{}.", longer(&s1, &s2)); // a &String coerces to a &str
}

Note that the scope/lifetime of s1 is (strictly) larger than the scope/lifetime of s2. Since they are different lifetimes, they are different types; the call to longer doesn’t match it’s signature.

An idea that is common in object-oriented programming, but more general than just OOP, is that if U is a subtype of T, then a value of type U can be used wherever a value of type T can be used. Intuitively, a subtype (despite being prefixed with “sub”) is “more” than the super type; a Cat does everything an Animal can, and more.

The lifetime of s1 encompasses the lifetime of s2; it is more. So the compiler will implicitly choose the lifetime of s2 for 'a, and s1 has a lifetime which is a subtype of s2’s/'a, so it can be used for the call.

And then, validation can happen as before; for example, the following is forbidden, because the return value has the lifetime of s2, which is shorter than the lifetime of s3.

let s1: String = "hello".to_owned();
let s3: &str;
if (...) {
	let s2: String = "world".to_owned();
	s3 = longer(s1, s2); // too short
} else {
	s3 = ...;
}
println!("{}.", s3);

Variance

One may have noticed some subtle handwaving. Specifically, a lifetime is a type, and may be a subtype of another lifetime. In general, values of a type may be used where a value of a supertype can be used. However, the values being passed to functions like longer (above) are not lifetimes; they are references. We can think of the ampersand (&) as a “constructor” for a reference type, which consists of a lifetime type, and a “usual”/data type. A type constructor creates/defines a new type based on existing types. In general, there may or may not be a subtyping relation between different types created from the same type constructor using different type parameters. If T is a type constructor that takes a type parameter U, then we have the following definitions.

To see the difference between covariant and contravariant, let us consider function type constructors. It turns out that the signatures of functions you already know to reason about are in fact just types (you may have also seen or worked with “function pointers”). Suppose we need a function (an actual, concrete function - i.e. a value (practically, a pointer value)) that returns an Animal. If we are given a function that happens to always return Cats, we can still call this function and we are still getting a subtype of Animal back. So by the previous definitions, function types are covariant in their return values.

On the other hand, suppose we need a function that takes an Animal as a parameter. We cannot substitute it with a function that takes a Cat as a parameter instead, because the function taking a Cat cannot be used in all places the function taking an Animal can. Intuitively, the function that accepts any Animal is more general than the function that only takes Cats. So function types are contravariant in their return values.

To understand invariance, let us consider mutable pointer types. What a pointer type T* (in C++ syntax) says we can do with a pointer value is read and write values of type T from the pointer. Consider the following C++ code.

void takes_mammal(Mammal* m) {
	writes_cat(m);
	reads_dog(m);
}

A general pointer type T* cannot be contravariant in T, otherwise both Mammal* would be a subtype of both Cat* and Dog*. Then, it would be valid to pass a value of type Mammal* in place of a value of type Cat* or Dog*, and the above code could write a Cat value to the pointer but try to read a Dog value back.

Similarly, the following code demonstrates why mutable pointers cannot be covariant in the pointee type.

void takes_car(FourWheelDrive* c) {
	writes_front_wheel_drive(c);
	reads_back_wheel_drive(c);
}

Note that here, FourWheelDrive is assumed to be a subtype of both front and back wheel drives, since given a value that is a four-wheel drive, it can certainly do everything a front- or back-wheel drive can. (In the previous example, Mammal was a supertype of Cat and Dog.)

Therefore, pointer types are invariant in the pointee type.

Traits: types for your types.

In C++, templates are, and can only be, validated when they are instantiated (used). For example, the following function definition is perfectly valid.

template<typename T> T add(T x, T y) {
	return x + y;
}

Then, nothing says that the following call is disallowed.

std::vector<int> v;
add(v, v);

It is only after the compiler tries creating the function add with T = std::vector<int> - doing the substitution - that it can realize that the resulting code tries to add two vectors (which is not defined).

Ultimately, this is the cause for ugly C++ compiler errors around templates; the C++ compiler can only detect the error after performing all the substitutions for a template - which may be deep inside nested templated instantiations.

Rust avoids this by requiring trait bounds on their generics. The equivalent definition for add in Rust must look like the following.

fn add<T: Add>(x: T, y: T) -> T {
	x + y
}

Specifying Add on the generic parameter T accomplishes two things. Firstly, when add is called, the compiler can ensure that the input type T really satisfies the Add requirement, without needing to instantiate add’s body to know that it will be necessary. Then, the function body of add can be validated assuming the values x and y (of whatever type T ends up being) really can be added (+) together.

In the same way types generalized values by grouping together values that “behaved the same”, traits generalize types by grouping together types that “behave the same”.

Metaprogramming with types.

Both C++ and Rust’s type systems and templates/generics are powerful enough to represent integers. Here is an implementation of unsigned integers using C++ templates. We can even use it to test the Collatz conjecture!

struct Bit0 {
	static unsigned int const VALUE = 0;
};
struct Bit1 {
	static unsigned int const VALUE = 1;
};

// "Car" represents the least significant bit.
// "Cdr" represents the rest of the bits/integer.
template<typename Car, typename Cdr> struct Integer {
	static unsigned int const VALUE = (Cdr::VALUE << 1) | Car::VALUE;
};

using Zero = Bit0;
using One = Integer<Bit1, Zero>;
using Two = Integer<Bit0, One>;
using Three = Integer<Bit1, One>;
using Four = Integer<Bit0, Two>;
using Five = Integer<Bit1, Two>;
using Six = Integer<Bit0, Three>;
using Seven = Integer<Bit1, Three>;
// and so on...

static_assert(Seven::VALUE == 7);

template<typename Lhs, typename Rhs> struct Add {};

template<typename Lhs, typename Rhs> using Sum = typename Add<Lhs, Rhs>::Output;

template<> struct Add<Zero, Zero> {
	using Output = Zero;
};

template<typename Rhs> struct Add<Zero, Rhs> {
	using Output = Rhs;
};

template<typename Lhs> struct Add<Lhs, Zero> {
	using Output = Lhs;
};

template<typename Lhs, typename Rhs> struct Add<Integer<Bit0, Lhs>, Integer<Bit0, Rhs>> {
	using Output = Integer<Bit0, Sum<Lhs, Rhs>>;
};

template<typename Lhs, typename Rhs> struct Add<Integer<Bit0, Lhs>, Integer<Bit1, Rhs>> {
	using Output = Integer<Bit1, Sum<Lhs, Rhs>>;
};

template<typename Lhs, typename Rhs> struct Add<Integer<Bit1, Lhs>, Integer<Bit0, Rhs>> {
	using Output = Integer<Bit1, Sum<Lhs, Rhs>>;
};

template<typename Lhs, typename Rhs> struct Add<Integer<Bit1, Lhs>, Integer<Bit1, Rhs>> {
	using Output = Integer<Bit0, Sum<Sum<Lhs, Rhs>, One>>;
};

static_assert(Add<Three, Four>::Output::VALUE == Add<Two, Five>::Output::VALUE);

template<typename Lhs, typename Rhs> struct Mul {};

template<typename Lhs, typename Rhs> using Product = typename Mul<Lhs, Rhs>::Output;

template<typename Rhs> struct Mul<Zero, Rhs> {
	using Output = Zero;
};

template<typename Lhs, typename Rhs> struct Mul<Integer<Bit0, Lhs>, Rhs> {
	using Output = Integer<Bit0, Product<Lhs, Rhs>>;
};

template<typename Lhs, typename Rhs> struct Mul<Integer<Bit1, Lhs>, Rhs> {
	using Output = Sum<Integer<Bit0, Product<Lhs, Rhs>>, Rhs>;
};

static_assert(Product<Two, Six>::VALUE == Product<Three, Four>::VALUE);

template<typename N> struct Collatz {};

template<> struct Collatz<One> {
	using Output = One;
};

template<typename M> struct Collatz<Integer<Bit0, M>> {
	using Output = typename Collatz<M>::Output;
};

template<typename M> struct Collatz<Integer<Bit1, M>> {
	using Output = typename Collatz<Sum<Product<Integer<Bit1, M>, Three>, One>>::Output;
};

using BigNum = Sum<One, Product<Two, Product<Three, Product<Four, Product<Five, Product<Six, Seven>>>>>>;

static_assert(Collatz<BigNum>::Output::VALUE == 1);

Note that, to my knowledge, the equivalent code cannot be implemented in Rust today. However, it can be debated exactly where the limitation lies (unfortunately I won’t be elaborating - I would guess I’ve been confusing enough with more important things to take away).


Last updated: 2022-12-23 09:56:36 -0500.