A Different Way To Introduce Pointers

By Clemens Tiedt

2021-08-21

Pointers seem to be a topic that many aspiring programmers have issues with. One of the reasons may be that they don't exist in higher-level language like Python or Java where you can just pass around objects and never have to think about whether you hold the object itself or a pointer to it¹.

In lectures, pointers are often introduced with a code example like this:

void main()
{
    int x = 41;
    int *y = &x;

    (*y)++;

    if (x == 42)
    {
        printf("Look, we changed x without touching it directly!\n");
    }
}

And this could give you a pretty good idea of which basic operations you can do with pointers:

You can create one by taking the address of another variable (referencing)
You can use the value at the address being pointed to (dereferencing)

However, if you aren't already familiar with pointers, you might wonder why this is necessary. So, I'll try to introduce pointers from a slightly different point of view by using linked lists.

Linked Lists, Pointers, and Rust

At this point, we're switching to Rust for our code examples. The main two reasons for this are that Rust supports sum types (you'll see what these are and why they help with linked lists in a moment) and that the Rust compiler actually gives us helpful errors.

So, let's build ourselves a linked list.

In case you're unfamiliar, linked lists are pretty simple: Each node contains a value and then a reference to the next node. Well, it's not quite that simple - we need some kind of marker for the last node in the list. As I hinted a few sentences ago, this is where sum types are useful to us. A sum type has a finite number of possible values and a variable of a sum type is always exactly one of these values. For our Node type, we want two possible options: One for "normal" nodes with a value and a successor and a second option to mark the end of our list. In Rust code, this might look like this:

enum Node {
    Value(u32, Node),
    End,
}

Let's write some code that just creates some lists:

fn main() {
    // A List with just one element
    let l1 = Node::Value(0, Node::End);

    // A List with multiple elements
    let l2 = Node::Value(0, Node::Value(1, Node::Value(2, Node::Value(3, Node::End))));

    // We can even create empty lists
    let l3 = Node::End;
}

Now, if you're familiar with Rust (or have seen a linked list implementation in C), you're probably getting skeptical at this point. So what happens if we try to compile this code?

error[E0072]: recursive type `Node` has infinite size
 --> bad_ll.rs:1:1
  |
1 | enum Node {
  | ^^^^^^^^^ recursive type has infinite size
2 |     Value(u32, Node),
  |                ---- recursive without indirection
  |
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `Node` representable
  |
2 |     Value(u32, Box<Node>),
  |                ^^^^    ^

error[E0391]: cycle detected when computing drop-check constraints for `Node`
 --> bad_ll.rs:1:1
  |
1 | enum Node {
  | ^^^^^^^^^
  |
  = note: ...which again requires computing drop-check constraints for `Node`, completing the cycle
  = note: cycle used when computing dropck types for `Canonical { max_universe: U0, variables: [], value: ParamEnvAnd { param_env: ParamEnv { caller_bounds: [], reveal: UserFacing }, value: Node } }`

error: aborting due to 2 previous errors

Some errors have detailed explanations: E0072, E0391.
For more information about an error, try `rustc --explain E0072`.

Huh, so rustc complains. Let's see what it tells us: recursive type `Node` has infinite size.

Data Type Sizes

Like C, Rust is a statically typed language. So, whenever you create a variable, Rust knows the type and reserves a certain amount of memory for it. Let's look at some examples:

let x: u16 = 0;
let y: u32 = 0;
let z [u32; 4] = [0, 1, 2, 3];

The variable x is of type u16. That means, it uses 16 bits (or two bytes) of memory. y on the other hand is a u32, meaning that it uses double that amount, four bytes. Lastly, z is an array of four u32 elements meaning that it will use 4 * 4 = 16 bytes. To check our (admittedly very simple) calculations, we can use std::mem::size_of_val:

dbg!(std::mem::size_of_val(&x)); // [src/main.rs:6] std::mem::size_of_val(&x) = 2
dbg!(std::mem::size_of_val(&y)); // [src/main.rs:7] std::mem::size_of_val(&y) = 4
dbg!(std::mem::size_of_val(&z)); // [src/main.rs:8] std::mem::size_of_val(&z) = 16

Calculating the size of a struct is a bit more difficult:

struct S {
    x: u16,
    y: u16,
}

struct T {
    x: u16,
    y: u32,
    z: [u32; 4],
}

By adding together the sizes of the individual fields, S should have a size of 4 bytes and T 22 bytes. However, std::mem::size_of (which is just like size_of_val, except it takes a generic type instead of an instance of a type) seems to disagree:

dbg!(std::mem::size_of::<S>()); //[src/main.rs:21] std::mem::size_of::<S>() = 4
dbg!(std::mem::size_of::<T>()); //[src/main.rs:22] std::mem::size_of::<T>() = 24

S is 4 bytes big as we guessed, but T somehow gained 2 bytes of size. The reason for this is that by default Rust will align structs. To understand this, let's look at what an even simpler struct U would look like in memory:

struct U {
    x: u8,
    y: u16,
}

Byte	0	1	2	3
Content	x	(empty)	y	y

This struct would have a size of 3 bytes, but that is not optimal for the CPU. The CPU can access data more quickly if the data starts at an address that is a multiple of the size of the data type. More specifically, we could fit an instance of U into 3 bytes, but then the 16 bit value y would start at an 8 bit offset from the start of the struct. So, the compiler inserts some empty space to align the data properly².

Okay, so now we know how to calculate the size of a struct - we're getting closer!

So, how do we calculate the size of an enum? Again, let's look at examples:

enum E {
    A,
    B,
    C,
}

enum F {
    A(u16),
    B(u16),
    C,
}

enum G {
    A(u16),
    B(u32),
}

The results for these are:

dbg!(std::mem::size_of::<E>()); // [src/main.rs:45] std::mem::size_of::<E>() = 1
dbg!(std::mem::size_of::<F>()); // [src/main.rs:46] std::mem::size_of::<F>() = 4
dbg!(std::mem::size_of::<G>()); // [src/main.rs:47] std::mem::size_of::<G>() = 8

As you can see, our simplest enum uses just one byte of memory - it works just like an enum in C where you are essentially just defining numeric constants. If we had an enum with more than 256 variants, Rust would just use one more byte to represent them³.

F uses 4 bytes of memory - with the knowledge about alignment, you should be able to guess why. A variant of this enum can contain up to two bytes of data and we need some way to represent the variant (as we saw before, one byte is sufficient here). So, with alignment we end up with a total size of four bytes.

G is more of the same - just that we need to reserve more memory, because a variant can store up to a u32. Add in alignment and we end up with a total of 8 bytes.

Weren't we talking about linked lists?

So, we have learned quite a bit about the sizes of data types and memory alignment. But as you may remember, we were wondering why the following code doesn't compile:

enum Node {
    Value(u32, Node),
    End,
}

With our new knowledge we should now be able to calculate the size of a Node. Since we have two variants, we need just one byte to represent them. Then, we can save a u32, so that puts us at 4 + 1 = 5 bytes which becomes 8 bytes with alignment. And then, we can also put another Node in there, so add another 8 bytes? Except that Node can contain another Node and so on.

That explains why rustc was unhappy. Node is a self-referential type that could potentially grow to an infinite size. Of course, it doesn't do that in our example code, but that's only because we wrote well-behaving code. However, rustc actually told us how to solve this:

help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `Node` representable

Yeah, this article was going to be about pointers. All of the suggestions that rustc makes here are some flavor of pointer. But how do they solve our issue with an infinite-sized type?

As I wrote at the beginning of this article, a pointer is just an address in memory. I'm using a 64 bit CPU, so my pointers are 64 bits (8 bytes) in size. So, if we replace the Node in the Node::Value variant with some sort of pointer to a Node, we're effectively just storing a 64 bit number. The next Node could be at a completely different position in memory, we just have to know its address.

This, however, leaves us with another question: What kind of pointer should we use here?

The Majestic Tale of a Madman with a `Box<T>`

This section is more specific to Rust. You can skip it if you're less interested in Rust's pointer types.

The first kind of pointer rustc suggests is a Box. This is Rust's version of a heap-allocated pointer. Using a Box, our solution looks like this:

enum Node {
    Value(u32, Box<Node>),
    End,
}

fn main() {
    // A List with just one element
    let l1 = Node::Value(0, Box::new(Node::End));

    // A List with multiple elements
    let l2 = Node::Value(0, Box::new(Node::Value(1, Box::new(Node::Value(2, Box::new(Node::Value(3, Box::new(Node::End))))))));

    // We can even create empty lists
    let l3 = Node::End;
}

As you can see, this is pretty verbose, but it does the job. Every time we create a Box, Rust allocates the memory we need for a Node on the heap. What this means is that a Node owns its successor. This gives us full control over what we can do with it.

The code for an Rc looks almost identical to that using a Box. Just like a Box, an Rc allocates memory on the heap, the difference being that an Rc allows shared ownership of the value it contains. This doesn't make a lot of sense unless you understand Rust's ownership concept, so we can safely ignore it at this point.

The last suggested pointer type is a &Node, a reference (or a borrow) as Rust calls it. The code for this version looks as follows:

enum Node<'a> {
    Value(u32, &'a Node<'a>),
    End,
}

fn main() {
    // A List with just one element
    let l1 = Node::Value(0, &Node::End);

    // A List with multiple elements
    let l2 = Node::Value(0, &Node::Value(1, &Node::Value(2, &Node::Value(3, &Node::End))));

    // We can even create empty lists
    let l3 = Node::End;
}

As you can see, we had to add a lifetime to our Node. This is because of Rust's safety rules. Storing a reference means that the specific value lives somewhere else (as all pointers do). As long as the value that this reference is pointing to is valid, everything is fine. However, if the value was freed while we still hold a reference, we would run into trouble. So, we need to say that the lifetime of the reference is at least as long as the lifetime of our Node. We do this by defining a lifetime 'a and assigning it to both. Since the Node type has a lifetime now, we also need to adjust the reference from &'a Node to &'a Node<'a>. What this means is "A Node::Value contains a reference to another Node that lives at least as long as the Node::Value containing it. Any inner Nodes also lives at least as long as the outermost one". Phew, that was quite a lot of words for such a simple idea.

Using a reference here can also lead to some other problems down the line. For example, this code is perfectly fine:

fn create_list<'a>() -> Node<'a> {
    Node::Value(0, &Node::Value(1, &Node::End))
}

But this code isn't:

fn create_list<'a>() -> Node<'a> {
    Node::Value(0, &Node::Value(1, &mut Node::End))
}

This seems to be the case because a list of immutable references can use the static lifetime (meaning the reference is valid for the entire runtime of the program), but static mutable values are considered unsafe. If the Node contains an owned pointer like a Box, this doesn't happen as the Box<Node> lives as long as the Node containing it.

Stack and Heap

In the last section, I mentioned the heap and concepts like "owned pointers", so I believe I owe you an explanation.

Conceptually, the memory owned by a process is divided into two areas: The stack and the heap. The stack grows down from the high addresses and the heap grows up from the low ones. We can actually see this for ourselves with a little example program:

#include <stdio.h>
#include <stdlib.h>

int main()
{
        int x = 42;
        int y = 67;

        int *p = malloc(sizeof(int));
        int *q = malloc(sizeof(int));

        printf("&x = %p\n", &x);
        printf("&y = %p\n", &y);
        printf(" p = %p\n", p);
        printf(" q = %p\n", q);

        free(p);
        free(q);

        return 0;
}

On my machine this produces the following output:

&x = 0x7fffcaeb926c
&y = 0x7fffcaeb9268
 p = 0x7fffc3d3a010
 q = 0x7fffc3d3a030

You can see how x and y both have higher addresses since they're located on the stack. Also, the difference between their addresses is four bytes - exactly the size of an int in C!

On the other hand, we used manual memory management to create p and q, so they live on the heap. As you can see, their addresses are in a lower range and the address of q is higher than that of p. The pointers we get are also 32 bytes apart. This seems to be about alignment again, but the documentation for malloc does not provide a clear answer about how it chooses where to reserve our memory.

You should also see how I had to manually free p and q. The stack is automatically managed for us - each function pushes its local variables onto it and when it exits, the stack pointer is just reverted to where it was before entering the function. This means, however, that a function can't return a pointer to something that lives on the stack. Since the stack is cleaned up after the function exits, that pointer would be invalid. So, if you have a function that creates some object and returns a pointer to it, you will need to put it on the heap. This way it can outlive the function it was created in⁴, but you are responsible for freeing it once it's no longer needed.

Rust does away with most of this manual memory management. You still need to think about where a value lives, but you can't forget to free a Box<T> since Rust automatically figures out when it can free values at compile time. When we talk about a type in Rust "owning" its memory, that usually means it's heap-allocated. This is the case with pretty much all collection types like String, Vec or HashMap. They need some form of memory management under the hood because they can have a dynamic size.

As you should be able to recognize now, this last sentence was not quite correct. Vec<T> has a fixed size, it just contains a pointer to a dynamic amount of heap memory. This is in contrast to the array type [T; N] that has the amount of memory it contains baked into the type signature.

With this new information in mind, we can safely say that for our simple linked list, Box<T> is the right choice. If we wanted to construct a linked list in some function and use it somewhere else, we would quickly run into trouble with references, so owning our data seems like a good idea. And since Rust manages memory for us, we can't even run into issues like forgetting to free our Box.

Wrapping up

So, we managed to build a working linked list and learn about how data types are represented in memory along the way. To summarize: Pointers are useful to build self-referential data types because they have a fixed size. Also, there are different kinds of pointers - Rust differentiates between owned and borrowed ones. An owned pointer can be moved and manipulated as we want, whereas there are more strict rules around borrowed values. However, borrowing is usually more memory-efficient as creating an owned pointer will always allocate memory.

There are of course many more things you can do with pointers. For example, arrays in C are really just pointers and there's the whole subject of pointer arithmetic that we skipped here. But I hope this gave you some idea of why pointers are useful and not something you need to fear. If you want to read some more about this topic, I recommend "Learn Rust With Entirely Too Many Linked Lists".

Java kind of uses pointers internally, but not in a way that you could use them in your program. That's a good thing, because Java also uses garbage collection to manage memory which could be quite the footgun if combined with pointers. ↩
If you need to save all the memory you can, you can add #[repr(packed)] to the struct, but this will lead to slower access times. For more information on data representation, read this page. ↩
But if you have an enum this size, you should probably rethink your design. ↩
If you're familiar with Rust's lifetimes you might recognize this sort of language - lifetimes do exist in C, Rust just makes them explicit! ↩

A Soil VM for the Linux Kernel

One of the Linux kernel features that have gained the most traction in the last few years is probably (e)BPF. Originally, the "Berkeley Packet Filter" was intended as a means of filtering network packets in kernel mode. However, BPF quickly developed into a fully-featured VM used for all kinds of purposes. The appeal of BPF is not hard to see: It allows you to load kernel mode code at system runtime (similar to kernel modules) while keeping some degree of sandboxing and fault tolerance afforded by the VM. It is much more difficult to break your kernel with a BPF program than with a regular kernel module. One of the most prominent current users of BPF is sched_ext, a framework for writing scheduler implementations in BPF. This lets you easily tinker with your scheduler and see results live and without the risk of breaking your kernel if your implementation crashes....

2024-09-05 #soil #c #linux #kernel

Share this article: A Different Way To Introduce Pointers