Don't panic! - Or should you?

By Clemens Tiedt

2023-03-12

If you came to Rust from a more conventional programming language, you probably had to adjust to its error handling. Many languages use exceptions for error handling and if you are doing a fallible operation, it looks something like this:

try:
    fallible_operation()
except Exception:
    # Handle the error
    # (Or often just ignore it)

Rust, on the other hand takes a page from the functional programming book and makes its errors values. So, the Python example from above would look something like this:

match fallible_operation() {
    Ok(v) => {
        // Do something with the value `v`
    }
    Err(e) => {
        // Handle the error
    }
}

But the point of this article is not to discuss Result and Option. There are already a number of sources you can refer to (including the book). No, the errors we have discussed so far are the "nice" kind that can be handled. Rust has another class of errors that you were probably introduced to as unrecoverable: Panics.

What is a panic?

A panic is a fatal error, one that you cannot catch and handle. Generally, a panic occurring means that something went so wrong that the most reasonable thing to do is to shut down your program. Most functions in the Rust standard library will only have recoverable errors, but you can easily turn a recoverable error into a panic: By unwrapping it. Here is the implementation of Option::unwrap.

pub const fn unwrap(self) -> T {
    match self {
        Some(val) => val,
        None => panic("called `Option::unwrap()` on a `None` value"),
    }
}

As you can see, it will return a T or call the panic macro. This is great if you know that your Option is Some, because then you don't have to handle a None case you know will not happen.

Something we have not talked about yet is that not all panics are equal. Rust supports two modes of panicking: Unwinding and aborting. The latter means just exiting the process and basically leaving the OS to clean up after you. It is in some way the nuclear option, even considering you are already panicking. On the other hand, there is Stack Unwinding. This means that Rust will travel back up the function stack and do its own cleanup. And this is where it gets interesting because as it turns out, we can catch an unwinding panic!

To catch a panic

If you look in the std::panic module, you will find a number of interesting functions. For now, let's start with catch_unwind. If we have a panicking function, we can call it inside catch_unwind and turn the panic into a Result:

fn main() {
    let res = std::panic::catch_unwind(|| {
        panic!("Oh no!");
    });

    match res {
        Ok(()) => println!("Nothing happened"),
        Err(e) => println!("A panic occurred: {e:?}")
    }
}

If you run this code, you will see a panic message, but the process will still exit with a code of zero. The match statement at the end will inform you rather unhelpfully that the error was Any {..}. This is Rust's version of dynamic typing and if we know the specific type it's supposed to be, we can in fact downcast it to get the info back. If we do downcast the error to &str (by calling e.downcast::<&str>().unwrap()), it will get properly printed. Of course, this begs the question: Can we panic with other values? The answer is yes, using the panic_any function!

fn main() {
    let res = std::panic::catch_unwind(|| {
        std::panic::panic_any(42);
    });

    match res {
        Ok(()) => println!("Nothing happened"),
        Err(e) => {
            let e = e.downcast::<i32>().unwrap();
            println!("A panic occurred: {e:?}");
        }
    }
}

This version of our program will tell us that it panicked with an error of 42 - in fact, it doesn't have to be an error, just Any type that is Send (i.e. can be transferred between threads).

The worst slice length function ever

To recap, we now know how to panic with an almost arbitrary value and how to catch this panic and its value. Can we build anything useful with this? Well, "useful" is a difficult word, but we could try to determine a slice's length by trying to index it until the operation panics. Let's start with a skeleton of our function:

fn inefficient_len<T>(data: &[T]) -> usize {
  let mut len = 0;
  loop {
    let _ = data[len];
    len += 1;
  }
  len
}

As you can see, inefficient_len takes a generic slice argument (since we don't care about the specific item type) and returns a usize - no surprises so far. We also have a counting variable and a loop in which we try to index into our slice as far as possible. Right now, this will just cause a panic without returning from our function. So, let's try to catch any panics during the indexing:

fn inefficient_len<T>(data: &[T]) -> usize {
  let mut len = 0;
  loop {
    let res = std::panic::catch_unwind(|| {
      let _ = data[len];
      len += 1;
    });
    if res.is_err() {break;}
  }
  len
}

We have moved our indexing and incrementing operation into a closure which we passed to catch_unwind and are checking the result - if it is an error, we break from the loop and return our count. This looks great, except that it doesn't compile.

error[E0277]: the type `&mut usize` may not be safely transferred across an unwind boundary
 --> src/lib.rs:4:40
  |
4 |       let res = std::panic::catch_unwind(|| {
  |                 ------------------------ ^-
  |                 |                        |
  |  _______________|________________________within this `[closure@src/lib.rs:4:40: 4:42]`
  | |               |
  | |               required by a bound introduced by this call
5 | |       let _ = data[len];
6 | |       len += 1;
7 | |     });
  | |_____^ `&mut usize` may not be safely transferred across an unwind boundary
  |
  = help: within `[closure@src/lib.rs:4:40: 4:42]`, the trait `UnwindSafe` is not implemented for `&mut usize`
  = note: `UnwindSafe` is implemented for `&usize`, but not for `&mut usize`
note: required because it's used within this closure
 --> src/lib.rs:4:40
  |
4 |     let res = std::panic::catch_unwind(|| {
  |                                        ^^
note: required by a bound in `catch_unwind`
 --> /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/panic.rs:139:1

error[E0277]: the type `T` may contain interior mutability and a reference may not be safely transferrable across a catch_unwind boundary
 --> src/lib.rs:4:40
  |
4 |       let res = std::panic::catch_unwind(|| {
  |  _______________------------------------_^
  | |               |
  | |               required by a bound introduced by this call
5 | |       let _ = data[len];
6 | |       len += 1;
7 | |     });
  | |_____^ `T` may contain interior mutability and a reference may not be safely transferrable across a catch_unwind boundary
  |
  = note: required because it appears within the type `[T]`
  = note: required for `&[T]` to implement `UnwindSafe`
note: required because it's used within this closure
 --> src/lib.rs:4:40
  |
4 |     let res = std::panic::catch_unwind(|| {
  |                                        ^^
note: required by a bound in `catch_unwind`
 --> /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/panic.rs:139:1
help: consider restricting type parameter `T`
  |
1 | fn inefficient_len<T: std::panic::RefUnwindSafe>(data: &[T]) -> usize {
  |                     +++++++++++++++++++++++++++

For more information about this error, try `rustc --explain E0277`.
error: could not compile `playground` due to 2 previous errors

This is once again a case of the Rust compiler having spectacular error messages. It is trying to protect us from accidentally violating invariants by transferring a type in an invalid state across a catch_unwind boundary which can happen if we are holding a mutable reference or a type with interior mutability. It also tells us that we can solve this problem by requiring our type T to be safe to transfer across such boundaries with the RefUnwindSafe trait.

However, we know more than the compiler in this case. We know that we will only panic while reading from a &[T] value, so we should not be able to leave our argument in an invalid state. In cases like this one, we can use the AssertUnwindSafe wrapper. By wrapping our closure in this type, we vouch that we are not carrying anything invalid across a catch_unwind boundary.

fn inefficient_len<T>(data: &[T]) -> usize {
  let mut len = 0;
  loop {
    let res = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
      let _ = data[len];
      len += 1;
    }));
    if res.is_err() {break;}
  }
  len
}

fn main() {
    let data = vec![1, 2, 3, 4];
    let len = inefficient_len(&data);
    println!("Length is {}", len);
}

For convenience, I added a short main function as well. If you run this code, you should see it correctly print that the length is 4. However, we also get the panic message which we might not want to see here. Well, we can get around this with a custom panic hook.

The documentation tells us that "the panic hook is invoked when a thread panics, but before the panic runtime is invoked". The default panic hook is what prints the panic message to the terminal. By replacing it, we can stop this behaviour:

fn inefficient_len<T>(data: &[T]) -> usize {
  std::panic::set_hook(Box::new(|_| {}));
  let mut len = 0;
  loop {
    let res = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
      let _ = data[len];
      len += 1;
    }));
    if res.is_err() {break;}
  }
  let _ = std::panic::take_hook();
  len
}

fn main() {
    let data = vec![1, 2, 3, 4];
    let len = inefficient_len(&data);
    println!("Length is {}", len);
}

At the beginning of the function, we set an empty hook and at the end we call take_hook to reinstate the default hook. Now, you should only see the message from our main function printed to the console.

It works until it doesn't

There is one major issue with this code: It is absolutely not guaranteed to work and your code can't know if it will work or not. As discussed earlier, there are two panic runtimes: The aborting and the unwinding one. Any code trying to catch a panic will only work with the unwinding runtime. If the program aborts on panic, there is nothing to catch. Furthermore, your code cannot know which panic runtime is used. You can use the experimental always_abort function to guarantee that your program will abort on panic, but since the panic runtime is determined at build time, you can only hope that the unwinding runtime is used.

So, the important takeaway here is: Don't write code that relies on the assumption that it can unwind a panic (if you can avoid it).

You could, however write your own panic runtime. Once again, you should not do this unless you need it but it will give us an interesting view into the inner workings of the Rust runtime. To start, we'll need to go #[no_std] to disable the default panic runtime.

#![no_std]
#![feature(start)]

#[start]
fn main(_argc: isize, _argv: *const *const u8) -> isize {
    panic!("Oh no");
}

We also have to enable the start feature because we can't use the default main function anymore. If we try to compile this code, the Rust compiler will helpfully tell us that we are missing a panic handler - so let's build one! To do this, we will need to implement the panic_impl language item. We will also need the eh_personality language item which we can replace with an empty function.

#![no_std]
#![feature(
    lang_items,
    start,
    core_intrinsics,
    libc,
    rustc_private
)]

extern crate libc;

#[lang = "panic_impl"]
extern "C" fn rust_begin_panic(info: &core::panic::PanicInfo) -> ! {
    core::intrinsics::abort();
}

#[lang = "eh_personality"]
fn eh_personality() {}

#[start]
fn main(_argc: isize, _argv: *const *const u8) -> isize {
    panic!("Oh no");
}

As you can see, our panic handler right now just uses the abort compiler intrinsic to kill the running process. We also needed to import the bundled libc crate because some function in core requires the libc memset function. It's a linker error, so figuring out what exactly went wrong here is not so easy. At this point, we can in fact panic our program, it just won't do anything interesting. How about we extract some information from that PanicInfo struct?

#![no_std]
#![feature(
    lang_items,
    start,
    core_intrinsics,
    libc,
    rustc_private,
    panic_info_message
)]

extern crate libc;

#[lang = "panic_impl"]
extern "C" fn rust_begin_panic(info: &core::panic::PanicInfo) -> ! {
    unsafe { libc::printf("Panicking now\n\0".as_ptr() as *const i8) };
    if let Some(location) = info.location() {
        unsafe {
            libc::printf(
                "Location: %s, %d:%d\n\0".as_ptr() as *const i8,
                location.file().as_ptr() as *const i8,
                location.line(),
                location.column(),
            )
        };
    }
    if let Some(&msg) = info.payload().downcast_ref::<&str>() {
        unsafe { libc::printf("%s\n\0".as_ptr() as *const i8, msg.as_ptr() as *const i8) };
    } else if let Some(msg) = info.message() {
        unsafe {
            libc::printf(
                "%s\n\0".as_ptr() as *const i8,
                msg.as_str().unwrap().as_ptr() as *const i8,
            )
        };
    }
    core::intrinsics::abort();
}

#[lang = "eh_personality"]
fn eh_personality() {}

#[start]
fn main(_argc: isize, _argv: *const *const u8) -> isize {
    panic!("Oh no");
}

Since we're operating in #[no_std] mode, we can't use the print! macro and have to resort to libc::printf instead. If you were doing this the proper way, you should be using CStr instead of how I am butchering my strings (in fact, it took me a moment to figure out I had to manually null-terminate them here), but this is the quickest way to get this working. With these changes, the program should panic, and print the panic message as well as location. This would also be the place to do any more panic handling like delegating to libunwind as the unwinding runtime does.

In practice, you will almost never need to do it this way. If you want to customize the panic message, you can set a hook as shown earlier. The only place I have actually needed to define my own panic handler this way was on a STM32 microprocessor.

Don't panic

One of the issues you may have with panics is that they are "silent" - a function's signature doesn't tell you if it may panic. This is an issue if you have a scenario where your code should never panic, but there are ways around this. For example, you can use David Tolnay's crate no_panic. Let's walk through how to use it and how it works.

use no_panic::no_panic;

#[no_panic]
fn demo(s: &str) -> &str {
    &s[1..]
}

fn main() {
    println!("{}", demo("input string"));
}

This is the example from the crate's documentation. You just add the no_panic attribute to a function and the compiler will have to prove that it can't panic. Well, it's a bit more complicated. You can tell that the demo function could panic if given an empty string or one that contains multibyte characters in the wrong places. But since the only call to the function is made with a well-formed string, the compiler knows it's fine. If you replace the string with one that cannot be sliced at the first character, the program will fail to compile with a scary-looking linker error. How does the crate do this? Well, we can look at the output of cargo expand to see what the proc macro expands to!

#![feature(prelude_import)]
#![feature(type_name_of_val)]  
#[prelude_import]
use std::prelude::rust_2021::*;
#[macro_use]
extern crate std;
use no_panic::no_panic;        
#[inline]
fn demo(mut __arg0: &str) -> &str {
    struct __NoPanic;
    extern "C" {
        #[link_name = "\n\nERROR[no-panic]: detected panic in function `demo`\n"]
        fn trigger() -> !;
    }
    impl core::ops::Drop for __NoPanic {
        fn drop(&mut self) {
            unsafe {
                trigger();
            }
        }
    }
    let __guard = __NoPanic;
    let __result = (move || -> &str {
        let s = __arg0;
        &s[1..]
    })();
    core::mem::forget(__guard);
    __result
}
fn main() {
    {
        ::std::io::_print(format_args!("{0}\n", demo("input string")));
    };
}

As you can see, the macro made some major changes to the demo function: Our function itself now lives in a closure and a lot of code has appeared around it. First, the macro defines a new struct __NoPanic. This struct implements Drop and uses this implementation to call a C function trigger when it goes out of scope. It creates an instance of this struct and then calls our original function. This is where it gets interesting. If the function runs through normally, the __guard instance will be forgotten (i.e. destroyed without calling its Drop implementation) and the result returned. But what would happen if the function panicked? Well, the panic runtime would be responsible for cleaning up the __guard value. And this is actually a trap. The trigger function doesn't actually exist. If our code goes down the "happy path", the Drop implementation will never be called and the nonexistant function doesn't matter. But if the compiler knows the panic runtime may have to clean up this value, it will try to link the trigger function, not find it, and give a linker error.

This means that the error won't look very nice (linker errors never do), but it will give you a hint where a panic could happen. Pretty nifty!

Conclusions

I hope you learned something on this tour through the different ways to panic and deal with panics in Rust. In almost all cases, you will be fine treating a panic as an unrecoverable error, but if you need to recover from one, now you know how to do it! As an exercise, try thinking about how you might use the concepts introduced here to implement a testing framework that uses the assert! family of macros like the Rust builtin test harness. If you want to dive deeper, here are some more links that could be interesting:

A Soil VM for the Linux Kernel

One of the Linux kernel features that have gained the most traction in the last few years is probably (e)BPF. Originally, the "Berkeley Packet Filter" was intended as a means of filtering network packets in kernel mode. However, BPF quickly developed into a fully-featured VM used for all kinds of purposes. The appeal of BPF is not hard to see: It allows you to load kernel mode code at system runtime (similar to kernel modules) while keeping some degree of sandboxing and fault tolerance afforded by the VM. It is much more difficult to break your kernel with a BPF program than with a regular kernel module. One of the most prominent current users of BPF is sched_ext, a framework for writing scheduler implementations in BPF. This lets you easily tinker with your scheduler and see results live and without the risk of breaking your kernel if your implementation crashes....

2024-09-05 #soil #c #linux #kernel

Share this article: Don't panic! - Or should you?