This monday we sat down after work at my company to increase our knowledge about debugging, as we have done a few times before. I suggested we have a look at rr. Amongst a few other examples, we produced one Rust program which crashes rather randomly:
use chrono::prelude::*; use rand::*; fn crashy(should: bool) { if should && rand::random() { panic!("This is a crash!"); } } fn main() { let mut counter = 0; loop { counter += 1; let timestamp = Local::now(); crashy(timestamp.second() % 2 == 0); } println!("All done"); }
Using rr, we can record a session of this program, and wait until it crashes.
mikael@chronos:~/code/rr-playground/in-rust$ rr record ./target/debug/in-rust rr: Saving execution to trace directory `/home/mikael/.local/share/rr/in-rust-1'. thread 'main' panicked at 'This is a crash!', src/main.rs:6:9 note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
Once there, we can replay what we recorded, and see what happens.
mikael@chronos:~/code/rr-playground/in-rust$ rr replay GNU gdb (Ubuntu 8.2-0ubuntu1) 8.2 Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. --Type <RET> for more, q to quit, c to continue without paging-- For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /home/mikael/.local/share/rr/in-rust-1/mmap_hardlink_3_in-rust...done. warning: Missing auto-load script at offset 0 in section .debug_gdb_scripts of file /home/mikael/.local/share/rr/in-rust-1/mmap_hardlink_3_in-rust. Use `info auto-load python-scripts [REGEXP]' to list them. Really redefine built-in command "restart"? (y or n) [answered Y; input not from terminal] Remote debugging using 127.0.0.1:5145 Reading symbols from /lib64/ld-linux-x86-64.so.2... Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.28.so...done. done. 0x00007fd0cdf0b090 in _start () from /lib64/ld-linux-x86-64.so.2 (rr)
From this debug prompt, we can type c
for continue
, and wait for the crash.
(rr) c Continuing. thread 'main' panicked at 'This is a crash!', src/main.rs:6:9 note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace. Program received signal SIGKILL, Killed. 0x0000000070000002 in ?? ()
Now here starts the magic; in a normal run, all we would be able to see is that the backtrace is empty, because the program has properly crashed.
(rr) bt #0 0x0000000070000002 in ?? () Backtrace stopped: Cannot access memory at address 0x681ffe10
But with rr, we can set a break point, and run the program in reverse.
(rr) b 5 Breakpoint 1 at 0x5592fed2559a: file src/main.rs, line 5. (rr) reverse-cont Continuing. Breakpoint 1, in_rust::crashy (should=true) at src/main.rs:5 5 if should && rand::random() {
We've now run our program in reverse, and stopped at some proper point. This is cheating a little, because we know where we crashed, but that's besides the point. From here, we can actually see a proper backtrace.
(rr) bt #0 in_rust::crashy (should=true) at src/main.rs:5 #1 0x00005592fed25668 in in_rust::main () at src/main.rs:16
We can now go to the main frame, and inspect the values there. We could of course also use reverse-step
and/or reverse-next
to single step backwards, instead of jumping directly to the frame.
(rr) f 1 #1 0x00005592fed25668 in in_rust::main () at src/main.rs:16 16 crashy(timestamp.second() + counter % 2 == 0); (rr) info locals timestamp = [snipped] counter = 18902396
And there we have the value of counter
. Which will be the same every time we run this replay. This is an incredibly powerful technique. Being able to have reproducible executions and step forwards and back, and look at whatever variables you want.