Investigate kernel error call stacks
BSD-2-CLAUSE License
Bot releases are visible (Hide)
This is a big release with some new major features added (though we still stay within minor version update, as there might be still some minor breaking changes). Most notable changes:
-A
argument). Retsnoop now can capture all input arguments for all traced functions and print them in human-readable form. See README for more details.-J
). In addition to traced functions specified with -e
and -a
flags, it's now possible to also specify a single-point injected probes (kprobes, kretprobes, tracepoints, and raw tracepoints). Note that for kprobe, it's possible to specify extra offset (e.g., -J kprobe:bprm_execve+12
), which allows to trace inlined functions and internals of functions (normally retsnoop only traces function entry and exit). See README for more details.-A
and -J
together. For kprobes and kretprobes registers state is captured, for tracepoints and raw tracepoints their actual arguments are captured. See README for more details.-T
) separately from default call stack mode. The latter now is controlled with -E
flag. The important distinction and a breaking change is that with function call trace mode --success-stacks/-S
option is implied, which makes most sense for function call tracing. When retsnoop -E
is specified, even with -T
, the original behavior of tracing and emitting only erroring call stacks (i.e., those that end up returning error from entry functions specified with -e
arguments). So, in short:
retsnoop -T
emits all function call trace, both successful and erroring;retsnoop -E
(or just retsnoop
, as -E
is the default mode) emits only erroring call stacks (no function call traces);retsnoop -E -S
will emit call stacks only (no function call traces), but both erroring and successful ones;retsnoop -E -T
will emit both call stacks and function call traces, but only erroring ones;retsnoop -E -T -S
will do both call stacks and function call trace for both successful and erroring cases.-C
flag. See retsnoop --config-help
for list of supported options and more details.--help
output.Full Changelog: https://github.com/anakryiko/retsnoop/compare/v0.9.8...v0.10
Published by github-actions[bot] 9 months ago
Full Changelog: https://github.com/anakryiko/retsnoop/compare/v0.9.7...v0.9.8
Published by anakryiko about 1 year ago
Full Changelog: https://github.com/anakryiko/retsnoop/compare/v0.9.6...v0.9.7
Published by anakryiko over 1 year ago
Full Changelog: https://github.com/anakryiko/retsnoop/compare/v0.9.5...v0.9.6
Published by anakryiko over 1 year ago
Massive improvements in how retsnoop determines whether kprobes are attachable:
--debug multi-kprobe
mode to bisect failing multi-kprobe attachment; it quickly narrows down and logs which kprobes were attempted but failed to be attached;Overall, these fixes and improvements make retsnoop's mass-attach behavior more reliable.
Full Changelog: https://github.com/anakryiko/retsnoop/compare/v0.9.4...v0.9.5
Published by anakryiko over 1 year ago
Published by anakryiko almost 2 years ago
retsnoop
now supports DWARF-based symbolization (i.e.,
source code file/line info and inline functions) on
KASLR-enabled Linux kernels.
Published by anakryiko about 2 years ago
Full Changelog: https://github.com/anakryiko/retsnoop/compare/v0.9.1...v0.9.2
Published by anakryiko about 2 years ago
$ sudo ./retsnoop -Vv
retsnoop v0.9.1
Feature detection:
BPF ringbuf map supported: yes
bpf_get_func_ip() supported: yes
bpf_get_branch_snapshot() supported: yes
BPF cookie supported: yes
multi-attach kprobe supported: yes
Feature calibration:
kretprobe IP offset: 4
fexit sleep fix: yes
fentry re-entry protection: yes
All just nice quality of life improvements. Enjoy!
Published by anakryiko about 2 years ago
Add function call trace output, in addition to default stack trace and LBR output.
Example:
$ sudo ./retsnoop -e '*sys_bpf' -v -n simfail -a ':kernel/bpf/syscall.c' -a ':kernel/bpf/verifier.c' -T
...
Receiving data...
15:50:12.413878 -> 15:50:12.414193 TID/PID 1755152/1755152 (simfail/simfail):
FUNCTION CALLS TRACE RESULT DURATION
------------------------------------- -------------------- ---------
→ __x64_sys_bpf
→ __sys_bpf
↔ bpf_check_uarg_tail_zero [0] 0.341us
→ bpf_raw_tracepoint_open
↔ __bpf_prog_get [0xffffc9000c93d000] 0.255us
→ bpf_tracing_prog_attach
↔ bpf_link_prime [0] 2.530us
↔ bpf_link_cleanup [void] 3.435us
← bpf_tracing_prog_attach [-ENOTSUPP] 306.161us
← bpf_raw_tracepoint_open [-ENOTSUPP] 310.147us
← __sys_bpf [-ENOTSUPP] 314.846us
← __x64_sys_bpf [-ENOTSUPP] 315.515us
entry_SYSCALL_64_after_hwframe+0x44 (arch/x86/entry/entry_64.S:112:0)
do_syscall_64+0x2d (arch/x86/entry/common.c:46:12)
315us [-ENOTSUPP] __x64_sys_bpf+0x1c (kernel/bpf/syscall.c:4749:1)
314us [-ENOTSUPP] __sys_bpf+0x867 (kernel/bpf/syscall.c:4689:9)
310us [-ENOTSUPP] bpf_raw_tracepoint_open+0x9a (kernel/bpf/syscall.c:3063:6)
! 306us [-ENOTSUPP] bpf_tracing_prog_attach
As you can see from the above, function calls trace mode allows to peer into exact control flow inside the kernel, but filter to according to allow/deny lists, taking into account all the filters (process name, latency, etc). In addition to call sequence, function results and duration is emitted.
Note that leaf function calls (e.g., bpf_link_prime
above) are collapsed, if they don't call any other functions. This makes call trace more readable and compact. This is marked with ↔
marker, while otherwise function entry is marked with →
, and function exit is marked with ←
.
This mode is perfectly augments stack trace output for deeper kernel internals inspection, but also is great for discovering how kernel internals work, in general.
Published by anakryiko over 2 years ago
General glob format is now <name-glob> [<module-glob>]
, where module glob is optional. This allows to, e.g., attach to all kprobes within some module: '* [fuse]' will attach to all the functions within fuse module. Note that module glob is also a glob, so one can capture multiple modules within one glob, e.g. -a '* [kvm*]'
will capture functions defined in kvm
and kvm_intel
modules.
Published by anakryiko over 2 years ago
Few more usability improvements:
any_return
, instead of mode detailed any
;Published by anakryiko over 2 years ago
Published by anakryiko over 2 years ago
--lbr=any_return
will capture only returns from functions, allowing to see further into unknown sequence of kernel function calls. This is very useful when trying to discover what's going on without knowing particular area of the kernel you are trying to debug. By default retsnoop is effectively using --lbr=any
.--lbr-max-count N
was added to limit number of last useful LBR records. It's not always necessary to see all 32 of them, last 5 or some might be more than enough.With all the above changes, here's an example of one captured error with LBR stack traces included. Retsnoop is run as:
$ sudo ./retsnoop -e '*sys_bpf' -a ':kernel/bpf/syscall.c' -n simfail --lbr=any_return --lbr-max-count=5
Failure is simulated with simfail
:
$ sudo ./simfail bpf-bad-map-lookup-value
And here's the result:
09:24:54.846 PID 336615 (simfail):
entry_SYSCALL_64_after_hwframe+0x44 (arch/x86/entry/entry_64.S:112:0)
do_syscall_64+0x2d (arch/x86/entry/common.c:46:12)
34us [-ENOENT] __x64_sys_bpf+0x1c (kernel/bpf/syscall.c:4749:1)
27us [-ENOENT] __sys_bpf+0x1a42 (kernel/bpf/syscall.c:4632:9)
. map_lookup_elem (kernel/bpf/syscall.c:1113:5)
! 7us [-ENOENT] bpf_map_copy_value
[#07] migrate_disable+0x3c (kernel/sched/core.c:1755:1) -> bpf_map_copy_value+0x31 (kernel/bpf/syscall.c:241:2)
[#07] . bpf_disable_instrumentation (include/linux/bpf.h:1453:2)
[#06] array_map_lookup_elem+0x24 (kernel/bpf/arraymap.c:168:1) -> bpf_map_copy_value+0x1ed (kernel/bpf/syscall.c:269:10)
[#05] rcu_read_unlock_strict+0x5 (kernel/rcu/tree_plugin.h:797:1) -> bpf_map_copy_value+0x18c (include/linux/rcupdate.h:724:2)
[#04] migrate_enable+0x59 (kernel/sched/core.c:1783:1) -> bpf_map_copy_value+0x9e (kernel/bpf/syscall.c:288:2)
[#04] . maybe_wait_bpf_programs (kernel/bpf/syscall.c:170:49)
[#03] bpf_map_copy_value+0xba (kernel/bpf/syscall.c:291:1) -> __kretprobe_trampoline+0x0
Published by anakryiko over 2 years ago
Two major features:
-x ENOMEM
to report stacks that return -ENOMEM. Use -X ENOMEM
to skip stacks that report -ENOMEM. NULL
is an error, so -x NULL
and -X NULL
is also supported. You can combine multiple -x
and -X
options together. -X
takes precedence (i.e., if some error is disabled, enabling it with -x
won't help).Published by anakryiko almost 3 years ago
Lots of quality of life improvements:
-e
, -a
and -d
: :fs/btrfs/*.c'
.-F
argument.-ss
. If vmlinux image can't be located, fall backs to -s none
(-sn
), meaning no extra symbolization beyond using /proc/kallsyms
.--dry-run
) which will do everything but load and attach BPF programs. Very useful to figure out what retsnoop will try to trace without risking affecting the system.-V
(--version
) now prints retsnoop version.Published by anakryiko almost 3 years ago
Fixes potential issues with LBR perf event by using hardware event. No other changes compared to v0.5.
Published by anakryiko about 3 years ago
When kernel supports capturing LBR entries from BPF kprobe/fexit function,
it will capture such LBR records and emit relevant them after the captured stack trace.
This allows to trace back inside the last failed/traced function, including logic inside
the inlined functions. This allows to see where exactly inside potentially large function
the error happened. Use --lbr
flag to enable this feature. If kernel doesn't support
this feature, retsnoop will report this with a warning, visible in verbose mode (-v
).
Relevant kernel feature was added by Song Liu in
Linux kernel commit 856c02dbce4f ("bpf: Introduce helper bpf_get_branch_snapshot").
Published by anakryiko about 3 years ago
Force line-oriented output in stdout.
Published by anakryiko over 3 years ago