stressing your system, a chicken at a time
MIT License
A chicken gun is a large-diameter, compressed-air cannon used to fire dead chickens at aircraft components in order to simulate high-speed bird strikes during the aircraft's flight. (source: Wikipedia)
Here you can find cg
, a tool aimed at providing very targetted load at specific parts of a machine to verify:
Table of contents
cpu
Exercises the CPU time spent on userspace code by creating n
threads that each keep running a busy loop indefinitely.
# run four threads with busyloops in them.
cg cpu --threads 4
Once the scenario runs, we can look at CPU utilization metrics to verify that we're really exercising the CPUs, but first, let's see where we can gather that info from:
cat /proc/stat
cpu 15336 204 1036 1949794 774 0 133 0 0 0 # -- aggregate over all cpus
cpu0 5135 42 370 649932 248 0 21 0 0 0
cpu1 5106 162 315 649920 275 0 102 0 0 0
# | | | | | | | | | |
# | | | | | | | | | *guest_nice
# | | | | | | | | *guest
# | | | | | | | *steal
# | | | | | | *softirq
# | | | | | *irq
# | | | | *iowait
# | | | *idle
# | | *system
# | *nice
# *user
Where each number measures the number of jiffies (100HZ on x86) that the cpu saw itself in that mode since the time that the system booted.
metric | description |
---|---|
user | normal processes executing in user mode |
nice |
nice d processes executing in user mode |
system | processes executing in kernel mode |
idle | idle |
iowait | time during which a particular CPU was idle and there was at least one outstanding disk I/O operation requested by a task scheduled on that CPU (at the time it generated that I/O request) |
References:
context-switches
In this scenario, threads get their execution swapped in n
cores all the time, constantly.
As a result, we end up with:
migration/*
processes.For instance, looking at the results of sampling a mostly idle system that only has cg context-switches
running for 30s
:
# take samples of the whole callgraph 99 times a second for every
# cpu in the machine while running the `sleep` command.
#
# -F,--freq Profile at this frequency.
#
# -a,--all-cpus System-wide collection from all CPUs
# (default if no target is specified).
#
# -g Enables call-graph (stack chain/backtrace) recording.
#
perf record --freq 99 -a -g sleep 30
# `perf-script` reads perf.data (created by perf record) and displays
# trace output.
#
# With the traces generated by `perf script`, `stackcollapse` then
# collapses that multiline output of samples into semicolon-separated single
# lines, appropriate for `flamegraph.pl` to consume.
#
# From those collapsed stack traces, `flamegraph.pl` generates the
# `svg` with the flamegraph visualization.
perf script | \
stackcollapse-perf.pl | \
flamegraph.pl --hash --width=1000 > \
context-switches-flamegraph.svg
Now, looking at the number of context switches as reported by procfs
, we can see how aggressive we are in terms of context switching:
cd /proc/$(cat /tmp/cg.pid)/tasks
find . -name "status" | xargs -n1 grep 'ctxt'
voluntary_ctxt_switches: 4
nonvoluntary_ctxt_switches: 1
voluntary_ctxt_switches: 214
nonvoluntary_ctxt_switches: 1590249
voluntary_ctxt_switches: 232
nonvoluntary_ctxt_switches: 1590307
voluntary_ctxt_switches: 240
nonvoluntary_ctxt_switches: 1590386
voluntary_ctxt_switches: 242
nonvoluntary_ctxt_switches: 1590412
If we're even more curious and want to know in which CPUs the threads were when they ran, we can then look at a tailored output of perf script
:
# filtering the system-wide samples, look at only those
# for the `cg` command, then output the corresponding `cpu`
# where each `tid` ran.
perf script --fields comm,cpu,tid | awk '/cg/{print $2 $3}'
Something interesting that happens when exercising context switches is that we can't just see the overhead associated with them by looking only at user
and system
CPU utilization, despite the fact that a 1 - idle
reveals that our CPUs are busy with such activity.
pids
Creates n
different processes under the same process group as the parent cg
initiated by cg pids
.
cg pids -n 5
# check the process group
pstree -p $(cat /tmp/cg.pid)
cg(2016)exe(2017)
exe(2018)
exe(2019)
exe(2020)
exe(2021)
Under the hood, cg pids
creates child processes from its own image (/proc/self/exe
), specifying the hidden cg sleep
- one that just sleeps forever - as their command.
This has the effect of having several processes (not just threads) under the same process group as cg
.
Despite the fact that Linux does not provide us with a single file containing the exact number of processes created, we can rely on what getdents(2)
on /proc
returns:
ls /proc/ | awk '/^[0-9]+$/'
files-open
By creating n
files under a particular directory and keeping them open, this scenario can be used to verify either that, for instance, per-process limits are really enforced.
For example:
# check out what the current limit for the current process is
cat /proc/$$/limits
Limit Soft Limit Hard Limit Units
...
Max resident set unlimited unlimited bytes
Max open files 1024 1048576 files
Max locked memory 16777216 16777216 bytes
...
# configure the current process to have a limit
# of 20 open files
ulimit -n 20
# verify that we indeed changes the limit for the current process
cat /proc/$$/limits
Limit Soft Limit Hard Limit Units
...
Max resident set unlimited unlimited bytes
Max open files 20 20 files
Max locked memory 16777216 16777216 bytes
...
# see that we can't go past that limit:
cg files-open -d /tmp -n 30
thread 'main' panicked at 'failed to create /tmp/17: Too many open files (os error 24)', src/fs.rs:18:25
In order to check what the current number of open files we have, we can inspect the process' /proc/$pid/fd
:
# create and open a number of open files that we're allowed to handle
cg files-open -d /tmp -n 10
ls /proc/$(cat /tmp/cg.pid)/fd | wc -l
13 # < 10 files + stdin, stdout, and stderr.
tcp-transmitter
and tcp-receiver
Respectively, sends/receives bytes from/to files as quickly as possible using as few userspace time as possible (leverages splice
heavily).
# in one terminal
cg tcp-receiver -a 127.0.0.1:1337
# in another terminal
cg tcp-transmitter -a 127.0.0.1:1337
# in yet another terminal
sar -n DEV 1
21:23:14 IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
21:23:15 enp0s3 4.00 4.00 0.23 0.40 0.00 0.00 0.00 0.00
21:23:15 lo 240386.00 240386.00 5263101.63 5263101.63 0.00 0.00 0.00 0.00
21:23:15 enp0s8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
21:23:15 docker0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
top
...
%Cpu(s): 1.5 us, 31.5 sy, 0.0 ni, 60.0 id, 0.0 wa, ...
*------* *-----*
Just like in a regular bare-metal or virtual machine, cg
can run in containerized environments too.
A container image can be found on DockerHub: cirocosta/chicken-gun.
docker run cirocosta/chicken-gun cpu --threads 4
MIT - See ./LICENSE
.