Write Your Own Fuzzer for the NetBSD Kernel
Note: This post was originally written in 2019 and has been lightly refreshed for clarity. The content and examples are unchanged — just the presentation got a coat of paint.
How Fuzzing Works
The easy way to describe fuzzing is to compare it to unit testing, but with adversarial input. That input can be random, or generated in a way that is unexpected from a normal execution perspective.
The simplest fuzzer is a few lines of shell: read N bytes from /dev/urandom,
pass them to the program, see what happens. If the program crashes, you found
something. This is dumb fuzzing — and it works, up to a point.
Coverage-Guided Fuzzing
Programs process different inputs at different speeds. That is a signal. If you can observe which code paths a given input exercises, you can use that feedback to generate better inputs: ones that reach new paths rather than exercising the same ones again and again.
AFL (American Fuzzy Lop) made this practical. The program is compiled with branch-tracing instrumentation — every conditional jump increments a counter. AFL builds a graph of execution paths from those counters and uses a genetic algorithm to mutate inputs toward unexplored paths. The important detail is that AFL does not need to understand the input format; it learns it by observing what mutants cause new branches to be taken.
This technique does not require source code at runtime, but it does require compilation. Kernel fuzzing adds another wrinkle: the coverage counters live inside the kernel, and something outside the kernel has to read them.
kcov(4)
NetBSD exposes kernel coverage data via kcov(4). When enabled, the kernel
records the program counter at every traced branch into a per-process buffer.
User space reads that buffer via mmap.
To see what compiler-injected instrumentation looks like, compile any C program with coverage tracing enabled:
$ gcc main.c -fsanitize-coverage=trace-pc
/usr/local/bin/ld: /tmp/ccIKK7Eo.o: in function `handler':
main.c:(.text+0xd): undefined reference to `__sanitizer_cov_trace_pc'
main.c:(.text+0x1b): undefined reference to `__sanitizer_cov_trace_pc'
The compiler inserted calls to __sanitizer_cov_trace_pc at every branch. The
linker complains because nothing provides that symbol. In the NetBSD kernel,
sys/kern/subr_kcov.c provides it — that is kcov(4).
Which Fuzzer?
AFL has become a de facto standard. Many major open-source projects run it continuously; the AFL website catalogs a long list of bugs found across browsers, image parsers, and network daemons. It is not the only option — Honggfuzz and Syzkaller both have active communities and find different classes of bugs — but it is a practical starting point for kernel fuzzing.
The fuzzing field is still empirical. There is no formal model for why one mutation strategy outperforms another in a given target. Most knowledge comes from comparative experiments, not proofs.
Making kcov Modular
One of the core problems with kernel coverage for fuzzing is that different
fuzzers want different things from the data. AFL wants a 64 KB shared memory
region organized as a hash map of (prev_PC, PC) pairs. Honggfuzz wants
something different. Syzkaller has its own requirements.
Keeping per-fuzzer logic inside kcov(4) does not scale — it leaves
fuzzer-specific code in a core kernel subsystem. Oracle tried to upstream AFL
support directly into the Linux kernel in 2016; the patches were rejected for
exactly this reason.
The NetBSD approach is to make kcov modular: the kernel provides raw coverage
data, and loadable modules implement the per-fuzzer data transformation. A module
registers an ops structure:
static struct kcov_ops kcov_mod_ops = {
.open = kcov_afl_open,
.free = kcov_afl_free,
.setbufsize = kcov_afl_setbufsize,
.enable = kcov_afl_enable,
.disable = kcov_afl_disable,
.mmap = kcov_afl_mmap,
.cov_trace_pc = kcov_afl_cov_trace_pc,
.cov_trace_cmp = kcov_afl_cov_trace_cmp,
};
At load time the module calls kcov_ops_set; at unload, kcov_ops_unset. While
the module is loaded, its ops replace the defaults. The current patch is on
GitHub.
The AFL Module
The AFL fuzzer expects a 64 KB shared memory region where coverage is stored as
a hash map of transition counts. Each (prev_PC, curr_PC) pair maps to a byte
in that region; repeated transitions increment the byte.
The module tracks this state per thread:
typedef struct afl_ctx {
uint8_t *afl_area; /* 64 KB SHM region */
struct uvm_object *afl_uobj;
size_t afl_bsize;
uint64_t afl_prev_loc;
lwpid_t lid; /* thread id — multiple threads, one fuzzer */
} kcov_afl_t;
The trace function translates each PC into the AFL map:
static void
kcov_afl_cov_trace_pc(void *priv, intptr_t pc)
{
kcov_afl_t *afl = priv;
++afl->afl_area[(afl->afl_prev_loc ^ pc) & (afl->afl_bsize - 1)];
afl->afl_prev_loc = _long_hash64(pc, BITS_PER_LONG);
}
The ^ between prev_loc and pc encodes the transition, not just the
destination. The _long_hash64 on the previous location improves distribution
across the map — a technique from Quentin Casasnovas of Oracle.
The full implementation, including open, mmap, and enable, is in the
kcov_modules repository.
Debugging the Module
Kernel debugging with coverage enabled is awkward. The trace function is called
for every branch. Putting a printf inside kcov_afl_cov_trace_pc causes the
printf itself to trigger more traces, which call printf, which overflows the
stack.
The standard toolbox does not help much here. A remote kernel debugger works but hits thousands of breakpoints before reaching any condition of interest.
debugcon_printf by Kamil
Rytarowski sidesteps this: it writes to the x86 debug console port (0xe9)
rather than going through the kernel I/O path. On QEMU, that port can be
redirected to a file on the host:
-debugcon file:/tmp/qemu.debug.log -global isa-debugcon.iobase=0xe9
With that, you can trace every PC arriving at the module without causing a re-entrant trace:
kcov_afl_cov_trace_pc(void *priv, intptr_t pc)
{
kcov_afl_t *afl = priv;
debugcon_printf("#:%x\n", pc); /* safe: bypasses kcov-traced I/O path */
++afl->afl_area[(afl->afl_prev_loc ^ pc) & (afl->afl_bsize - 1)];
afl->afl_prev_loc = _long_hash64(pc, BITS_PER_LONG);
}
Running log of all branches on the host machine, outside the guest, with zero effect on the guest’s coverage counters. It is a narrow tool, but for this specific problem it is the right one.
What’s Next
The AFL module was built for the AFL FileSystems Fuzzing project. The next post covers the practical side: porting AFL to run against the NetBSD kernel, writing a mount wrapper, and what happens when you point this at FFS with a pre-seeded corpus.
Thanks to Kamil Rytarowski for the project idea and for the collaboration on NetBSD development.