Skip to content

Remove bpfilter daemon and consolidate every feature in libbpfilter#468

Closed
qdeslandes wants to merge 6 commits intomainfrom
daemon-less
Closed

Remove bpfilter daemon and consolidate every feature in libbpfilter#468
qdeslandes wants to merge 6 commits intomainfrom
daemon-less

Conversation

@qdeslandes
Copy link
Contributor

  • Eliminate the bpfilter daemon process; all filtering logic now lives in libbpfilter and is called directly by bfcli
  • Move the code generation engine (cgen/), BPF stubs (bpf/), context management (ctx.c), and ELF stub handling from src/bpfilter/ into src/libbpfilter/
  • Remove the IPC layer (request.c, response.c, io.c, ns.c) and the daemon entry point, systemd service, and socket
  • Rewrite bfcli to initialise the library context on startup, call libbpfilter APIs directly, and tear down on exit
  • Adapt all e2e tests to run without a daemon: helpers no longer start/stop a background process, and daemon-specific tests are replaced by equivalent persistence and namespace tests
  • Scrub every remaining "daemon" reference from comments, docs, and user-facing strings (external kernel headers excluded)

As it is now, this PR can be merged into main, but more polishing will be done before tagging a new release.

THIS PR IS A DUPLICATE TO TRY TO TRIGGER CLAUDE CODE REVIEW

The cgen/, bpf/, ctx, and xlate modules were compiled directly into the
bpfilter binary. Move them into libbpfilter so the filtering logic lives
entirely in the shared library and the daemon is a thin entry point.
Replace the daemon IPC path in libbpfilter/cli.c with direct calls to
the cgen and ctx APIs, allowing bfcli to load, attach, and manage chains
without a running daemon.

Move ctx and elfstub headers to the public libbpfilter API, pull BTF
lifetime into bf_ctx_setup/teardown, remove namespace switching and
transient mode, and extend bfcli's stage-1 parser with global flags
(--verbose, --bpffs-path, --with-bpf-token, deprecated compat options).
Replace daemon e2e tests with namespace and persistence tests that
exercise bpffs-based chain discovery directly, without starting or
stopping the daemon process.

Remove the daemon lifecycle (start_bpfilter/stop_bpfilter) from the
test harness and propagate --bpffs-path through the ${BFCLI} variable
so all test invocations use the sandbox-local bpffs mount.

Skip bf_ctx_setup in bfcli when --dry-run is used so matcher tests
no longer require bpffs access. Fix clang-analyzer BitwiseShift
warning in --verbose option parsing.

Drop namespace unit tests and request namespace accessors that are no
longer part of the public API.
The daemon has been removed in favour of direct library calls. Update
comments, documentation, and user-facing strings that still referred to
"the daemon" so the terminology is consistent with the new architecture.

External kernel headers (src/external/) are left unchanged.
Lock bpfilter directory in bpffs to prevent concurrent access to the
chains and BPF objects.
`make test` runs all the tests, including fuzzing the bfcli parser for
60 seconds (`make fuzzing` target). This operation takes more time than
all the other tests combined which providing little value (fuzzing for a
few hours could be more interesting).

Disable fuzzing in the `make test` target.
@meta-cla meta-cla bot added the cla signed label Mar 10, 2026
return bf_err_r(r, "failed to create new context");

_bf_global_ctx = TAKE_PTR(_ctx);
_bf_global_ctx = _ctx;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: must-fix — Double-free and dangling _bf_global_ctx on error paths in bf_ctx_setup.

_bf_global_ctx = _ctx is assigned here, before the fallible bf_ctx_get_pindir_fd(), flock(), and _bf_ctx_discover() calls added in commit 5. Two problems result:

  1. Early returns (pindir_fd failure at line ~397, flock failure at ~401): when the function returns, the _free_bf_ctx_ cleanup on _ctx fires and frees the context object, but _bf_global_ctx still points to the freed memory. Any subsequent access to the global context (e.g. via bf_ctx_teardown() or signal handlers) would be a use-after-free.

  2. _bf_ctx_discover failure (line ~405): _bf_ctx_free(&_bf_global_ctx) frees the object and nulls _bf_global_ctx, but _ctx still holds the (now-dangling) pointer. When _ctx goes out of scope, its cleanup handler sees a non-NULL *_ctx, accesses freed memory, and double-frees.

The original daemon code avoided this by doing _bf_global_ctx = TAKE_PTR(_ctx) as the last fallible step. Consider restructuring so _bf_global_ctx is only assigned after all fallible operations succeed:

Suggested change
_bf_global_ctx = _ctx;
_bf_global_ctx = TAKE_PTR(_ctx);

Then move TAKE_PTR(_ctx) at line ~411 to here, and use _bf_global_ctx directly in the remaining code (which you already do for _bf_global_ctx->lock_fd). On the discover failure path, set _bf_global_ctx = NULL after _bf_ctx_free.

#include "cgen/prog/link.h"
#include "cgen/prog/map.h"

static int copy_hookopts(struct bf_hookopts **dest,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: must-fix — Static functions copy_hookopts and copy_set (line ~377) should use the _bf_ prefix per the naming convention documented in doc/developers/style.rst: "Static functions and variables use leading underscore: _bf_ctx_free()".

Every other static function across src/libbpfilter/ follows this pattern (e.g. _bf_ctx_new, _bf_ctx_free, _bf_cgen_persist, _bf_cgen_get_chain_pindir_fd).

Suggested change
static int copy_hookopts(struct bf_hookopts **dest,
static int _bf_copy_hookopts(struct bf_hookopts **dest,

return r;

r = bf_send(fd, request, &response, NULL);
bf_rpack_node_t child;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: must-fix — Mid-block variable declaration. The codebase consistently declares all variables at the top of the function block (C89 style). Every other use of bf_rpack_node_t in the codebase (hook.c, cgen.c, link.c, pack.c) appears in the initial declarations block. Move this declaration to the top of _bf_copy_set, alongside wpack, rpack, data, data_len, and r.

r = bf_cgen_update(cgen, &chain_copy, 0);
if (r)
return bf_err_r(r, "bf_chain_attach: failed to send request");
return -EINVAL;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestionbf_cgen_update returns specific negative errno values (e.g. -EACCES, -ENOMEM, -E2BIG), but this line unconditionally replaces the error with -EINVAL, discarding diagnostic information.

The companion function bf_chain_update_set (line ~458) correctly propagates the actual error via bf_err_r(r, "failed to update chain with new set data"). Consider the same pattern here for consistency:

    if (r)
        return bf_err_r(r, "failed to update chain '%s'", chain->name);

/// BPF token file descriptor
int token_fd;

int lock_fd;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit — The new lock_fd field lacks a /// Doxygen comment, while every other field in this struct has one (token_fd, with_bpf_token, bpffs_path, verbose). Consider adding something like:

    /// Lock file descriptor holding LOCK_EX on the bpffs pin directory.
    int lock_fd;

@github-actions
Copy link

Claude Code Review — PR #468

PR: Remove bpfilter daemon and consolidate every feature in libbpfilter
HEAD: ed7cc064

Summary

Severity Count
must-fix 3
suggestion 3
nit 1

Inline comments posted

# File Severity Title
1 src/libbpfilter/ctx.c:393 must-fix Double-free / dangling _bf_global_ctx on error paths in bf_ctx_setup
2 src/libbpfilter/cli.c:27 must-fix Static functions copy_hookopts and copy_set missing _bf_ prefix
3 src/libbpfilter/cli.c:403 must-fix Mid-block variable declaration in copy_set
4 src/libbpfilter/cli.c:372 suggestion bf_chain_update masks error code from bf_cgen_update
5 src/libbpfilter/ctx.c:42 nit Missing Doxygen comment on lock_fd field

Additional findings (not inline)

suggestionsrc/libbpfilter/include/bpfilter/bpfilter.h: The public API functions still contain "Request payload format" and "Response payload format" Doxygen sections (24 occurrences across 12 functions) describing the now-deleted daemon IPC protocol. Since these functions are now direct library calls, the framing is misleading. Also, struct bf_response is forward-declared but response.h was removed in this PR.

suggestion — Commit 5 (31553a8): typo in headline — "bffps" should be "bpffs". The commit body uses the correct spelling.

suggestion — Commit 2 (af1b97b): component prefix daemon: does not match actual changes. 20 of 25 changed files are in src/libbpfilter/, 3 in src/bfcli/, and only 1 in src/bpfilter/. A prefix like lib,cli: would be more accurate per project guidelines.

Comment on lines +7 to +14
---------------------

``bfcli`` reads the following environment variables to configure the library before processing commands. These variables are ignored when ``--dry-run`` is used.

- ``BF_BPFFS_PATH``: path to the BPF filesystem directory. Defaults to ``/sys/fs/bpf``. bpfilter pins BPF objects under a ``bpfilter`` subdirectory at this path.
- ``BF_WITH_BPF_TOKEN``: if set (any value), associate a BPF token to every ``bpf()`` system call. Required when running in user namespaces. The token is created from the bpffs at ``BF_BPFFS_PATH``. Only supported for kernel v6.9+.
- ``BF_VERBOSE``: comma-separated list of verbose flags. Supported values:

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The documentation at lines 7-14 claims bfcli reads BF_BPFFS_PATH, BF_WITH_BPF_TOKEN, and BF_VERBOSE environment variables, but no getenv() calls exist anywhere in bfcli or libbpfilter. Users following this documentation would set environment variables that are silently ignored, with bfcli using default values instead. Either implement getenv() support or remove the environment variables section from the docs.

Extended reasoning...

Bug Description

The newly added "Environment variables" section in doc/usage/bfcli.rst (lines 7-14) documents three environment variables that bfcli supposedly reads:

  • BF_BPFFS_PATH: path to the BPF filesystem directory
  • BF_WITH_BPF_TOKEN: whether to associate a BPF token
  • BF_VERBOSE: comma-separated list of verbose flags

However, there are zero getenv() calls anywhere in src/bfcli/ or src/libbpfilter/. The only getenv calls in the entire project are in tools/perfrec and tools/benchmarks (for SUDO_UID/SUDO_GID), which are unrelated.

Code Path Analysis

bfcli parses its configuration exclusively via command-line flags using argp. In src/bfcli/opts.c, the flags --bpffs-path (key b), --with-bpf-token (key t), and --verbose (key v) are defined and parsed in _bfc_opts_parser(). The parsed values are stored in struct bfc_opts fields (bpffs_path, with_bpf_token, verbose). In src/bfcli/main.c, these opts are passed directly to bf_ctx_setup(opts.with_bpf_token, opts.bpffs_path, opts.verbose). At no point does any code call getenv() to read environment variables.

Why Tests Don’t Catch This

The e2e test helper (tests/e2e/e2e_test_util.sh) both exports the environment variable AND passes the CLI flag:

export BF_BPFFS_PATH=${WORKDIR}/bpf
BFCLI="bfcli --bpffs-path ${BF_BPFFS_PATH}"

Since ${BFCLI} always includes --bpffs-path, the CLI flag provides the correct value. The export is a no-op from bfcli’s perspective. This masks the fact that environment variable support is missing.

Step-by-Step Proof

  1. A user reads the documentation and runs: export BF_BPFFS_PATH=/my/custom/bpffs && bfcli ruleset get
  2. bfcli enters main() in src/bfcli/main.c, calls bfc_opts_parse() which initializes opts.bpffs_path = "/sys/fs/bpf" (the default from src/bfcli/opts.h line 107).
  3. Since no --bpffs-path flag was passed on the command line, opts.bpffs_path remains at the default /sys/fs/bpf.
  4. bf_ctx_setup() is called with bpffs_path = "/sys/fs/bpf", completely ignoring the user’s exported BF_BPFFS_PATH=/my/custom/bpffs.
  5. The user’s intended configuration is silently discarded.

Impact

Users following the official documentation will believe they can configure bfcli via environment variables. When they do, their settings are silently ignored, leading to misconfiguration. This is particularly dangerous for BF_BPFFS_PATH since using the wrong bpffs path could cause BPF programs to be pinned in the wrong location, and for BF_WITH_BPF_TOKEN which is required in user namespace scenarios.

Fix

Either:

  1. Add getenv() calls in bfcli (e.g., in bfc_opts_parse or main) to read these environment variables as fallback defaults when the corresponding CLI flags are not provided, or
  2. Remove the "Environment variables" section from doc/usage/bfcli.rst entirely until the feature is actually implemented.

Comment on lines +393 to +401
_bf_global_ctx = _ctx;

if (!bf_ctx_is_transient()) {
r = _bf_ctx_discover();
if (r) {
_bf_ctx_free(&_bf_global_ctx);
return bf_err_r(r, "failed to discover chains");
}
pindir_fd = bf_ctx_get_pindir_fd();
if (pindir_fd < 0)
return bf_err_r(pindir_fd, "failed to get pin directory FD");

r = flock(pindir_fd, LOCK_EX | LOCK_NB);
if (r)
return bf_err_r(-errno, "failed to lock pin directory");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: bf_ctx_setup() leaves _bf_global_ctx as a dangling pointer when bf_ctx_get_pindir_fd() (line 395) or flock() (line 399) fails. The _free_bf_ctx_ cleanup on _ctx frees the memory on early return, but _bf_global_ctx still holds the stale pointer, leading to use-after-free if any subsequent code dereferences it. The fix pattern is already demonstrated at line 405 for the _bf_ctx_discover() error path -- add _bf_ctx_free(&_bf_global_ctx) (or at minimum _bf_global_ctx = NULL) before the early returns at lines 397 and 401.

Extended reasoning...

What the bug is

In bf_ctx_setup(), on line 393, _bf_global_ctx is assigned the value of _ctx before two fallible operations: bf_ctx_get_pindir_fd() (line 395) and flock() (line 399). If either of these fails, the function returns early via return bf_err_r(...). At that point, the _free_bf_ctx_ cleanup attribute on _ctx fires, which calls _bf_ctx_free(&_ctx) -- this frees the heap memory and sets _ctx = NULL. However, _bf_global_ctx still points to the now-freed memory.

The specific code path

Here is the concrete sequence for a failure at line 395:

  1. _bf_ctx_new(&_ctx, ...) succeeds -- _ctx points to a valid bf_ctx on the heap.
  2. Line 393: _bf_global_ctx = _ctx -- both pointers now reference the same allocation.
  3. Line 395: bf_ctx_get_pindir_fd() fails (returns negative). Note this function itself dereferences _bf_global_ctx, which is why the assignment must happen before this call.
  4. Line 397: return bf_err_r(pindir_fd, ...) -- function begins returning.
  5. Scope cleanup: _free_bf_ctx_ fires on _ctx, calling _bf_ctx_free(&_ctx). This frees the memory and sets _ctx = NULL.
  6. _bf_global_ctx now holds a pointer to freed memory -- a classic dangling pointer.

The same sequence applies if flock() fails on line 399-401.

Why existing code does not prevent it

The TAKE_PTR(_ctx) on line 411 is designed to prevent the cleanup from freeing the context on the success path -- it sets _ctx = NULL so the cleanup is a no-op. But this line is only reached on success. On the error paths at lines 397 and 401, TAKE_PTR never runs, so the cleanup fires and frees the memory while _bf_global_ctx still references it.

Notably, the _bf_ctx_discover() error path at lines 404-407 already handles this correctly by explicitly calling _bf_ctx_free(&_bf_global_ctx) before returning. This sets _bf_global_ctx = NULL and frees the memory properly. The two earlier error paths are simply missing this same treatment.

Impact

After a failed bf_ctx_setup(), _bf_global_ctx is a dangling pointer. Any subsequent call to functions like bf_ctx_teardown(), bf_ctx_token(), bf_ctx_is_verbose(), or any other function that dereferences _bf_global_ctx will trigger use-after-free, which can cause crashes, memory corruption, or undefined behavior.

How to fix

Add _bf_ctx_free(&_bf_global_ctx) before the early returns at lines 397 and 401, mirroring the pattern already used at line 405. Alternatively, a single goto cleanup pattern could consolidate all error paths. The key constraint is that _bf_global_ctx = _ctx must remain before line 395 because bf_ctx_get_pindir_fd() reads _bf_global_ctx internally, so moving the assignment later is not an option.

Comment on lines +106 to 112
r = bf_btf_setup();
if (r)
return bf_err_r(r, "failed to load vmlinux BTF");

_ctx = calloc(1, sizeof(*_ctx));
if (!_ctx)
return -ENOMEM;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Resource leak: if calloc fails at line 111, bf_btf_setup() (called at line 106) has already allocated global BTF resources, but the _free_bf_ctx_ cleanup attribute on _ctx is a no-op since _ctx is still NULL, so bf_btf_teardown() never runs. Fix by moving bf_btf_setup() after the calloc, or adding an explicit bf_btf_teardown() before return -ENOMEM.

Extended reasoning...

Bug Analysis

The function _bf_ctx_new() calls bf_btf_setup() at line 106 before allocating the context struct with calloc at line 110. bf_btf_setup() internally calls btf__load_vmlinux_btf(), which allocates a struct btf object and stores it in the static global _bf_btf. This allocation succeeds and the global is now non-NULL.

Trigger Path

If calloc(1, sizeof(*_ctx)) at line 110 returns NULL (out-of-memory), the function immediately returns -ENOMEM at line 112. At this point, the local variable _ctx is still NULL (it was initialized to NULL at line 100 and never assigned). The _free_bf_ctx_ cleanup attribute calls _bf_ctx_free(&_ctx), which checks if (!*ctx) return; — since _ctx is NULL, the cleanup is a complete no-op. The bf_btf_teardown() call inside _bf_ctx_free() (line 174) is therefore never reached.

Step-by-step proof

  1. _bf_ctx_new() is called from bf_ctx_setup() (line 389).
  2. Line 100: _ctx is declared as NULL with _free_bf_ctx_ cleanup.
  3. Line 106: bf_btf_setup() succeeds — global _bf_btf now holds an allocated struct btf *.
  4. Line 110: calloc() fails, returns NULL — _ctx remains NULL.
  5. Line 112: return -ENOMEM triggers scope exit.
  6. Cleanup fires: _bf_ctx_free(&_ctx) called, but *ctx == NULL → immediate return.
  7. bf_btf_teardown() is never called → the global _bf_btf leaks.

Impact

The leaked struct btf object persists for the lifetime of the process. In practice, calloc for a small struct rarely fails, so the window is narrow. However, this is a real correctness defect: the cleanup-attribute pattern that this codebase relies on has a gap here because the resource (bf_btf_setup) is acquired before the object that owns its teardown (_ctx) is allocated.

Suggested Fix

The simplest fix is to move the bf_btf_setup() call to after the calloc succeeds, so that if any subsequent step fails, the cleanup attribute on the now-non-NULL _ctx will properly call _bf_ctx_free() which calls bf_btf_teardown(). Alternatively, add an explicit bf_btf_teardown() call before return -ENOMEM at line 112.

Comment on lines 369 to 371

r = bf_send(fd, request, &response, NULL);
r = bf_cgen_update(cgen, &chain_copy, 0);
if (r)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Two minor correctness issues in cli.c: (1) bf_chain_prog_fd() (line 269) and bf_chain_logs_fd() (line 286) return dup() directly — on failure dup() returns -1, which equals -EPERM in the project's negative-errno convention, masking the real error. Should check for -1 and return -errno (see ctx.c:329 for the correct pattern). (2) bf_chain_update() (line 372) replaces the actual error from bf_cgen_update() with -EINVAL, discarding diagnostic information. Should return r directly or use bf_err_r(r, ...) as done in bf_chain_update_set() in the same file.

Extended reasoning...

Unchecked dup() returns in bf_chain_prog_fd() and bf_chain_logs_fd()

The dup() system call returns -1 on failure and sets errno to indicate the error (e.g., EMFILE for too many open file descriptors, EBADF for a bad fd). Both bf_chain_prog_fd() at line 269 and bf_chain_logs_fd() at line 286 return the result of dup() directly to the caller:

return dup(cgen->handle->prog_fd);  // line 269
return dup(cgen->handle->lmap->fd); // line 286

The bpfilter project uses a negative-errno convention throughout: functions return negative errno values on failure (e.g., -ENOMEM, -ENOENT). A raw -1 return from dup() would be interpreted as -EPERM (Operation not permitted), which is incorrect and misleading for callers trying to diagnose why the fd wasn't obtained.

Step-by-step proof for dup() bug

  1. Caller invokes bf_chain_prog_fd("my_chain").
  2. The chain exists and cgen->handle->prog_fd is valid, so we reach line 269.
  3. However, the process has hit its RLIMIT_NOFILE limit, so dup() fails and returns -1, setting errno = EMFILE.
  4. bf_chain_prog_fd() returns -1 to the caller.
  5. The caller interprets -1 as -EPERM (since -EPERM == -1), logging "Operation not permitted" instead of "Too many open files".

The correct pattern exists in ctx.c at line 329-331, where dup() failure is properly handled by checking the return value and converting via -errno.

Error code replacement in bf_chain_update()

At lines 370-372 in bf_chain_update():

r = bf_cgen_update(cgen, &chain_copy, 0);
if (r)
    return -EINVAL;

The actual error code r from bf_cgen_update() is discarded and replaced with a generic -EINVAL. This loses the specific error information that bf_cgen_update() returned, making debugging harder. For comparison, bf_chain_update_set() in the same file (lines 456-459) handles the same function correctly:

r = bf_cgen_update(cgen, &new_chain, BF_FLAG(BF_CGEN_UPDATE_PRESERVE_COUNTERS));
if (r)
    return bf_err_r(r, "failed to update chain with new set data");

Step-by-step proof for error code replacement

  1. Caller invokes bf_chain_update(chain) to update an existing chain.
  2. The chain is found and copied successfully, reaching line 370.
  3. bf_cgen_update() fails with -ENOMEM (out of memory during BPF program regeneration).
  4. r is -ENOMEM, the condition if (r) is true.
  5. The function returns -EINVAL instead of -ENOMEM, telling the caller the input was invalid when the real problem was memory exhaustion.

Impact and fix

Both issues are minor correctness bugs that primarily affect error diagnostics. The dup() issue could theoretically cause incorrect error handling in callers that branch on specific errno values. The fix for dup() is to capture the return value, check for -1, and return bf_err_r(-errno, ...). The fix for bf_chain_update() is to replace return -EINVAL with return bf_err_r(r, "failed to update chain") for consistency with the rest of the file.

Comment on lines +19 to 22
* @brief Get the current ruleset.
*
* **Request payload format**
* The request doesn't contain data.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Stale documentation remains after daemon removal: (1) bf_cgen_attach docstring in cgen.h (lines 112-115) still describes the removed ns namespace parameter, and (2) bpfilter.h retains "Request payload format" / "Response payload format" Doxygen sections throughout that describe the deleted daemon IPC JSON wire protocol. These functions are now direct C library calls, so the IPC payload documentation is misleading.

Extended reasoning...

Stale ns parameter description in bf_cgen_attach

In src/libbpfilter/cgen/cgen.h, the docstring for bf_cgen_attach (lines 109-121) contains a paragraph that reads:

ns is the namespace the codegen should switch to before attaching the program. This is required to ensure the interface index the program is attached to (for XDP and TC programs) is correct, and the interface index the program filters on (e.g. meta.iface, for all hooks) is correct too.

However, the ns parameter was removed as part of this PR — the function signature is now int bf_cgen_attach(struct bf_cgen *cgen, struct bf_hookopts **hookopts); with no namespace parameter. The @param ns Doxygen tag was correctly removed, but this descriptive paragraph was left behind. It references a parameter that no longer exists in the function signature.

Stale IPC payload format sections in bpfilter.h

In src/libbpfilter/include/bpfilter/bpfilter.h, every function docstring contains Request payload format and Response payload format sections with @code{.json} blocks describing JSON wire formats. These sections documented the serialization protocol used when the functions communicated with the bpfilter daemon over IPC.

Since this PR removes the daemon entirely, these functions now call directly into libbpfilter — there is no IPC layer and no JSON serialization. The PR correctly updated the first-line descriptions (e.g., "Get the ruleset from the daemon" → "Get the current ruleset", "The daemon will flush" → "The library will flush"), but left all the payload format sections intact throughout the file.

Impact

Both issues are documentation-only with no functional impact. However, the stale documentation is misleading for anyone reading the API docs — it describes an IPC protocol that no longer exists and references a parameter that was removed. The payload format sections in bpfilter.h are particularly prominent, appearing in every function docstring.

Suggested fix

For cgen.h: Remove the paragraph about ns namespace switching (lines 112-115).

For bpfilter.h: Remove or rewrite all "Request payload format" and "Response payload format" sections. If the JSON format descriptions are still useful as documentation of the C struct layouts, they could be rephrased to remove references to "request" and "response" payloads. Otherwise, they should be removed entirely since they describe a deleted protocol layer.

Comment on lines 167 to 169
closep(&(*ctx)->token_fd);
closep(&(*ctx)->lock_fd);
bf_list_clean(&(*ctx)->cgens);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 In _bf_ctx_free(), closep(&(*ctx)->lock_fd) on line 168 releases the flock before bf_list_clean(&(*ctx)->cgens) on line 169, which calls bf_cgen_free() that performs bpffs filesystem operations (bf_ctx_get_pindir_fd(), bf_rmdir_at()). Moving the closep(&(*ctx)->lock_fd) call after bf_list_clean() would maintain mutual exclusion for the entire cleanup sequence.

Extended reasoning...

Bug Description

The _bf_ctx_free() function releases the pin-directory flock (lock_fd) before cleaning up the codegen list. The cleanup order on lines 167-169 is:

closep(&(*ctx)->token_fd);     // line 167
closep(&(*ctx)->lock_fd);      // line 168 - releases flock
bf_list_clean(&(*ctx)->cgens); // line 169 - does bpffs operations

Code Path

When bf_list_clean() iterates over the cgens list, it calls bf_cgen_free() for each codegen. In src/libbpfilter/cgen/cgen.c (lines 237-239), bf_cgen_free() performs filesystem operations on the bpffs pin directory:

pin_fd = bf_ctx_get_pindir_fd();
if (pin_fd >= 0)
    bf_rmdir_at(pin_fd, (*cgen)->chain->name, false);

These operations happen after the flock has already been released, creating a window where another bfcli process could acquire the lock and begin its own setup (including _bf_ctx_discover() which reads from the same pin directory) while this process is still removing chain subdirectories.

Impact

The practical impact is limited. The bf_rmdir_at() call only removes empty directories (non-recursive), and the race window is narrow — it requires another bfcli instance to start up and acquire the lock during the brief cleanup period. However, the intent of the flock (introduced by this PR in bf_ctx_setup) is clearly to provide mutual exclusion over the bpffs pin directory, and releasing it before the cleanup completes violates that invariant.

Step-by-step proof

  1. Process A calls bf_ctx_teardown()_bf_ctx_free().
  2. At line 168, closep(&(*ctx)->lock_fd) closes the fd, releasing the LOCK_EX flock.
  3. Process B starts, calls bf_ctx_setup()flock(pindir_fd, LOCK_EX | LOCK_NB) succeeds (lock is free).
  4. Process B calls _bf_ctx_discover(), begins iterating chain subdirectories under bpffs/bpfilter/.
  5. Process A reaches line 169, bf_list_clean() calls bf_cgen_free() which calls bf_rmdir_at() removing a chain directory.
  6. Process Bs readdir() or openat() may now encounter a directory that was removed mid-iteration.

Fix

Swap lines 168 and 169 so the lock is held during the entire cleanup:

closep(&(*ctx)->token_fd);
bf_list_clean(&(*ctx)->cgens);  // cleanup while lock is held
closep(&(*ctx)->lock_fd);       // release lock after cleanup

@@ -31,7 +28,7 @@ If you use a different distribution, you can still build and use **bpfilter** if
Example usage
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Documentation bug: doc/usage/index.rst line 31 incorrectly lists the local build path as $BUILD_DIR/output/bin/bfcli when it should be $BUILD_DIR/output/sbin/bfcli. The bfcli binary is installed to CMAKE_INSTALL_SBINDIR (i.e., sbin/) per src/bfcli/CMakeLists.txt, and README.md already correctly uses sbin. This is a minor doc inconsistency introduced in this PR.

Extended reasoning...

What the bug is

In doc/usage/index.rst, the PR changed the local-build path instruction from the old $BUILD_DIR/output/sbin/bpfilter (which was correct for the daemon) to $BUILD_DIR/output/bin/bfcli. The directory component was changed from sbin to bin, but bfcli is still installed to sbin/, not bin/.

The specific code path

The install destination for bfcli is defined in src/bfcli/CMakeLists.txt at line 81, which uses DESTINATION ${CMAKE_INSTALL_SBINDIR}. The CMake GNUInstallDirs module sets CMAKE_INSTALL_SBINDIR to sbin by default. This means the actual install path is $BUILD_DIR/output/sbin/bfcli, not $BUILD_DIR/output/bin/bfcli.

Step-by-step proof

  1. src/bfcli/CMakeLists.txt installs bfcli with DESTINATION ${CMAKE_INSTALL_SBINDIR}, which resolves to sbin/.
  2. README.md (line 65, also modified in this PR) correctly references $BUILD_DIR/output/sbin/bfcli.
  3. doc/usage/index.rst (line 31, new in this PR) incorrectly references $BUILD_DIR/output/bin/bfcli.
  4. The two documentation files are inconsistent, and the CMakeLists.txt confirms sbin is correct.

Why existing code does not prevent this

This is a documentation-only issue. There is no automated check that verifies documentation paths match actual CMake install destinations.

Impact

Users who follow the usage guide to run a locally-built bfcli will get a "command not found" error because they will look in bin/ instead of sbin/. The impact is low since README.md has the correct path, and users can easily discover the right location.

Fix

Change $BUILD_DIR/output/bin/bfcli to $BUILD_DIR/output/sbin/bfcli on line 31 of doc/usage/index.rst.

Comment on lines +11 to +12
${FROM_NS} ${BFCLI} chain get --name xdp | awk '/ip4.proto eq icmp/{getline; print $2}' | grep -q "^1$" && exit 0 || exit 1
${FROM_NS} ${BFCLI} ruleset flush No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Line 11 uses && exit 0 || exit 1 which unconditionally terminates the script, making line 12 (ruleset flush) dead code that never executes. The sibling file netns_to_host.sh correctly omits the explicit exit, allowing its ruleset flush to run. Remove && exit 0 || exit 1 from line 11 to match the sibling pattern.

Extended reasoning...

In tests/e2e/namespace/host_to_netns.sh, line 11 ends with && exit 0 || exit 1. This pattern means: if grep -q succeeds, run exit 0; if grep -q fails, run exit 1. In both cases, the script terminates immediately, so line 12 (ruleset flush) is unreachable dead code.

Step-by-step proof: (1) The pipeline on line 11 runs chain get, pipes through awk, then into grep -q. (2) Case A: grep -q finds a match, exit status 0, && exit 0 executes, script terminates with code 0. (3) Case B: grep -q does not find a match, exit status non-zero, && short-circuits, || exit 1 executes, script terminates with code 1. (4) There is no execution path where control reaches line 12.

The sibling test netns_to_host.sh uses the correct pattern on its line 11 - no && exit 0 || exit 1 - so grep -q exit status propagates naturally, and ruleset flush on line 12 executes normally.

The practical impact is low since the cleanup trap registered by make_sandbox handles sandbox teardown regardless. However, the ruleset flush on line 12 was clearly intended to run as explicit cleanup (matching the sibling file), and the dead code is misleading to anyone reading or maintaining these tests.

The fix is to remove && exit 0 || exit 1 from line 11 to match the pattern in netns_to_host.sh. This lets grep -q control the script exit status naturally while allowing ruleset flush to execute.

Comment on lines 22 to +24
│ ├── xlate.c # Rule translation
│ └── bpf/ # eBPF stub programs
│ └── *.c # Implementation (chain, matcher, rule, hook, set, bpf, btf, ctx...)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 CLAUDE.md directory tree has two issues: xlate.c on line 22 no longer exists (it was part of the deleted src/bpfilter/ daemon code) and should be removed, and line 23 (bpf/) uses └── but should use ├── since *.c follows at the same level. Both are minor documentation inaccuracies introduced by this PR.

Extended reasoning...

Bug Description

The CLAUDE.md directory tree listing for src/libbpfilter/ contains two documentation errors introduced by this PR's changes.

Issue 1: Stale xlate.c reference (line 22)

Line 22 of CLAUDE.md lists xlate.c under src/libbpfilter/:

│   ├── xlate.c           # Rule translation

However, xlate.c was part of src/bpfilter/ (the daemon directory that this PR removes). The rule translation functionality was rewritten into src/libbpfilter/cli.c as part of this PR. Since xlate.c does not exist under src/libbpfilter/, this line should be removed entirely.

Issue 2: Duplicate last-child tree markers (lines 23-24)

Lines 23-24 both use the └── (last-child) marker at the same indentation level:

│   └── bpf/              # eBPF stub programs
│   └── *.c               # Implementation...

In standard ASCII tree notation, └── means "last child at this level." Since bpf/ is not the last entry (it is followed by *.c), line 23 should use ├── instead. The correct tree after removing xlate.c would be:

│   ├── bpf/              # eBPF stub programs
│   └── *.c               # Implementation...

Step-by-step proof

  1. The PR diff shows CLAUDE.md was modified to update the directory tree after removing the daemon.
  2. Reading CLAUDE.md line 22: │ ├── xlate.c # Rule translation — this file does not exist under src/libbpfilter/. It was in src/bpfilter/ which was deleted.
  3. Reading line 23: │ └── bpf/ — uses └── but *.c follows on line 24 at the same level, so ├── is correct.
  4. Reading line 24: │ └── *.c — correctly uses └── as the actual last child.

Impact

These are documentation-only issues with no functional impact. The directory tree in CLAUDE.md will be slightly misleading to developers trying to understand the project structure, but it does not affect builds, tests, or runtime behavior.

Fix

Remove line 22 (xlate.c) and change line 23 from └── to ├── for the bpf/ entry.

Comment on lines 116 to +143

r = bf_wpack_new(&pack);
if (r)
return r;
bf_ctx_flush();

bf_wpack_open_array(pack, "ruleset");
while (chain_node && hookopts_node) {
_free_bf_cgen_ struct bf_cgen *cgen = NULL;
_free_bf_chain_ struct bf_chain *chain_copy = NULL;
_free_bf_hookopts_ struct bf_hookopts *hookopts_copy = NULL;
struct bf_chain *chain = bf_list_node_get_data(chain_node);
struct bf_hookopts *hookopts = bf_list_node_get_data(hookopts_node);
struct bf_hookopts *node_hookopts =
bf_list_node_get_data(hookopts_node);

r = bf_chain_new_from_copy(&chain_copy, chain);
if (r)
goto err_load;

if (node_hookopts) {
r = copy_hookopts(&hookopts_copy, node_hookopts);
if (r)
goto err_load;
}

bf_wpack_open_object(pack, NULL);
r = bf_cgen_new(&cgen, &chain_copy);
if (r)
goto err_load;

bf_wpack_open_object(pack, "chain");
bf_chain_pack(chain, pack);
bf_wpack_close_object(pack);
r = bf_cgen_set(cgen, hookopts_copy ? &hookopts_copy : NULL);
if (r) {
bf_err_r(r, "failed to set chain '%s'", cgen->chain->name);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 Non-atomic chain replacement in bf_chain_set() (line 199) deletes the old chain via bf_ctx_delete_cgen() before calling bf_cgen_set() for the new one — if the new chain fails to load/attach, the hook is left unfiltered. Similarly, bf_ruleset_set() (line 117) calls bf_ctx_flush() upfront, destroying all existing chains before loading replacements, and on error flushes even the partial progress. This is a pre-existing pattern carried over from the old daemon code in xlate.c, and the documentation already notes "Replacement is not atomic," but it remains a security concern for a packet filter since failures leave gaps in filtering.

Extended reasoning...

What the bug is

Two functions in cli.c implement non-atomic replace/set semantics that can leave the system with no active BPF filtering programs:

  1. bf_chain_set() (lines 173–214): At line 199, the old cgen is unconditionally deleted with (void)bf_ctx_delete_cgen(old_cgen, true), which unloads the BPF program and detaches it from its hook. Then at line 201, bf_cgen_set() is called to generate, load, and attach the replacement. If any of those steps fail (e.g., BPF verifier rejects the program, memory allocation fails, hook attach fails), the function returns an error — but the old chain is already gone. The hook point has no filtering program attached.

  2. bf_ruleset_set() (lines 108–164): At line 117, bf_ctx_flush() destroys all existing chains and their BPF programs before the loop even begins loading replacements. If any chain in the loop fails at bf_cgen_set() (line 141), the err_load handler at line 162 calls bf_ctx_flush() again, destroying even the partially-loaded chains. The end result on any single failure is zero filtering rules system-wide.

Concrete failure scenario

Consider bf_chain_set() replacing an active chain:

  1. bf_ctx_get_cgen() finds the existing chain (line 197)
  2. bf_ctx_delete_cgen(old_cgen, true) unloads old BPF program and detaches from hook (line 199)
  3. bf_cgen_set(new_cgen, ...) calls bf_program_generate()bf_program_load()bf_handle_attach() (in cgen.c lines 305–350)
  4. If bf_program_load() fails (e.g., BPF verifier rejection due to a rule that produces too many instructions), the function returns an error
  5. The old program was already detached at step 2 — the hook point is now unfiltered

For bf_ruleset_set(), simply having one malformed chain in a batch of 10 would destroy all 10 existing chains and leave zero filtering active.

Why this is pre-existing

The same delete-then-set pattern existed in the old daemon code (src/bpfilter/xlate.c), which this PR replaces. The PR consolidates the daemon into libbpfilter but preserves the same operational semantics. The project documentation at doc/usage/bfcli.rst explicitly states "Replacement is not atomic," indicating this is a known design choice rather than an oversight.

Addressing the intentional-design argument

One verifier noted this behavior is documented and that the error-path cleanup (flushing partial state) is arguably better than leaving the system in an inconsistent half-loaded state. This is a reasonable engineering trade-off — a clean failure state is easier to reason about than partial filtering. However, for a packet filter, "no filtering" is itself a security-relevant state. The codebase already demonstrates that atomic program replacement is feasible: bf_cgen_update() in cgen.c uses bf_link_update() to atomically swap BPF programs on an existing link without any gap. This pattern could theoretically be extended to bf_chain_set() and bf_ruleset_set() as well.

Impact and recommendation

Since this is a pre-existing pattern that predates this PR and is documented behavior, it should not block merging. However, it represents a real security gap for production use of bpfilter — any failure during chain replacement leaves hooks unfiltered. A future improvement could adopt the atomic swap pattern already used in bf_cgen_update() to avoid the delete-before-replace window.

@qdeslandes
Copy link
Contributor Author

Closing in favor of #467

@qdeslandes qdeslandes closed this Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant