Skip to content
Merged
3 changes: 3 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ services:
- CHOWN # Required for root-entrypoint to chown /data + /tmp before dropping privileges
- SETUID # Required for root-entrypoint to switch to non-root user
- SETGID # Required for root-entrypoint to switch to non-root group
sysctls: # ARP flux mitigation for host networking accuracy
net.ipv4.conf.all.arp_ignore: 1
net.ipv4.conf.all.arp_announce: 2
volumes:

- type: volume # Persistent Docker-managed Named Volume for storage
Expand Down
3 changes: 3 additions & 0 deletions docs/DOCKER_COMPOSE.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ services:
- CHOWN # Required for root-entrypoint to chown /data + /tmp before dropping privileges
- SETUID # Required for root-entrypoint to switch to non-root user
- SETGID # Required for root-entrypoint to switch to non-root group
sysctls: # ARP flux mitigation (reduces duplicate/ambiguous ARP behavior on host networking)
net.ipv4.conf.all.arp_ignore: 1
net.ipv4.conf.all.arp_announce: 2

volumes:
- type: volume # Persistent Docker-managed named volume for config + database
Expand Down
51 changes: 51 additions & 0 deletions docs/docker-troubleshooting/arp-flux-sysctls.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# ARP Flux Sysctls Not Set

## Issue Description

NetAlertX detected that ARP flux protection sysctls are not set as expected:

- `net.ipv4.conf.all.arp_ignore=1`
- `net.ipv4.conf.all.arp_announce=2`

## Security Ramifications

This is not a direct container breakout risk, but detection quality can degrade:

- Incorrect IP/MAC associations
- Device state flapping
- Unreliable topology or presence data

## Why You're Seeing This Issue

The running environment does not provide the expected kernel sysctl values. This is common in Docker setups where sysctls were not explicitly configured.

## How to Correct the Issue

Set these sysctls at container runtime.

- In `docker-compose.yml` (preferred):
```yaml
services:
netalertx:
sysctls:
net.ipv4.conf.all.arp_ignore: 1
net.ipv4.conf.all.arp_announce: 2
```

- For `docker run`:
```bash
docker run \
--sysctl net.ipv4.conf.all.arp_ignore=1 \
--sysctl net.ipv4.conf.all.arp_announce=2 \
ghcr.io/netalertx/netalertx:latest
```

> **Note:** Setting `net.ipv4.conf.all.arp_ignore` and `net.ipv4.conf.all.arp_announce` may fail with "operation not permitted" unless the container is run with elevated privileges. To resolve this, you can:
> - Use `--privileged` with `docker run`.
> - Use the more restrictive `--cap-add=NET_ADMIN` (or `cap_add: [NET_ADMIN]` in `docker-compose` service definitions) to allow the sysctls to be applied at runtime.

## Additional Resources

For broader Docker Compose guidance, see:

- [DOCKER_COMPOSE.md](https://docs.netalertx.com/DOCKER_COMPOSE)
3 changes: 3 additions & 0 deletions install/docker/docker-compose.dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ services:
- CHOWN
- SETUID
- SETGID
sysctls:
net.ipv4.conf.all.arp_ignore: 1
net.ipv4.conf.all.arp_announce: 2
volumes:
- type: volume
source: netalertx_data
Expand Down
3 changes: 3 additions & 0 deletions install/docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ services:
- CHOWN
- SETUID
- SETGID
sysctls:
net.ipv4.conf.all.arp_ignore: 1
net.ipv4.conf.all.arp_announce: 2
volumes:
- type: volume
source: netalertx_data
Expand Down
31 changes: 10 additions & 21 deletions install/production-filesystem/entrypoint.d/36-override-individual-settings.sh
100644 → 100755
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this changed? was tehre an issue with the original implementation? it was written to accomodate for future expansion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it was a pointless function to set a single variable. It can be reintroduced if there is more than one variable to be set. Right now it makes it harder to read.

Original file line number Diff line number Diff line change
Expand Up @@ -9,28 +9,17 @@ if [ ! -f "${NETALERTX_CONFIG}/app.conf" ]; then
exit 0
fi

# Helper: set or append config key safely
set_config_value() {
_key="$1"
_value="$2"

# Remove newlines just in case
_value=$(printf '%s' "$_value" | tr -d '\n\r')

# Escape sed-sensitive chars
_escaped=$(printf '%s\n' "$_value" | sed 's/[\/&]/\\&/g')
if [ -n "${LOADED_PLUGINS:-}" ]; then
echo "[ENV] Applying LOADED_PLUGINS override"
value=$(printf '%s' "$LOADED_PLUGINS" | tr -d '\n\r')
# declare delimiter for sed and escape it along with / and &
delim='|'
escaped=$(printf '%s\n' "$value" | sed "s/[\/${delim}&]/\\&/g")

if grep -q "^${_key}=" "${NETALERTX_CONFIG}/app.conf"; then
sed -i "s|^${_key}=.*|${_key}=${_escaped}|" "${NETALERTX_CONFIG}/app.conf"
if grep -q '^LOADED_PLUGINS=' "${NETALERTX_CONFIG}/app.conf"; then
# use same delimiter when substituting
sed -i "s${delim}^LOADED_PLUGINS=.*${delim}LOADED_PLUGINS=${escaped}${delim}" "${NETALERTX_CONFIG}/app.conf"
else
echo "${_key}=${_value}" >> "${NETALERTX_CONFIG}/app.conf"
echo "LOADED_PLUGINS=${value}" >> "${NETALERTX_CONFIG}/app.conf"
fi
}

# ------------------------------------------------------------
# LOADED_PLUGINS override
# ------------------------------------------------------------
if [ -n "${LOADED_PLUGINS:-}" ]; then
echo "[ENV] Applying LOADED_PLUGINS override"
set_config_value "LOADED_PLUGINS" "$LOADED_PLUGINS"
fi
84 changes: 11 additions & 73 deletions install/production-filesystem/entrypoint.d/37-host-optimization.sh
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,92 +1,30 @@
#!/bin/sh

# 37-host-optimization.sh: Apply and validate network optimizations (ARP flux fix)
# 37-host-optimization.sh: Detect ARP flux sysctl configuration.
#
# This script improves detection accuracy by ensuring proper ARP behavior.
# It attempts to apply sysctl settings and warns if not possible.
# This script does not change host/kernel settings.

# --- Color Codes ---
RED=$(printf '\033[1;31m')
YELLOW=$(printf '\033[1;33m')
RESET=$(printf '\033[0m')

# --- Skip flag ---
if [ -n "${SKIP_OPTIMIZATIONS:-}" ]; then
exit 0
fi

# --- Helpers ---

get_sysctl() {
sysctl -n "$1" 2>/dev/null || echo "unknown"
}

set_sysctl_if_needed() {
key="$1"
expected="$2"

current="$(get_sysctl "$key")"

# Already correct
if [ "$current" = "$expected" ]; then
return 0
fi

# Try to apply
if sysctl -w "$key=$expected" >/dev/null 2>&1; then
return 0
fi

# Failed
return 1
}

# --- Apply Settings (best effort) ---

failed=0

set_sysctl_if_needed net.ipv4.conf.all.arp_ignore 1 || failed=1
set_sysctl_if_needed net.ipv4.conf.all.arp_announce 2 || failed=1
set_sysctl_if_needed net.ipv4.conf.default.arp_ignore 1 || failed=1
set_sysctl_if_needed net.ipv4.conf.default.arp_announce 2 || failed=1
[ "$(sysctl -n net.ipv4.conf.all.arp_ignore 2>/dev/null || echo unknown)" = "1" ] || failed=1
[ "$(sysctl -n net.ipv4.conf.all.arp_announce 2>/dev/null || echo unknown)" = "2" ] || failed=1

# --- Validate final state ---

all_ignore="$(get_sysctl net.ipv4.conf.all.arp_ignore)"
all_announce="$(get_sysctl net.ipv4.conf.all.arp_announce)"

# --- Warning Output ---

if [ "$all_ignore" != "1" ] || [ "$all_announce" != "2" ]; then
if [ "$failed" -eq 1 ]; then
>&2 printf "%s" "${YELLOW}"
>&2 cat <<EOF
>&2 cat <<'EOF'
══════════════════════════════════════════════════════════════════════════════
⚠️ ATTENTION: ARP flux protection not enabled.

NetAlertX relies on ARP for device detection. Your system currently allows
ARP replies from incorrect interfaces (ARP flux), which may result in:

• False devices being detected
• IP/MAC mismatches
• Flapping device states
• Incorrect network topology

This is common when running in Docker or multi-interface environments.

──────────────────────────────────────────────────────────────────────────
Recommended fix (Docker Compose):

sysctls:
net.ipv4.conf.all.arp_ignore: 1
net.ipv4.conf.all.arp_announce: 2

──────────────────────────────────────────────────────────────────────────
Alternatively, apply on the host:
⚠️ WARNING: ARP flux sysctls are not set.

Expected values:
net.ipv4.conf.all.arp_ignore=1
net.ipv4.conf.all.arp_announce=2

Detection accuracy may be reduced until this is configured.
Detection accuracy may be reduced until configured.

See: https://docs.netalertx.com/docker-troubleshooting/arp-flux-sysctls/
══════════════════════════════════════════════════════════════════════════════
EOF
>&2 printf "%s" "${RESET}"
Expand Down
9 changes: 5 additions & 4 deletions install/production-filesystem/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,11 @@ for script in "${ENTRYPOINT_CHECKS}"/*; do
fi
script_name=$(basename "$script" | sed 's/^[0-9]*-//;s/\.(sh|py)$//;s/-/ /g')
echo "--> ${script_name} "
if [ -n "${SKIP_STARTUP_CHECKS:-}" ] && echo "${SKIP_STARTUP_CHECKS}" | grep -q "\b${script_name}\b"; then
printf "%sskip%s\n" "${GREY}" "${RESET}"
continue
fi
if [ -n "${SKIP_STARTUP_CHECKS:-}" ] &&
printf '%s' "${SKIP_STARTUP_CHECKS}" | grep -wFq -- "${script_name}"; then
printf "%sskip%s\n" "${GREY}" "${RESET}"
continue
fi

"$script"
NETALERTX_DOCKER_ERROR_CHECK=$?
Expand Down
12 changes: 7 additions & 5 deletions install/production-filesystem/services/healthcheck.sh
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,13 @@ else
log_error "python /app/server is not running"
fi

# 5. Check port 20211 is open and contains "netalertx"
if curl -sf --max-time 10 "http://localhost:${PORT:-20211}" | grep -i "netalertx" > /dev/null; then
log_success "Port ${PORT:-20211} is responding and contains 'netalertx'"
# 5. Check port 20211 is open
CHECK_ADDR="${LISTEN_ADDR:-127.0.0.1}"
[ "${CHECK_ADDR}" == "0.0.0.0" ] && CHECK_ADDR="127.0.0.1"
if timeout 10 bash -c "</dev/tcp/${CHECK_ADDR}/${PORT:-20211}" 2>/dev/null; then
log_success "Port ${PORT:-20211} is responding"
else
log_error "Port ${PORT:-20211} is not responding or doesn't contain 'netalertx'"
log_error "Port ${PORT:-20211} is not responding"
fi

# NOTE: GRAPHQL_PORT might not be set and is initailized as a setting with a default value in the container. It can also be initialized via APP_CONF_OVERRIDE
Expand All @@ -71,4 +73,4 @@ else
echo "[HEALTHCHECK] ❌ One or more health checks failed"
fi

exit $EXIT_CODE
exit $EXIT_CODE
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ nav:
- Docker Updates: UPDATES.md
- Docker Maintenance: DOCKER_MAINTENANCE.md
- Docker Startup Troubleshooting:
- ARP flux sysctls: docker-troubleshooting/arp-flux-sysctls.md
- Aufs capabilities: docker-troubleshooting/aufs-capabilities.md
- Excessive capabilities: docker-troubleshooting/excessive-capabilities.md
- File permissions: docker-troubleshooting/file-permissions.md
Expand Down
97 changes: 55 additions & 42 deletions test/api_endpoints/test_devices_endpoints.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@ def create_dummy(client, api_token, test_mac):
client.post(f"/device/{test_mac}", json=payload, headers=auth_headers(api_token))


def delete_dummy(client, api_token, test_mac):
client.delete("/devices", json={"macs": [test_mac]}, headers=auth_headers(api_token))


def test_get_all_devices(client, api_token, test_mac):
# Ensure there is at least one device
create_dummy(client, api_token, test_mac)
Expand Down Expand Up @@ -149,53 +153,62 @@ def test_export_import_cycle_base64(client, api_token, test_mac):


def test_devices_totals(client, api_token, test_mac):
# 1. Create a dummy device
create_dummy(client, api_token, test_mac)

# 2. Call the totals endpoint
resp = client.get("/devices/totals", headers=auth_headers(api_token))
assert resp.status_code == 200

# 3. Ensure the response is a JSON list
data = resp.json
assert isinstance(data, list)

# 4. Dynamically get expected length
conditions = get_device_conditions()
expected_length = len(conditions)
assert len(data) == expected_length

# 5. Check that at least 1 device exists
assert data[0] >= 1 # 'devices' count includes the dummy device
try:
# 1. Call the totals endpoint
resp = client.get("/devices/totals", headers=auth_headers(api_token))
assert resp.status_code == 200

# 2. Ensure the response is a JSON list
data = resp.json
assert isinstance(data, list)

# 3. Dynamically get expected length
conditions = get_device_conditions()
expected_length = len(conditions)
assert len(data) == expected_length

# 4. Check that at least 1 device exists when there are any conditions
if expected_length > 0:
assert data[0] >= 1 # 'devices' count includes the dummy device
else:
# no conditions defined; data should be an empty list
assert data == []
finally:
delete_dummy(client, api_token, test_mac)


def test_devices_by_status(client, api_token, test_mac):
# 1. Create a dummy device
create_dummy(client, api_token, test_mac)

# 2. Request devices by a valid status
resp = client.get("/devices/by-status?status=my", headers=auth_headers(api_token))
assert resp.status_code == 200
data = resp.json
assert isinstance(data, list)
assert any(d["id"] == test_mac for d in data)

# 3. Request devices with an invalid/unknown status
resp_invalid = client.get("/devices/by-status?status=invalid_status", headers=auth_headers(api_token))
# Strict validation now returns 422 for invalid status enum values
assert resp_invalid.status_code == 422

# 4. Check favorite formatting if devFavorite = 1
# Update dummy device to favorite
client.post(
f"/device/{test_mac}",
json={"devFavorite": 1},
headers=auth_headers(api_token)
)
resp_fav = client.get("/devices/by-status?status=my", headers=auth_headers(api_token))
fav_data = next((d for d in resp_fav.json if d["id"] == test_mac), None)
assert fav_data is not None
assert "&#9733" in fav_data["title"]
try:
# 1. Request devices by a valid status
resp = client.get("/devices/by-status?status=my", headers=auth_headers(api_token))
assert resp.status_code == 200
data = resp.json
assert isinstance(data, list)
assert any(d["id"] == test_mac for d in data)

# 2. Request devices with an invalid/unknown status
resp_invalid = client.get("/devices/by-status?status=invalid_status", headers=auth_headers(api_token))
# Strict validation now returns 422 for invalid status enum values
assert resp_invalid.status_code == 422

# 3. Check favorite formatting if devFavorite = 1
# Update dummy device to favorite
update_resp = client.post(
f"/device/{test_mac}",
json={"devFavorite": 1},
headers=auth_headers(api_token)
)
assert update_resp.status_code == 200
assert update_resp.json.get("success") is True

resp_fav = client.get("/devices/by-status?status=my", headers=auth_headers(api_token))
fav_data = next((d for d in resp_fav.json if d["id"] == test_mac), None)
assert fav_data is not None
assert "&#9733" in fav_data["title"]
finally:
delete_dummy(client, api_token, test_mac)


def test_delete_test_devices(client, api_token):
Expand Down
Loading