Fix race condition in ensure_forward_stateful: add threading.Lock

Concurrent callers (health monitor + startup) could both pass the delete-all loop and each insert a copy, producing duplicate ESTABLISHED,RELATED rules. Lock serialises all calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 10:12:18 -04:00
parent 1b61e9e290
commit b8e57b6e51
1 changed files with 17 additions and 13 deletions
@@ -8,10 +8,13 @@ import os
 import subprocess
 import logging
 import re
+import threading
 from typing import Dict, List, Any, Optional

 logger = logging.getLogger(__name__)

+_forward_stateful_lock = threading.Lock()
+
 # Virtual IPs assigned to Caddy per service — must match Caddyfile listeners.
 # Populated at import time from the default subnet; call update_service_ips()
 # whenever ip_range changes so all downstream callers see the new values.
@@ -459,6 +462,7 @@ def ensure_forward_stateful() -> bool:
    which pushes this rule down every time wg0 restarts — causing ICMP to hit the
    per-peer DROP rule before reaching the stateful ACCEPT.
    """
+    with _forward_stateful_lock:
        try:
            # Remove all existing instances so we can re-anchor at position 1.
            # PostUp -I FORWARD rules drift this rule down on every wg0 restart.