feat: fix cross-cell service access — DNS DNAT, service DNAT, Caddy routing

DNS A records now return the WireGuard server IP (10.0.0.1) instead of
Docker bridge VIPs so cross-cell peers resolve service names correctly
regardless of their bridge subnet. DNAT rules (wg0:53→cell-dns:53 and
wg0:80→cell-caddy:80) are applied at startup. Caddy routes by Host header,
eliminating the Docker bridge subnet conflict. Firewall cell rules allow
DNS and service (Caddy) traffic from linked cell subnets. Split-tunnel
AllowedIPs now dynamically includes connected-cell VPN subnets and drops
the 172.20.0.0/16 range. Peers with route_via set now receive full-tunnel
config (0.0.0.0/0) so all their traffic exits via the remote cell.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-02 03:12:09 -04:00
parent f2f15eb17e
commit 9a800e3b6b
11 changed files with 325 additions and 146 deletions
+4 -1
View File
@@ -305,6 +305,8 @@ def _apply_startup_enforcement():
firewall_manager.apply_all_peer_rules(peers)
firewall_manager.apply_all_cell_rules(cell_links)
firewall_manager.ensure_cell_api_dnat()
firewall_manager.ensure_dns_dnat()
firewall_manager.ensure_service_dnat()
# Restore any cell link WireGuard peers that were lost from wg0.conf
# (happens if the container was rebuilt, wg0.conf was reset, etc.)
_restore_cell_wg_peers(cell_links)
@@ -337,7 +339,8 @@ def _bootstrap_dns():
cell_name = identity.get('cell_name', os.environ.get('CELL_NAME', 'mycell'))
domain = identity.get('domain', os.environ.get('CELL_DOMAIN', 'cell'))
ip_range = identity.get('ip_range', os.environ.get('CELL_IP_RANGE', '172.20.0.0/16'))
network_manager.bootstrap_dns_records(cell_name, domain, ip_range)
# Bootstrap on first start; then always regenerate to ensure A records use WG server IP.
network_manager.apply_ip_range(ip_range, cell_name, domain)
except Exception as e:
logger.warning(f"DNS bootstrap failed (non-fatal): {e}")