fix: DNS first-install — split-horizon zone creation + CoreDNS inode bind-mount

VPN clients got dns_probe_finished_bad_config / couldn't resolve any domain
after first setup because:

1. complete_setup() never wrote the split-horizon DNS zone for non-LAN modes;
   SetupManager now accepts network_manager as an optional 3rd constructor
   param, and complete_setup() calls
   self.network_manager.update_split_horizon_zone(effective_domain, wg_ip,
   primary_domain) for pic_ngo/cell_to_cell modes.

2. generate_corefile() used a tmp-file + os.replace pattern; the Corefile is
   a Docker FILE bind-mount, so os.replace orphaned the inode and CoreDNS
   never saw config updates.  Fixed by truncating and rewriting in place
   (open with 'w', seek(0), truncate()), preserving the inode CoreDNS holds.

api/managers.py passes network_manager into SetupManager.
Tests: new mock_network_manager fixture, 2 setup-zone tests, 1 inode
regression test in test_firewall_manager.py.
Verified live on pic1.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-06-10 12:48:37 -04:00
parent a9c7235347
commit 1daace48eb
5 changed files with 80 additions and 8 deletions
+24 -1
View File
@@ -94,9 +94,10 @@ def _build_ddns_config(domain_mode: str, cloudflare_api_token: str = '',
class SetupManager:
"""Manages the first-run setup wizard state and completion."""
def __init__(self, config_manager, auth_manager):
def __init__(self, config_manager, auth_manager, network_manager=None):
self.config_manager = config_manager
self.auth_manager = auth_manager
self.network_manager = network_manager
# ── state helpers ─────────────────────────────────────────────────────
@@ -270,6 +271,28 @@ class SetupManager:
'HTTPS will activate once registration succeeds.'
)
# ── write the split-horizon DNS zone for non-LAN modes ─────────
# VPN clients use the cell's CoreDNS (DNS=<wg ip>) and must resolve
# the effective domain to the internal Caddy IP so traffic reaches
# Caddy through the tunnel. _bootstrap_dns runs at container start
# BEFORE setup completes (domain_mode still 'lan'), so it takes the
# LAN branch and never writes this zone — leaving CoreDNS pointing
# at a missing zone file and VPN lookups returning nothing
# (dns_probe_finished_bad_config). Write it here now that the mode
# and effective domain are known.
if domain_mode != 'lan' and self.network_manager is not None:
try:
effective_domain = self.config_manager.get_effective_domain()
primary_domain = self.config_manager.get_identity().get('domain', 'cell')
if effective_domain and effective_domain != primary_domain:
caddy_ip = self.network_manager._get_wg_server_ip()
self.network_manager.update_split_horizon_zone(
effective_domain, caddy_ip, primary_domain=primary_domain)
logger.info(
f'Split-horizon zone written for {effective_domain} -> {caddy_ip}')
except Exception as exc:
logger.warning(f'Split-horizon zone setup failed (non-fatal): {exc}')
# ── mark setup complete (must be last) ─────────────────────────
self.config_manager.set_identity_field('setup_complete', True)