Three related cell-link/peer-config fixes (the peer and cell endpoints were
showing the raw external IP, which confused public-vs-internal addressing):
1. Peer WireGuard configs now embed the cell's effective domain (DDNS/ACME
modes) instead of the detected external IP, via the new
WireGuardManager.get_advertised_endpoint(). A name that resolves to the
public IP survives IP changes and lets the datacenter forward each cell's
WG port to the right host. LAN mode still falls back to the IP; an admin
wireguard_endpoint override still wins.
2. Cell invites advertise <effective-domain>:<this cell's WG port> (was the
external IP + a default/possibly-wrong port), so a remote cell pairs to the
right host and port over the public path.
3. Cross-cell peer-sync no longer targets http://<ip>:3000 (the API binds
127.0.0.1 and is unreachable across cells). It targets the remote's Caddy on
HTTPS/443 — which the WireGuard server already DNATs over the tunnel — and the
initial pre-tunnel invite push goes to https://<endpoint-host>/... ; legacy
http://<ip>:3000 link URLs migrate to https on load.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Three independent bugs surfaced during pic1 clean-install testing:
1. Tor _exit_status hardcoded configured=True regardless of whether Tor was
actually installed. Status now flows through the same store-installed /
container-running bridge used by every other optional service, so Tor only
reports installed when the container is present and running.
2. check_port_open compared the port from wg0.conf against the kernel-reported
listening port, causing false "port closed" results whenever the conf and the
running container were momentarily out of sync. The function is now an honest
liveness check: any wg0 interface that is up and has a "listening port:" line
in `wg show` is considered open. The check-port API endpoint now also returns
the actual kernel listening_port and a port_mismatch flag so the UI can inform
the user when a container recreate is needed. (The recreate machinery already
exists via the port-change pending-restart path; this fix makes the mismatch
visible rather than silently lying about reachability.)
3. upload_backup only handled .zip archives; encrypted .age blobs were rejected
with a generic error. The endpoint now calls backup_crypto.is_encrypted() to
detect Age-encrypted blobs and stores them verbatim as <id>.tar.gz.age with
mode 0600 so they can be uploaded and then restored with a passphrase. The
plaintext zip path is unchanged.
Tests added/updated: test_connectivity_manager.py (Tor status bridge),
test_wireguard_manager.py + test_wireguard_endpoints.py (port-check liveness
and mismatch flag), test_config_backup_restore_http.py (encrypted upload
round-trip).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- DNS (critical): add _configured_dns_params() that returns (primary_domain,
split_horizon_zones) from config_manager so all apply_all_dns_rules() callers
pass the correct primary zone (e.g. 'pic.ngo') and split-horizon list
(e.g. ['pic1.pic.ngo']) instead of the FQDN as the primary — fixes
DNS_PROBE_FINISHED_BAD_CONFIG for all external domains when on VPN
- firewall_manager: add split_horizon_zones param to apply_all_dns_rules()
and forward it to generate_corefile()
- Peers: filter service_access list to installed services only; peers.py
derives valid services from config_manager.get_installed_services() with
the email→mail ID mapping; Peers.jsx fetches from /api/store/installed
and filters the checkboxes and defaults accordingly
- Health check: fix file_manager→'files' ID mapping so files service health
is checked when installed (was silently skipped due to 'file' vs 'files')
- Verbosity persistence: move log_levels.json from non-mounted
/app/api/config/ to CONFIG_DIR (/app/config/) which maps to config/api/
on the host; both load (managers.py) and save (routes/services.py) updated
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Endpoint override:
- Add PUT /api/wireguard/endpoint to set endpoint_override in identity
config; GET returns detected, override, and effective endpoints
- _effective_endpoint() helper applies override in peer config generation
(wireguard.py and peer_dashboard.py); detected IP still shown in UI
- Add Endpoint Override input in WireGuard page — solves the common case
where auto-detected IP is a gateway/VPS but peers connect via LAN IP
Docker cell-network fix:
- Declare cell-network external in docker-compose.yml; Docker Compose v5
enforces label ownership and rejects networks created by older versions
- Makefile start/update pre-create cell-network idempotently
- reinstall/uninstall(full) explicitly delete and recreate the network
- Fix uninstall loop path: data/api/services/ (not data/services/)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three related fixes for split-tunnel peers that need to reach connected cells:
1. apply_peer_rules/apply_all_peer_rules now accept wg_subnet (actual local VPN
subnet) and cell_subnets (connected cells' vpn_subnets) parameters instead of
hardcoding 10.0.0.0/24. All callers (startup, add_peer, update_peer,
apply-enforcement endpoint) pass the real values.
2. Explicit ACCEPT rules are inserted in FORWARD for each connected cell's
subnet so split-tunnel peers (internet_access=False) can still reach
connected cells via the wg0→wg0 path.
3. apply_ip_range in network_manager now loads cell_links.json and passes it
to generate_corefile(), fixing a race where the bootstrap DNS thread could
overwrite the Corefile and wipe cross-cell DNS forwarding zones on startup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
DNS A records now return the WireGuard server IP (10.0.0.1) instead of
Docker bridge VIPs so cross-cell peers resolve service names correctly
regardless of their bridge subnet. DNAT rules (wg0:53→cell-dns:53 and
wg0:80→cell-caddy:80) are applied at startup. Caddy routes by Host header,
eliminating the Docker bridge subnet conflict. Firewall cell rules allow
DNS and service (Caddy) traffic from linked cell subnets. Split-tunnel
AllowedIPs now dynamically includes connected-cell VPN subnets and drops
the 172.20.0.0/16 range. Peers with route_via set now receive full-tunnel
config (0.0.0.0/0) so all their traffic exits via the remote cell.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>