Commit Graph

7 Commits

Author SHA1 Message Date
roof 1bb8a5eb59 fix: advertise WireGuard endpoint by domain, and reach linked cells over HTTPS
Unit Tests / test (push) Successful in 9m50s
Three related cell-link/peer-config fixes (the peer and cell endpoints were
showing the raw external IP, which confused public-vs-internal addressing):

1. Peer WireGuard configs now embed the cell's effective domain (DDNS/ACME
   modes) instead of the detected external IP, via the new
   WireGuardManager.get_advertised_endpoint(). A name that resolves to the
   public IP survives IP changes and lets the datacenter forward each cell's
   WG port to the right host. LAN mode still falls back to the IP; an admin
   wireguard_endpoint override still wins.

2. Cell invites advertise <effective-domain>:<this cell's WG port> (was the
   external IP + a default/possibly-wrong port), so a remote cell pairs to the
   right host and port over the public path.

3. Cross-cell peer-sync no longer targets http://<ip>:3000 (the API binds
   127.0.0.1 and is unreachable across cells). It targets the remote's Caddy on
   HTTPS/443 — which the WireGuard server already DNATs over the tunnel — and the
   initial pre-tunnel invite push goes to https://<endpoint-host>/... ; legacy
   http://<ip>:3000 link URLs migrate to https on load.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-16 04:21:16 -04:00
roof 8d904b1b8f fix: clean-install bugs — Tor false-installed, WG port-check honesty, encrypted backup upload
Unit Tests / test (push) Successful in 13m7s
Three independent bugs surfaced during pic1 clean-install testing:

1. Tor _exit_status hardcoded configured=True regardless of whether Tor was
   actually installed.  Status now flows through the same store-installed /
   container-running bridge used by every other optional service, so Tor only
   reports installed when the container is present and running.

2. check_port_open compared the port from wg0.conf against the kernel-reported
   listening port, causing false "port closed" results whenever the conf and the
   running container were momentarily out of sync.  The function is now an honest
   liveness check: any wg0 interface that is up and has a "listening port:" line
   in `wg show` is considered open.  The check-port API endpoint now also returns
   the actual kernel listening_port and a port_mismatch flag so the UI can inform
   the user when a container recreate is needed.  (The recreate machinery already
   exists via the port-change pending-restart path; this fix makes the mismatch
   visible rather than silently lying about reachability.)

3. upload_backup only handled .zip archives; encrypted .age blobs were rejected
   with a generic error.  The endpoint now calls backup_crypto.is_encrypted() to
   detect Age-encrypted blobs and stores them verbatim as <id>.tar.gz.age with
   mode 0600 so they can be uploaded and then restored with a passphrase.  The
   plaintext zip path is unchanged.

Tests added/updated: test_connectivity_manager.py (Tor status bridge),
test_wireguard_manager.py + test_wireguard_endpoints.py (port-check liveness
and mismatch flag), test_config_backup_restore_http.py (encrypted upload
round-trip).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 01:52:26 -04:00
roof c696ca9ef6 fix: DNS split-horizon in DDNS mode, service access filter, health check, verbosity persistence
Unit Tests / test (push) Successful in 7m32s
- DNS (critical): add _configured_dns_params() that returns (primary_domain,
  split_horizon_zones) from config_manager so all apply_all_dns_rules() callers
  pass the correct primary zone (e.g. 'pic.ngo') and split-horizon list
  (e.g. ['pic1.pic.ngo']) instead of the FQDN as the primary — fixes
  DNS_PROBE_FINISHED_BAD_CONFIG for all external domains when on VPN

- firewall_manager: add split_horizon_zones param to apply_all_dns_rules()
  and forward it to generate_corefile()

- Peers: filter service_access list to installed services only; peers.py
  derives valid services from config_manager.get_installed_services() with
  the email→mail ID mapping; Peers.jsx fetches from /api/store/installed
  and filters the checkboxes and defaults accordingly

- Health check: fix file_manager→'files' ID mapping so files service health
  is checked when installed (was silently skipped due to 'file' vs 'files')

- Verbosity persistence: move log_levels.json from non-mounted
  /app/api/config/ to CONFIG_DIR (/app/config/) which maps to config/api/
  on the host; both load (managers.py) and save (routes/services.py) updated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 13:05:58 -04:00
roof 39c59fd3ef feat: WireGuard endpoint override + fix Docker network label issue
Unit Tests / test (push) Successful in 11m14s
Endpoint override:
- Add PUT /api/wireguard/endpoint to set endpoint_override in identity
  config; GET returns detected, override, and effective endpoints
- _effective_endpoint() helper applies override in peer config generation
  (wireguard.py and peer_dashboard.py); detected IP still shown in UI
- Add Endpoint Override input in WireGuard page — solves the common case
  where auto-detected IP is a gateway/VPS but peers connect via LAN IP

Docker cell-network fix:
- Declare cell-network external in docker-compose.yml; Docker Compose v5
  enforces label ownership and rejects networks created by older versions
- Makefile start/update pre-create cell-network idempotently
- reinstall/uninstall(full) explicitly delete and recreate the network
- Fix uninstall loop path: data/api/services/ (not data/services/)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 04:51:38 -04:00
roof c2d215ee2e fix: cross-cell routing for split-tunnel peers
Three related fixes for split-tunnel peers that need to reach connected cells:

1. apply_peer_rules/apply_all_peer_rules now accept wg_subnet (actual local VPN
   subnet) and cell_subnets (connected cells' vpn_subnets) parameters instead of
   hardcoding 10.0.0.0/24. All callers (startup, add_peer, update_peer,
   apply-enforcement endpoint) pass the real values.

2. Explicit ACCEPT rules are inserted in FORWARD for each connected cell's
   subnet so split-tunnel peers (internet_access=False) can still reach
   connected cells via the wg0→wg0 path.

3. apply_ip_range in network_manager now loads cell_links.json and passes it
   to generate_corefile(), fixing a race where the bootstrap DNS thread could
   overwrite the Corefile and wipe cross-cell DNS forwarding zones on startup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 14:36:28 -04:00
roof 9a800e3b6b feat: fix cross-cell service access — DNS DNAT, service DNAT, Caddy routing
DNS A records now return the WireGuard server IP (10.0.0.1) instead of
Docker bridge VIPs so cross-cell peers resolve service names correctly
regardless of their bridge subnet. DNAT rules (wg0:53→cell-dns:53 and
wg0:80→cell-caddy:80) are applied at startup. Caddy routes by Host header,
eliminating the Docker bridge subnet conflict. Firewall cell rules allow
DNS and service (Caddy) traffic from linked cell subnets. Split-tunnel
AllowedIPs now dynamically includes connected-cell VPN subnets and drops
the 172.20.0.0/16 range. Peers with route_via set now receive full-tunnel
config (0.0.0.0/0) so all their traffic exits via the remote cell.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 03:12:09 -04:00
roof 09138fbc18 A5: Extract all route groups into Flask blueprints (app.py -1735 lines)
Extract 9 route groups out of app.py into routes/ blueprints:
- routes/network.py  — DNS, DHCP, NTP, network info/test (10 routes)
- routes/wireguard.py — WireGuard keys, peers, config, enforcement (18 routes)
- routes/cells.py    — cell-to-cell connections (5 routes)
- routes/peers.py    — peer CRUD + IP update + _next_peer_ip helper (10 routes)
- routes/routing.py  — NAT, peer routes, firewall, iptables (17 routes)
- routes/vault.py    — certs, trust, secrets (19 routes)
- routes/containers.py — containers, images, volumes (14 routes)
- routes/services.py — service bus, logs, services status/connectivity (18 routes)
- routes/peer_dashboard.py — peer-scoped dashboard/services (2 routes)

All blueprints use lazy `from app import X` inside route bodies to preserve
test patch compatibility (patch('app.email_manager', mock) still works).

Also included in this commit:
- A1 fix: backup/restore now includes email/calendar user files
- A2 fix: apply_config sets applying=True flag via helper container
- A3 fix: add_peer rolls back firewall on DNS failure

app.py reduced: 3011 → 1294 lines. 1021 tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-01 06:11:21 -04:00