Commit Graph

12 Commits

Author SHA1 Message Date
roof e2c50c381a Fix cross-cell domain access: scope DNAT rules, add Docker→wg0 routing
- firewall_manager: add _get_wg_server_ip() helper; scope ensure_cell_api_dnat(),
  ensure_dns_dnat(), ensure_service_dnat() DNAT rules with -d server_ip; add
  ensure_wg_masquerade() (Docker→wg0 MASQUERADE+FORWARD) and
  ensure_cell_subnet_routes() (host routes via docker run busybox)
- wireguard_manager: scope PostUp DNAT rules with -d server_ip in generate_config()
  and ensure_postup_dnat(); add Docker→wg0 MASQUERADE+FORWARD rules
- app.py: call ensure_wg_masquerade() and ensure_cell_subnet_routes() in
  _apply_startup_enforcement()
- tests/test_firewall_manager.py: mock _get_wg_server_ip, add
  test_dnat_is_scoped_to_server_ip and test_returns_false_when_wg_server_ip_not_found
- tests/e2e/wg/test_cell_to_cell_routing.py: rewrite to use dynamic config
  (no hardcoded IPs/ports), add latency and domain access tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 12:37:02 -04:00
roof 1e1bda4679 Fix cross-cell ICMP routing: state-based cell DROP + e2e test
The cell catch-all DROP rule blocked all traffic from a connected cell's
subnet, including ESTABLISHED/RELATED packets (ICMP replies, TCP ACKs) for
connections initiated by local VPN peers. This broke ping to the remote
cell's WireGuard IP even when the cell-to-cell tunnel was healthy.

Change the DROP to match only NEW,INVALID connections so established reply
traffic passes through to the stateful ACCEPT rule.

Also adds tests/e2e/wg/test_cell_to_cell_routing.py — an end-to-end test
that brings up a real WireGuard tunnel from the test runner to pic1 and
verifies full cross-cell routing including ICMP ping, API /health, and Caddy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 10:59:11 -04:00
roof 1a611e0474 fix: UI always accessible; fix exit-relay AllowedIPs not updating
**PIC UI always accessible (service_access=[])**
Remove the per-peer Caddy:80 ACCEPT/DROP rule from apply_peer_rules.
Service access was enforced at two layers (iptables DROP + CoreDNS ACL),
but the iptables layer also blocked the PIC web UI served through Caddy.
CoreDNS ACL alone is sufficient — DNS blocks service hostnames; the UI
path through Caddy remains reachable regardless of service_access value.

**Exit-relay internet routing (route_via another cell)**
update_peer_ip validated new_ip as a single ip_network, rejecting the
comma-separated '10.0.1.0/24, 0.0.0.0/0' string passed by
update_cell_peer_allowed_ips(add_default_route=True). The AllowedIPs
in wg0.conf was never updated, so WireGuard never routed internet traffic
through the exit cell's tunnel. Fix: validate each CIDR individually and
apply the change live via wg set without a container restart.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 05:41:22 -04:00
roof c521fab1cb fix: merge CoreDNS ACL per-service and add reload plugin; add peer/cell e2e tests
- _build_acl_block: put all blocked IPs for a service in ONE acl block instead
  of one block per peer — the first block's allow-all was silently granting
  access to every peer after the first blocked one (first-match semantics)
- generate_corefile: add 'reload' plugin so SIGUSR1 triggers Corefile reload
  in newer CoreDNS builds (without it the signal was a no-op)
- tests/test_firewall_manager.py: new tests for single merged ACL block and
  the reload directive
- tests/e2e/api/test_peer_access_update.py: e2e tests for service_access,
  internet_access, and peer_access updates persisting live to iptables/CoreDNS
- tests/e2e/api/test_cell_to_cell.py: e2e tests for cell-to-cell connection
  management, permissions API, and cross-cell service access restrictions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 04:57:37 -04:00
roof a43f9fbf0d fix: full security audit remediation — P0/P1/P2/P3 fixes + 1020 passing tests
P0 — Broken functionality:
- Fix 12+ endpoints with wrong manager method signatures (email/calendar/file/routing)
- Fix email_manager.delete_email_user() missing domain arg
- Fix cell-link DNS forwarding wiped on every peer change (generate_corefile now
  accepts cell_links param; add/remove_cell_dns_forward no longer clobber the file)
- Fix Flask SECRET_KEY regenerating on every restart (persisted to DATA_DIR)
- Fix _next_peer_ip exhaustion returning 500 instead of 409
- Fix ConfigManager Caddyfile path (/app/config-caddy/)
- Fix UI double-add and wrong-key peer bugs in Peers.jsx / WireGuard.jsx
- Remove hardcoded credentials from Dashboard.jsx

P1 — Security:
- CSRF token validation on all POST/PUT/DELETE/PATCH to /api/* (double-submit pattern)
- enforce_auth: 503 only when users file readable but empty; never bypass on IOError
- WireGuard add_cell_peer: validate pubkey, name, endpoint against strict regexes
- DNS add_cell_dns_forward: validate IP and domain; reject injection chars
- DNS zone write: realpath containment + record content validation
- iptables comment /32 suffix prevents substring match deleting wrong peer rules
- is_local_request() trusts only loopback + 172.16.0.0/12 (Docker bridge)
- POST /api/containers: volume allow-list prevents arbitrary host mounts
- file_manager: bcrypt ($2b→$2y) for WebDAV; realpath containment in delete_user
- email/calendar: stop persisting plaintext passwords in user records
- routing_manager: validate IPs, networks, and interface names
- peer_registry: write peers.json at mode 0o600
- vault_manager: Fernet key file at mode 0o600
- CORS: lock down to explicit origin list
- domain/cell_name validation: reject newline, brace, semicolon injection chars

P2 — Architecture:
- Peer add: rollback registry entry if firewall rules fail post-add
- restart_service(): base class now calls _restart_container(); email and calendar
  managers call cell-mail / cell-radicale respectively
- email/calendar managers sync user list (no passwords) to cell_config.json
- Pending-restart flag cleared only after helper subprocess exits with code 0
- docker-compose.yml: add config-caddy volume to API container

P3 — Tests (854 → 1020):
- Fill test_email_endpoints.py, test_calendar_endpoints.py,
  test_network_endpoints.py, test_routing_endpoints.py
- New: test_peer_management_update.py, test_peer_management_edge_cases.py,
  test_input_validation.py, test_enforce_auth_configured.py,
  test_cell_link_dns.py, test_logs_endpoints.py, test_cells_endpoints.py,
  test_is_local_request_per_endpoint.py, test_caddy_routing.py
- E2E conftest: skip WireGuard suite when wg-quick absent
- Update existing tests to match fixed signatures and comment formats

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 11:30:21 -04:00
roof 0c12e3fc97 fix: change domain from dev to lan to avoid browser HSTS preload blocking HTTP
The .dev TLD has been HSTS preloaded in Chrome/Firefox/Safari/Edge since 2019.
Browsers silently redirect http://anything.dev to https://anything.dev before
making any network request. Since Caddy has auto_https off, all browser-based
access to .dev domains fails with a connection error even though DNS, routing,
and HTTP all work correctly (curl works; browsers don't).

- cell_config.json: domain "dev" -> "lan"
- Caddyfile: all http://*.dev blocks -> http://*.lan
- Corefile: dev zone -> lan zone (file /data/lan.zone)
- data/dns/lan.zone: new zone file (dev.zone removed live)
- test_wg_domain_access.py: remove hardcoded DOMAIN_IPS / .dev references;
  read domain from /api/config at runtime so tests work with any configured TLD

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 01:54:33 -04:00
roof 32272420cb test: add E2E coverage for peer dashboard/services, DNS records, and WG domain access
- test_peer_dashboard_services.py (63 tests): unit tests for all API fixes
  * peer_dashboard field names (name/transfer_rx/transfer_tx vs old stale names)
  * peer_dashboard service_urls dict with correct domain-keyed URLs
  * peer_services email structure (nested smtp/imap, address not username)
  * peer_services files key (not webdav), caldav URL (calendar.dev not radicale.dev:5232)
  * peer_services wireguard DNS (not 10.0.0.1), config text with DNS line
  * DNS zone records (api/webui → Caddy, VIPs for calendar/files/mail/webdav)
  * Caddyfile generation (all service blocks including webui.dev)
  * Access control (401 anon, 403 admin on peer-only routes, 404 missing peer)
- e2e/api/test_peer_endpoints.py: fix stale field assertions, add structure checks
- e2e/wg/test_wg_domain_access.py: E2E WG tests for DNS resolution via VPN tunnel
  * All *.dev domains resolve to correct IPs via CoreDNS
  * api.dev/webui.dev must resolve to Caddy, not container direct IPs
  * CoreDNS reachability through VPN tunnel
  * Peer config DNS field correctness
- e2e/ui/test_peer_dashboard.py: UI checks for service icon links, CalDAV URL, email

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 17:41:21 -04:00
roof 9677755b4f fix: e2e/integration test infrastructure and Makefile test targets
- Fix make test: was pointing to non-existent api/tests/, now runs unit tests
  correctly with --ignore=e2e --ignore=integration
- Remove dead phase test targets (test-phase1..4, test-all-phases) that all
  referenced cd api && pytest tests/ (non-existent path)
- Add .test_admin_pass file: reset_admin_password.py now writes a persistent
  test password file alongside .admin_initial_password; the API never deletes
  it (unlike .admin_initial_password which is consumed on first startup)
- Update both integration/conftest.py and e2e/helpers/admin_password.py to
  read .test_admin_pass before .admin_initial_password — so tests work after
  make restart without needing PIC_ADMIN_PASS env var
- Add AI collaboration rules to CLAUDE.md (auto-loaded every session)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 08:27:27 -04:00
roof 420dced9ff fix: WireGuard peer sync, privileged mode, E2E and integration test correctness
- api/app.py: sync WireGuard server config on peer add/remove (non-fatal)
- docker-compose.yml: add privileged:true to wireguard service
- E2E tests: fix logout selector, DNS IP lookup, wg config DNS line, VIP skip guards,
  badge text selectors, heading .first, async logout wait
- Integration tests: fix 4 tests that sent unauthenticated requests expecting 400
  (now use authenticated session helpers); accept 401 as valid in webui proxy test;
  add password field to service_access validation test
- Remove stale tracked config templates (config/api/api/*, config/api/cell.env, etc.)
  that no longer exist on disk after config layout was reorganised

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 06:04:40 -04:00
roof 31a7951ffd fix: 4 issues — admin password sudo, peer modal, WireGuard fetch creds, port check
1. make reset/show-admin-password: use sudo so data/api/ owned-by-root
   files are writable without explicit sudo prefix

2. Peers.jsx: remove one-time password modal on peer creation — admin
   already knows the password they typed; replace with a success toast
   showing peer name and provisioned accounts

3. WireGuard.jsx + Peers.jsx: add credentials:'include' to every raw
   fetch() call (7 calls across two files, plus fix one hardcoded
   localhost:3000 URL); the port check and peer status calls were
   returning 401 because they didn't send the session cookie

4. test_admin_wireguard.py: update test to match new toast flow (no modal),
   add Scenario 10 test that verifies the port check badge renders on the
   WireGuard page after the credentials fix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 03:33:11 -04:00
roof 7d2979b8af fix: integration and E2E test correctness after auth enforcement
config_manager: make per-file copy errors non-fatal during restore
  (resolves test failures when /app/config/* is not writable by test runner)
test_live_api.py: fix NameError (_req.Session not requests.Session)
test_negative_scenarios.py: replace raw requests.* with authenticated _S.*
  (all endpoints now require auth; unauthenticated calls return 401)
wg/conftest.py: fix wg_server_info — public key is at /api/wireguard/keys
test_admin_navigation.py, test_peer_acl.py: add .first to ambiguous locators
  to avoid Playwright strict-mode errors when desktop+mobile nav both mount

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-25 18:14:38 -04:00
roof 0d32038150 feat: add comprehensive E2E test suite (Playwright + WireGuard + API)
Adds tests/e2e/ with three layers of E2E coverage:
- API layer (tests/e2e/api/): unauthenticated access, admin endpoints,
  peer endpoints, access control enforcement — 24 tests
- Playwright UI (tests/e2e/ui/): login flows, admin navigation, peer
  dashboard/services, role-based ACL, password change — 60+ tests
- WireGuard connectivity (tests/e2e/wg/): tunnel up/down, DNS resolution
  through VPN, service ACL enforcement via iptables, full-tunnel routing
Shared helpers: PicAPIClient, WGInterface, playwright_login, cleanup.
Makefile targets: test-e2e-api, test-e2e-ui, test-e2e-wg, test-e2e.
Adds scripts/reset_admin_password.py for test bootstrap.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-25 16:41:13 -04:00