151 Commits

Author SHA1 Message Date
roof 4b3d695805 docs: consolidate all manuals into the Gitea wiki — repo keeps README only
Unit Tests / test (push) Successful in 10m29s
QUICKSTART, the monolithic project-wiki file, the API documentation, the
service developer guide, and the webui README had drifted badly out of
date (localhost-only auth, DHCP, v1 connectivity fwmarks, unsupported
DDNS providers, "HTTP dispatch not implemented") while the four-persona
Gitea wiki is current and maintained. Their still-accurate content now
lives in the wiki (incl. the new Dev-Service-Manifest-Reference page),
so the repo keeps a single README pointing there. README refreshed:
Connectivity v2 named instances, signed store images, audit log, backup
encryption, real provider list, current UI pages, dead LICENSE link
removed.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 14:26:48 -04:00
roof 2ab3d2d5ac feat: secure build phase 2 — enforce image verification by default
All store images are now digest-pinned and cosign-signed by the publish
pipeline, so the warn-by-default training-wheels period is over: an
unsigned or undigested image must not install unless the admin
explicitly opts out. The service_composer fallback used when the config
manager is unavailable or corrupt also flips to enforce — config
corruption must fail closed rather than silently weaken verification.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 14:12:58 -04:00
roof c430a392b8 chore: fix .gitignore CLAUDE.md entry (was appended without newline)
Unit Tests / test (push) Failing after 25s
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:57:44 -04:00
roof fa00c90328 chore: remove CLAUDE.md from the repo (Claude Code context, not product docs)
Unit Tests / test (push) Failing after 57s
CLAUDE.md is Claude Code tooling context, not product documentation — the
canonical dev/admin/user docs live in README, QUICKSTART, the service-developer
guide, and the Gitea wiki. Keep it local + gitignored so it stays out of the
repository while remaining available to the dev tooling on pic0.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:56:16 -04:00
roof 238db60702 feat: secure build phase 1 — cosign cell-side image verification (warn default) + Dockerfile validation
Unit Tests / test (push) Successful in 13m28s
- config/cosign/cosign.pub: public verification key committed to repo (safe);
  cosign private key lives in /home/roof/.pic-secrets/ and is NEVER committed
- api/config_manager.py: image_verification config block (modes: off|warn|enforce,
  default: warn) so existing deployments are unaffected until images are signed
- api/service_composer.py: cosign verify before pull/up; enforce aborts the
  operation, warn logs and proceeds, off skips entirely; also fixes the prior
  unsafe proceed-on-pull-failure path
- api/service_store_manager.py: store-image digest requirement (warn default,
  reject under enforce)
- api/Dockerfile: cosign binary copied from the official cosign image
- docker-compose.yml: config/cosign/ bind-mounted into cell-api container
- install.sh: ensure/verify bundled cosign pubkey on new cell installs
- api/manifest_validator.py: validate_build_context() — Dockerfile lint
- tests: full coverage for config modes, composer verify paths, store digest
  guard, and validate_build_context

Verification defaults to warn so nothing breaks in production until images are
signed (phase 2). Private key stored outside git at /home/roof/.pic-secrets/.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 03:53:47 -04:00
roof 8d904b1b8f fix: clean-install bugs — Tor false-installed, WG port-check honesty, encrypted backup upload
Unit Tests / test (push) Successful in 13m7s
Three independent bugs surfaced during pic1 clean-install testing:

1. Tor _exit_status hardcoded configured=True regardless of whether Tor was
   actually installed.  Status now flows through the same store-installed /
   container-running bridge used by every other optional service, so Tor only
   reports installed when the container is present and running.

2. check_port_open compared the port from wg0.conf against the kernel-reported
   listening port, causing false "port closed" results whenever the conf and the
   running container were momentarily out of sync.  The function is now an honest
   liveness check: any wg0 interface that is up and has a "listening port:" line
   in `wg show` is considered open.  The check-port API endpoint now also returns
   the actual kernel listening_port and a port_mismatch flag so the UI can inform
   the user when a container recreate is needed.  (The recreate machinery already
   exists via the port-change pending-restart path; this fix makes the mismatch
   visible rather than silently lying about reachability.)

3. upload_backup only handled .zip archives; encrypted .age blobs were rejected
   with a generic error.  The endpoint now calls backup_crypto.is_encrypted() to
   detect Age-encrypted blobs and stores them verbatim as <id>.tar.gz.age with
   mode 0600 so they can be uploaded and then restored with a passphrase.  The
   plaintext zip path is unchanged.

Tests added/updated: test_connectivity_manager.py (Tor status bridge),
test_wireguard_manager.py + test_wireguard_endpoints.py (port-check liveness
and mismatch flag), test_config_backup_restore_http.py (encrypted upload
round-trip).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 01:52:26 -04:00
roof 743b026b01 feat: connectivity redesign phase 7 — cell-relay as a connection type
Unit Tests / test (push) Successful in 13m22s
cell exits surface as cell_relay connections via reconcile, bridged onto
the existing cell route_via mechanism, health from handshake, loop
detection, assignable in the unified UI

- CELL_RELAY_TYPE constant; not manually creatable
- reconcile_cell_relays() derives connections from cell links offering an
  exit (name "Cell: <cellname>", mark+table only, no iface/port/container)
- apply_routes bridges cell_relay to existing route_via path via
  apply_peer_route_via + cell firewall rules + set_exit_relay_active;
  keeps peer.route_via in sync
- _probe_cell_relay health from cell handshake + offer state
- _cell_relay_loops loop detection at assign and apply time
- FAILOPEN_DEFAULTS cell_relay=False
- set_peer_exit clears stale route_via on reassignment
- reconcile hooked into PUT /exit-offer and peer-sync/permissions handlers
- cell_link_manager + wireguard_manager wired into connectivity_manager
- UI: cell_relay in TYPE_META/GROUP_TYPES/GROUP_LABELS (Cells optgroup),
  removed "coming soon" placeholder
- 18 new tests in tests/test_connectivity_cell_relay.py

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 23:58:19 -04:00
roof 391d8ede48 merge: connectivity phase 6 UI (subpages, assignment matrix, Cell Network merge)
Unit Tests / test (push) Successful in 13m15s
2026-06-10 23:11:44 -04:00
roof 603225694c feat: connectivity redesign phase 5 — one container per connection instance
Unit Tests / test (push) Successful in 13m5s
instanceable rendering, per-instance up/down on create/delete,
store-service-installed gate, per-instance health

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 22:56:31 -04:00
roof aba2b0d33f feat: connectivity redesign phase 6 — subpages UI, assignment matrix, Cell Network merge
Replace the monolithic Connectivity page with Services-style subpages:
overview dashboard (aggregated status), per-type connection lists (tunnels/
proxies/ssh/tor) with add/edit forms + lifecycle/health badges + empty states,
a peer+service assignment matrix with per-peer fail-open toggle, and Cell
Network moved under /connectivity/cells. Sidebar gains Connectivity children,
hidden when a type has no instances and its store service isn't installed.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 22:53:46 -04:00
roof d39c091cec feat: connectivity redesign phase 3+4 — per-connection health, per-peer fallback, connection CRUD API
Unit Tests / test (push) Successful in 13m15s
Health probes (probe_health/refresh_health) are type-aware: WireGuard
checks the last WG handshake timestamp, OpenVPN checks the tun/tap
interface, Tor checks the control-port GETINFO, and sshuttle/proxy
types do a TCP reachability probe to the remote endpoint. Results are
persisted via set_connection_status and wired into the health_monitor_loop
so the UI always has a current health snapshot without polling.

Per-peer fail-open semantics: VPN, SSH, and proxy connections default to
fail-closed (kill-switch stays active even when the tunnel is down).
Tor defaults to fail-open. The default can be overridden per-peer via
set_peer_failopen/effective_failopen. apply_routes skips the fwmark and
kill-switch rules for any fail-open peer whose connection health is not
"working", letting traffic fall back to direct routing transparently.

New generic admin-only connection CRUD endpoints (GET/POST/PUT/DELETE
/api/connectivity/connections, GET /<id>/health, PUT
/api/connectivity/peers/<peer>/failopen) are guarded by the existing
admin role check. connection.create, connection.update, connection.delete,
and peer.failopen are all registered in ROUTE_ACTION_MAP for the audit
hook so every change is recorded in the owner-visible change log.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 21:50:45 -04:00
roof 8b50fb1036 feat: audit/change log — owner-visible record of who changed what
Unit Tests / test (push) Successful in 12m47s
Add AuditManager (api/audit_manager.py): JSONL append-only log at
data/api/audit/audit.log with SHA-256 hash chain for tamper detection,
verify endpoint, size-based rotation, and automatic redaction of secret
fields before any entry is written. Supports structured query (actor,
action, date range) and CSV export.

Wire an @app.after_request hook in app.py that fires on every mutating
/api/* request: captures actor, role, remote IP, and maps the route +
method to a human-readable action via ROUTE_ACTION_MAP. Explicit audit
entries for password_change and password_reset are added in
auth_routes.py so those events record the actor without logging secret
values.

Expose an admin-only blueprint (api/routes/audit.py):
  GET /api/audit          — paginated query
  GET /api/audit/export   — CSV download
  GET /api/audit/verify   — hash-chain integrity check

Register AuditManager in managers.py and add api/audit to
config_manager.py critical_data_paths so it is included in backups and
restored with other persistent state.

Add Activity page (webui/src/pages/Activity.jsx, admin-only) reachable
from the nav in App.jsx. New auditAPI helper in api.js covers all three
endpoints.

Tests: test_audit_manager.py (unit: hash chain, redaction, rotation,
query, csv, verify) and test_audit_hook_routes.py (integration: hook
fires on mutating routes, skips safe methods, records actor/ip/action,
backup-inclusion assertion).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 20:19:38 -04:00
roof 13074f56cb fix: logging verbosity now actually applies + per-service log levels
Unit Tests / test (push) Successful in 12m34s
Root causes fixed:
- Dead LOG_LEVEL globals() lookup pinned root logger at INFO regardless of
  PIC_LOG_LEVEL env or config; replaced with _resolve_root_log_level() +
  apply_root_log_level() which sets both root logger and all attached handlers
  at startup and on runtime re-apply.
- set_service_level() only set the named 'pic.<service>' logger; bare module
  loggers (e.g. 'caddy_manager') were never reached, so per-service log files
  stayed 0 bytes. Fixed via _SERVICE_MODULE_LOGGERS map covering all managers.
- Log viewer GET /api/logs had no level filter; added ?level= query param.
- Per-service log levels lived in an out-of-band config/api/log_levels.json
  side-file with no validation; migrated into ConfigManager under a new
  'logging' section ({python:{root,services}, containers:{caddy,coredns,
  wireguard,mailserver,api}}) with get/set helpers, invalid-level rejection,
  and one-time migration from the old file on first load.

New capabilities:
- Container log levels: Caddy (injects global log { level X } + hot reload),
  CoreDNS (DEBUG enables log plugin, else errors-only), WireGuard/mailserver
  via pending_restart path.
- PUT /api/logs/verbosity accepts {python, containers} dict; returns per-entry
  applied:hot|pending_restart status.
- Webui Logs page gains two-section Verbosity tab (Python services + Container
  services) with needs-restart badges.
- managers.py wires per-service loggers before manager instantiation and
  re-applies persisted levels from ConfigManager; legacy log_levels.json read
  removed.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 19:14:01 -04:00
roof 89aed4efe0 feat: connectivity redesign phase 2 — instance-aware routing + reference connections by id
Unit Tests / test (push) Successful in 12m6s
apply_routes now iterates over connection instances rather than types:
each instance gets its own fwmark, routing table, interface, and
redirect_port via _routing_connections / _resolve_peer_connection /
_apply_connection_for_src; kill-switch is enforced per iface-instance.
Old per-type MARKS/TABLES constants are kept only as migration scaffolding.

peer_registry: exit_via is now stored as a connection id (or 'default');
_migrate_exit_via_to_connection_id runs on _load_peers to upgrade legacy
type-string values; set_peer_exit_via validates against known connection
ids; VALID_EXIT_VIA removed; config_manager wired in from managers.py.

egress_manager: egress_overrides keyed by service_id → connection_id;
local MARKS/TABLES/EXIT_TYPES/_REDIRECT_PORTS/_add_tor_redirect removed;
(mark, table, redirect_port) resolved at apply-time via
connectivity_manager.get_connection; manifest egress.allowed still
enforced by connection type.

api/app.py + api.js: PUT peer/service exit endpoints accept {connection_id};
back-compat shim resolves a legacy type string to its single active instance.

Tests extended: two same-type instances produce distinct marks/tables/ports;
peer exit_via and egress override id migrations round-trip correctly;
single-instance behaviour is equivalent to the old type-keyed path.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 17:35:28 -04:00
roof 5b9d20eeac feat: connectivity redesign phase 1 — multi-instance connection data model
Unit Tests / test (push) Successful in 12m51s
Migrate from the single-exit-per-type model (one wireguard_exit, one
tor_exit, etc.) to N named connection instances, each carrying its own
resource allocations and vault-backed secret refs.

config_manager.py:
- Connectivity v2 schema: top-level `connections` list, each entry has
  id, name, type, enabled, status, config, secret_ref, and allocated
  resources (mark, table, iface, redirect_port).
- Helpers: get_connectivity / list_connections / get_connection /
  add_connection / update_connection / delete_connection /
  set_connection_status.
- v1→v2 migration: promotes legacy wireguard_exit / tor fields into
  the new list on first load; idempotent on v2 configs.

connectivity_manager.py:
- Resource allocator: per-instance fwmark range 0x1000–0x1FFF, routing
  table range 1000+, interface names, and redirect ports 9100–9199;
  all tracked in config to survive restarts.
- Connection CRUD: create / update / delete / list / get with vault
  secret refs for WireGuard private keys and Tor credentials.
- Single-Tor enforcement: rejects a second tor/tor_bridge instance at
  creation time.
- Per-instance config validation for each connection type.
- apply_routes, peer wiring, and egress hookups are intentionally left
  unchanged in this phase; they land in later phases alongside UI.

tests/test_connectivity_connections.py (new, 473 lines):
- Allocator uniqueness, v1→v2 migration round-trip, CRUD lifecycle,
  single-Tor enforcement, and status transitions.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 16:34:56 -04:00
roof 8a9f4f50c6 docs: bring all docs current with this session's changes
Unit Tests / test (push) Successful in 12m12s
Update README, QUICKSTART, wiki, service-developer-guide, and CLAUDE.md for:
optional store services (email/calendar/files), sshuttle+proxy egress exits,
provider-aware Network Services/DNS overview, DHCP/dnsmasq removal, split-horizon
VPN DNS, container hardening (slim images, unprivileged WireGuard, webui port 8080,
pinned ntp/coredns), installer changes (host NTP, PIC_DEBUG, clean output, systemd),
and the backup overhaul (full secrets coverage + optional passphrase encryption).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:56:03 -04:00
roof 82a0c0e9bd fix: overhaul backup/restore — full secrets coverage, ordered reapply, optional passphrase encryption
Unit Tests / test (push) Successful in 12m25s
P0 — backups previously omitted peers/keys/vault(CA+fernet)/auth/cell-links/ddns/connectivity
configs (a restore lost everything incl admin login + CA) and included logs/trash; restore did
file-copies only with no reapply.

Changes:
- api/config_manager.py: backup_config now includes auth_users.json, .flask_secret_key,
  peers.json, peer_service_credentials.json, WireGuard keys + wg_confs + api/wireguard/keys,
  vault/** (incl fernet.key), api/services + service configs, cell_links.json, ddns_token,
  caddy/**; new _is_excluded() drops logs/config_backups/.test_admin_pass/.gitkeep/*.tmp/
  *.partial/__pycache__; restore_config reordered (vault/fernet → config → wg keys/peers →
  cell_links → caddy/dns → service configs → auth/ddns → volumes) + new _reapply_runtime_state()
  (regenerate Caddyfile/Corefile, reapply services, connectivity apply_routes, replay cell pushes)
- api/backup_crypto.py (new): optional passphrase encryption via scrypt-derived key + Fernet;
  encrypted archives written 0600
- api/routes/config.py: backup/restore accept optional {passphrase}; wrong/missing passphrase
  returns 400; backup response warns it contains secrets
- Makefile: backup target applies same excludes + chmod 0600 + secrets warning
- webui/src/services/api.js + webui/src/pages/Settings.jsx: passphrase field on create backup,
  restore prompt, "contains secrets" banner
- tests/test_config_backup_overhaul.py (new, 18 tests) + tests/test_config_backup_restore_http.py
  (2 assertions updated)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 15:41:10 -04:00
roof c3ba82251a fix: update WG tests to assert rp_filter is absent from PostUp/PostDown
Unit Tests / test (push) Successful in 11m46s
The pic1 commit (c65beb2) correctly removed rp_filter sysctl from
WireGuard PostUp/PostDown because writing /proc/sys fails in the
unprivileged (NET_ADMIN-only) container and crashed wg-quick. Two
tests that asserted rp_filter was present were left stale. Replace
them with a single test asserting rp_filter is NOT in the generated
config, restoring green main.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:53:58 -04:00
roof c65beb27a6 fix: remove sysctl rp_filter from WireGuard PostUp/PostDown
Unit Tests / test (push) Failing after 11m57s
sysctl writes to /proc/sys/net/ are blocked in unprivileged containers
(NET_ADMIN only, no SYS_ADMIN). The rp_filter=0 call at the end of
PostUp caused wg-quick to tear down wg0 immediately on every start,
putting cell-wireguard into a crash loop.

Remove the sysctl lines from both the seed (setup_cell.py) and the
API-regenerated (wireguard_manager.py) wg0.conf. Reverse-path filtering
is an optimisation, not required for VPN functionality; the iptables
FORWARD/MASQUERADE/DNAT rules all still work correctly without it.

Found during clean-install hardening verification on pic1 (f4b8d5c).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-10 14:33:05 -04:00
roof f4b8d5c4f7 harden containers: drop WG privileged, slim images, digest pins; fix WG path + empty chrony.conf
Unit Tests / test (push) Successful in 12m16s
Security — WireGuard:
- Replace linuxserver/wireguard (privileged + SYS_MODULE + /lib/modules) with a
  bespoke alpine image (wireguard/Dockerfile + entrypoint.sh): CAP_NET_ADMIN only,
  119 MB → 14.7 MB. Modern kernels (≥5.6) have WireGuard built in; no module
  loading required. Kernel-fallback comment left in compose for rare old kernels.

Security — supply-chain digest pins:
- CoreDNS image pinned by SHA-256 digest in docker-compose.yml.
- api/Dockerfile: python:3.11-slim and docker:27-cli pinned by digest.
- webui/Dockerfile: node:20-alpine and nginxinc/nginx-unprivileged:alpine pinned.
- ntp/Dockerfile: alpine:3.20 pinned by digest.
- wireguard/Dockerfile: alpine:3.20 pinned by digest.

Security — webui non-root:
- Switch from nginx:alpine (root, port 80) to nginxinc/nginx-unprivileged:alpine
  (port 8080, runs as nginx uid 101). Compose port mapping and all Caddy upstream
  references updated: cell-webui:80 → cell-webui:8080 everywhere.

API layer reduction (561 MB → 245 MB):
- Multi-stage api/Dockerfile: docker CLI copied from docker:27-cli stage instead
  of being installed via apt from Docker's external repo (removes GPG key fetch,
  lsb-release, gnupg, two apt-get update rounds). --no-install-recommends on
  remaining apt install. mkdir folded into the same RUN layer.

Bug fix — WireGuard config path mismatch:
- setup_cell.py wrote wg0.conf to config/wireguard/wg0.conf but wireguard_manager
  and the new entrypoint expect config/wireguard/wg_confs/wg0.conf (the standard
  wg-quick sub-directory). Fixed by creating the wg_confs/ sub-dir and writing
  there; REQUIRED_DIRS updated to pre-create it.

Bug fix — empty chrony.conf:
- config/ntp/chrony.conf was 0 bytes (pre-existing gap); added a real config
  (pool.ntp.org + Cloudflare, allow 172.20/10.0, local stratum 10, driftfile,
  makestep, rtcsync). NTP compose service now builds from ./ntp instead of
  pulling alpine:latest and running apk at every container start.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:07:54 -04:00
roof fb257c50b3 test: cover startup Caddyfile regeneration to prevent restart-loop regression
Unit Tests / test (push) Successful in 11m56s
Adds TestStartupCaddyRegen::test_startup_regenerates_caddyfile_first,
asserting that _apply_startup_enforcement() calls
caddy_manager.regenerate_with_installed([]) before any peer/iptables work.
This pins the fix that ensures a stale on-disk Caddyfile (e.g. missing
`admin 0.0.0.0:2019`) is overwritten at startup and cannot cause the health
monitor to restart Caddy every few minutes.

Also restores two displaced lines in test_health_history_maxlen_evicts_old_entries.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 13:18:42 -04:00
roof 5cb8ebe430 fix: quiet installer output for non-technical users; Makefile/compose cleanup
Unit Tests / test (push) Successful in 12m18s
The installer dumped ~200 lines of docker layer spam, a leaked apt error,
and obsolete version warnings, alarming for non-technical users.

install.sh:
- Clean, progress-only default output; full log to /var/log/pic-install.log
- Admin password still surfaced on stdout at the end
- PIC_DEBUG=1 / --debug flag restores verbose output
- On error, prints the last 20 lines from the log file

Makefile:
- start / update / start-core compose invocations get @ prefix to suppress
  command echo, plus --quiet-pull to kill layer-download spam

docker-compose.yml + docker-compose.services.yml:
- Removed obsolete `version: '3.3'` top-level key (triggers deprecation
  warning with current Docker Compose)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 13:01:48 -04:00
roof 1daace48eb fix: DNS first-install — split-horizon zone creation + CoreDNS inode bind-mount
VPN clients got dns_probe_finished_bad_config / couldn't resolve any domain
after first setup because:

1. complete_setup() never wrote the split-horizon DNS zone for non-LAN modes;
   SetupManager now accepts network_manager as an optional 3rd constructor
   param, and complete_setup() calls
   self.network_manager.update_split_horizon_zone(effective_domain, wg_ip,
   primary_domain) for pic_ngo/cell_to_cell modes.

2. generate_corefile() used a tmp-file + os.replace pattern; the Corefile is
   a Docker FILE bind-mount, so os.replace orphaned the inode and CoreDNS
   never saw config updates.  Fixed by truncating and rewriting in place
   (open with 'w', seek(0), truncate()), preserving the inode CoreDNS holds.

api/managers.py passes network_manager into SetupManager.
Tests: new mock_network_manager fixture, 2 setup-zone tests, 1 inode
regression test in test_firewall_manager.py.
Verified live on pic1.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 12:48:37 -04:00
roof a9c7235347 fix: install chrony for host NTP and enable pic.service on cold install
Unit Tests / test (push) Successful in 12m0s
Root-cause fix for ACME failures caused by clock drift breaking TOTP
during DDNS registration: install and start chrony (all supported package
managers) before the setup wizard runs, so the host clock is accurate from
day one.

Also enables and starts the pic systemd unit at the end of a cold install —
previously the unit file was written but never activated, so the stack would
not survive a reboot without a manual `systemctl enable --now pic`.

Makefile uninstall hardened: `disable --now` instead of bare `disable` so the
running unit is stopped before the unit file is removed; daemon-reload called
afterwards to flush the stale unit; and all lingering cell-* containers
(tor/sshuttle/redsocks/store services) are now force-removed so subsequent
reinstalls start from a clean Docker state.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 09:38:03 -04:00
roof aa1e5c41ec test: raise coverage 68.7% -> ~80.4%; add ~250 tests for new egress/DDNS/network paths
Unit Tests / test (push) Successful in 12m6s
Coverage was below acceptable levels and several newly-added code paths
(sshuttle egress, proxy egress, DDNS provider stubs, DNS overview route,
peer-registry provisioning) had zero test coverage.

~250 new unit tests are added across 16 new test files. Existing test files
are updated to match refactored interfaces (DHCP removed, constants
introduced, network_manager restructured). .coveragerc is added to pin the
source mapping and the 70% floor so regressions are caught at commit time.

tests/test_enhanced_api.py was previously living in api/ (wrong location)
and is moved to tests/ where it belongs.

Integration test files are updated to remove references to DHCP endpoints
and add coverage for the new DNS overview and DDNS sync endpoints.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 09:03:39 -04:00
roof c41cadafb4 refactor: Network Services rebuilt, DHCP decommissioned, infra cleanup
Network Services page is rebuilt around real API data: GET /api/dns/overview
returns provider-aware records; per-service Cloudflare sync is exposed via
POST /api/ddns/sync; effective domain is displayed so operators can verify
what external name resolves to the cell; NTP status reflects the actual
systemd-timesyncd state rather than a hardcoded boolean.

DHCP is fully decommissioned: the cell-dhcp container is removed from
docker-compose.yml, DHCP methods are stripped from network_manager, the
setup_cell script no longer seeds DHCP config, and the Settings DHCP field
is gone. DHCP was never a PIC responsibility and the container was consuming
resources for no benefit.

Dead code removed: api/config.py (superseded by config_manager), the
standalone Email/Calendar/Files pages (these are now optional store services
and do not need dedicated pages). api/constants.py is introduced to hold
RESERVED_SUBDOMAINS in one place rather than scattered literals.

Docker resource limits (mem_limit, cpus, pids_limit) are added to all
compose services so a runaway process cannot starve the host.

Makefile gains a warning before the backup target so operators are not
surprised by the archive path. Settings same/accept state fix ensures
the Cell Identity section correctly shows the accept/discard banner and
does not flash a false-positive change indicator on first load.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 08:50:00 -04:00
roof 6232ef23a9 feat: connectivity — registry-driven peer table, sshuttle/proxy egress, egress UI
The peer table was empty because it was not consulting the peer registry;
now peers are driven by PeerRegistry so the Connectivity page reflects actual
connected cells.

Exit-key handling is unified: all code paths now use the same key derivation
so a store-service exit bridge and a manual WireGuard peer both produce
consistent routing state.

Two new egress exit types are added (sshuttle via SSH tunnel and proxy via
redsocks SOCKS5), wiring through connectivity_manager, egress_manager, and
app.py routes. This lets a cell route its traffic through an SSH host or a
SOCKS5 proxy as an alternative to WireGuard exit nodes.

ServiceStoreManager and ServiceBus updated so the egress lifecycle (install /
uninstall) is cleanly signalled between components.

Connectivity.jsx gains the Service Egress section, letting operators assign
and reassign egress methods from the UI without touching config files.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 08:36:15 -04:00
roof cc7a223fdf fix: P0/P1 audit fixes — DDNS correctness, peer provisioning gates, honest stubs
CloudflareDDNS.update() was calling the wrong endpoint; fix to use the
correct zone-records API so DDNS updates actually land.

NoIP and FreeDNS providers now return explicit "not implemented" errors
instead of silently claiming success, preventing false-positive health state.

PicNgoDNS ACME dns-challenge now sends the token in the request body (was
missing), so cert issuance no longer silently fails.

add_peer gates builtin-service provisioning on the installed-services list
so a freshly-provisioned peer does not attempt to configure services that
aren't present, eliminating the startup error loop.

Startup Caddyfile regeneration added to routes/config.py so that a stale
on-disk Caddyfile no longer triggers the health-monitor restart loop after
a config change.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 08:23:00 -04:00
roof 649378b59b fix: resolve all Cell Identity banner and cert issues
Unit Tests / test (push) Successful in 7m17s
Four bugs fixed:

1. Banner delay (up to 5 s): DraftConfigContext now exposes isDirty as
   reactive useState so App.jsx re-renders immediately when any section
   marks itself dirty, instead of waiting for the next checkPending() poll.

2. Banner re-triggers after Apply (race): For non-'*' container restarts
   (e.g., cell_name → DNS restart) the background thread took ~300 ms to
   clear _pending_restart. A concurrent checkPending() poll could see
   needs_restart=True and overwrite the frontend's optimistic clear.
   Fix: set needs_restart=False and applying=True synchronously before
   spawning the thread.

3. Apply showed banner during applyPending() when hasDirty()==false:
   setApplyStatus('saving') was skipped for the auto-save-then-apply
   path, leaving applyStatus=null while applyPending() ran and the
   banner stayed visible. Always set 'saving' before applyPending().

4. Cert status always 'unknown' in pic_ngo mode: _check_cert_via_ssl
   connected to cell-caddy:443 but sent SNI='cell-caddy'. Caddy finds no
   matching cert and returns nothing. Fix: pass the effective public
   domain (e.g. pic1.pic.ngo) as SNI so Caddy returns the right cert.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-10 04:17:56 -04:00
roof ec8995d41e fix: Cell Identity changes now show Configuration changes pending banner
Unit Tests / test (push) Successful in 7m26s
DraftConfig dirty state (set when any Cell Identity field changes) was
tracked in refs but never checked by the banner, which only looked at
backend pending state. Cell name changes in pic_ngo mode intentionally
block auto-save (to prevent premature DDNS re-registration), so the
backend never marked pending and the banner never appeared.

Fix: show the banner when hasDirty() is true in addition to backend
pending. Add clearAllDirty() to DraftConfigContext so Cancel immediately
clears frontend dirty state without waiting for the next 5-second poll.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 16:17:51 -04:00
roof 2085f77733 Fix Settings: restore Accept/Discard flow for Cell Identity
Unit Tests / test (push) Successful in 7m26s
The previous commit incorrectly added a standalone Save button to the
Cell Identity section. The Settings page already has a global
Accept/Discard flow (DraftConfig) where all section changes accumulate
in state and are only committed when the user presses Accept. The Save
button bypassed that pattern entirely.

Fix: remove the Save button. Cell Identity changes now follow the same
flow as every other section — edit → dirty state → Accept to commit,
Discard to revert. The pic_ngo cell-name auto-save block from the prior
commit is kept: the change accumulates until Accept, at which point the
DraftConfig flusher calls saveIdentity() and the DDNS re-registration
happens.

Update the regression tests to reflect the correct pattern: they now
verify that dirty state is set (triggering the Accept/Discard banner),
that auto-save is blocked for pic_ngo cell name changes, that auto-save
fires for ip_range changes, and that the flusher path (Accept) saves.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 15:50:48 -04:00
roof 36bc32543d Remove unused advanced zone field; add explicit Identity Save button
Unit Tests / test (push) Successful in 7m25s
Two changes:

1. Remove 'Internal zone name (advanced)' from Settings. The field
   edited _identity.domain (the internal .cell TLD) which no user
   should ever change post-install — changing it breaks all internal
   service DNS. Removed the Advanced collapse section and the
   showAdvancedZone state. The LAN-mode 'Local Domain' field is kept
   since that mode genuinely needs a user-editable domain value.

2. Add an explicit Save button to the Cell Identity section. The
   previous auto-save fix (no auto-save for pic_ngo cell name changes)
   accidentally removed the only way to save those changes. The Save
   button appears whenever the section is dirty and is disabled when:
   - there are validation errors, or
   - domainMode is pic_ngo, cell name changed, and the availability
     check hasn't confirmed the name is free yet.

Adds 8 Vitest regression tests covering Save button visibility,
disabled states, that auto-save is blocked for pic_ngo cell name
changes, and that it still fires for ip_range-only changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 15:32:30 -04:00
roof 348fd8faad Fix Settings: stop auto-registering DDNS on cell name change
Unit Tests / test (push) Successful in 7m37s
Two bugs in the pic_ngo availability + auto-save flow:

1. Availability check fired on page load even when cell_name matched
   the currently-registered name — sending unnecessary check requests
   to the DDNS server and showing 'taken' for the user's own name.
   Fix: skip the check when identity.cell_name === loadedCellName.

2. Auto-save triggered DDNS re-registration (release old subdomain +
   register new one) as soon as picAvail became 'available' — without
   the user pressing Accept. This happened because picAvail was in
   the auto-save effect's dependency array, so it re-ran whenever the
   availability check completed.
   Fix: block auto-save entirely for pic_ngo cell name changes; the
   user must press Accept explicitly since re-registration is
   irreversible.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 15:09:53 -04:00
roof 9ad9fac8dd Fix Settings crash: temporal dead zone on checkDdnsStatus
Unit Tests / test (push) Successful in 7m37s
checkDdnsStatus was declared via useCallback at line ~526 but referenced
in a useEffect dependency array at line 419 — before its declaration.
JavaScript const/let are not hoisted; accessing them before declaration
throws a ReferenceError (temporal dead zone). In the production build
this surfaced as:

  ReferenceError: Cannot access 'Pn' before initialization

and caused the Settings page to crash blank on load.

Moved the checkDdnsStatus useCallback definition to immediately before
the useEffect that lists it as a dependency.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 12:42:16 -04:00
roof c1e93f2058 Fix stale DNS zone after wizard completes (#8)
Unit Tests / test (push) Successful in 7m29s
_bootstrap_dns runs at container start before the wizard, writing the
default cell name ('mycell') into cell.zone.  When the wizard completed
it fired IDENTITY_CHANGED for Caddy but never updated the DNS zone, so
DNS records kept showing 'mycell.cell' even after naming the cell.

After successful wizard completion, call network_manager.apply_cell_name
to rename the hostname record in the primary zone file, then reload
CoreDNS.  The empty old_name triggers auto-detection so it works even
when the zone was written with the env-var default.

Adds test_setup_route.py covering: apply_cell_name called on success,
not called on failure, 410 on repeat completion, and IDENTITY_CHANGED
publication.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 05:14:22 -04:00
roof 3d750ed1e8 Fix DDNS security and reliability gaps (#2, #3, #5, #6, #7)
Unit Tests / test (push) Successful in 7m23s
- Fix #2: Move DDNS bearer token from cell_config.json to data/api/ddns_token.
  Token is now in the secrets store (data/) rather than the config store (config/).
  Auto-migrates existing installs on first access. ConfigManager.get/set_ddns_token()
  added. set_ddns_config() now strips 'token' key to prevent it leaking back.

- Fix #3: Set Caddyfile permissions to 0o600 after write so the token embedded
  in the Caddyfile is not world-readable on the host filesystem.

- Fix #5: Heartbeat now fires IDENTITY_CHANGED after re-registration so Caddy
  regenerates its config with the new token automatically — users no longer need
  to click Re-register in Settings after a wizard registration failure.
  Also: heartbeat skips the 401-cycle when no token exists and goes straight to
  registration instead. DDNSManager now accepts service_bus= and is wired up.

- Fix #6: Settings page starts polling GET /api/caddy/cert-status every 15s
  after a successful DDNS re-registration and shows "Acquiring certificate…"
  feedback until Let's Encrypt issues the cert (up to 5 minutes).

- Fix #7: regenerate_with_installed() is debounced (5 s window) so two rapid
  IDENTITY_CHANGED events (e.g. wizard + heartbeat) can't start simultaneous
  ACME orders that interfere with each other.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 03:37:48 -04:00
roof 40f9d90fad feat: improve setup wizard and DDNS UX
Unit Tests / test (push) Successful in 7m29s
Setup wizard (Issue 1 — UI):
- pic.ngo subdomain input now uses the same split-field style as DuckDNS:
  input + static '.pic.ngo' suffix in a flex row, availability status below

Setup wizard (Issue 2 — Caddy not regenerating after completion):
- complete_setup route now fires IDENTITY_CHANGED after a successful wizard
  submission so CaddyManager regenerates the Caddyfile immediately; users
  no longer need to press 'Renew Certificate' to start ACME

Settings — DDNS status (Issue 2 — domain status missing):
- New GET /api/ddns/status endpoint: returns registered flag, domain_name,
  public_ip (ipify with 30s cache), last_ip from heartbeat
- Settings DDNS section for pic_ngo now shows a live status row with
  color-coded dot (green=registered+current, yellow=registered+stale,
  gray=not registered), current public IP, and a Check button
- Status auto-refreshes on mount and after each successful re-registration

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 00:36:47 -04:00
roof fb0326dae7 fix: remove auto-DDNS registration from installer; default to lan mode
Unit Tests / test (push) Successful in 7m27s
install.sh → make setup was registering 'mycell.pic.ngo' with DDNS at
install time (before the user ever opened the setup wizard). On a fresh
install the user would then open the wizard, choose 'pic1', and get a
401 OTP error because 'mycell' was already registered and the TOTP window
had moved on.

- Remove the register_with_ddns() call from setup_cell.py main(); DDNS
  registration now only happens through the setup wizard
- Change default DOMAIN_MODE from pic_ngo to lan so a bare 'make setup'
  no longer generates an ACME Caddyfile or pre-seeds a pic.ngo identity;
  the wizard collects the real cell name and domain mode from the user

make ddns-register still works for manual / scripted deployments.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 16:42:44 -04:00
roof e9077b2633 fix: Caddy health check must hit /config/ not /
Unit Tests / test (push) Successful in 7m35s
GET http://cell-caddy:2019/ returns 404 because Caddy's admin API has no
root handler.  The health monitor interpreted every response as a failure,
restarted Caddy every 3 minutes, and prevented ACME from ever completing.

/config/ returns 200 + the running config JSON whenever Caddy is up and
serving — that is the correct liveness indicator.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 15:57:32 -04:00
roof da302b5d54 fix: renew_cert regenerates Caddyfile before reload
Unit Tests / test (push) Successful in 7m32s
A stale or empty-token Caddyfile on disk caused Caddy to reject the
/load request, so the Renew button appeared to do nothing. Now
renew_cert() calls regenerate_with_installed([]) first, which writes a
fresh Caddyfile from current identity/config before reloading Caddy.
This ensures a broken on-disk file never blocks ACME renewal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 14:38:30 -04:00
roof 6bd5f02b03 fix: surface DDNS registration failure during setup wizard
Unit Tests / test (push) Successful in 7m34s
Two problems on fresh install with pic_ngo mode:

1. Caddy crashed at startup because ddns.token was empty (registration
   hadn't completed yet), producing a bare `token` keyword in the
   Caddyfile that Caddy rejects with "wrong argument count".
   Fix: fall back to lan mode in _caddyfile_pic_ngo when the token is
   empty so Caddy always starts cleanly. The Caddyfile is regenerated
   once registration completes and the token is persisted.

2. DDNS registration failures were silently swallowed — the wizard
   showed "Setup complete!" with no indication that HTTPS wouldn't work.
   This made it look like everything was fine when the subdomain was
   never registered (e.g. name already taken from a previous install,
   or transient network error).
   Fix: capture the exception, classify it (name_taken vs transient),
   and return it as a `warnings` list in the setup response. The wizard
   done screen now shows amber warning cards with actionable text instead
   of auto-redirecting, giving the user a "Continue to login" button and
   a clear explanation of what went wrong.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 13:52:00 -04:00
roof 7ef294fd65 fix: fall back to lan mode in pic_ngo Caddyfile when token is empty
Unit Tests / test (push) Successful in 7m42s
On a fresh install before DDNS registration completes, ddns.token is
empty. Writing `token ` (bare keyword, no value) causes Caddy to reject
the Caddyfile at startup with "wrong argument count or unexpected line
ending after 'token'".

Guard added: if the token is empty, generate a LAN-mode Caddyfile so
Caddy starts cleanly. The Caddyfile is regenerated automatically once
registration completes and the token is persisted to cell_config.json.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 13:38:51 -04:00
roof 33d255f089 feat: TLS certificate management in Vault page
Unit Tests / test (push) Successful in 7m26s
Adds live cert status, one-click ACME renewal, and custom cert upload
directly to the Vault page so users never need to touch Caddy config.

Backend:
- CaddyManager.get_cert_status() now returns domain, domain_mode, and
  cert_type so the UI can render the right controls without a separate
  identity fetch
- CaddyManager.renew_cert() reloads Caddy and invalidates the status
  cache; the frontend polls until the cert turns valid
- CaddyManager.upload_custom_cert() validates PEM, writes cert+key to
  the shared config/caddy/certs/ volume, updates identity (cert_type=custom),
  and regenerates the Caddyfile so Caddy references the new paths
- LAN-mode Caddyfile switches from /etc/caddy/internal/ to the shared
  certs dir automatically when cert_type=custom is set
- ddns_api default no longer includes /api/v1 — the plugin appends it;
  legacy /api/v1 suffix is stripped at write time to keep the Caddyfile clean
- POST /api/caddy/cert-renew and POST /api/caddy/custom-cert routes added

Frontend:
- TLSPanel component at the top of Vault.jsx shows status badge
  (valid/expiring-soon/expired/pending/internal) with domain and expiry
- Renew button visible only for ACME modes; spins during the API call
  then polls GET /api/caddy/cert-status every 10 s until valid
- Upload Custom Cert opens a modal with PEM text areas; works for all modes
- caddyAPI.renewCert() and uploadCustomCert() added to api.js

Tests: 22 new tests across 5 classes covering enriched status,
renew_cert guards, upload_custom_cert validation/writes/persistence,
custom-cert Caddyfile path selection, and ddns_api suffix stripping.
All 2093 existing tests continue to pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 12:53:42 -04:00
roof 85d265187d fix: Caddy TLS cert acquisition — two DNS-01 blockers
Unit Tests / test (push) Successful in 7m32s
1. caddy_manager: embed ddns.token (registration bearer token) in
   Caddyfile, not DDNS_TOTP_SECRET. The pic_ngo plugin sends the token
   to POST /api/v1/dns-challenge; using the TOTP secret caused 401 on
   every attempt.

2. firewall_manager: add _acme-challenge.<zone> forwarding block before
   each split-horizon zone in the Corefile. Without this, CoreDNS was
   authoritative for the challenge name and returned NODATA for TXT
   queries (wildcard A record matches but wrong type), blocking Caddy's
   internal DNS pre-verification step.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 10:45:15 -04:00
roof 76bbc2b67a fix: EmailManager route calls get_email_users not get_users
Unit Tests / test (push) Successful in 7m27s
The method is named get_email_users in EmailManager; the route was
calling the non-existent get_users, causing an AttributeError on every
GET /api/email/users request.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 10:12:24 -04:00
roof bd71466a87 fix: split-horizon DNS zone uses WireGuard IP, not Docker bridge IP
Unit Tests / test (push) Successful in 7m31s
VPN peers can reach Caddy via the host's WireGuard interface (10.0.0.1),
not via the Docker bridge IP (172.20.0.2) which is unreachable outside
the container network. _bootstrap_dns now calls _get_wg_server_ip()
instead of ip_utils.get_service_ips() so the internal zone returns a
routable address for service subdomains.

Also log config save failures instead of silently swallowing them —
the silent PermissionError/OSError was masking write failures and
making it impossible to diagnose why installed services disappeared
after container restarts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 02:11:01 -04:00
roof e4c80149f4 fix: start-core missing cell-network creation breaks fresh install
Unit Tests / test (push) Successful in 7m34s
make start-core (called by install.sh step 6) used $(DCF) which includes
docker-compose.services.yml — that file declares cell-network as external:true.
On a fresh machine the network doesn't exist yet, so compose up failed with
"network cell-network declared as external, but could not be found".

Fix: add the same network-create idempotency guard that start and update
already have. Also add 26 regression tests (test_install_process.py) that
verify install.sh structure and that all start-* targets using DCF create
the network before running compose up.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 01:07:00 -04:00
roof 69862331e7 fix: DDNS update token in body, webdav gating, regression tests
Unit Tests / test (push) Successful in 7m25s
- PicNgoDDNS.update(): send token in request body instead of Authorization
  header; DDNS server validates it from body (was returning HTTP 422 on
  every heartbeat, leaving IP record stale after fresh install)
- peers.py / Peers.jsx: webdav service_access only valid when 'files' store
  service is installed; was always shown even with no services, confusing
  users into thinking WebDAV was pre-installed
- 10 new regression tests: DDNS update body contract, Caddy always
  regenerates on startup with no services, peer role allowed on
  /api/services/active, webdav gating by installed services

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 16:56:12 -04:00
roof 962d137093 fix: lockout countdown shows NaN minutes
Unit Tests / test (push) Successful in 7m31s
The API returns locked_until already ending in 'Z' (UTC ISO format).
Appending another 'Z' produces an invalid date string, so Date arithmetic
yielded NaN. Remove the redundant suffix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 16:28:14 -04:00
roof 1607a2e86f fix: peer access to /api/services/active and unconditional Caddy startup regen
Unit Tests / test (push) Successful in 7m23s
- Add _PEER_READABLE_PATHS allowlist in enforce_auth so peer-role sessions
  can read /api/services/active; fixes My Services showing 'not installed'
  for cell members when services are installed
- Move Caddy regeneration before the early-return in reapply_on_startup so
  the Caddyfile is always rebuilt from current identity on startup, even when
  no store services are installed; fixes ERR_SSL_PROTOCOL_ERROR after a cell
  rename (Caddyfile retained old wildcard domain)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 15:58:27 -04:00
roof 9bdda6aaf8 fix: service credential provisioning and install reliability
Unit Tests / test (push) Successful in 7m21s
- calendar: create_calendar_user() now writes bcrypt htpasswd entry to
  data/services/calendar/config/users (the path Radicale reads at
  /etc/radicale/users); delete_calendar_user() removes the entry

- email: create_email_user() calls `docker exec cell-mail setup email add`
  to register the account in docker-mailserver's Dovecot/Postfix store;
  delete_email_user() calls the matching `setup email del` — both are
  non-fatal if the container isn't running

- service_composer.install(): pull image separately before up so slow
  registry pulls don't race with container startup; retry up once on
  failure so a transient registry hiccup on first install doesn't
  require the user to manually retry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 13:41:41 -04:00
roof c696ca9ef6 fix: DNS split-horizon in DDNS mode, service access filter, health check, verbosity persistence
Unit Tests / test (push) Successful in 7m32s
- DNS (critical): add _configured_dns_params() that returns (primary_domain,
  split_horizon_zones) from config_manager so all apply_all_dns_rules() callers
  pass the correct primary zone (e.g. 'pic.ngo') and split-horizon list
  (e.g. ['pic1.pic.ngo']) instead of the FQDN as the primary — fixes
  DNS_PROBE_FINISHED_BAD_CONFIG for all external domains when on VPN

- firewall_manager: add split_horizon_zones param to apply_all_dns_rules()
  and forward it to generate_corefile()

- Peers: filter service_access list to installed services only; peers.py
  derives valid services from config_manager.get_installed_services() with
  the email→mail ID mapping; Peers.jsx fetches from /api/store/installed
  and filters the checkboxes and defaults accordingly

- Health check: fix file_manager→'files' ID mapping so files service health
  is checked when installed (was silently skipped due to 'file' vs 'files')

- Verbosity persistence: move log_levels.json from non-mounted
  /app/api/config/ to CONFIG_DIR (/app/config/) which maps to config/api/
  on the host; both load (managers.py) and save (routes/services.py) updated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 13:05:58 -04:00
roof 4ebcb1d077 fix: don't overwrite split-horizon Corefile from _bootstrap_dns
Unit Tests / test (push) Successful in 7m29s
The apply_all_dns_rules() call at the end of _bootstrap_dns() was
added to force reload 30s into the Corefile on startup. Now that
reload 30s is removed (it broke CoreDNS zone serving), the call is
unnecessary in LAN mode and actively harmful in DDNS mode:
update_split_horizon_zone() already writes the correct Corefile
with the split-horizon block; the subsequent apply_all_dns_rules()
call would overwrite it without the split-horizon zones, causing
all service subdomain lookups to return NXDOMAIN.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 04:56:41 -04:00
roof 0507445d86 fix: remove file reload 30s from CoreDNS zone blocks
Unit Tests / test (push) Successful in 7m29s
CoreDNS 1.14.3 returns REFUSED for all zones that use
'file /data/zone reload 30s' — the reload timer defers the
initial zone load, causing the plugin to return REFUSED until
the timer fires. The timer never resolves this correctly.

Zone updates are already triggered by SIGUSR1 sent from
_reload_dns_service() after every zone file write, which
causes CoreDNS to reinitialise all plugins and re-read zone
files. No periodic zone polling is needed.

Also update config/dns/Corefile to remove the stale reload 30s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 04:33:19 -04:00
roof 9b5c2e1994 fix: ensure DNS zone changes take effect immediately on startup
Unit Tests / test (push) Successful in 7m35s
Three related issues prevented CoreDNS from serving updated zone records:

1. The `file` plugin blocks in generate_corefile() lacked a `reload`
   option, so CoreDNS never re-read zone files after they were written.
   Added `reload 30s` so zone file changes are picked up within 30s.

2. _reload_dns_service() sent SIGHUP via `docker exec ... kill -HUP 1`,
   which doesn't trigger zone reloads. Changed to SIGUSR1 via
   `docker kill --signal=SIGUSR1` (same as firewall_manager.reload_coredns).

3. _bootstrap_dns() wrote the zone file but never regenerated the
   Corefile. CoreDNS's reload plugin only fires when the Corefile
   changes, so zone records from startup were invisible until the next
   peer modification triggered apply_all_dns_rules(). Now _bootstrap_dns()
   always calls apply_all_dns_rules() after the zone write.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 03:41:19 -04:00
roof 08f46332b0 fix: add built-in service subdomains to DNS zone on startup
Unit Tests / test (push) Successful in 7m45s
_build_dns_records() only hardcoded 'api' and 'webui', relying on the
optional service registry for the rest. Built-in services (calendar,
files, mail, webdav) were never registered, so they were absent from
the zone file and tests querying webdav.<domain> via CoreDNS got
NXDOMAIN.

Add _BUILTIN_SERVICE_SUBDOMAINS constant and include those names in
every zone build. Also update _stale and apply_cell_name exclusion
sets so DDNS mode correctly removes them from the parent zone.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 03:14:34 -04:00
roof e8b8e47aa4 fix: use sudo for nft list tables — /usr/sbin not in roof user PATH
Unit Tests / test (push) Successful in 7m26s
nft lives in /usr/sbin which is absent from the non-root PATH on Debian.
The delete call already used sudo; add it to the list call too so the
session-scoped cleanup fixture doesn't crash before any test runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 15:46:09 -04:00
roof adce219a46 fix: clean up stale wg-quick nftables tables in e2e test teardown
Unit Tests / test (push) Successful in 7m29s
wg-quick creates an nftables 'preraw' table per interface that drops
decrypted ICMP replies arriving on any other interface. If a test run
crashes before bring_down(), the table persists and silently kills pings
on subsequent runs (handshake succeeds, replies are decrypted, but the
stale table drops them before the ping process sees them).

Extend cleanup_stale_e2e_interfaces() to also delete any orphaned
wg-quick-pic-e2e-* nftables tables found on the host.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 15:35:19 -04:00
roof 65d6d07c8d fix: get_status returns actual configured WG address instead of hardcoded default
Unit Tests / test (push) Successful in 7m41s
The address field in get_status() was hardcoded to SERVER_ADDRESS
('10.0.0.1/24') regardless of what wg0.conf contains, so instances
with a non-default subnet (e.g. pic1 at 10.0.1.1/24) always reported
the wrong server IP to callers such as the e2e WG conftest fixture.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 14:48:49 -04:00
roof ab6d6230dd Fix: read WG server IP and subnet from live API instead of hardcoding 10.0.0.x
Unit Tests / test (push) Successful in 7m30s
test_wg_connect_and_ping_server and the connected_peer fixture hardcoded
10.0.0.1 / 10.0.0.0/24 as the server VPN address. This breaks when the
server uses a different subnet (e.g. pic1 uses 10.0.1.1/24). Now both
read 'address' from /api/wireguard/status at session start and pass the
live server_ip / server_network through wg_server_info and connected_peer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 14:09:48 -04:00
roof e2e9c50786 Test: skip peer-sync push test when WG tunnel between cells is not active
Unit Tests / test (push) Successful in 7m27s
The test_remote_permissions_pushed_to_cell2 test verifies that permission
changes on cell1 are pushed to cell2 via the WireGuard tunnel. When both
cells use a public endpoint (DDNS VPS) instead of LAN IPs, no tunnel is
established and the push silently fails. The test now probes cell2's API
at its WG DNS IP before asserting the push succeeded — skips gracefully
if the tunnel is down rather than failing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 12:52:03 -04:00
roof 568e4f9783 Fix: prevent wg0.conf truncation when remove_peer splits blocks
Unit Tests / test (push) Successful in 7m46s
_write_config() was stripping trailing newlines, causing the next
add_cell_peer() to create a single-newline separator between [Interface]
and [Peer] blocks instead of the required blank line. On the following
remove_peer() call, split('\n\n') treated both sections as one block,
matched the PublicKey filter, and wrote an empty string — destroying the
[Interface] section and reverting to the hardcoded SERVER_ADDRESS fallback.

Two-part fix:
1. _write_config() always ends content with a newline
2. remove_peer() normalises single-newline [Peer] headers to blank-line
   separators before splitting, and refuses to write if [Interface] would
   be lost

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 12:31:05 -04:00
roof 26576e1124 Fix: use domain_name (FQDN) in cell invite and conflict checks
Unit Tests / test (push) Successful in 7m39s
The GET /api/cells/invite endpoint was returning domain='pic.ngo' instead
of the full FQDN 'test5.pic.ngo' because it read _identity.domain rather
than _identity.domain_name.

Apply the same domain_name preference (domain_name || domain) to:
- routes/cells.py get_cell_invite() — the invite shown to connecting cells
- routes/cells.py update_cell_permissions() — Corefile DNS regeneration
- cell_link_manager.py _check_invite_conflicts() — incoming domain collision check
- cell_link_manager.py exchange_invites() — own invite construction

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 11:56:42 -04:00
roof 31f76c54fa Fix: use domain_name as service URL base and harden WG e2e tests
Unit Tests / test (push) Successful in 11m15s
API:
- _configured_domain() now prefers _identity.domain_name (full FQDN
  e.g. 'test5.pic.ngo') over domain ('pic.ngo'). Service URLs in
  /api/peer/services and /api/peer/dashboard now correctly return
  'calendar.test5.pic.ngo' instead of 'calendar.pic.ngo'.

WG e2e tests:
- test_api_domain_returns_json_not_webui: accept 3xx redirect as
  valid routing (Caddy redirects HTTP→HTTPS in pic_ngo mode).
- test_catchall_api_path_returns_json and test_catchall_root_serves_webui:
  skip when Caddy is in HTTPS-redirect mode — catch-all :80 block only
  exists in HTTP-mode cells (lan/local domain).
- test_http_api_domain_reaches_api: replace --dns-servers (requires
  c-ares) with dig + curl --host pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 08:40:59 -04:00
roof b6af71acb5 Fix: accept both VIP and Caddy IP in DNS resolution test
Unit Tests / test (push) Successful in 11m9s
Cells with wildcard zone (e.g. * -> 172.20.0.2) and cells with per-service
VIP DNS records are both valid. Accept either in the assertion so the test
passes regardless of the zone file style.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 08:29:05 -04:00
roof 352bb6bb9e Fix: use api_base fixture instead of hardcoded pic0 IP in WG domain access tests
test_peer_services_* functions hardcoded 'http://192.168.31.51:3000' as the
fallback for PIC_API_BASE, causing failures when tests run on any other host
(including pic1 itself). Use the api_base fixture, which reads PIC_HOST and
PIC_API_PORT from the environment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 08:06:29 -04:00
roof 463db029e1 Fix: expose listen_port in WG status API and add HTTPS DNAT to PostUp/PreDown
Unit Tests / test (push) Successful in 11m6s
Adds listen_port to /api/wireguard/status response so e2e test conftest
picks up the actual port (51821) instead of defaulting to 51820.

Extends PostUp/PreDown in generate_config to also DNAT and forward port
443 (HTTPS) through to cell-caddy — mirrors the ensure_service_dnat fix
so HTTPS works even after a WireGuard container restart without an API
restart. Updates _is_dnat_rule to recognize 443 rules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 07:42:49 -04:00
roof 8da711e366 fix: DNAT and forward port 443 (HTTPS) to Caddy from WireGuard peers
Unit Tests / test (push) Successful in 11m9s
ensure_service_dnat() only wired port 80 → cell-caddy, so HTTPS was
silently dropped: no DNAT rule redirected 443 to the Caddy container,
and the FORWARD chain had no ACCEPT for dport 443. Refactored the
function to loop over both 80 and 443 so both are DNAT'd and forwarded.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 07:14:55 -04:00
roof 3e26186f85 fix: correct fake WireGuard key length and guard cell2_client teardown
Unit Tests / test (push) Successful in 11m14s
The synthetic cell fixture used a 46-char base64 key where the validator
expects exactly 43 chars before '='. The key failed format validation so
add_cell_peer returned False, making the cell connection store nothing and
all TestCellPermissionsApi tests hit 404.

The TestCellServiceAccessRestrictions and TestLiveCellConnection teardown
fixtures called _remove_connection(cell2_client, ...) without checking if
cell2_client is None (expected when no second cell is configured), causing
AttributeError on teardown.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 06:20:52 -04:00
roof f84f16fcd6 fix: add /api/network/dns/corefile endpoint and per-line iptables check
Unit Tests / test (push) Successful in 11m13s
The e2e tests were reading a stale Corefile at a hardcoded fallback path
(/home/roof/pic/config/dns/Corefile) instead of the live one written by
the API (/opt/pic/config/dns/Corefile on pic1). Adding a proper API
endpoint eliminates the path ambiguity.

The iptables test was checking whether peer_ip, DROP, and dpt:80 appeared
anywhere in the full multi-line output rather than on the same rule line,
producing false positives. Now checks per line.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 05:54:17 -04:00
roof eee0e800aa feat: add GET /api/peers/<peer_name> endpoint
Unit Tests / test (push) Successful in 11m19s
Allows fetching a single peer by name. E2E tests need this to verify
persisted peer state after PUT operations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 05:19:10 -04:00
roof 2b29938a64 fix: set CSRF token in PicAPIClient after login
Unit Tests / test (push) Successful in 11m22s
POST requests from PicAPIClient were failing with 403 (CSRF token missing)
because the login response csrf_token was not being applied to subsequent
request headers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 05:05:08 -04:00
roof 39c59fd3ef feat: WireGuard endpoint override + fix Docker network label issue
Unit Tests / test (push) Successful in 11m14s
Endpoint override:
- Add PUT /api/wireguard/endpoint to set endpoint_override in identity
  config; GET returns detected, override, and effective endpoints
- _effective_endpoint() helper applies override in peer config generation
  (wireguard.py and peer_dashboard.py); detected IP still shown in UI
- Add Endpoint Override input in WireGuard page — solves the common case
  where auto-detected IP is a gateway/VPS but peers connect via LAN IP

Docker cell-network fix:
- Declare cell-network external in docker-compose.yml; Docker Compose v5
  enforces label ownership and rejects networks created by older versions
- Makefile start/update pre-create cell-network idempotently
- reinstall/uninstall(full) explicitly delete and recreate the network
- Fix uninstall loop path: data/api/services/ (not data/services/)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 04:51:38 -04:00
roof 1b44a18062 fix: declare cell-network external; pre-create in Makefile start/update
Unit Tests / test (push) Successful in 11m16s
Docker Compose v5 enforces label ownership on networks it creates. On
systems where cell-network was created by an older compose version (no
labels), Caddy and other services fail to start with "incorrect label"
error.

Declaring the network external in docker-compose.yml skips label
validation. The Makefile start/update targets now create the network if
it doesn't exist (idempotent). The reinstall and uninstall (full) paths
explicitly delete the network so fresh recreations are clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 03:13:01 -04:00
roof f3737acfa4 fix: fall back to cell effective domain when email service domain not configured
Unit Tests / test (push) Successful in 11m10s
When the email store service is installed but no explicit domain has been
set in its config, _provision_email now falls back to
config_manager.get_effective_domain() so peer account creation works
immediately without requiring a separate config step.

Also threads config_manager into AccountManager.__init__ (optional kwarg,
no existing callers break) so the fallback is available without a global
import.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 17:06:51 -04:00
roof 64dd8b8488 fix: uninstall stops optional service containers before core teardown
Unit Tests / test (push) Successful in 11m11s
Iterates data/services/*/docker-compose.yml and runs `docker compose down`
for each before stopping core containers, so stale optional-service
containers (email, calendar, files, etc.) don't leave cell-network occupied
and block a subsequent fresh install.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 15:52:49 -04:00
roof 0267dce73d feat: HTTPS cert status, IDENTITY_CHANGED wiring, remove stale ip_utils Caddyfile writes
Unit Tests / test (push) Successful in 11m18s
- CaddyManager: add refresh_cert_status() and get_cert_status_fresh() that
  open a live TLS connection to cell-caddy:443 to read cert expiry; avoids
  needing a volume mount into the API container
- CaddyManager: periodic cert refresh in health_monitor_loop (every 60 cycles)
- config.py PUT /api/ddns: publish IDENTITY_CHANGED so CaddyManager regenerates
  the Caddyfile immediately after any domain/cell_name change — previously the
  event was never fired from this route
- config.py: remove all ip_utils.write_caddyfile() calls; CaddyManager is now
  the sole authority for Caddyfile generation
- app.py: add GET /api/caddy/cert-status route
- app.py: add GET /api/egress/status and PUT /api/egress/services/<id>/exit routes
- Settings.jsx: display cert status badge (valid/expired/internal/unknown) with
  expiry date and days-remaining in the domain section
- Tests: TestRefreshCertStatus (8 tests), TestDdnsConfigUpdatesFiresIdentityChanged,
  TestCaddyCertStatusRoute added; fix expired-cert helper to set not_valid_before
  relative to expiry so it's always earlier

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 11:39:36 -04:00
roof 41d09c598b wire: AccountManager HTTP dispatch + EgressManager startup + egress API routes
Unit Tests / test (push) Successful in 11m15s
- add_peer() now calls account_manager.provision() for any installed store
  service whose manifest declares accounts.manager == 'http', enabling
  per-peer credential provisioning to third-party HTTP services
- reapply_on_startup() calls egress_manager.apply_all() so fwmark rules
  survive container restarts without manual intervention
- add GET /api/egress/status and PUT /api/egress/services/<id>/exit routes
  so the UI can read and override per-service egress policy
- tests: HTTP provision wiring (happy path + non-fatal failure), egress
  apply_all at startup (wired/unwired/failure cases)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 10:30:41 -04:00
roof a906c26b5d fix: resolve Caddy env vars at write time to prevent parse errors
Unit Tests / test (push) Successful in 11m25s
acme_ca and the pic_ngo DNS credentials ({$PIC_NGO_DDNS_TOKEN},
{$PIC_NGO_DDNS_API}) were written as Caddy env-var placeholders, but the
Caddy container does not inherit the API container's environment, so the
substitutions always failed — Caddy saw bare directive names with no
arguments and rejected the Caddyfile.

- _global_acme_block: only emit the acme_ca directive when ACME_CA_URL is
  actually set; omitting it makes Caddy default to Let's Encrypt production.
- _caddyfile_pic_ngo: embed the DDNS_TOTP_SECRET and DDNS_URL values directly
  into the Caddyfile at write time rather than relying on Caddy env expansion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 15:01:15 -04:00
roof e87022dc55 fix: cell-network name, install error surfacing, health history cleanup
Unit Tests / test (push) Successful in 11m22s
- docker-compose.services.yml: change external network name from
  pic_cell-network to cell-network so store-service compose files can find
  it.  The project-prefixed name was overriding the explicit name: cell-network
  fix in docker-compose.yml when both files were merged by make start.

- service_store.py: normalize docker compose stderr into the error key in
  the 400 response so the Store page shows the actual failure reason instead
  of the generic fallback message.

- app.py: skip health checks for email/calendar/files managers when those
  optional store services are not installed — prevents false Down alerts and
  unnecessary noise in health history.

- Logs.jsx: remove Email/Calendar/Files columns from the health history table;
  they are optional store services, not core builtins that should always appear.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 14:28:46 -04:00
roof 7d5c5421f1 Implement connectivity store services (wireguard-ext, openvpn-client, tor)
Unit Tests / test (push) Successful in 11m31s
- ConnectivityManager: move config dirs to data_dir/services/<id>/config so
  Docker can bind-mount them into store-service containers (Docker resolves
  bind-mount paths on the host, not inside the API container).  Add
  _migrate_legacy_configs to copy existing files from the old config_dir
  location on first boot.

- manifest_validator: add allow_host_network parameter to
  validate_rendered_compose.  When True, waives the external-network
  requirement, permits network_mode: host, and allows devices: — all needed
  by VPN/Tor containers that must share the host network namespace to create
  tun/wg interfaces.  Non-host services are unaffected.

- service_composer: read requires_host_network from the manifest and pass
  allow_host_network=True to validate_rendered_compose for connectivity
  services.

- Tests: update file-path assertions to new data_dir layout; add
  TestMigrateLegacyConfigs, TestValidateRenderedComposeHostNetwork, and
  two TestWriteCompose cases for the host-network path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 10:06:48 -04:00
roof 60601eb4af fix: give cell-network an explicit name to avoid compose project prefix
Unit Tests / test (push) Successful in 11m21s
Without name: cell-network, Docker Compose creates the network as
pic_cell-network (prefixed with the project name). Store service compose
templates declare cell-network as external: true and can't find it.
Adding name: cell-network makes the network name predictable regardless
of the Compose project name.

Existing installs need: make stop && make start to recreate the network.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 09:14:31 -04:00
roof 5ed75677c3 test: add e2e tests for service store install/uninstall flow
Unit Tests / test (push) Successful in 11m13s
Tests verify:
- /services page loads and lists all available services
- Admin can install calendar, files, email, and webmail via the store UI
- Install order respects dependencies (email before webmail)
- Uninstall flow shows confirmation dialog before removing
- Dashboard shows service links after install

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 04:51:10 -04:00
roof f7bb2cc962 fix: allow first-party store service subdomains and registry images
Unit Tests / test (push) Successful in 11m25s
Two manifest validation bugs blocked all store service installs:

1. service_store_manager.RESERVED_SUBDOMAINS included 'mail', which
   prevented the email service from using its required subdomain.
   Removed mail/calendar/files/webmail — they belong to official PIC
   store services and must be claimable by them.

2. manifest_validator required @sha256 digest pins on ALL images,
   including first-party git.pic.ngo/roof/* images that the PIC team
   builds and controls. service_store_manager._validate_manifest already
   only warned for first-party images; the secondary validator was
   stricter than intended, causing a hard reject on :latest tags.
   Aligned to warn-not-reject for first-party; malformed digests (when
   provided) are still a hard error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 03:09:41 -04:00
roof c493630bb5 fix: Dashboard blank page — move state declarations before use
Unit Tests / test (push) Successful in 11m36s
SERVICES was computed on line 33 using activeServiceIds which was not
declared until line 36. In strict JS, const is not hoisted — this threw
a ReferenceError on mount, crashing the component and showing a blank page.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 02:44:41 -04:00
roof 0ed8669aec fix: dashboard only shows email/calendar/files if installed
Unit Tests / test (push) Successful in 11m25s
Fetches /api/services/active on load; service status cards and quick-
access links for email, calendar, files, and webmail are suppressed
until the service is installed via the Store. Core services (WireGuard,
Routing, Network) always show. Fixes #setup_complete gate on dev stack.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 01:38:16 -04:00
roof 03a67ad922 feat: add EgressManager — per-service egress enforcement via host iptables
Unit Tests / test (push) Successful in 11m20s
Routes outbound traffic from installed service containers through
alternate exits (wireguard_ext, openvpn, tor) using host-side
iptables fwmark policy-routing in a dedicated PIC_EGRESS chain.
Marks 0x110/0x120/0x130 are distinct from ConnectivityManager's
0x10/0x20/0x30. Container IPs discovered at runtime via docker
inspect. Wired into ServiceStoreManager install/remove lifecycle
and managers.py singleton. 22 new tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 00:58:47 -04:00
roof 5cbbfb41d9 feat: add HTTP dispatch to AccountManager for generic store services
Services with accounts.manager='http' now use POST/DELETE to the
service container's /service-api/accounts endpoint instead of
requiring a named Python manager. _resolve_service allows 'http'
without a registered Python object; _provision_http and
_deprovision_http handle the HTTP calls with 404-as-success on
delete. 9 new tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 00:46:54 -04:00
roof 1f2f9d9f6e feat: add manifest_validator.py — security chokepoint for compose and manifest validation
Unit Tests / test (push) Successful in 11m18s
Rejects privileged compose configs (network_mode:host, pid:host, ipc:host,
userns_mode:host, cap_add:ALL, string commands, missing cell-network,
reserved container names). Validates manifest schema_version=3, image
digest pinning (sha256 required, :tag-only rejected), and provision hook
format. Wired into ServiceComposer.write_compose() and
ServiceStoreManager.install() as a single enforcement point.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 18:45:45 -04:00
roof 62b31b072b feat: remove optional services step from setup wizard
Services are now installed post-setup from the Store page, so the
wizard step that let users pre-select email/calendar/files is removed.
Reduces wizard from 5 steps to 4 (Step4Services deleted, Step5Review
renamed to Step4Review). Backend drops services_enabled validation,
background install thread, and service_store_manager dependency.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 18:33:43 -04:00
roof 3d594025d2 fix: remove legacy service dirs from setup_cell, update sanity_check for optional services
Unit Tests / test (push) Successful in 11m24s
setup_cell.py no longer creates mail/radicale/webdav config and data dirs —
those are managed by ServiceComposer when services are installed. Added
data/services/ for ServiceComposer. sanity_check.py now uses stdlib urllib
and discovers installed services via /api/services/active before checking
their status routes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 17:22:42 -04:00
roof 10ac15d9fe docs: Phase 7 — update docs to reflect optional services migration
Email, calendar, and files are now optional store services, not always-on
builtins. Updated README, QUICKSTART, Wiki, and service-developer-guide to
reflect: dynamic nav, optional service install flow, correct egress
identifiers (wireguard_ext/default vs wireguard/cell_internet), removed
builtin/store distinction from manifest reference, 7 core containers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 17:10:48 -04:00
roof 44d7e96f29 feat: Phase 6 — require_active_service decorator + wizard install wiring
Email/calendar/files routes now return 404 when the service is not
installed, using a require_active_service decorator that checks
ServiceRegistry. Status endpoints are exempt so health checks always work.

SetupManager.complete_setup() now accepts a service_store_manager and
installs any wizard-selected services in a background daemon thread after
setup completes. Failures are logged but do not fail the wizard.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 16:58:57 -04:00
roof a69ca1e402 feat: Phase 5 — remove legacy service blocks, one-shot container cleanup
Unit Tests / test (push) Successful in 11m20s
Email, calendar, files, webmail (rainloop), and the file manager (filegator)
are removed from the main docker-compose stack. They install as independent
per-service compose projects via ServiceComposer.

On startup, _cleanup_legacy_builtin_containers() stops and removes any of the
5 legacy containers still running from the old main stack (guarded by a
one-shot sentinel in _meta.legacy_builtins_cleaned so it never runs twice).
Per-service installs (com.docker.compose.project != 'pic') are left untouched.

Changes:
- docker-compose.yml: remove mail, radicale, webdav, rainloop, filegator blocks;
  fix dhcp + ntp to profiles: ["core","full"] so they start with --profile core
- Makefile: replace all --profile full with --profile core (6 occurrences);
  remove mailserver.env conditional from update: target
- api/legacy_cleanup.py: new module with cleanup_legacy_builtin_containers()
- api/app.py: import and call cleanup at startup before reapply_on_startup()
- tests/test_legacy_cleanup.py: 7 tests covering sentinel, absent containers,
  per-service project skip, main-stack removal, exception safety

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 15:57:45 -04:00
roof a10fe11136 feat: Phase 4 — dynamic nav + service visibility based on installed services
Unit Tests / test (push) Successful in 11m24s
Email, calendar, and files no longer appear in the nav or as usable pages
unless they are installed. The nav refreshes whenever a service is installed
or removed via the new pic-services-changed CustomEvent.

Changes:
- routes/services.py: add GET /api/services/active endpoint
- api.js: add servicesAPI.listActive()
- App.jsx: replace hardcoded coreServiceChildren with dynamic state fetched
  from /api/services/active; SERVICE_META maps ids to nav entry shapes
- ServiceNotInstalledBanner.jsx: new component — admin gets catalog link,
  peer gets "contact admin" message
- EmailPage/CalendarPage/FilesPage: show banner when service not installed
- ServicesIndex.jsx: remove CoreServiceCard + CORE_SERVICES "Built-in"
  section; rename Remove → Uninstall; dispatch pic-services-changed on
  install/uninstall success
- MyServices.jsx: conditionally render service cards based on active list;
  placeholder card when absent; page-level notice when nothing is installed
- tests/test_services_active_endpoint.py: 4 new endpoint tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 12:15:02 -04:00
roof 87c321c1c9 feat: Phase 3 — ServiceComposer deps + store install via per-service compose
Unit Tests / test (push) Successful in 11m21s
ServiceStoreManager.install() now delegates container lifecycle to
ServiceComposer (per-service docker-compose.yml) instead of appending to a
shared compose override. This eliminates IP pool allocation, compose override
rendering, and the single-stack docker exec approach.

Changes:
- service_composer.py: add _resolve_requires(), _resolve_dependents(),
  reapply_active_services() — dependency graph and startup reapply
- service_store_manager.py: rewrite install() and remove() to use
  ServiceComposer; add _fetch_template(); delete _allocate_service_ip(),
  _render_compose_override(), _write_compose_override(); remove() now guards
  against removing services that others depend on
- managers.py: pass service_composer= to ServiceStoreManager
- Tests: 13 new composer dep tests; TestInstall/TestRemove rewritten for
  the new composer-driven path; test_optional_services_feature.py updated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 09:33:02 -04:00
roof 0bfe95320b feat: Phase 2 — remove builtins layer, ServiceRegistry is installed-only
Unit Tests / test (push) Successful in 11m31s
Builtins (email/calendar/files) are no longer baked into the API image.
ServiceRegistry now only knows about installed store services. When nothing
is installed, Caddy and DNS get no service routes — no hardcoded fallback.

Changes:
- service_registry.py: remove _BUILTINS_DIR, _builtin_ids, _builtin_manifest,
  _load_manifest; get() and list_all() now delegate entirely to installed services
- caddy_manager.py: remove _build_core_service_routes(); remove hardcoded
  fallback pairs from _http01_service_pairs(); empty registry → api block only
- network_manager.py: _get_service_subdomains() returns [] when no registry
- api/services/builtins/: deleted (email, calendar, files manifests)
- Tests updated throughout: removed builtin-dependent assertions, added
  installed-service fixtures, updated fallback expectations to api-only

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 08:53:44 -04:00
roof 18b50d08c1 fix: post-Phase-0 corrections — data-dir bind mounts, reserved subdomains, list_active()
Unit Tests / test (push) Successful in 11m31s
Three related fixes discovered during review of Phase 0 and Phase 1 manifests:

1. validate_rendered_compose(): add allowed_data_dir param. After ${PIC_DATA_DIR}
   substitution, compose templates produce absolute paths; without this the
   validator would reject every service install.  ServiceComposer.write_compose()
   now passes its resolved data_dir so only the designated data directory is
   exempt — /etc, /proc, docker.sock etc. still blocked.

2. _RESERVED_SUBDOMAINS: remove service-level subdomains (mail, calendar, files,
   webdav, webmail). The reserved list should protect PIC infrastructure endpoints
   (api, webui, admin) — not service subdomains that official store services
   (calendar, files, webmail) must be allowed to claim.  Aligns with the
   existing _RESERVED_SUBS in service_registry.py.

3. ServiceRegistry.list_active(): new method returning only installed store
   services (no builtins). This is the forward-looking API that Phase 2 will
   make the primary read path once builtins are deleted. Adding it now unblocks
   the QA agent's test_optional_services_feature.py which was already testing
   the expected Phase 2 behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 07:35:43 -04:00
roof c40919d374 feat: Phase 0 — manifest_validator, compose YAML safety check, cap_add allowlist, backend denylist, provision hook enforcement, size cap
Introduces api/manifest_validator.py as a single security chokepoint
imported by both ServiceComposer and ServiceStoreManager:

- validate_manifest(): rejects kind=builtin, reserved container names,
  reserved subdomains, backend denylist (localhost, cell-api, etc.),
  cap_add outside allowlist / in denylist, shell-string provision hooks,
  and env values with shell-special characters
- validate_rendered_compose(): walks the rendered YAML and rejects
  privileged:true, host network/pid/ipc/userns, absolute bind mounts,
  denied capabilities, devices key, apparmor/seccomp unconfined, and
  string-form command/entrypoint (shell-injection vector)
- validate_provision_hook(): requires argv list form, lowercase binary,
  rejects NUL bytes

ServiceStoreManager changes:
- _validate_manifest() delegates to validate_manifest() after existing checks
- _fetch_manifest() and fetch_index() now stream with a 256 KB size cap
  (prevents memory exhaustion from a malicious or compromised index)
- Digest-pin warning for images missing @sha256 (hard error for unknown
  registries, warning for git.pic.ngo/roof/* and TRUSTED_IMAGES_NO_DIGEST)

ServiceComposer changes:
- write_compose() calls validate_rendered_compose() before any disk write
  so no partial file is left if validation fails
- render_template() substitutes ${PIC_DATA_DIR} with the resolved data_dir path

102 new tests in tests/test_manifest_validator.py covering all five P0
security issues.  Existing test mocks updated to use streaming response
pattern (stream=True + raw.read) and valid compose YAML templates.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 07:23:08 -04:00
roof 5e438aa991 fix: remove stray </div> in Email/Calendar/Files pages that broke vite build
Unit Tests / test (push) Successful in 11m27s
Stray closing div was left in the ternary falsy branch after AdminConfigSection
was moved outside the ternary. esbuild interpreted it as an unterminated regex.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 05:10:52 -04:00
roof c20906d6cc feat: PIC Services Architecture Phase 1 — registry-driven services ecosystem
Unit Tests / test (push) Successful in 11m30s
Implements the full Phase 1 services architecture:
- ServiceRegistry: merges built-in + installed + runtime config; drives Caddy and CoreDNS instead of hardcoded service names
- ServiceComposer: docker-compose lifecycle for third-party services
- AccountManager: per-service credential provisioning and deprovisioning per peer
- Built-in manifests (email, calendar, files) with subdomain, backup, and account hooks
- Admin UI: Accounts tab on Email, Calendar, Files pages
- Developer guide v1: manifest reference, compose variables, backup/egress integration
- 158 new tests; 1762 total passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 05:02:26 -04:00
roof 2f5370bd98 feat: add Steps 1-4 implementation files (AccountManager, ServiceComposer, builtins, tests)
Unit Tests / test (push) Successful in 11m24s
These files were created during Steps 1-4 of the services architecture but were
never staged: AccountManager (per-service credential provisioning), ServiceComposer
(docker-compose lifecycle), built-in service manifests for email/calendar/files,
and their test suites (158 tests). Also un-tracks .coverage binaries that were
accidentally committed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 04:39:19 -04:00
roof dc7b316cbd docs: correct Step 7 developer guide to match Steps 3-6 implementation
Unit Tests / test (push) Failing after 11s
Steps 3-6 were implemented since this doc was last written. Several
technical details had drifted from the actual code:

- Provision response shape was shown as echoing the password; corrected
  to {provisioned: true} to match the security model (passwords are
  never returned after creation)
- Restore command flag corrected from -C / to -C <path>; archives use
  relative paths so the extraction target must be explicit
- Added ServiceRegistry validation chokepoint note: subdomain and
  backend are validated at registration time, before Caddyfile
  generation, not at request time
- Added Admin UI note: Accounts tab appears on service pages
- Added -- separator security note for backup command construction

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 03:10:43 -04:00
roof ad5731073d feat: Admin UI — Accounts tab on service pages (Step 6)
Unit Tests / test (push) Failing after 11s
Admins previously had no UI path to provision per-peer accounts for
email, calendar, and files: they had to hit the AccountManager API
routes directly.  This change wires those routes to a dedicated Accounts
tab on each service page so any peer can be granted or revoked service
access in two clicks.

- webui/src/services/api.js: add accountsAPI with list/provision/
  deprovision/getCredentials, pointing to
  /api/services/catalog/{serviceId}/accounts
- webui/src/components/ServiceAccountsPanel.jsx: new reusable panel;
  handles credential reveal, removal confirmation, load-error state,
  and humanized credential labels
- EmailPage, CalendarPage, FilesPage: Overview/Accounts tab nav (admin
  only); Accounts tab renders ServiceAccountsPanel; AdminConfigSection
  is hidden while on the Accounts tab

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 20:29:57 -04:00
roof 16fb362df7 feat: replace hardcoded service names with ServiceRegistry-driven Caddy and CoreDNS config
Unit Tests / test (push) Failing after 11s
Previously, CaddyManager and NetworkManager contained hardcoded lists of
service names (calendar, files, mail, webdav, etc.), meaning every new
service required a code change to appear in Caddy routes and DNS records.
Now both managers accept a service_registry parameter and derive their
service lists dynamically from the registry at runtime.

- CaddyManager: new _build_registry_service_routes() and
  _http01_service_pairs() methods pull routes from the registry
- NetworkManager: new _get_service_subdomains() method returns registry
  subdomains with a hardcoded fallback when no registry is wired in;
  _build_dns_records, stale-record detection, and service name sets all
  use the registry
- managers.py: service_registry constructed before network_manager so it
  can be injected into both CaddyManager and NetworkManager
- service_registry.py: validation chokepoint in get_caddy_routes() rejects
  invalid subdomain/backend values and reserved service names
- service_store_manager.py: _validate_manifest now validates top-level
  subdomain, backend, extra_subdomains, and extra_backends fields
- tests: 24 new tests covering registry-driven routing and DNS subdomain
  generation (test_caddy_registry_integration.py)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 18:27:52 -04:00
roof 63c0dfb9d9 docs: document Services UI refactor in wiki
Unit Tests / test (push) Successful in 11m29s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 06:58:24 -04:00
roof 0afdee32da feat: Services UI — nested nav, per-service pages, settings migration
Rename Store → Services: ServicesIndex.jsx shows built-in core services
(Email, Calendar, Files) with Manage links, plus the existing add-on
store below.

New service sub-pages at /services/email|calendar|files serve both
admin and peer roles. Admins see connection info, service status, users
list, and an inline config form (port/data-dir). Peers see connection
info and their personal credentials fetched from peerAPI.

Navigation restructured: a Services parent item expands to show the
three sub-pages via a collapsible sidebar group (ChevronDown toggle).
Both admin and peer navigation include the Services group. Sidebar
extracted NavItem/NavList components to eliminate the duplicate mobile/
desktop rendering.

Settings.jsx drops EmailForm, CalendarForm, FilesForm and their
SERVICE_DEFS entries. Port conflict detection and per-service validation
logic extracted to utils/serviceConfig.js, shared by Settings and the
new service pages. Service form flushers are registered without cleanup
so the Apply banner saves dirty config even when the user navigates away
from a service page before clicking Apply.

Legacy routes /email, /calendar, /files, /store redirect to their new
canonical paths.

GET /api/config now includes installed_services so the nav can derive
which add-ons are installed without a separate store fetch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 06:46:17 -04:00
roof b16189d00f Fix three DNS corruption bugs in DDNS/non-LAN mode
Unit Tests / test (push) Successful in 11m30s
apply_cell_name() now skips multi-label zone files (split-horizon DDNS
zones like pic2.pic.ngo.zone) and excludes '*' and '@' from hostname
candidate detection, preventing the wildcard record from being renamed
to the old cell name during a cell rename.

update_split_horizon_zone() now deletes stale zone files from previous
cell names sharing the same TLD (e.g. pic3.pic.ngo.zone when renaming
to pic2.pic.ngo), eliminating orphaned DNS entries.

_bootstrap_dns() now detects non-LAN domain modes and calls
update_split_horizon_zone() instead of apply_ip_range(), preventing
service records (api, calendar, files…) from being re-injected into
the DDNS parent zone on every container restart.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 05:56:00 -04:00
roof 66500bb128 fix: use effective_domain for service links and clean up stale DNS records
Unit Tests / test (push) Successful in 11m32s
Dashboard, Email, Calendar, and Files pages were building service URLs
with the internal LAN zone name (e.g. 'cell') instead of the public
effective domain (e.g. 'pic2.pic.ngo'), and always using http:// even
in DDNS mode where HTTPS is available.

Changes:
- Dashboard/Email/Calendar/Files: read effective_domain + domain_mode
  from ConfigContext; use effective_domain in non-LAN mode and https://
  for all DDNS domain modes.
- Calendar: show port 443 instead of 80 in DDNS mode.
- network_manager.update_split_horizon_zone: when the primary internal
  zone name is a parent of the effective DDNS domain (e.g. pic.ngo is a
  parent of pic2.pic.ngo), remove stale bootstrap service records (api,
  calendar, files, mail, webmail, webdav) that pollute the DNS display
  and would shadow public DNS responses.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 05:06:52 -04:00
roof d7dbd596ab feat: route PIC services as subdomains of the cell's effective domain
Unit Tests / test (push) Successful in 11m33s
In DDNS modes (pic_ngo, cloudflare, duckdns, http01), all built-in
services are now reachable as subdomains of the cell domain, e.g.
calendar.pic1.pic.ngo instead of pic1.pic.ngo/calendar.

Key changes:
- CaddyManager._build_core_service_routes(): new helper generates
  Caddy named-matcher host blocks for calendar, mail/webmail, files,
  webdav, and api subdomains within the wildcard TLS server block.
- All ACME modes (pic_ngo, cloudflare, duckdns) use the new
  subdomain matchers; http01 emits a dedicated server block per service.
- http01: installed store-plugin services whose name clashes with a
  core service are skipped to prevent duplicate server blocks.
- routes/config.py: ip_utils.write_caddyfile() is skipped in non-LAN
  modes so LAN Caddy config never overwrites the ACME config.
- firewall_manager.generate_corefile(): new split_horizon_zones param
  adds local authoritative file zones so LAN clients resolve
  *.pic1.pic.ngo to the internal Caddy IP without hairpin NAT.
- NetworkManager.update_split_horizon_zone(): writes the wildcard zone
  file and regenerates the Corefile with the split-horizon block;
  called automatically after every identity change in non-LAN mode.
- Added @ to allowed record-name chars in update_dns_zone validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 04:31:57 -04:00
roof 1f016de855 feat: make DDNS domain_name the effective domain across all services
Unit Tests / test (push) Successful in 11m35s
- ConfigManager.get_effective_domain(): returns domain_name when DDNS
  active (pic_ngo/cloudflare/duckdns), domain otherwise. Used by all
  public-facing services so they use the real registered FQDN.
- ConfigManager.get_internal_domain(): always returns _identity.domain
  (CoreDNS zone name, dnsmasq, cell-link invites — stays internal).
- Silent migration: if domain_mode != lan and domain is generic "cell",
  auto-set to {cell_name}.local for unique CoreDNS zone naming.
- caddy_manager: fix custom_domain bug — cloudflare/http01 modes were
  reading identity.get('custom_domain') which never exists; now reads
  domain_name correctly.
- routes/config, app: expose effective_domain in GET /api/config and
  /api/status responses.
- email_manager, routes/email: use get_effective_domain() for
  OVERRIDE_HOSTNAME, POSTMASTER_ADDRESS, and new-user email defaults.
- ServiceBus.IDENTITY_CHANGED event: emitted from PUT /api/config and
  POST /api/ddns/register after identity writes; caddy_manager and
  email_manager subscribe to regenerate config automatically.
- Settings.jsx: hide Local Domain input in non-LAN modes; show
  read-only effective_domain with "managed by DDNS" badge and an
  Advanced toggle for the internal CoreDNS zone name.
- 11 new test classes covering all new helpers, event subscriptions,
  caddy/email handlers, and the custom_domain fix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 02:48:47 -04:00
roof 393d56d4ca fix: block auto-save when DDNS availability check is unreachable
Unit Tests / test (push) Successful in 11m34s
'unreachable' should not be a terminal state that triggers auto-save —
it was causing a 503 when the availability check failed and auto-save
fired the backend registration attempt. Only 'available' allows
auto-save when the cell name has changed from the loaded value.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 14:29:10 -04:00
roof 01027c171e fix: clarify Re-register button purpose with inline hint
Unit Tests / test (push) Successful in 15m24s
Add a short label explaining the button is for DDNS recovery (when the
DDNS server lost your record), not routine IP updates.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 14:08:49 -04:00
roof 742e4209ee fix: don't register pic.ngo subdomain until availability check completes
Auto-save was firing with picAvail === null (the moment the user typed a
new cell name, before the 900ms availability debounce even started), which
caused the backend to immediately register the subdomain on DDNS.

Track the last saved/loaded cell name in loadedCellName. When domainMode
is pic_ngo and the typed name differs from the loaded name, block
auto-save until picAvail reaches a terminal state (available or
unreachable). Also update loadedCellName on successful save so subsequent
edits to the same name are not blocked.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 13:56:52 -04:00
roof ad2eaca273 feat: release old pic.ngo subdomain when cell name changes
Unit Tests / test (push) Successful in 15m45s
Adds DELETE /api/v1/registration to the DDNS server (token-authenticated,
owner-only) and PicNgoDDNS.release() on the client. DDNSManager.register()
now automatically releases the old subdomain before claiming the new one,
so stale names are freed for others to use. Release failures are logged as
warnings and do not block the new registration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:07:13 -04:00
roof de43f4a9a0 fix: DDNS register() always sends public IP and saves token to correct location
Unit Tests / test (push) Successful in 15m27s
Two bugs that prevented registration from working after wizard completion:
1. register(name, '') sent empty IP; server stored blank A record. Now calls
   _get_public_ip() when ip is empty so the A record is always set correctly.
2. Token was saved to _identity.domain.ddns.token (TypeError when domain is a
   string) instead of the top-level ddns config where update_ip() reads it.
   Subdomain also now correctly written to _identity.domain_name.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 16:05:55 -04:00
roof 0b31d02f10 feat: DDNS self-healing heartbeat + manual re-register endpoint
Unit Tests / test (push) Successful in 15m26s
- DDNSTokenExpired exception triggers auto re-register in update_ip()
  so cells recover silently after a DDNS DB reset
- POST /api/ddns/register lets the user force re-registration from Settings
- Re-register button in Settings → External Domain & DDNS (pic_ngo only)
- 3 new tests covering register endpoint: wrong provider, missing name, success

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 15:05:27 -04:00
roof cde177966d fix: DDNS URL env var takes priority; switch default to HTTPS
- ddns_manager: DDNS_URL env var overrides stored api_base_url so
  existing cells pick up the new HTTPS endpoint without re-registering
- docker-compose.yml: default DDNS_URL now points to https://ddns.pic.ngo
- setup_manager.py: add rstrip('/') before replacing /api/v1 to handle
  URLs with or without trailing slash

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 14:50:28 -04:00
roof 61e8631c7d feat: DDNS settings integration — check availability, update credentials
- GET /api/config now returns domain_mode, domain_name, ddns.{provider,subdomain,has_token}
- GET /api/ddns/check/<name> proxies availability check to DDNS service
- PUT /api/ddns validates and saves cloudflare/duckdns credentials post-setup
- When cell_name changes for pic_ngo provider, auto-registers the new subdomain
- Settings: Cell Name shows availability badge for pic_ngo; auto-save blocks on taken
- Settings: new External Domain & DDNS section — pic_ngo info, cloudflare/duckdns edit
- 11 new tests for the two new endpoints (all pass)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 14:35:37 -04:00
roof 81dcced0ca fix: bake DDNS_TOTP_SECRET and correct URL into defaults
Unit Tests / test (push) Successful in 15m42s
docker-compose.yml DDNS_TOTP_SECRET defaulted to empty string —
containers on fresh installs had no OTP, so every /register call
was rejected with 401 and no domain was ever registered.

setup_cell.py still pointed to https://ddns.pic.ngo/api/v1 (no nginx
on VPS, so HTTPS fails). Both now default to the correct values; both
are still overridable via env var for custom DDNS deployments.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 13:49:43 -04:00
roof 777ffa4fb2 fix: use DDNS_URL env var for availability check; default to port 8080
Unit Tests / test (push) Successful in 15m23s
_check_pic_ngo_available was hardcoding https://ddns.pic.ngo, ignoring
DDNS_URL. Now imports DDNS_API_BASE from setup_manager so both the
availability check and DDNS registration use the same configured URL.

API container now receives DDNS_URL and DDNS_TOTP_SECRET from env.
Default DDNS_URL points to http://ddns.pic.ngo:8080/api/v1 (the
FastAPI service runs on port 8080 without TLS termination in front).

Also returns 503 (not 500) when the DDNS service is unreachable.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 13:06:44 -04:00
roof 55d36eb410 wizard: block Next if external service cannot be verified
Unit Tests / test (push) Successful in 15m44s
For pic_ngo: name must be confirmed available (not just format-valid).
For cloudflare/duckdns: token is auto-verified on Next if not already
done — invalid or unreachable service blocks proceeding. Only lan and
http01 (no external dependency) allow Next without a live check.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 08:09:06 -04:00
roof 99dcb1332a wizard: check pic.ngo availability on Next, not just on blur
The availability check was only triggered onBlur, so clicking Next
without blurring the field skipped the DDNS request entirely. Now
handleNext awaits the check and blocks with an error if the name is
taken. Unknown/unreachable DDNS is treated as available to avoid
blocking the wizard.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 07:56:59 -04:00
roof 900781032a wizard: 5-step redesign — password, domain, timezone, services, review
Unit Tests / test (push) Successful in 15m22s
Domain name is now the cell identity (no separate cell name step).
All 5 providers (pic_ngo, cloudflare, duckdns, http01, lan) are
first-class options in a single Domain step. pic.ngo availability
is checked live via backend proxy to ddns.pic.ngo. Cloudflare and
DuckDNS tokens are verified via backend before proceeding.
cell_name is derived automatically from the chosen domain.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 07:09:57 -04:00
roof 1c62c47475 fix: 500 on setup complete + wizard shows all 7 steps
Unit Tests / test (push) Successful in 15m41s
Two bugs:

1. AttributeError: AuthManager.update_password does not exist — the
   fallback when create_user fails should call set_password_admin().
   This caused a 500 on every setup submit when an admin user already
   existed (e.g. from a previous install attempt).

2. Wizard was jumping to step 2 and skipping domain steps 3-4 when
   preconfigured data existed in cell_config.json. Since the installer
   no longer sets that data, and the wizard must always show all steps,
   the installerConfigured state and all step-skipping navigation is
   removed. Values are still pre-filled if found in config.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 16:41:33 -04:00
roof 4a42ff5dcc wizard: move all config to /setup; install.sh is infrastructure-only
Unit Tests / test (push) Successful in 15m41s
install.sh no longer prompts for anything. It installs packages (with sudo),
creates the system user, clones the repo, and runs 'make install' — all as
the invoking user. Only package installs and system-level ops use sudo.
All folder creation happens under the user's own account, no chown needed.

/setup wizard gains the missing validation that was previously in install.sh:
- Step 1: checks pic.ngo name availability via backend (non-blocking)
- Step 4: 'Verify token' button for Cloudflare and DuckDNS tokens,
  validated server-side through new /api/setup/validate steps

API changes (routes/setup.py):
- validate step 'pic_ngo_available': proxy check to ddns.pic.ngo
- validate step 'cloudflare_token': verify via Cloudflare tokens API
- validate step 'duckdns_token': verify via DuckDNS update endpoint

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 16:07:56 -04:00
roof 2d842abe5b installer: restore cell identity prompts and domain setup
Unit Tests / test (push) Successful in 15m39s
Reverts 8d1ef39. The installer must collect cell name, domain mode, and
provider tokens before 'make install' so that DDNS registration,
availability checks, and Caddy TLS can be configured at first boot.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 15:01:32 -04:00
roof 8d1ef39ca5 installer: remove cell identity prompts — wizard handles all config
Unit Tests / test (push) Successful in 15m44s
The /setup wizard now collects cell name, domain mode, credentials,
password, services, and timezone.  The bash installer's job is just
infrastructure: packages, user, repo clone, make install, start.

Removes: prompt/prompt_secret helpers, verify_cf_token, verify_duckdns,
check_pic_ngo_available, and the entire Step 5 identity block.
TOTAL_STEPS 8 → 7.  Step numbers renumbered accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 14:41:46 -04:00
roof 9566f7dd1b wizard: skip cell-name and domain steps when installer pre-configured them
Unit Tests / test (push) Successful in 15m44s
When the bash installer collects cell name and domain mode, the first-run
wizard's /setup should only ask for a password, service selection, and
timezone.  Previously the wizard pre-filled those fields but still showed
all 7 steps.

- useEffect fetches /api/setup/status on mount; if preconfigured.cell_name
  and preconfigured.domain_mode are both set, sets installerConfigured=true
  and jumps to step 2 (password)
- handleStep2Next → step 5 when installerConfigured (skips domain steps 3+4)
- handleStep2Back → step 1 when installerConfigured (review cell name)
- handleStep5Back returns to step 2 when installerConfigured

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 14:03:56 -04:00
roof f03a5f08c6 Makefile: explicitly pass all identity env vars to setup_cell.py
Unit Tests / test (push) Successful in 15m41s
DOMAIN_MODE, CELL_DOMAIN_NAME, CLOUDFLARE_API_TOKEN, DUCKDNS_TOKEN,
DUCKDNS_SUBDOMAIN are now explicit in the setup target so they are
visible and documented, not silently inherited from the environment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 13:27:53 -04:00
roof f550f04ce2 Fix DDNS registration and wizard pre-fill after installer run
Unit Tests / test (push) Successful in 15m29s
DDNS registration (setup_cell.py):
- Replace pyotp dependency with stdlib TOTP (HMAC-SHA1, RFC 6238)
  pyotp is only available inside the Docker container, not on the host
  where setup_cell.py runs — registration was silently skipped every time
- OTP header still sent if generation succeeds; omitted gracefully if not

Wizard pre-fill (setup_manager + Setup.jsx):
- GET /api/setup/status now returns 'preconfigured' dict with cell_name,
  domain_mode, domain_name, and provider tokens from installer-written config
- Setup.jsx fetches status on mount and pre-fills all form state so the
  user only needs to set password, services, and timezone — not re-enter
  the identity they already configured in the bash installer
- Fails silently so wizard still works on fresh installs with no config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 12:22:53 -04:00
roof 579f49ba13 Installer: interactive cell identity prompts with live token validation
Unit Tests / test (push) Successful in 15m24s
install.sh now guides the user through the full identity setup before
running make install:
- Cell name prompt with format validation and pic.ngo availability check
- Domain mode selection: pic.ngo / Cloudflare / DuckDNS / HTTP-01 / LAN
- Cloudflare API token: collected and verified against CF tokens/verify API
- DuckDNS: subdomain + token verified against duckdns.org/update
- HTTP-01: domain name collected, port-80 warning shown
- All collected values passed as env vars to make install
- After two failed token attempts user can continue (re-verified at boot)
- Final banner shows configured cell name and domain

setup_cell.py: updated to handle all domain modes
- Reads DOMAIN_MODE / CELL_DOMAIN_NAME / CLOUDFLARE_API_TOKEN /
  DUCKDNS_TOKEN / DUCKDNS_SUBDOMAIN from env
- write_cell_config() now writes domain_mode + domain_name to _identity
  and builds the ddns section for each provider (not hardcoded to pic_ngo)
- register_with_ddns() only called when DOMAIN_MODE == 'pic_ngo'

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 11:34:22 -04:00
roof 925ab1f696 Overhaul setup wizard: domain config, password strength, field alignment
Unit Tests / test (push) Successful in 8m48s
Password:
- Add lowercase to strength scoring; "Good" now requires all API criteria
  (12 chars, upper, lower, digit) — no more submitting passwords the API rejects
- isReady gates the Next button on meeting API requirements, not just length

Domain steps 3 + 4:
- Step 3: choose pic_ngo / custom / lan (sends valid API domain_modes)
- Step 4 (pic.ngo): shows derived [cellName].pic.ngo domain preview
- Step 4 (custom): domain name field + TLS method selector
  (Cloudflare DNS-01 + API token, DuckDNS + token, HTTP-01 + port-80 warning)
- Step 4 skipped entirely for LAN-only
- Review step shows actual domain string and TLS method instead of opaque codes

Cell name:
- Description and preview hint make clear it becomes the pic.ngo subdomain
- Step 1 shows live "name.pic.ngo" preview as you type

Backend:
- setup_manager now accepts and stores domain_name, cloudflare_api_token,
  duckdns_token for Phase 3 DDNS registration use

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 07:27:59 -04:00
roof 439886624e Fix config/data ownership — chown to invoking user after make install
Unit Tests / test (push) Successful in 8m46s
make install runs as root so all generated files (config/, data/) land
as root:root. Added a chown pass in install.sh after make install
completes, re-applying REPO_OWNER ownership. Also fixed the make setup
chown to use SUDO_USER when invoked via sudo rather than always id -u
(which is 0 when running as root).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 06:46:12 -04:00
roof 24877df976 Fix setup wizard and installer for fresh-install flow
Unit Tests / test (push) Successful in 8m53s
- setup_manager: fall back to update_password if admin already exists
  (installer bootstrap creates admin; wizard now updates rather than fails)
- install.sh: chown repo to SUDO_USER instead of pic user so the
  invoking operator can run make update without git safe.directory errors
- test: update mock to also stub update_password when testing total auth failure

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 06:08:55 -04:00
roof bfa0d99dd1 Fix git safe.directory error for non-root users after install
Unit Tests / test (push) Successful in 8m55s
The installer runs as root and chowns /opt/pic to the pic user.
Any other user (roof, operator) running make update then hits
"detected dubious ownership". Fix: add /opt/pic to system-wide
git safe.directory after clone, and add same guard in make update.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 05:46:40 -04:00
roof 1e2cf5580f Fix setup wizard: align field names with API (domain_type→domain_mode, services→services_enabled)
Unit Tests / test (push) Successful in 8m52s
The wizard was sending domain_type and services but the API expected
domain_mode and services_enabled, causing a validation error on submit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 05:36:18 -04:00
roof 1989dfa0a3 Fix: exempt /api/setup/* from enforce_auth so setup wizard works on fresh install
Unit Tests / test (push) Successful in 8m49s
The setup wizard runs before any account exists, but the installer's
setup_cell.py creates auth_users.json with an admin account first.
This meant enforce_auth was active by the time the browser hit /setup,
blocking all /api/setup/* calls with 401. The CSRF hook already exempted
/api/setup/* — auth enforcement now matches.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 05:03:44 -04:00
roof 5dab6377bc Restore https:// now that git.pic.ngo has a TLS certificate
Unit Tests / test (push) Failing after 15m59s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 04:33:51 -04:00
roof 0a24d20bbc Update QUICKSTART: use http for install.pic.ngo and git.pic.ngo (no HTTPS yet)
Unit Tests / test (push) Successful in 8m50s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 02:58:48 -04:00
roof 46599bd37e Fix installer: use http://git.pic.ngo without port (nginx forwards)
Unit Tests / test (push) Successful in 8m55s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 02:57:13 -04:00
roof dde4d9a53f Rewrite CLAUDE.md following article best practices
Unit Tests / test (push) Successful in 8m54s
Adds: tech stack, coding conventions, file placement rules, safety rules,
infrastructure topology table, and expands architecture with key-file table
and before-request hook documentation. Removes vague guidance, replaces
with actionable rules Claude can follow automatically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 07:25:53 -04:00
roof 674a66f7a0 Revert registry port: git.pic.ngo uses standard port (DNS fix pending)
Unit Tests / test (push) Successful in 8m55s
2026-05-10 06:59:13 -04:00
roof 9df3bf6a17 Fix release workflow: registry is git.pic.ngo:3000 not port 80
Unit Tests / test (push) Successful in 8m55s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 06:52:42 -04:00
roof 0773179962 Gitignore .coverage files
Unit Tests / test (push) Successful in 8m55s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 06:28:40 -04:00
roof 3a35cf72d3 Fix CI failures on root — mock OSError instead of relying on filesystem
Tests assumed write to /nonexistent/... fails, but CI runs as root where
Linux allows creating any path. Use unittest.mock.patch on builtins.open
with OSError side_effect so the test is environment-independent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 06:19:24 -04:00
roof 515f3d5075 Update QUICKSTART: lead with curl installer, document all domain modes
Unit Tests / test (push) Failing after 8m43s
Option A is now the one-line curl installer (install.pic.ngo); Option B
is the manual git clone path. Wizard section covers all five domain modes
(pic_ngo, cloudflare, duckdns, http01, lan) and current password rules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 05:05:08 -04:00
roof 35993bc79d Update all documentation to reflect current architecture
Unit Tests / test (push) Failing after 8m47s
README, QUICKSTART, and Wiki were pre-wizard, pre-auth, pre-DDNS, and
pre-service-store.  Full rewrite covering:
- First-run wizard replaces manual make setup + .env identity config
- Session-based auth (admin/peer roles, CSRF protection)
- DDNS: pic.ngo registration with TOTP, provider abstraction
- Service store: install/remove optional services from manifest index
- Cell-to-cell networking and peer-sync protocol
- Extended connectivity: WG external, OpenVPN, Tor exit routing
- Caddy HTTPS: Let's Encrypt (DNS-01/HTTP-01) or internal CA
- Current container list, port bindings, and security model
- Accurate make targets (ddns-update, reset-admin-password, etc.)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 04:35:37 -04:00
roof f1b48208fc Fix CI unit test failures and DDNS config wiring
Unit Tests / test (push) Failing after 8m58s
- auth_manager._ensure_file(): stop creating the empty auth_users.json on
  init — the constructor now only creates the parent directory.  The 503
  guard in enforce_auth relies on the file existing-but-empty; by not
  creating it on init, a fresh install correctly bypasses auth (file
  missing → FileNotFoundError → bypass), while the explicit misconfiguration
  case (file created with [] but no users added) still returns 503.
- test_enforce_auth_configured.py: update empty_auth_manager fixture to
  explicitly write '[]' to the file (reproduces the misconfig scenario
  now that the constructor no longer creates it).
- ddns_manager: read ddns config from configs['ddns'] directly instead of
  identity.domain.ddns — _identity.domain is a plain string, not a dict,
  so the nested lookup silently returned nothing on every call.
- setup_cell.py: write top-level 'ddns' block into cell_config.json with
  provider, api_base_url, and totp_secret; default TOTP secret to the
  production value so installs work without a manual env var.
- test_ddns_manager.py: update _make_config_manager to populate cm.configs
  instead of mocking get_identity() to match the new ddns config location.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 04:20:19 -04:00
roof ffe1dbeed6 Integrate DDNS registration and IP update into installer
Unit Tests / test (push) Failing after 8m57s
setup_cell.py: register_with_ddns() called at end of setup — detects
public IP via api.ipify.org, generates TOTP code from DDNS_TOTP_SECRET,
POSTs to DDNS /register, saves token to data/api/.ddns_token (mode 600).
Idempotent: skips if token file already exists. Fails gracefully if
DDNS_TOTP_SECRET is unset or network is unreachable.

scripts/ddns_update.py: standalone script for periodic IP updates.
Reads token from data/api/.ddns_token, fetches current public IP,
compares to cached last IP (data/api/.ddns_last_ip) and calls /update
only when the IP has actually changed.

Makefile: add ddns-update (run update script) and ddns-register (force
re-registration by removing old token then calling register_with_ddns).
Usage: DDNS_TOTP_SECRET=<secret> make ddns-register

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 02:28:02 -04:00
roof 15376b67c7 Add runtime-generated config paths to .gitignore
Unit Tests / test (push) Failing after 9m0s
config/api/dns/, config/api/network.json, config/api/webdav/ are
created at API startup and should never be tracked.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 13:26:03 -04:00
192 changed files with 38439 additions and 9015 deletions
BIN
View File
Binary file not shown.
+7
View File
@@ -0,0 +1,7 @@
[run]
omit =
api/test_enhanced_api.py
[report]
omit =
api/test_enhanced_api.py
+10 -2
View File
@@ -21,8 +21,10 @@ config/api/caddy/Caddyfile
config/api/calendar.json
config/api/cell_config.json
config/api/wireguard.json
config/api/webdav/webdav.conf
config/api/webdav/
config/api/dhcp/
config/api/dns/
config/api/network.json
config/caddy/Caddyfile
config/dhcp/dnsmasq.conf
config/dns/Corefile
@@ -84,4 +86,10 @@ backups/
# Temporary files
*.tmp
*.temp
*.temp
# Coverage data
.coverage
htmlcov/
CLAUDE.md
-87
View File
@@ -1,87 +0,0 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## What This Project Is
**Personal Internet Cell (PIC)** — a self-hosted digital infrastructure platform. It manages DNS, DHCP, NTP, WireGuard VPN, email, calendar/contacts (CalDAV), file storage (WebDAV), reverse proxy (Caddy), a certificate authority, and container orchestration, all from a single API + React UI.
## Common Commands
```bash
# Full stack
make start # docker-compose up -d
make stop # docker-compose down
make restart # docker-compose restart
make status # docker status + API health
make logs # docker-compose logs -f
make build # rebuild api image
# Tests
make test # pytest tests/ api/tests/
make test-coverage # pytest with coverage HTML report
make test-api # pytest tests/test_api_endpoints.py
pytest tests/test_<module>.py # single test file
# Local dev (no Docker)
pip install -r api/requirements.txt
python api/app.py # Flask API on :3000
cd webui && npm install && npm run dev # React UI on :5173 (proxies API to :3000)
# WireGuard
make show-routes
make add-peer PEER_NAME=foo PEER_IP=10.0.0.5 PEER_KEY=<pubkey>
make list-peers
```
## Architecture
### Backend (`api/`)
All service managers inherit `BaseServiceManager` (`api/base_service_manager.py`). This enforces a consistent interface: `get_status()`, `get_config()`, `update_config()`, `validate_config()`, `test_connectivity()`, `get_logs()`, `restart_service()`. When adding or modifying a service manager, follow this pattern.
The `ServiceBus` (`api/service_bus.py`) is a pub/sub event system used for inter-service communication. Services publish events (e.g., `SERVICE_STARTED`, `CONFIG_CHANGED`, `PEER_CONNECTED`) and subscribe to events from dependencies. Dependency graph is declared in the bus — e.g., `wireguard` depends on `network`; `email` depends on `network` and `vault`.
`ConfigManager` (`api/config_manager.py`) is the single source of truth. Config lives in `/app/config/cell_config.json` (mapped from `config/api/`). All managers read/write through ConfigManager, which validates against per-service schemas and maintains automatic backups.
`LogManager` (`api/log_manager.py`) provides structured JSON logging with rotation (5 MB / 5 backups per service). Use it instead of `print()` or raw `logging`.
`app.py` (2000+ lines) contains all Flask REST endpoints, organized by service. It runs a background health-monitoring thread.
Service managers:
- `network_manager.py` — DNS (CoreDNS), DHCP (dnsmasq), NTP (chrony)
- `wireguard_manager.py` — VPN peer lifecycle, QR codes
- `peer_registry.py` — peer registration/lookup
- `routing_manager.py` — NAT, firewall rules, VPN gateway
- `vault_manager.py` — internal certificate authority
- `email_manager.py` — Postfix + Dovecot
- `calendar_manager.py` — Radicale CalDAV/CardDAV
- `file_manager.py` — WebDAV storage
- `container_manager.py` — Docker SDK wrappers
- `cell_manager.py` — top-level orchestration
### Frontend (`webui/`)
React 18 + Vite + Tailwind CSS. All API calls go through `src/services/api.js` (Axios). Vite dev server proxies `/api` to `localhost:3000`. Pages in `src/pages/`, shared components in `src/components/`.
### Infrastructure
`docker-compose.yml` defines 13 services on a custom bridge network `cell-network` (172.20.0.0/16). Cell IPs default to 10.0.0.0/24. Key ports: 53 (DNS), 80/443 (Caddy), 3000 (API), 5173/8081 (WebUI), 51820/udp (WireGuard), 25/587/993 (mail), 5232 (CalDAV), 8080 (WebDAV).
Config files for each service live under `config/<service>/`. Persistent data is under `data/` (git-ignored). WireGuard configs are also git-ignored.
## Testing
Tests live in `tests/` (28 files). Use mocking (`pytest-mock`) for external system calls. Integration tests in `test_integration.py` require Docker services running.
## AI Collaboration Rules (Claude Code)
These rules apply to every Claude Code session in this repo:
- **Read memory first** — load `/home/roof/.claude/projects/-home-roof/memory/MEMORY.md` and referenced files at session start.
- **Dev machine context** — you are already on pic0 (192.168.31.51), the dev machine. Execute commands here directly; do not ask the user to run them.
- **Use all available agents** — spawn specialized sub-agents (pic-remote, pic-qa, pic-architect, etc.) for tasks that match their description.
- **make is the only interface** — never call docker/docker-compose directly. All container lifecycle operations go through `make start`, `make stop`, `make build`, `make logs`, etc.
- **Test every new feature** — after implementing any change, run `make test` before considering the task done.
- **Test before commit** — the pre-commit hook enforces this, but run `make test` manually first and fix all failures before staging files.
+63 -16
View File
@@ -12,7 +12,8 @@
test-e2e-deps test-e2e-api test-e2e-ui test-e2e-wg test-e2e \
reset-test-admin-pass \
show-admin-password reset-admin-password \
show-routes add-peer list-peers
show-routes add-peer list-peers \
ddns-update ddns-register
# Detect docker compose command (v2 plugin preferred, fallback to v1 standalone)
DC := $(shell docker compose version >/dev/null 2>&1 && echo "docker compose" || echo "docker-compose")
@@ -78,9 +79,14 @@ check-deps:
setup: check-deps
@echo "Setting up Personal Internet Cell..."
@sudo chown -R $$(id -u):$$(id -g) config/ data/ 2>/dev/null || true
@sudo chown -R $${SUDO_USER:-$$(id -un)}:$${SUDO_USER:-$$(id -un)} config/ data/ 2>/dev/null || true
CELL_NAME=$(or $(CELL_NAME),mycell) \
CELL_DOMAIN=$(or $(CELL_DOMAIN),cell) \
DOMAIN_MODE=$(or $(DOMAIN_MODE),lan) \
CELL_DOMAIN_NAME=$(or $(CELL_DOMAIN_NAME),) \
CLOUDFLARE_API_TOKEN=$(or $(CLOUDFLARE_API_TOKEN),) \
DUCKDNS_TOKEN=$(or $(DUCKDNS_TOKEN),) \
DUCKDNS_SUBDOMAIN=$(or $(DUCKDNS_SUBDOMAIN),) \
VPN_ADDRESS=$(or $(VPN_ADDRESS),10.0.0.1/24) \
WG_PORT=$(or $(WG_PORT),51820) \
WG_PRIVATE_KEY="$(WG_PRIVATE_KEY)" \
@@ -96,12 +102,14 @@ init-peers:
start:
@echo "Starting Personal Internet Cell..."
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile full up -d --build
@docker network inspect cell-network >/dev/null 2>&1 || \
docker network create --driver bridge --subnet "$${CELL_NETWORK:-172.20.0.0/16}" cell-network
@PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile core up -d --build --quiet-pull
@echo "Services started. Check status with 'make status'"
stop:
@echo "Stopping Personal Internet Cell..."
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile full down
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile core down
@echo "Services stopped."
restart:
@@ -130,20 +138,20 @@ shell-%:
update:
@echo "Pulling latest code..."
@git config --global --add safe.directory $$(pwd) 2>/dev/null || true
@git stash --include-untracked --quiet 2>/dev/null || true
git pull
@git stash pop --quiet 2>/dev/null || true
@if [ ! -f config/mail/mailserver.env ]; then \
echo "Config missing — running setup first..."; \
$(MAKE) setup; \
fi
@echo "Rebuilding and restarting services..."
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile full up -d --build
@docker network inspect cell-network >/dev/null 2>&1 || \
docker network create --driver bridge --subnet "$${CELL_NETWORK:-172.20.0.0/16}" cell-network
@PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile core up -d --build --quiet-pull
@echo "Update complete. Run 'make status' to verify."
reinstall:
@echo "Reinstalling Personal Internet Cell from scratch..."
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile full down -v 2>/dev/null || true
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile core down 2>/dev/null || true
docker network rm cell-network 2>/dev/null || true
@sudo rm -rf config/ data/
@$(MAKE) setup
@$(MAKE) start
@@ -172,22 +180,30 @@ uninstall:
case "$$ans" in \
y|Y) \
echo "Stopping containers and removing images..."; \
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile full down -v --rmi all 2>/dev/null || true; \
for f in data/api/services/*/docker-compose.yml; do [ -f "$$f" ] && PUID=$$(id -u) PGID=$$(id -g) docker compose -f "$$f" down 2>/dev/null || true; done; \
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile core down --rmi all 2>/dev/null || true; \
docker ps -aq --filter "name=cell-" | xargs -r docker rm -f 2>/dev/null || true; \
docker network rm cell-network 2>/dev/null || true; \
echo "Deleting config/ and data/..."; \
sudo rm -rf config/ data/; \
echo "Uninstall complete. Git repo and scripts remain."; \
;; \
n|N|"") \
echo "Stopping and removing containers (keeping images and data)..."; \
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile full down 2>/dev/null || true; \
for f in data/api/services/*/docker-compose.yml; do [ -f "$$f" ] && PUID=$$(id -u) PGID=$$(id -g) docker compose -f "$$f" down 2>/dev/null || true; done; \
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile core down 2>/dev/null || true; \
docker ps -aq --filter "name=cell-" | xargs -r docker rm -f 2>/dev/null || true; \
echo "Done. Images, config/ and data/ are untouched. Run 'make start' to bring it back up."; \
;; \
*) \
echo "Cancelled."; \
;; \
esac
@-sudo systemctl disable pic 2>/dev/null || true
@-sudo rm -f /etc/systemd/system/pic.service
@if command -v systemctl >/dev/null 2>&1; then \
sudo systemctl disable --now pic 2>/dev/null || true; \
sudo rm -f /etc/systemd/system/pic.service; \
sudo systemctl daemon-reload 2>/dev/null || true; \
fi
@-sudo rm -f /opt/pic/.installed
@echo "Note: Data volumes were not deleted. To remove all data, manually delete config/ and data/."
@@ -211,7 +227,9 @@ build-webui:
start-core:
@echo "Starting core services (caddy, dns, wireguard, api, webui)..."
PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile core up -d --build
@docker network inspect cell-network >/dev/null 2>&1 || \
docker network create --driver bridge --subnet "$${CELL_NETWORK:-172.20.0.0/16}" cell-network
@PUID=$$(id -u) PGID=$$(id -g) $(DCF) --profile core up -d --build --quiet-pull
@echo "Core services started. Run 'make start' to also bring up optional services."
start-dns:
@@ -238,9 +256,23 @@ backup:
@echo "Creating backup..."
@mkdir -p backups
@sudo tar -czf backups/cell-backup-$(shell date +%Y%m%d-%H%M%S).tar.gz \
--exclude='data/logs' \
--exclude='data/api/config_backups' \
--exclude='data/api/.test_admin_pass' \
--exclude='data/api/.gitkeep' \
--exclude='*.tmp' \
--exclude='*.partial' \
--exclude='__pycache__' \
config/ data/ docker-compose.yml Makefile README.md
@sudo chown $$(id -u):$$(id -g) backups/cell-backup-*.tar.gz
@echo "Backup created in backups/."
@chmod 600 backups/cell-backup-*.tar.gz
@echo "Backup created in backups/ (mode 0600 — contains secrets/keys)."
@echo ""
@echo "WARNING: this archive contains secrets and key material (WireGuard"
@echo "keys, internal CA, vault fernet.key, admin credentials). Store it"
@echo "securely. Data volumes of installed store services (email, calendar,"
@echo "files, ...) are NOT included here — they are captured by API-driven"
@echo "backups (POST /api/config/backup) via _backup_service_volumes."
restore:
@echo "Available backups:"
@@ -335,6 +367,21 @@ add-peer:
echo "Usage: make add-peer PEER_NAME=name PEER_IP=10.0.0.x PEER_KEY=<pubkey>"; \
fi
# ── DDNS ─────────────────────────────────────────────────────────────────────
ddns-update:
@python3 scripts/ddns_update.py
ddns-register:
@DDNS_TOTP_SECRET="$(DDNS_TOTP_SECRET)" python3 -c "\
import os, sys; sys.path.insert(0, 'scripts'); \
from setup_cell import register_with_ddns, _read_existing_ip_range; \
import json; \
cfg = json.load(open('config/api/cell_config.json')) if os.path.exists('config/api/cell_config.json') else {}; \
name = cfg.get('_identity', {}).get('cell_name', os.environ.get('CELL_NAME', 'mycell')); \
import os; os.remove('data/api/.ddns_token') if os.path.exists('data/api/.ddns_token') else None; \
register_with_ddns(name)"
# ── Dev ───────────────────────────────────────────────────────────────────────
dev:
-535
View File
@@ -1,535 +0,0 @@
# Personal Internet Cell – Project Wiki
## 🌟 Overview
Personal Internet Cell is a **production-grade, self-hosted, decentralized digital infrastructure** solution designed to provide individuals with full control over their digital services and data. The project has evolved from a phase-based implementation to a **unified, enterprise-ready system** with modern architecture, comprehensive testing, and production-grade features.
## 📋 Table of Contents
1. [Project Goals](#project-goals)
2. [Architecture & Components](#architecture--components)
3. [Service Manager Architecture](#service-manager-architecture)
4. [Core Services](#core-services)
5. [API Reference](#api-reference)
6. [Enhanced CLI](#enhanced-cli)
7. [Security Model](#security-model)
8. [Testing & Quality Assurance](#testing--quality-assurance)
9. [Usage Examples](#usage-examples)
10. [Development & Deployment](#development--deployment)
11. [Future Enhancements](#future-enhancements)
12. [Project Status](#project-status)
## 🎯 Project Goals
- **Self-Hosted**: Run your own digital services (email, calendar, files, VPN, etc.) on your hardware
- **Decentralized**: Peer-to-peer networking and trust, no central authority
- **Production-Grade**: Enterprise-ready architecture with comprehensive monitoring
- **Secure**: Modern cryptography, certificate management, and encrypted storage
- **User-Friendly**: Professional CLI and API for easy management
- **Extensible**: Modular architecture for future services and integrations
- **Event-Driven**: Real-time service communication and orchestration
## 🏗️ Architecture & Components
### **Modern Architecture Stack**
- **Backend**: Python (Flask) with production-grade service managers
- **Service Architecture**: BaseServiceManager pattern with unified interfaces
- **Event System**: Service bus for real-time communication and orchestration
- **Configuration**: Centralized configuration management with validation
- **Logging**: Structured JSON logging with rotation and search
- **Containerization**: Docker-based deployment and service isolation
- **API**: RESTful endpoints with comprehensive documentation
### **Core Architecture Components**
```
┌─────────────────────────────────────────────────────────────┐
│ Personal Internet Cell │
├─────────────────────────────────────────────────────────────┤
│ Enhanced CLI │ Web UI │ REST API │ Service Bus │ Logging │
├─────────────────────────────────────────────────────────────┤
│ Service Managers │
│ Network │ WireGuard │ Email │ Calendar │ Files │ Routing │
│ Vault │ Container │ Cell │ Peer │ │ │
├─────────────────────────────────────────────────────────────┤
│ Core Infrastructure │
│ DNS │ DHCP │ NTP │ VPN │ CA │ Encryption │ Trust │ Storage │
└─────────────────────────────────────────────────────────────┘
```
## 🔧 Service Manager Architecture
### **BaseServiceManager Pattern**
All services inherit from `BaseServiceManager`, providing:
```python
class BaseServiceManager(ABC):
def __init__(self, service_name: str, data_dir: str, config_dir: str)
@abstractmethod
def get_status(self) -> Dict[str, Any]
@abstractmethod
def test_connectivity(self) -> Dict[str, Any]
# Common methods
def get_logs(self, lines: int = 50) -> List[str]
def restart_service(self) -> bool
def get_config(self) -> Dict[str, Any]
def update_config(self, config: Dict[str, Any]) -> bool
def health_check(self) -> Dict[str, Any]
def handle_error(self, error: Exception, context: str) -> Dict[str, Any]
```
### **Service Bus Integration**
```python
# Event-driven service communication
service_bus.register_service('network', network_manager)
service_bus.register_service('wireguard', wireguard_manager)
service_bus.publish_event(EventType.SERVICE_STARTED, 'network', data)
# Service dependencies
service_dependencies = {
'wireguard': ['network'],
'email': ['network', 'vault'],
'calendar': ['network', 'vault'],
'files': ['network', 'vault'],
'routing': ['network', 'wireguard'],
'vault': ['network']
}
```
## 🔧 Core Services
### **Network Services**
- **NetworkManager**: DNS, DHCP, NTP with dynamic management
- Dynamic zone file generation
- DHCP lease monitoring
- Network connectivity testing
- Service health monitoring
### **VPN & Mesh Networking**
- **WireGuardManager**: WireGuard VPN configuration and peer management
- Key generation and management
- Peer configuration
- Connectivity testing
- Dynamic IP updates
- **PeerRegistry**: Peer registration and trust management
- Peer lifecycle management
- Trust relationship tracking
- Data integrity validation
- Peer statistics
### **Digital Services**
- **EmailManager**: SMTP/IMAP email services
- User account management
- Mailbox configuration
- Service connectivity testing
- Email delivery monitoring
- **CalendarManager**: CalDAV/CardDAV calendar and contacts
- User and calendar management
- Event synchronization
- Service health monitoring
- Connectivity testing
- **FileManager**: WebDAV file storage
- User directory management
- Storage quota monitoring
- File system access testing
- Backup and restore capabilities
### **Infrastructure Services**
- **RoutingManager**: Advanced routing and NAT
- NAT rule management
- Firewall configuration
- Exit node routing
- Bridge and split routing
- Connectivity testing
- **VaultManager**: Security and trust management
- Self-hosted Certificate Authority
- Certificate lifecycle management
- Age/Fernet encryption
- Trust relationship management
- Cryptographic verification
- **ContainerManager**: Docker orchestration
- Container lifecycle management
- Image and volume management
- Docker daemon connectivity
- Service isolation
- **CellManager**: Overall cell orchestration
- Service coordination
- Health monitoring
- Configuration management
- Peer management
## 📡 API Reference
### **Core API Endpoints**
```bash
# Service Status and Health
GET /api/services/status # All services status
GET /api/services/connectivity # Service connectivity tests
GET /health # API health check
# Configuration Management
GET /api/config # Get configuration
PUT /api/config # Update configuration
POST /api/config/backup # Create backup
GET /api/config/backups # List backups
POST /api/config/restore/<id> # Restore backup
GET /api/config/export # Export configuration
POST /api/config/import # Import configuration
# Service Bus
GET /api/services/bus/status # Service bus status
GET /api/services/bus/events # Event history
POST /api/services/bus/services/<service>/start
POST /api/services/bus/services/<service>/stop
POST /api/services/bus/services/<service>/restart
# Logging
GET /api/logs/services/<service> # Service logs
POST /api/logs/search # Log search
POST /api/logs/export # Log export
GET /api/logs/statistics # Log statistics
POST /api/logs/rotate # Log rotation
```
### **Service-Specific Endpoints**
```bash
# Network Services
GET /api/dns/records # DNS records
POST /api/dns/records # Add DNS record
DELETE /api/dns/records # Remove DNS record
GET /api/dhcp/leases # DHCP leases
POST /api/dhcp/reservations # Add DHCP reservation
GET /api/ntp/status # NTP status
GET /api/network/info # Network information
POST /api/network/test # Network connectivity test
# WireGuard & Peers
GET /api/wireguard/keys # WireGuard keys
POST /api/wireguard/keys/peer # Generate peer keys
GET /api/wireguard/config # WireGuard configuration
GET /api/wireguard/peers # List peers
POST /api/wireguard/peers # Add peer
DELETE /api/wireguard/peers # Remove peer
GET /api/wireguard/status # WireGuard status
POST /api/wireguard/connectivity # Connectivity test
PUT /api/wireguard/peers/ip # Update peer IP
# Digital Services
GET /api/email/users # Email users
POST /api/email/users # Add email user
DELETE /api/email/users/<user> # Remove email user
GET /api/email/status # Email service status
GET /api/email/connectivity # Email connectivity
POST /api/email/send # Send email
GET /api/email/mailbox/<user> # User mailbox
GET /api/calendar/users # Calendar users
POST /api/calendar/users # Add calendar user
DELETE /api/calendar/users/<user> # Remove calendar user
POST /api/calendar/calendars # Create calendar
POST /api/calendar/events # Add event
GET /api/calendar/events/<user>/<calendar> # List events
GET /api/calendar/status # Calendar service status
GET /api/calendar/connectivity # Calendar connectivity
GET /api/files/users # File users
POST /api/files/users # Add file user
DELETE /api/files/users/<user> # Remove file user
POST /api/files/folders # Create folder
DELETE /api/files/folders/<user>/<path> # Remove folder
POST /api/files/upload/<user> # Upload file
GET /api/files/download/<user>/<path> # Download file
DELETE /api/files/delete/<user>/<path> # Delete file
GET /api/files/list/<user> # List files
GET /api/files/status # File service status
GET /api/files/connectivity # File connectivity
# Routing & Security
GET /api/routing/status # Routing status
POST /api/routing/nat # Add NAT rule
DELETE /api/routing/nat/<id> # Remove NAT rule
POST /api/routing/peers # Add peer route
DELETE /api/routing/peers/<peer> # Remove peer route
POST /api/routing/exit-nodes # Add exit node
POST /api/routing/bridge # Add bridge route
POST /api/routing/split # Add split route
POST /api/routing/firewall # Add firewall rule
POST /api/routing/connectivity # Routing connectivity test
GET /api/routing/logs # Routing logs
GET /api/routing/nat # List NAT rules
GET /api/routing/peers # List peer routes
GET /api/routing/firewall # List firewall rules
GET /api/vault/status # Vault status
GET /api/vault/certificates # List certificates
POST /api/vault/certificates # Generate certificate
DELETE /api/vault/certificates/<name> # Revoke certificate
GET /api/vault/ca/certificate # CA certificate
GET /api/vault/age/public-key # Age public key
GET /api/vault/trust/keys # Trusted keys
POST /api/vault/trust/keys # Add trusted key
DELETE /api/vault/trust/keys/<name> # Remove trusted key
POST /api/vault/trust/verify # Verify trust
GET /api/vault/trust/chains # Trust chains
```
## 💻 Enhanced CLI
### **CLI Features**
```bash
# Interactive mode with tab completion
python api/enhanced_cli.py --interactive
# Batch operations
python api/enhanced_cli.py --batch "status" "services" "health"
# Configuration management
python api/enhanced_cli.py --export-config json
python api/enhanced_cli.py --import-config config.json
# Service wizards
python api/enhanced_cli.py --wizard network
python api/enhanced_cli.py --wizard email
# Health monitoring
python api/enhanced_cli.py --health
python api/enhanced_cli.py --logs network
# Service status
python api/enhanced_cli.py --status
python api/enhanced_cli.py --services
python api/enhanced_cli.py --peers
```
### **CLI Capabilities**
- **Interactive Mode**: Tab completion, command history, help system
- **Batch Operations**: Execute multiple commands in sequence
- **Configuration Wizards**: Guided setup for complex services
- **Real-time Monitoring**: Live status updates and health checks
- **Log Management**: View, search, and export service logs
- **Service Management**: Start, stop, restart, and configure services
## 🔒 Security Model
### **Certificate Management**
- **Self-hosted CA**: Issue and manage TLS certificates for all services
- **Certificate Lifecycle**: Generate, renew, revoke, and monitor certificates
- **Trust Management**: Direct, indirect, and verified trust relationships
- **Age Encryption**: Modern encryption for sensitive data and keys
### **Network Security**
- **WireGuard VPN**: Secure peer-to-peer communication with key rotation
- **Firewall & NAT**: Granular control over network access and routing
- **Service Isolation**: Docker containers for each service
- **Input Validation**: All API endpoints validate and sanitize input
### **Data Protection**
- **Encrypted Storage**: Sensitive data encrypted at rest using Age/Fernet
- **Secure Communication**: TLS for all API endpoints and service communication
- **Access Control**: Role-based access for services and API endpoints
- **Audit Logging**: Comprehensive security event logging and monitoring
## 🧪 Testing & Quality Assurance
### **Test Coverage**
- **BaseServiceManager**: 100% coverage
- **ConfigManager**: 95%+ coverage
- **ServiceBus**: 95%+ coverage
- **LogManager**: 95%+ coverage
- **All Service Managers**: 77%+ overall coverage
- **API Endpoints**: 100% endpoint coverage
### **Test Types**
- **Unit Tests**: Individual component testing
- **Integration Tests**: Service interaction testing
- **API Tests**: Endpoint functionality testing
- **Error Handling**: Exception and edge case testing
- **Performance Tests**: Load and stress testing
### **Testing Commands**
```bash
# Run all tests
python api/test_enhanced_api.py
# Run specific test suites
python -m pytest api/tests/test_network_manager.py
python -m pytest api/tests/test_service_bus.py
# Generate coverage report
coverage run -m pytest api/tests/
coverage html
```
## 📝 Usage Examples
### **Add DNS Record**
```bash
curl -X POST http://localhost:3000/api/dns/records \
-H "Content-Type: application/json" \
-d '{
"name": "www",
"type": "A",
"value": "192.168.1.100",
"ttl": 300
}'
```
### **Register Peer**
```bash
curl -X POST http://localhost:3000/api/wireguard/peers \
-H "Content-Type: application/json" \
-d '{
"name": "bob",
"ip": "203.0.113.22",
"public_key": "peer_public_key_here",
"allowed_networks": ["10.0.0.0/24"]
}'
```
### **Generate Certificate**
```bash
curl -X POST http://localhost:3000/api/vault/certificates \
-H "Content-Type: application/json" \
-d '{
"common_name": "myapp.example.com",
"domains": ["myapp.example.com", "www.myapp.example.com"],
"days": 365
}'
```
### **Configure NAT Rule**
```bash
curl -X POST http://localhost:3000/api/routing/nat \
-H "Content-Type: application/json" \
-d '{
"source_network": "10.0.0.0/24",
"target_interface": "eth0",
"nat_type": "MASQUERADE",
"protocol": "ALL"
}'
```
## 🛠️ Development & Deployment
### **Development Setup**
```bash
# Install dependencies
pip install -r api/requirements.txt
# Start development server
python api/app.py
# Run tests
python api/test_enhanced_api.py
# Start frontend (if available)
cd webui && bun install && npm run dev
```
### **Production Deployment**
```bash
# Docker deployment
docker-compose up --build -d
# Health check
curl http://localhost:3000/health
# Service status
curl http://localhost:3000/api/services/status
```
### **Service Development**
```python
from base_service_manager import BaseServiceManager
class MyServiceManager(BaseServiceManager):
def __init__(self, data_dir='/app/data', config_dir='/app/config'):
super().__init__('myservice', data_dir, config_dir)
def get_status(self) -> Dict[str, Any]:
# Implement service status
return {
'running': True,
'status': 'online',
'timestamp': datetime.utcnow().isoformat()
}
def test_connectivity(self) -> Dict[str, Any]:
# Implement connectivity test
return {
'success': True,
'message': 'Service connectivity working',
'timestamp': datetime.utcnow().isoformat()
}
```
## 🚀 Future Enhancements
### **Planned Features**
- **Certificate Auto-renewal**: Automatic certificate renewal and monitoring
- **Web of Trust Models**: Advanced trust relationship management
- **Certificate Transparency**: CT log integration and monitoring
- **Hardware Security Module (HSM)**: HSM integration for key management
- **WebSocket Updates**: Real-time service status updates
- **Advanced Monitoring**: Metrics collection and alerting systems
- **Mobile App**: Mobile application for remote management
- **Plugin System**: Extensible architecture for custom services
### **Architecture Improvements**
- **Service Discovery**: Dynamic service registration and discovery
- **Load Balancing**: Multi-instance service deployment
- **Advanced Caching**: Redis-based caching for performance
- **Message Queues**: RabbitMQ/Kafka for reliable messaging
- **Distributed Tracing**: OpenTelemetry integration
- **Configuration Management**: GitOps-style configuration management
## 📊 Project Status
### **✅ Completed Features**
- **Production-Grade Architecture**: BaseServiceManager pattern implemented
- **Event-Driven Communication**: Service bus with real-time events
- **Centralized Configuration**: Type-safe configuration with validation
- **Comprehensive Logging**: Structured logging with search and export
- **Enhanced CLI**: Interactive CLI with batch operations
- **Health Monitoring**: Real-time health checks across all services
- **Security Framework**: Self-hosted CA, encryption, and trust management
- **Complete API**: RESTful API with comprehensive documentation
- **Testing Framework**: Comprehensive test suite with high coverage
### **🎯 Current Status**
- **All Services**: 10 service managers fully implemented and integrated
- **API Server**: Running on port 3000 with all endpoints functional
- **CLI Tool**: Enhanced CLI with all features working
- **Test Coverage**: 77%+ overall coverage with comprehensive testing
- **Documentation**: Complete documentation for all components
- **Production Ready**: Suitable for personal and small business deployment
### **🌟 Key Achievements**
- **Unified Architecture**: All services follow the same patterns and interfaces
- **Event-Driven Design**: Services communicate and orchestrate automatically
- **Configuration Management**: Centralized, validated configuration system
- **Comprehensive Logging**: Production-grade logging with advanced features
- **Enhanced CLI**: Professional command-line interface for management
- **Health Monitoring**: Real-time monitoring and alerting capabilities
- **Security Framework**: Enterprise-grade security with modern cryptography
- **Complete Testing**: Comprehensive test suite ensuring reliability
---
**The Personal Internet Cell empowers users with full control over their digital infrastructure, combining privacy, security, and usability in a single, production-ready, self-hosted platform.** 🌟
-239
View File
@@ -1,239 +0,0 @@
# Quick Start
This guide walks through a first-time PIC installation from a clean Linux host.
---
## Prerequisites
- Linux host with the WireGuard kernel module (`modprobe wireguard` to verify)
- Docker Engine and Docker Compose installed
- Python 3.10+ (needed for `make setup` only)
- 2 GB+ RAM, 10 GB+ disk
---
## 1. Clone the repository
```bash
git clone <repo-url> pic
cd pic
```
---
## 2. Configure the environment
Copy the example environment file and edit it:
```bash
cp .env.example .env
```
Open `.env` and set at minimum:
```
WEBDAV_PASS=changeme
```
`WEBDAV_PASS` must be set before starting — the WebDAV container will fail to start without it.
All other variables have working defaults. See the Configuration section in [README.md](README.md) for the full list.
---
## 3. Run setup
`make setup` installs system dependencies, generates WireGuard keys, and writes all required config files under `config/`:
```bash
make check-deps # installs docker, python3-cryptography, etc. via apt
make setup # generates keys and writes configs
```
To customise the cell identity at setup time, pass overrides on the command line:
```bash
CELL_NAME=myhome CELL_DOMAIN=cell VPN_ADDRESS=10.0.0.1/24 WG_PORT=51820 make setup
```
`VPN_ADDRESS` must be an RFC-1918 address (e.g. `10.0.0.1/24`).
---
## 4. Start the stack
```bash
make start
```
This builds the `cell-api` and `cell-webui` images and starts all 13 containers. The first run takes a few minutes while images are pulled and built.
Check that everything came up:
```bash
make status
```
You should see all containers in the `Up` state and the API responding at `http://localhost:3000/health`.
---
## 5. Open the web UI
Open a browser and go to:
```
http://<host-ip>:8081
```
If you are running locally:
```
http://localhost:8081
```
The sidebar contains: Dashboard, Peers, Network Services, WireGuard, Email, Calendar, Files, Routing, Vault, Containers, Cell Network, Logs, Settings.
---
## 6. Set cell identity
Go to **Settings** in the sidebar.
Set your:
- **Cell name** — a short identifier, e.g. `myhome`
- **Domain** — the TLD your cell will use internally, e.g. `cell`
- **VPN IP range** — the CIDR for WireGuard peers, e.g. `10.0.0.0/24`
After saving, the UI will show a banner asking you to apply the changes. Click **Apply Now**. The containers will restart briefly to pick up the new configuration.
---
## 7. Add a WireGuard peer
Go to **WireGuard** in the sidebar.
1. Click **Add Peer**.
2. Enter a name for the peer (e.g. `laptop`).
3. The API generates a key pair and assigns the next available VPN IP automatically.
4. Click the QR code icon to display the peer config as a QR code.
5. Scan the QR code with a WireGuard client (Android, iOS, or the WireGuard desktop app).
The peer config sets your cell as the DNS server. Once connected, `*.cell` names resolve through the cell's CoreDNS.
To manage peers from the command line:
```bash
make list-peers
make add-peer PEER_NAME=phone PEER_IP=10.0.0.3 PEER_KEY=<base64-pubkey>
```
---
## 8. Day-to-day operations
```bash
# Follow logs from all services
make logs
# Follow logs from a single service
make logs-api
make logs-wireguard
make logs-caddy
# Check container status and API health
make status
# Open a shell inside a container
make shell-api
make shell-dns
```
---
## 9. Backup
Before making significant changes, create a backup:
```bash
make backup
```
This archives `config/` and `data/` into `backups/cell-backup-<timestamp>.tar.gz`.
To list available backups:
```bash
make restore
```
To restore manually:
```bash
tar -xzf backups/cell-backup-YYYYMMDD-HHMMSS.tar.gz
make start
```
Backup and restore is also available in the UI under **Settings**.
---
## 10. Updating PIC
```bash
make update
```
This runs `git pull`, then rebuilds and restarts all containers. If `config/` is missing (e.g. after a fresh clone), it runs `make setup` automatically.
---
## Troubleshooting
**Containers not starting**
```bash
make logs
make logs-api
```
Look for errors related to missing config files or port conflicts.
**Port 53 already in use**
On Ubuntu/Debian, `systemd-resolved` listens on port 53. Disable it:
```bash
sudo systemctl disable --now systemd-resolved
sudo rm /etc/resolv.conf
echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf
```
Then run `make start` again.
**WebDAV container exits immediately**
`WEBDAV_PASS` is not set in `.env`. Set it and run `make start` again.
**WireGuard container fails to load kernel module**
Ensure the WireGuard kernel module is available:
```bash
sudo modprobe wireguard
```
On some minimal installs you may need to install `wireguard-tools` and the kernel headers for your running kernel.
**API returns 503 or UI shows "Backend Unavailable"**
The Flask API may still be starting. Wait 10–15 seconds after `make start` and refresh. If it persists:
```bash
make logs-api
```
**Config changes not taking effect**
After changing identity or service settings in the UI, a yellow banner appears at the top of the page. Click **Apply Now** to restart the affected containers.
+105 -72
View File
@@ -1,6 +1,6 @@
# Personal Internet Cell (PIC)
PIC is a self-hosted digital infrastructure platform. It manages DNS, DHCP, NTP, WireGuard VPN, email, calendar/contacts (CalDAV), file storage (WebDAV), a reverse proxy, and a certificate authority — all controlled from a single REST API and React web UI. No manual config file editing is required for normal operations.
PIC is a self-hosted digital infrastructure platform. It packages DNS, NTP, WireGuard VPN, a reverse proxy, a certificate authority, and optional third-party services (email, calendar/contacts, file storage, and more) — all managed through a single REST API and a React web UI. No manual config file editing is required for normal operations.
---
@@ -8,98 +8,132 @@ PIC is a self-hosted digital infrastructure platform. It manages DNS, DHCP, NTP,
```
Browser
└── React SPA (cell-webui :8081)
└── React SPA (cell-webui :8081, container port 8080)
└── Flask REST API (cell-api :3000, bound to 127.0.0.1)
└── Docker SDK / config files
├── cell-caddy :80/:443 reverse proxy
├── cell-dns :53 CoreDNS
├── cell-dhcp :67/udp dnsmasq
├── cell-ntp :123/udp chrony
── cell-wireguard :51820/udp WireGuard VPN
├── cell-mail :25/:587/:993 Postfix + Dovecot
├── cell-radicale 127.0.0.1:5232 CalDAV/CardDAV
├── cell-webdav 127.0.0.1:8080 WebDAV
├── cell-rainloop :8888 webmail (RainLoop)
├── cell-filegator :8082 file manager UI
└── cell-webui :8081 React UI (Nginx)
└── Service managers + Docker SDK
├── cell-caddy :80/:443 Caddy reverse proxy (HTTPS/TLS)
├── cell-dns :53 CoreDNS
├── cell-ntp :123/udp chrony
├── cell-wireguard :51820/udp WireGuard VPN (NET_ADMIN only, not privileged)
── cell-webui :8081→8080 React UI (Nginx)
(+ per-service containers, started when a service is installed)
```
All containers run on a custom Docker bridge network (`cell-network`, default `172.20.0.0/16`). Static IPs per container are set in `docker-compose.yml` and overridden via `.env`.
Six core containers run on a Docker bridge network (`cell-network`, default subnet `172.20.0.0/16`). Static IPs per container are set in `docker-compose.yml` and can be overridden via `.env`. Installed service containers join the same network with their own compose projects managed by `ServiceComposer`.
The Flask API (`api/app.py`, ~2800 lines) contains all REST endpoints, runs a background health-monitoring thread, and manages the entire lifecycle of generated config artefacts: `Caddyfile`, `Corefile`, `wg0.conf`, and `cell_config.json` (the single source of truth at `config/api/cell_config.json`).
The Flask API (`api/app.py`) contains REST endpoints and a background health-monitoring thread. Service managers are instantiated as singletons in `api/managers.py`. The single source of truth for runtime configuration is `config/api/cell_config.json`, managed by `ConfigManager`.
The React frontend (`webui/`) is built with Vite + Tailwind CSS. All API calls go through `src/services/api.js` (Axios). Pages: Dashboard, Peers, Network Services, WireGuard, Email, Calendar, Files, Routing, Vault, Containers, Cell Network, Logs, Settings.
The React frontend (`webui/`) is built with Vite + Tailwind CSS. All API calls go through `src/services/api.js` (Axios).
**Web UI pages:** Dashboard, Peers, Network Services, WireGuard, Connectivity (tunnels, proxies, SSH, Tor, cells, assignments), Services (store catalog + per-service pages), Routing, Vault, Containers, Activity, Logs, Settings — plus peer-facing My Services and Account pages.
---
## Features
- **First-run wizard** — browser-based setup at `/setup`. On first start, all API requests redirect to `/setup` (HTTP 428) until the wizard is completed. Sets cell name, domain mode, timezone, admin password, and initial services. No manual `.env` editing required for identity.
- **Session-based auth** — admin and peer roles. All `/api/*` endpoints require an authenticated session after setup. CSRF protection on all state-changing requests.
- **WireGuard VPN** — peer lifecycle management, automatic key generation, QR code config export, per-peer routing policy.
- **Caddy HTTPS** — automatic TLS via Let's Encrypt (DNS-01 or HTTP-01) or an internal CA, depending on domain mode.
- **DDNS (pic.ngo)** — registers a `<cell-name>.pic.ngo` subdomain. Supported providers: `pic_ngo`, `cloudflare`, `duckdns`. A background thread re-publishes the public IP every 5 minutes.
- **Service store** — install/remove optional third-party services from the `pic-services` index at `git.pic.ngo`. Manifests declare container images, Caddy routes, and iptables rules. Store images are digest-pinned and cosign-signed by the build pipeline; the cell verifies signatures before starting a container (enforced by default).
- **Extended connectivity** — named connection instances per exit type: WireGuard external, OpenVPN, Tor, sshuttle (SSH tunnel), or proxy (HTTP/SOCKS5 via redsocks), plus cell-relay through another cell. Peers are assigned per-peer to a connection with configurable fail-open/fail-closed; per-connection health is tracked. Per-service egress policy is also supported. Routing uses per-instance fwmarks and `ip rule` in the WireGuard container.
- **Cell-to-cell networking** — WireGuard-based site-to-site links between PIC cells with service-level access control (calendar, files, mail, WebDAV) and a peer-sync protocol.
- **Certificate authority** — `vault_manager` issues and revokes TLS certificates for internal services.
- **Network services** — CoreDNS (`.cell` TLD and split-horizon DNS for the cell domain), chrony NTP.
- **Split-horizon DNS** — from outside the VPN, the cell domain resolves to the public IP. Inside the VPN, CoreDNS resolves it to the WireGuard IP so traffic stays in the tunnel. Caddy serves on both interfaces.
- **Email** _(optional, install via Service Store)_ — Postfix + Dovecot via `docker-mailserver`.
- **Calendar/contacts** _(optional, install via Service Store)_ — Radicale CalDAV/CardDAV.
- **File storage** _(optional, install via Service Store)_ — WebDAV with per-user accounts; Filegator for browser-based file management.
- **Container manager** — start/stop/inspect containers, pull images, manage volumes via the Docker SDK.
- **Firewall manager** — iptables rule management (`firewall_manager.py`).
- **Structured logging** — JSON logs with rotation (5 MB / 5 backups per service), log search, and per-service verbosity control.
- **Audit log** — append-only, hash-chained change log of all admin actions, with CSV export and an Activity page in the UI.
- **Backup / restore** — full backup of config, secrets, key material, and live service data volumes, with optional passphrase encryption; ordered restore with automatic runtime reapply.
---
## Requirements
- Linux host with the WireGuard kernel module loaded
- Linux host with the WireGuard kernel module loaded (`modprobe wireguard` to verify; required — userspace WireGuard is not supported)
- Docker Engine and Docker Compose (v2 plugin or v1 standalone)
- Python 3.10+ (for `make setup` and local dev only; not needed at runtime)
- Python 3.10+ (for `make setup` and local development; not needed at runtime)
- 2 GB+ RAM, 10 GB+ disk
- Ports available: 53, 67/udp, 80, 443, 51820/udp, 25, 587, 993
- Ports available: 53, 80, 443, 51820/udp (plus 25, 587, 993 only when the email service is installed)
---
## Documentation
Full documentation lives in the [project wiki](https://git.pic.ngo/roof/pic/wiki) — installation walkthrough, admin guide (setup, domains/TLS, services, connectivity, peers, backup, logging/audit, troubleshooting), user guide, and developer documentation (architecture, API reference, building store services, testing).
---
## Quick Start
See [QUICKSTART.md](QUICKSTART.md) for step-by-step setup.
See the wiki's [Setup and First Run](https://git.pic.ngo/roof/pic/wiki/Admin-Setup) for step-by-step instructions.
The short version — one-line installer (recommended):
```bash
curl -fsSL https://install.pic.ngo | sudo bash
# open http://<host-ip>:8081/setup — the setup wizard appears automatically
```
Or clone manually for development:
```bash
git clone https://git.pic.ngo/roof/pic.git pic
cd pic
make start
# open http://<host-ip>:8081 — the setup wizard appears automatically
```
---
## Configuration
Runtime configuration is controlled by `.env` in the project root. Copy `.env.example` to `.env` before first run.
Port assignments and container IPs are configured in `.env` in the project root. A `.env` file is not required for first start — all variables have defaults. Create one only if you need to change ports or container IPs.
| Variable | Default | Description |
|---|---|---|
| `CELL_NETWORK` | `172.20.0.0/16` | Docker bridge subnet for all containers |
| `CADDY_IP` through `FILEGATOR_IP` | `172.20.0.2``.13` | Static IP for each container |
| `DNS_PORT` | `53` | DNS (UDP+TCP) |
| `DHCP_PORT` | `67` | DHCP (UDP) |
| `CELL_NETWORK` | `172.20.0.0/16` | Docker bridge subnet |
| `CADDY_IP` through `WG_IP` | `172.20.0.2``.11` | Static IP per core container |
| `DNS_PORT` | `53` | DNS (UDP + TCP) |
| `NTP_PORT` | `123` | NTP (UDP) |
| `WG_PORT` | `51820` | WireGuard listen port (UDP) |
| `API_PORT` | `3000` | Flask API (bound to `127.0.0.1`) |
| `WEBUI_PORT` | `8081` | React UI |
| `MAIL_SMTP_PORT` | `25` | SMTP |
| `MAIL_SUBMISSION_PORT` | `587` | SMTP submission |
| `MAIL_IMAP_PORT` | `993` | IMAP |
| `RADICALE_PORT` | `5232` | CalDAV (bound to `127.0.0.1`) |
| `WEBDAV_PORT` | `8080` | WebDAV (bound to `127.0.0.1`) |
| `RAINLOOP_PORT` | `8888` | Webmail |
| `FILEGATOR_PORT` | `8082` | File manager UI |
| `WEBDAV_USER` | `admin` | WebDAV basic-auth username |
| `WEBDAV_PASS` | _(required)_ | WebDAV basic-auth password — must be set before `make start` |
| `FLASK_DEBUG` | _(unset)_ | Set to `1` to enable Flask debug mode; do not use in production |
| `API_PORT` | `3000` | Flask API (127.0.0.1 only) |
| `WEBUI_PORT` | `8081` | Host port mapped to container port 8080 |
| `FLASK_DEBUG` | _(unset)_ | Set to `1` for Flask debug mode; do not use in production |
| `PUID` / `PGID` | current user | UID/GID passed to the WireGuard container |
Cell identity (cell name, domain, VPN IP range) is configured via `make setup` or the Settings → Identity page in the UI after startup. The VPN IP range must be an RFC-1918 CIDR (`10.0.0.0/8`, `172.16.0.0/12`, or `192.168.0.0/16`); the API and UI both enforce this.
Cell identity (cell name, domain mode, timezone) is set through the first-run wizard on first start, or later through the Settings page in the UI.
---
## Security Notes
## Security
**Ports exposed to the network:**
**Ports exposed on all interfaces by default:**
- `80` / `443` — Caddy (HTTP/HTTPS reverse proxy)
- `51820/udp` — WireGuard
- `25` / `587` / `993` — Mail (SMTP, submission, IMAP)
- `53` — DNS (UDP + TCP)
- `67/udp` — DHCP
- `53` — DNS
- `8081` — Web UI
- `8888`Webmail (RainLoop)
- `8082` — File manager (Filegator)
- `25` / `587` / `993` — mail _(only when the email service is installed)_
**Ports bound to `127.0.0.1` only** (not directly reachable from the network):
**Ports bound to `127.0.0.1` only:**
- `3000` — Flask API
- `5232` — Radicale (CalDAV)
- `8080` — WebDAV
The API has no authentication layer. It relies on `is_local_request()` to restrict sensitive endpoints (containers, vault) to requests originating from loopback or the cell's Docker network. The Docker socket is mounted into `cell-api`; treat access to port 3000 as equivalent to root access on the host.
The API uses session-based authentication (admin and peer roles). The Docker socket is mounted into `cell-api`; treat access to port 3000 as equivalent to root access on the host.
For internet-facing deployments, place the host behind a firewall or VPN and restrict access to the API and UI ports.
Before setup is complete, all `/api/*` requests except `/api/setup/*` and `/health` return HTTP 428 and a redirect to `/setup`.
CSRF protection (double-submit token in `X-CSRF-Token` header) applies to all `POST`, `PUT`, `DELETE`, and `PATCH` requests on `/api/*` once a user session exists, except `/api/auth/*` and `/api/setup/*`.
Cell-to-cell peer-sync endpoints (`/api/cells/peer-sync/*`) authenticate via source IP and WireGuard public key, not session cookies.
For internet-facing deployments, place the host behind a firewall and restrict access to the API and UI ports.
---
@@ -123,7 +157,7 @@ cd webui && npm install && npm run dev
# Follow all container logs
make logs
# Follow logs for one service (e.g. api, dns, caddy, wireguard, mail)
# Follow logs for one service
make logs-api
# Open a shell inside a container
@@ -135,41 +169,38 @@ make shell-api
## Testing
```bash
make test # run the full pytest suite
make test # run all unit tests (pytest, excludes e2e and integration)
make test-coverage # run with coverage; HTML report in htmlcov/
make test-api # run API endpoint tests only
```
Tests live in `tests/` (34 files, 642 test functions). Coverage includes:
- All service managers (network, WireGuard, email, calendar, file, routing, vault, container)
- API endpoint tests for each service area
- Config manager (CRUD, validation, backup/restore)
- IP utilities and Caddyfile generation
- Peer registry and WireGuard peer lifecycle
- Service bus pub/sub
- Firewall manager
- Pending-restart logic
Integration tests (`tests/integration/`) require a running PIC stack:
Tests live in `tests/`. Integration tests require a running stack:
```bash
make test-integration # full suite (creates peers)
make test-integration # full suite (creates peers, modifies state)
make test-integration-readonly # read-only checks, safe to run anytime
```
End-to-end tests use Playwright:
```bash
make test-e2e-deps # install Playwright and dependencies (run once)
make test-e2e-api # API-level e2e tests
make test-e2e-ui # UI-level e2e tests
```
---
## Management Commands
```bash
make setup # generate WireGuard keys, write configs, create data dirs
make start # docker compose up -d --build
make start # docker compose up -d --build (full profile)
make stop # docker compose down
make restart # docker compose restart
make status # container status + API health check
make logs # follow all service logs
make logs-<svc> # follow logs for one service
make shell-<svc> # shell inside a container
make logs-<svc> # follow logs for one service (e.g. make logs-api)
make shell-<svc> # shell inside a container (e.g. make shell-api)
make update # git pull + rebuild + restart
make reinstall # full wipe of config/ and data/, then setup + start
@@ -180,11 +211,13 @@ make restore # list available backups
make list-peers # show WireGuard peers via API
make show-routes # wg show inside the wireguard container
make add-peer PEER_NAME=foo PEER_IP=10.0.0.5 PEER_KEY=<pubkey>
make show-admin-password # print current admin password
make reset-admin-password # generate and set a new random admin password
```
---
## License
MIT — see [LICENSE](LICENSE).
MIT.
BIN
View File
Binary file not shown.
File diff suppressed because it is too large Load Diff
+24 -24
View File
@@ -1,35 +1,35 @@
FROM python:3.11-slim
FROM docker:27-cli@sha256:851f91d241214e7c6db86513b270d58776379aacc5eb9c4a87e5b47115e3065c AS dockercli
FROM gcr.io/projectsigstore/cosign:v2.4.1@sha256:b03690aa52bfe94054187142fba24dc54137650682810633901767d8a3e15b31 AS cosign
FROM python:3.11-slim@sha256:a3ab0b966bc4e91546a033e22093cb840908979487a9fc0e6e38295747e49ac0
WORKDIR /app/api
# Install system dependencies
RUN apt-get update && apt-get install -y \
wireguard-tools \
iptables \
iproute2 \
util-linux \
curl \
ca-certificates \
gnupg \
lsb-release \
&& curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null \
&& apt-get update \
&& apt-get install -y docker-ce-cli \
&& rm -rf /var/lib/apt/lists/*
# The API runs as root by design: it drives iptables, the docker socket, and
# docker-execs into sibling containers. Non-root is not feasible here.
COPY --from=dockercli /usr/local/bin/docker /usr/local/bin/docker
# cosign verifies store-service image signatures against the bundled public key
# (config/cosign/cosign.pub) before ServiceComposer starts a container.
COPY --from=cosign /ko-app/cosign /usr/local/bin/cosign
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
wireguard-tools \
iptables \
iproute2 \
util-linux \
curl \
ca-certificates \
&& rm -rf /var/lib/apt/lists/* \
&& mkdir -p /app/data /app/config
# Copy requirements first for better caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy all application code into /app/api
COPY . .
# Create necessary directories
RUN mkdir -p /app/data /app/config
# Expose port
EXPOSE 3000
# Run the application
CMD ["python", "app.py"]
CMD ["python", "app.py"]
+298
View File
@@ -0,0 +1,298 @@
"""
AccountManager — per-service credential provisioning for PIC peers.
Responsibilities:
- Dispatch account creation/deletion to each service's underlying manager
- Store per-peer per-service credentials securely (0o600 file)
- Provide credential retrieval for peer_config_template filling
- Bulk-deprovision a peer from all services on peer deletion
Credentials file format (data/peer_service_credentials.json):
{
"<service_id>": {
"<peer_username>": {"password": "..."}
}
}
Design note — plaintext passwords:
Credentials are stored in plaintext so the peer endpoint can return them to
the peer's device for one-time client configuration. The file is created with
0o600 so it is only readable by the process owner (same pattern used for
WireGuard keys and service_secrets.json).
"""
import json
import logging
import os
import secrets as _secrets_mod
import threading
from pathlib import Path
from typing import Dict, List, Optional
try:
import requests as _requests
except ImportError:
_requests = None
logger = logging.getLogger('picell')
_DISPATCH_PROVISION = {
'email_manager': '_provision_email',
'calendar_manager': '_provision_calendar',
'file_manager': '_provision_files',
}
_DISPATCH_DEPROVISION = {
'email_manager': '_deprovision_email',
'calendar_manager': '_deprovision_calendar',
'file_manager': '_deprovision_files',
}
_HTTP_TIMEOUT = 10
class AccountManager:
def __init__(self, service_registry, data_dir: str, config_manager=None, **managers):
"""
service_registry — ServiceRegistry instance
data_dir — host data directory (data/peer_service_credentials.json lives here)
config_manager — ConfigManager instance (used to resolve fallback email domain)
**managers — named manager instances: email_manager=..., calendar_manager=...,
file_manager=...
"""
self._registry = service_registry
self._creds_path = Path(data_dir) / 'peer_service_credentials.json'
self._config_manager = config_manager
self._managers = managers
self._lock = threading.Lock()
# ── Credential storage (0o600) ────────────────────────────────────────
def _load_creds(self) -> Dict:
if not self._creds_path.exists():
return {}
try:
with open(self._creds_path) as f:
return json.load(f)
except (OSError, json.JSONDecodeError) as e:
logger.warning('AccountManager: failed to load credentials: %s', e)
return {}
def _save_creds(self, creds: Dict) -> None:
tmp = str(self._creds_path) + '.tmp'
with open(tmp, 'w', opener=lambda path, flags: os.open(path, flags, 0o600)) as f:
json.dump(creds, f, indent=2)
f.flush()
os.fsync(f.fileno())
os.replace(tmp, str(self._creds_path))
# ── Per-manager provision / deprovision ───────────────────────────────
def _provision_email(self, manager, svc: Dict, peer_username: str, password: str) -> bool:
domain = (svc.get('config') or {}).get('domain', '')
if not domain and self._config_manager is not None:
domain = self._config_manager.get_effective_domain() or ''
if not domain:
raise ValueError("Email service has no 'domain' configured")
return manager.create_email_user(peer_username, domain, password)
def _deprovision_email(self, manager, svc: Dict, peer_username: str) -> bool:
domain = (svc.get('config') or {}).get('domain', '')
return manager.delete_email_user(peer_username, domain)
@staticmethod
def _provision_calendar(manager, _svc: Dict, peer_username: str, password: str) -> bool:
return manager.create_calendar_user(peer_username, password)
@staticmethod
def _deprovision_calendar(manager, _svc: Dict, peer_username: str) -> bool:
return manager.delete_calendar_user(peer_username)
@staticmethod
def _provision_files(manager, _svc: Dict, peer_username: str, password: str) -> bool:
return manager.create_user(peer_username, password)
@staticmethod
def _deprovision_files(manager, _svc: Dict, peer_username: str) -> bool:
return manager.delete_user(peer_username)
# ── HTTP dispatch (manager == "http") ────────────────────────────────
@staticmethod
def _http_base_url(svc: Dict) -> str:
"""Return the base URL for the service's /service-api endpoint."""
backend = svc.get('backend', '')
if not backend:
raise ValueError(f"Service {svc.get('id')!r} has no 'backend' configured")
return f'http://{backend}'
def _provision_http(self, svc: Dict, peer_username: str, password: str) -> bool:
if _requests is None:
raise RuntimeError('requests library is required for HTTP account dispatch')
url = self._http_base_url(svc) + '/service-api/accounts'
try:
resp = _requests.post(
url,
json={'username': peer_username, 'password': password},
timeout=_HTTP_TIMEOUT,
)
if resp.status_code in (200, 201):
return True
logger.warning('HTTP provision %s on %s returned %s: %s',
peer_username, svc.get('id'), resp.status_code, resp.text[:200])
return False
except Exception as exc:
raise RuntimeError(f'HTTP provision request failed: {exc}') from exc
def _deprovision_http(self, svc: Dict, peer_username: str) -> bool:
if _requests is None:
raise RuntimeError('requests library is required for HTTP account dispatch')
url = self._http_base_url(svc) + f'/service-api/accounts/{peer_username}'
try:
resp = _requests.delete(url, timeout=_HTTP_TIMEOUT)
if resp.status_code in (200, 204, 404):
return True
logger.warning('HTTP deprovision %s on %s returned %s: %s',
peer_username, svc.get('id'), resp.status_code, resp.text[:200])
return False
except Exception as exc:
raise RuntimeError(f'HTTP deprovision request failed: {exc}') from exc
# ── Service validation helper ─────────────────────────────────────────
def _resolve_service(self, service_id: str):
"""Return (svc, manager_name, manager) or raise ValueError.
manager is None when manager_name == 'http' — callers must check.
"""
svc = self._registry.get(service_id)
if svc is None:
raise ValueError(f'Unknown service: {service_id!r}')
accounts_cfg = svc.get('accounts') or {}
manager_name = accounts_cfg.get('manager')
if not manager_name:
raise ValueError(f'Service {service_id!r} does not support accounts')
if manager_name == 'http':
return svc, 'http', None
manager = self._managers.get(manager_name)
if manager is None:
raise ValueError(f'Manager {manager_name!r} is not registered with AccountManager')
return svc, manager_name, manager
# ── Public API ────────────────────────────────────────────────────────
def provision(self, service_id: str, peer_username: str,
password: str = None) -> Dict:
"""Create an account on the service for the peer; store and return credentials.
Raises ValueError if the service doesn't support accounts.
Raises RuntimeError if the underlying manager fails.
"""
svc, manager_name, manager = self._resolve_service(service_id)
if password is None:
password = _secrets_mod.token_urlsafe(16)
if manager_name == 'http':
ok = self._provision_http(svc, peer_username, password)
else:
dispatch = _DISPATCH_PROVISION.get(manager_name)
if dispatch is None:
raise ValueError(f'No provision dispatch for manager: {manager_name!r}')
ok = getattr(self, dispatch)(manager, svc, peer_username, password)
if not ok:
raise RuntimeError(
f'Provision of {peer_username!r} on {service_id!r} returned False — '
'check underlying service manager logs'
)
cred = {'password': password}
with self._lock:
all_creds = self._load_creds()
all_creds.setdefault(service_id, {})[peer_username] = cred
self._save_creds(all_creds)
logger.info('AccountManager: provisioned %s on %s', peer_username, service_id)
return cred
def deprovision(self, service_id: str, peer_username: str) -> bool:
"""Delete the peer's account on the service and clear stored credentials."""
svc, manager_name, manager = self._resolve_service(service_id)
if manager_name == 'http':
ok = self._deprovision_http(svc, peer_username)
else:
dispatch = _DISPATCH_DEPROVISION.get(manager_name)
if dispatch is None:
raise ValueError(f'No deprovision dispatch for manager: {manager_name!r}')
ok = getattr(self, dispatch)(manager, svc, peer_username)
with self._lock:
all_creds = self._load_creds()
svc_creds = all_creds.get(service_id, {})
if peer_username in svc_creds:
del svc_creds[peer_username]
if not svc_creds:
del all_creds[service_id]
self._save_creds(all_creds)
logger.info('AccountManager: deprovisioned %s from %s', peer_username, service_id)
return bool(ok)
def get_credentials(self, service_id: str, peer_username: str) -> Optional[Dict]:
"""Return stored credentials for peer+service, or None if not provisioned."""
with self._lock:
return self._load_creds().get(service_id, {}).get(peer_username)
def list_accounts(self, service_id: str) -> List[str]:
"""Return peer usernames provisioned on a service."""
with self._lock:
return list(self._load_creds().get(service_id, {}).keys())
def list_peer_services(self, peer_username: str) -> List[str]:
"""Return service IDs where this peer has a provisioned account."""
with self._lock:
creds = self._load_creds()
return [svc_id for svc_id, peers in creds.items() if peer_username in peers]
def is_provisioned(self, service_id: str, peer_username: str) -> bool:
return self.get_credentials(service_id, peer_username) is not None
def deprovision_peer(self, peer_username: str) -> Dict[str, bool]:
"""Remove a peer from every service they are provisioned on.
Called on peer deletion. Continues even if individual services fail.
Returns {service_id: success} for each service attempted.
"""
results: Dict[str, bool] = {}
for service_id in self.list_peer_services(peer_username):
try:
results[service_id] = self.deprovision(service_id, peer_username)
except Exception as e:
logger.warning('AccountManager: deprovision %s from %s failed: %s',
peer_username, service_id, e)
results[service_id] = False
return results
def get_all_credentials(self, peer_username: str) -> Dict[str, Dict]:
"""Return {service_id: {field: value}} for all services the peer is provisioned on."""
with self._lock:
creds = self._load_creds()
return {
svc_id: peers[peer_username]
for svc_id, peers in creds.items()
if peer_username in peers
}
def store_credentials(self, service_id: str, peer_username: str,
cred: Dict) -> None:
"""Directly store credentials without calling the underlying manager.
Used when a peer was provisioned through the legacy peers-POST route
so that their credentials become retrievable via AccountManager.
"""
with self._lock:
all_creds = self._load_creds()
all_creds.setdefault(service_id, {})[peer_username] = cred
self._save_creds(all_creds)
+568 -20
View File
@@ -44,6 +44,10 @@ from managers import (
caddy_manager,
ddns_manager, service_store_manager,
connectivity_manager,
service_registry,
service_composer,
account_manager,
audit_manager,
firewall_manager, EventType,
)
# Re-exports: tests do `from app import CellManager` and `from app import _resolve_peer_dns`
@@ -51,12 +55,23 @@ from cell_manager import CellManager
from wireguard_manager import _resolve_peer_dns
from port_registry import PORT_FIELDS, detect_conflicts
import auth_routes
from legacy_cleanup import cleanup_legacy_builtin_containers
# Context variable for request info
request_context = contextvars.ContextVar('request_context', default={})
# Set default log level and log file if not already defined
LOG_LEVEL = globals().get('LOG_LEVEL', 'INFO')
def _resolve_root_log_level():
"""Resolve the root python log level from PIC_LOG_LEVEL env, then the
ConfigManager logging.python.root setting, defaulting to INFO."""
env_level = os.environ.get('PIC_LOG_LEVEL', '').strip().upper()
if env_level in ('DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'):
return env_level
try:
return config_manager.get_logging_config()['python']['root']
except Exception:
return 'INFO'
LOG_LEVEL = _resolve_root_log_level()
LOG_FILE = globals().get('LOG_FILE', 'picell.log')
class ContextFilter(logging.Filter):
@@ -107,6 +122,23 @@ logging.basicConfig(
)
logger = logging.getLogger('picell')
def apply_root_log_level(level=None):
"""(Re)apply the root python log level at runtime.
Sets the ROOT logger level and every root handler level so that bare-module
loggers (e.g. firewall_manager, network_manager) — which log via
logging.getLogger(__name__) and propagate to root — are governed. When
``level`` is None the level is re-resolved from env/ConfigManager.
"""
resolved = (level or _resolve_root_log_level()).upper()
numeric = getattr(logging, resolved, logging.INFO)
root = logging.getLogger()
root.setLevel(numeric)
for h in root.handlers:
h.setLevel(numeric)
return resolved
# Flask app setup
app = Flask(__name__)
CORS(app,
@@ -183,6 +215,13 @@ def enforce_setup():
return jsonify({'error': 'Setup required', 'redirect': '/setup'}), 428
# Read-only endpoints accessible to peer-role sessions (not just admin).
# Add paths here when peers need to read shared cell state.
_PEER_READABLE_PATHS = frozenset({
'/api/services/active',
})
@app.before_request
def enforce_auth():
"""Enforce session-based authentication and role-based access control.
@@ -199,8 +238,8 @@ def enforce_auth():
backward-compatibility with pre-auth test suites.
"""
path = request.path
# Always allow non-API paths and auth namespace
if not path.startswith('/api/') or path.startswith('/api/auth/'):
# Always allow non-API paths, auth namespace, and setup namespace
if not path.startswith('/api/') or path.startswith('/api/auth/') or path.startswith('/api/setup/'):
return None
# Cell peer-sync endpoints authenticate via source IP + WG pubkey — not session
if path.startswith('/api/cells/peer-sync/'):
@@ -216,10 +255,6 @@ def enforce_auth():
return None
users = auth_manager.list_users()
if not users:
# Only fail closed when the auth file is readable but empty —
# that's an explicit misconfiguration. If the file is missing or
# unreadable (test env, wrong host path, permission denied), bypass
# so pre-auth test suites continue to work.
users_file = getattr(auth_manager, '_users_file', None)
if users_file:
try:
@@ -238,6 +273,8 @@ def enforce_auth():
if path.startswith('/api/peer/'):
if role != 'peer':
return jsonify({'error': 'Forbidden'}), 403
elif path in _PEER_READABLE_PATHS:
pass # both admin and peer may read these endpoints
else:
if role != 'admin':
return jsonify({'error': 'Forbidden'}), 403
@@ -282,6 +319,214 @@ def log_request(response):
logger.info(f"{ctx.get('method')} {ctx.get('path')} {ctx.get('status')}")
return response
# ── Audit trail ─────────────────────────────────────────────────────────────
# Mutating endpoints that must NOT be audited: read-shaped POSTs (searches,
# exports, port checks, history clears) and namespaces handled elsewhere.
_NO_AUDIT_ENDPOINTS = frozenset({
# Read-shaped POSTs / diagnostics — not state changes worth auditing.
'services.search_logs',
'services.export_logs',
'services.rotate_logs',
'wireguard.check_wireguard_port',
'wireguard.test_wireguard_connectivity',
'wireguard.get_peer_config',
'wireguard.get_peer_status',
'wireguard.refresh_external_ip',
'network.test_network',
'routing.test_routing_connectivity',
'clear_health_history',
'peers.ip_update',
})
# Map (METHOD, endpoint) -> (action, target_type, target_id_view_arg).
# target_id_view_arg names a view_arg used as the target id, or None for a
# resource-level action. Endpoint is request.url_rule.endpoint
# ('<blueprint>.<func>' for blueprint routes, '<func>' for app routes).
ROUTE_ACTION_MAP = {
# config
('PUT', 'config.update_config'): ('config.update', 'config', None),
('POST', 'config.apply_pending_config'): ('config.apply', 'config', None),
('DELETE', 'config.cancel_pending_config'): ('config.cancel_pending', 'config', None),
('POST', 'config.import_config'): ('config.import', 'config', None),
('POST', 'config.create_config_backup'): ('backup.create', 'backup', None),
('POST', 'config.restore_config'): ('backup.restore', 'backup', 'backup_id'),
('POST', 'config.upload_backup'): ('backup.upload', 'backup', None),
('DELETE', 'config.delete_config_backup'): ('backup.delete', 'backup', 'backup_id'),
# ddns
('PUT', 'config.update_ddns_config'): ('ddns.update', 'ddns', None),
('POST', 'config.ddns_register'): ('ddns.register', 'ddns', None),
('POST', 'config.ddns_sync_records'): ('ddns.sync', 'ddns', None),
# peers
('POST', 'peers.add_peer'): ('peer.create', 'peer', None),
('PUT', 'peers.update_peer'): ('peer.update', 'peer', 'peer_name'),
('PUT', 'peers.set_peer_route_via'): ('peer.route_via', 'peer', 'peer_name'),
('DELETE', 'peers.remove_peer'): ('peer.delete', 'peer', 'peer_name'),
('POST', 'peers.register_peer'): ('peer.register', 'peer', None),
('DELETE', 'peers.unregister_peer'): ('peer.unregister', 'peer', 'peer_name'),
('PUT', 'peers.update_peer_ip_registry'): ('peer.update_ip', 'peer', 'peer_name'),
('POST', 'peers.clear_peer_reinstall'): ('peer.clear_reinstall', 'peer', 'peer_name'),
# wireguard
('POST', 'wireguard.generate_peer_keys'): ('wireguard.peer_keys', 'wireguard', None),
('POST', 'wireguard.add_wireguard_peer'): ('wireguard.peer_add', 'wireguard', None),
('DELETE', 'wireguard.remove_wireguard_peer'): ('wireguard.peer_remove', 'wireguard', None),
('PUT', 'wireguard.update_peer_ip'): ('wireguard.peer_ip', 'wireguard', None),
('POST', 'wireguard.setup_network'): ('wireguard.network_setup', 'wireguard', None),
('PUT', 'wireguard.set_wireguard_endpoint'): ('wireguard.endpoint', 'wireguard', None),
('POST', 'wireguard.apply_wireguard_enforcement'): ('wireguard.apply_enforcement', 'wireguard', None),
# services (catalog + bus)
('POST', 'services.restart_service_containers'): ('service.restart', 'service', 'service_id'),
('POST', 'services.reconfigure_service'): ('service.reconfigure', 'service', 'service_id'),
('POST', 'services.provision_service_account'): ('account.create', 'account', 'service_id'),
('DELETE', 'services.deprovision_service_account'): ('account.delete', 'account', 'service_id'),
('POST', 'services.start_service'): ('service.start', 'service', 'service_name'),
('POST', 'services.stop_service'): ('service.stop', 'service', 'service_name'),
('POST', 'services.restart_service'): ('service.restart', 'service', 'service_name'),
# service store
('POST', 'service_store.install_service'): ('service.install', 'service', 'service_id'),
('DELETE', 'service_store.remove_service'): ('service.remove', 'service', 'service_id'),
('POST', 'service_store.refresh_index'): ('service.store_refresh', 'service', None),
# built-in service accounts (email / calendar / files)
('POST', 'email.create_email_user'): ('account.create', 'account', None),
('DELETE', 'email.delete_email_user'): ('account.delete', 'account', 'username'),
('POST', 'calendar.create_calendar_user'): ('account.create', 'account', None),
('DELETE', 'calendar.delete_calendar_user'): ('account.delete', 'account', 'username'),
('POST', 'files.create_file_user'): ('account.create', 'account', None),
('DELETE', 'files.delete_file_user'): ('account.delete', 'account', 'username'),
# vault / certs / secrets / trust
('POST', 'vault.generate_certificate'): ('vault.cert_issue', 'certificate', None),
('DELETE', 'vault.revoke_certificate'): ('vault.cert_revoke', 'certificate', 'common_name'),
('POST', 'vault.store_secret'): ('vault.secret_store', 'secret', None),
('DELETE', 'vault.delete_secret'): ('vault.secret_delete', 'secret', 'name'),
('POST', 'vault.add_trusted_key'): ('vault.trust_key_add', 'trust', None),
('DELETE', 'vault.remove_trusted_key'): ('vault.trust_key_remove', 'trust', 'name'),
# caddy
('POST', 'caddy_cert_renew'): ('caddy.cert_renew', 'caddy', None),
('POST', 'caddy_upload_custom_cert'): ('caddy.custom_cert', 'caddy', None),
# connectivity
('POST', 'connectivity_upload_wireguard'): ('connection.exit_wireguard', 'connection', None),
('POST', 'connectivity_upload_openvpn'): ('connection.exit_openvpn', 'connection', None),
('POST', 'connectivity_configure_sshuttle'): ('connection.exit_sshuttle', 'connection', None),
('POST', 'connectivity_configure_proxy'): ('connection.exit_proxy', 'connection', None),
('PUT', 'connectivity_set_peer_exit'): ('connection.peer_exit_set', 'peer', 'peer_name'),
('POST', 'connectivity_create_connection'): ('connection.create', 'connection', None),
('PUT', 'connectivity_update_connection'): ('connection.update', 'connection', 'conn_id'),
('DELETE', 'connectivity_delete_connection'): ('connection.delete', 'connection', 'conn_id'),
('PUT', 'connectivity_set_peer_failopen'): ('peer.failopen', 'peer', 'peer_name'),
# egress
('PUT', 'egress_set_service_exit'): ('egress.service_exit_set', 'service', 'service_id'),
# cells
('POST', 'cells.add_cell_connection'): ('cell.create', 'cell', None),
('DELETE', 'cells.remove_cell_connection'): ('cell.delete', 'cell', 'cell_name'),
('PUT', 'cells.update_cell_permissions'): ('cell.permissions_set', 'cell', 'cell_name'),
('PUT', 'cells.set_exit_offer'): ('cell.exit_offer', 'cell', 'cell_name'),
# network / dns
('POST', 'network.add_dns_record'): ('network.dns_record_add', 'dns', None),
('DELETE', 'network.remove_dns_record'): ('network.dns_record_remove', 'dns', None),
# routing
('POST', 'routing.setup_routing'): ('network.routing_setup', 'routing', None),
('POST', 'routing.add_nat_rule'): ('network.nat_add', 'routing', None),
('DELETE', 'routing.remove_nat_rule'): ('network.nat_remove', 'routing', 'rule_id'),
('POST', 'routing.add_peer_route'): ('network.peer_route_add', 'routing', None),
('DELETE', 'routing.remove_peer_route'): ('network.peer_route_remove', 'routing', 'peer_name'),
('POST', 'routing.add_firewall_rule'): ('network.firewall_add', 'routing', None),
('DELETE', 'routing.remove_firewall_rule'): ('network.firewall_remove', 'routing', 'rule_id'),
('POST', 'routing.add_exit_node'): ('network.exit_node_add', 'routing', None),
('POST', 'routing.add_bridge_route'): ('network.bridge_add', 'routing', None),
('POST', 'routing.add_split_route'): ('network.split_add', 'routing', None),
# containers
('POST', 'containers.create_container'): ('container.create', 'container', None),
('DELETE', 'containers.remove_container'): ('container.remove', 'container', 'name'),
('POST', 'containers.restart_container'): ('container.restart', 'container', 'name'),
('POST', 'containers.start_container'): ('container.start', 'container', 'name'),
('POST', 'containers.stop_container'): ('container.stop', 'container', 'name'),
}
def _audit_actor_ip():
"""Derive (actor, role, ip) for the current request, mirroring is_local_request's
trust model: the last X-Forwarded-For entry (appended by Caddy) over remote_addr."""
actor = session.get('username', 'anonymous')
role = session.get('role', 'system')
ip = request.remote_addr or ''
xff = request.headers.get('X-Forwarded-For', '')
if xff:
last = xff.split(',')[-1].strip()
if last:
ip = last
return actor, role, ip
def _audit_map_action(method, endpoint, view_args, path):
"""Resolve (action, target_type, target_id) for a mutating request."""
spec = ROUTE_ACTION_MAP.get((method, endpoint))
view_args = view_args or {}
if spec:
action, target_type, id_arg = spec
target_id = str(view_args.get(id_arg, '')) if id_arg else ''
return action, target_type, target_id
# Unmapped: emit a generic action so nothing is invisible.
return f"{method.lower()}.{path}", 'unknown', ''
def _audit_summary(action):
"""Build a redacted summary for the current request.
For config.update only, list the changed config KEY NAMES (never values).
Request bodies are never recorded.
"""
if action != 'config.update':
return ''
try:
from audit_manager import AuditManager
body = request.get_json(silent=True)
if not isinstance(body, dict):
return ''
keys = []
for section, val in body.items():
if isinstance(val, dict):
keys.extend(f"{section}.{k}" for k in val.keys())
else:
keys.append(str(section))
return AuditManager.summarize_keys(keys)
except Exception:
return ''
@app.after_request
def audit_request(response):
"""Append an audit entry for mutating /api/* requests. Never raises."""
try:
method = request.method
if method not in ('POST', 'PUT', 'DELETE', 'PATCH'):
return response
path = request.path
if not path.startswith('/api/'):
return response
if (path.startswith('/api/auth/') or path.startswith('/api/setup/')
or path.startswith('/api/cells/peer-sync/')):
return response
rule = request.url_rule
endpoint = rule.endpoint if rule is not None else ''
if endpoint in _NO_AUDIT_ENDPOINTS:
return response
actor, role, ip = _audit_actor_ip()
action, target_type, target_id = _audit_map_action(
method, endpoint, request.view_args, path)
status = response.status_code
ctx = request_context.get({})
summary = _audit_summary(action)
audit_manager.record(
actor=actor, role=role, ip=ip, action=action,
target_type=target_type, target_id=target_id, summary=summary,
result='success' if status < 400 else 'failure',
status=status, method=method, path=path,
request_id=ctx.get('request_id', ''),
)
except Exception as e:
logger.warning(f"audit_request hook failed: {e}")
return response
@app.teardown_request
def clear_log_context(exc):
request_context.set({})
@@ -292,7 +537,23 @@ auth_routes.auth_manager = auth_manager
# Apply firewall + DNS rules from stored peer settings (survives API restarts)
def _configured_domain() -> str:
return config_manager.configs.get('_identity', {}).get('domain', 'cell')
identity = config_manager.configs.get('_identity', {})
# domain_name is the full FQDN (e.g. 'test5.pic.ngo'); fall back to domain
# (e.g. 'lan', 'dev') for cells that don't have a subdomain prefix.
return identity.get('domain_name') or identity.get('domain', 'cell')
def _configured_dns_params():
"""Return (primary_domain, split_horizon_zones) for Corefile generation.
In DDNS mode the primary CoreDNS zone is the parent domain (e.g. 'pic.ngo')
and the cell's FQDN (e.g. 'pic1.pic.ngo') is a separate split-horizon block
so LAN clients resolve *.pic1.pic.ngo to the internal Caddy IP.
In LAN mode both values are the same so split_horizon_zones is empty.
"""
primary = config_manager.get_internal_domain()
effective = config_manager.get_effective_domain()
return primary, ([effective] if effective != primary else [])
def _restore_cell_wg_peers(cell_links):
@@ -330,6 +591,15 @@ def _restore_cell_wg_peers(cell_links):
def _apply_startup_enforcement():
try:
# Regenerate the Caddyfile from current config before anything else so a
# stale on-disk file (e.g. one written by an older image, missing the
# `admin 0.0.0.0:2019` directive) can't permanently wedge the health
# monitor into restarting Caddy every few minutes. Done first so the
# later service_store/identity regenerations don't debounce it away.
try:
caddy_manager.regenerate_with_installed([])
except Exception as _cre:
logger.warning(f"startup Caddyfile regeneration failed (non-fatal): {_cre}")
peers = peer_registry.list_peers()
cell_links = cell_link_manager.list_connections()
firewall_manager.reconcile_stale_peer_rules(peers)
@@ -356,8 +626,10 @@ def _apply_startup_enforcement():
# (happens if the container was rebuilt, wg0.conf was reset, etc.)
_restore_cell_wg_peers(cell_links)
wireguard_manager.sync_cell_routes()
firewall_manager.apply_all_dns_rules(peers, COREFILE_PATH, _configured_domain(),
cell_links=cell_links)
_dns_primary, _dns_szones = _configured_dns_params()
firewall_manager.apply_all_dns_rules(peers, COREFILE_PATH, _dns_primary,
cell_links=cell_links,
split_horizon_zones=_dns_szones)
logger.info(f"Applied enforcement rules for {len(peers)} peers, {len(cell_links)} cells on startup")
# Phase 3: reapply policy routing rules for peers whose internet traffic is
# routed through an exit cell (ip rule entries don't survive container restart)
@@ -375,6 +647,11 @@ def _apply_startup_enforcement():
sync_summary = cell_link_manager.replay_pending_pushes()
if sync_summary.get('attempted'):
logger.info(f"Startup permission sync: {sync_summary}")
# Remove legacy builtin containers from old main stack (one-shot, idempotent)
try:
cleanup_legacy_builtin_containers(config_manager)
except Exception as _cle:
logger.warning(f'legacy cleanup failed (non-fatal): {_cle}')
# Service store: re-apply firewall/caddy rules for installed services
try:
service_store_manager.reapply_on_startup()
@@ -394,8 +671,25 @@ def _bootstrap_dns():
cell_name = identity.get('cell_name', os.environ.get('CELL_NAME', 'mycell'))
domain = identity.get('domain', os.environ.get('CELL_DOMAIN', 'cell'))
ip_range = identity.get('ip_range', os.environ.get('CELL_IP_RANGE', '172.20.0.0/16'))
# Bootstrap on first start; then always regenerate to ensure A records use WG server IP.
network_manager.apply_ip_range(ip_range, cell_name, domain)
domain_mode = identity.get('domain_mode', 'lan')
if domain_mode == 'lan':
# LAN mode: write full service records into the primary local zone.
network_manager.apply_ip_range(ip_range, cell_name, domain)
else:
# Non-LAN mode (DDNS/ACME): ensure the split-horizon zone is present so
# LAN clients resolve service subdomains to the internal Caddy IP.
# Never call apply_ip_range here — it would pollute the DDNS parent zone.
effective_domain = config_manager.get_effective_domain()
if effective_domain and effective_domain != domain:
# Use the WireGuard server IP so VPN peers can reach Caddy via the tunnel.
# The Docker bridge IP (172.20.x.x) is only reachable inside the Docker
# network; WireGuard peers need the host's WG interface IP (e.g. 10.0.0.1).
caddy_ip = network_manager._get_wg_server_ip()
# update_split_horizon_zone writes both the zone file and the Corefile
# (with the split-horizon block included). No separate apply_all_dns_rules
# call needed — that would overwrite the Corefile and drop the split-horizon block.
network_manager.update_split_horizon_zone(
effective_domain, caddy_ip, primary_domain=domain)
except Exception as e:
logger.warning(f"DNS bootstrap failed (non-fatal): {e}")
@@ -479,6 +773,9 @@ app.register_blueprint(_config_bp)
from routes.service_store import store_bp
app.register_blueprint(store_bp)
from routes.audit import bp as _audit_bp
app.register_blueprint(_audit_bp)
# Re-export config helpers so existing test imports/patches keep working
from routes.config import (
_set_pending_restart, _clear_pending_restart,
@@ -502,9 +799,18 @@ def perform_health_check():
'timestamp': datetime.utcnow().isoformat(),
'alerts': []
}
# email/calendar/files are optional store services — only check them when installed
_installed_store_ids = set(config_manager.get_installed_services())
_OPTIONAL_STORE_MANAGERS = frozenset({'email_manager', 'calendar_manager', 'file_manager'})
_MANAGER_TO_STORE_ID = {'email_manager': 'email', 'calendar_manager': 'calendar', 'file_manager': 'files'}
# Get health from each service
for service_name in service_bus.list_services():
if service_name in _OPTIONAL_STORE_MANAGERS:
store_id = _MANAGER_TO_STORE_ID[service_name]
if store_id not in _installed_store_ids:
continue
try:
service = service_bus.get_service(service_name)
if hasattr(service, 'health_check'):
@@ -514,7 +820,7 @@ def perform_health_check():
result[service_name] = health
except Exception as e:
result[service_name] = {'error': str(e), 'status': 'offline'}
# Health alerting logic — alert only when a service container is not running
global service_alert_counters
for service_name in service_bus.list_services():
@@ -564,6 +870,8 @@ def perform_health_check():
return {'error': str(e), 'timestamp': datetime.utcnow().isoformat()}
def health_monitor_loop():
_cert_check_cycle = 0
_conn_health_cycle = 0
while health_monitor_running:
with app.app_context():
health_result = perform_health_check()
@@ -587,6 +895,23 @@ def health_monitor_loop():
caddy_manager.reset_health_failures()
except Exception as _caddy_err:
logger.error("Caddy health monitor error: %s", _caddy_err)
# Refresh cert status every 60 cycles (\u2248 1 hour with a 60 s loop).
_cert_check_cycle += 1
if _cert_check_cycle >= 60:
_cert_check_cycle = 0
try:
caddy_manager.refresh_cert_status()
except Exception as _cert_err:
logger.warning("Cert status refresh failed (non-fatal): %s", _cert_err)
# Refresh connection health every 2 cycles (\u2248 every 2 min) so the
# connections list and per-peer fallback decisions stay current.
_conn_health_cycle += 1
if _conn_health_cycle >= 2:
_conn_health_cycle = 0
try:
connectivity_manager.refresh_health()
except Exception as _ch_err:
logger.warning("Connection health refresh failed (non-fatal): %s", _ch_err)
time.sleep(60) # Check every 60 seconds
# Start health monitor thread
@@ -708,6 +1033,7 @@ def get_cell_status():
return jsonify({
"cell_name": identity.get('cell_name', os.environ.get('CELL_NAME', 'mycell')),
"domain": identity.get('domain', os.environ.get('CELL_DOMAIN', 'cell')),
"effective_domain": config_manager.get_effective_domain(),
"uptime": uptime_seconds,
"peers_count": len(peers),
"services": services_status,
@@ -789,6 +1115,34 @@ def connectivity_upload_openvpn():
return jsonify({'error': str(e)}), 500
@app.route('/api/connectivity/exits/sshuttle', methods=['POST'])
def connectivity_configure_sshuttle():
"""Configure the sshuttle (SSH tunnel) exit. Secrets are never echoed back."""
try:
data = request.get_json(silent=True) or {}
result = connectivity_manager.configure_sshuttle(data)
if result.get('ok'):
return jsonify({'ok': True})
return jsonify({'ok': False, 'error': result.get('error', 'invalid config')}), 400
except Exception as e:
logger.error(f"connectivity_configure_sshuttle: {e}")
return jsonify({'error': 'internal error'}), 500
@app.route('/api/connectivity/exits/proxy', methods=['POST'])
def connectivity_configure_proxy():
"""Configure the upstream proxy (redsocks) exit. Secrets are never echoed back."""
try:
data = request.get_json(silent=True) or {}
result = connectivity_manager.configure_proxy(data)
if result.get('ok'):
return jsonify({'ok': True})
return jsonify({'ok': False, 'error': result.get('error', 'invalid config')}), 400
except Exception as e:
logger.error(f"connectivity_configure_proxy: {e}")
return jsonify({'error': 'internal error'}), 500
@app.route('/api/connectivity/exits/apply', methods=['POST'])
def connectivity_apply_routes():
"""Idempotently re-apply all connectivity policy routing rules."""
@@ -802,13 +1156,18 @@ def connectivity_apply_routes():
@app.route('/api/connectivity/peers/<peer_name>/exit', methods=['PUT'])
def connectivity_set_peer_exit(peer_name: str):
"""Assign a peer to an egress exit type."""
"""Assign a peer to a connection by id (or 'default' to clear).
Body: {"connection_id": "<id>|default"}. The legacy {"exit_via": "<type>"}
field is still accepted as a one-release back-compat shim and resolved to
the single connection instance of that type.
"""
try:
data = request.get_json(silent=True) or {}
exit_via = data.get('exit_via')
if not isinstance(exit_via, str):
return jsonify({'ok': False, 'error': 'exit_via is required'}), 400
result = connectivity_manager.set_peer_exit(peer_name, exit_via)
connection_id = data.get('connection_id', data.get('exit_via'))
if not isinstance(connection_id, str) or not connection_id:
return jsonify({'ok': False, 'error': 'connection_id is required'}), 400
result = connectivity_manager.set_peer_exit(peer_name, connection_id)
if result.get('ok'):
return jsonify(result)
return jsonify(result), 400
@@ -827,6 +1186,195 @@ def connectivity_get_peer_exits():
return jsonify({'error': str(e)}), 500
# Connectivity v2 — generic connection CRUD (going-forward API; admin-only via
# enforce_auth which restricts all non-peer /api/* routes to the admin role).
@app.route('/api/connectivity/connections', methods=['GET'])
def connectivity_list_connections():
"""List all connection instances (with status; never any secret value)."""
try:
return jsonify({'connections': connectivity_manager.list_connections()})
except Exception as e:
logger.error(f"connectivity_list_connections: {e}")
return jsonify({'error': str(e)}), 500
@app.route('/api/connectivity/connections', methods=['POST'])
def connectivity_create_connection():
"""Create a connection instance. Secrets are stored in the vault, never echoed."""
try:
data = request.get_json(silent=True) or {}
conn_type = data.get('type')
name = data.get('name')
config = data.get('config') or {}
conn_secrets = data.get('secrets') or {}
if not isinstance(conn_type, str) or not conn_type:
return jsonify({'ok': False, 'error': 'type is required'}), 400
if not isinstance(name, str) or not name.strip():
return jsonify({'ok': False, 'error': 'name is required'}), 400
result = connectivity_manager.create_connection(
conn_type, name, config=config, secrets=conn_secrets)
if result.get('ok'):
return jsonify(result), 201
return jsonify(result), 400
except Exception as e:
logger.error(f"connectivity_create_connection: {e}")
return jsonify({'error': 'internal error'}), 500
@app.route('/api/connectivity/connections/<conn_id>', methods=['PUT'])
def connectivity_update_connection(conn_id: str):
"""Update a connection's name, config and/or secrets. Secrets never echoed."""
try:
data = request.get_json(silent=True) or {}
result = connectivity_manager.update_connection(
conn_id,
name=data.get('name'),
config=data.get('config'),
secrets=data.get('secrets'),
)
if result.get('ok'):
return jsonify(result)
status = 404 if 'not found' in result.get('error', '') else 400
return jsonify(result), status
except Exception as e:
logger.error(f"connectivity_update_connection({conn_id}): {e}")
return jsonify({'error': 'internal error'}), 500
@app.route('/api/connectivity/connections/<conn_id>', methods=['DELETE'])
def connectivity_delete_connection(conn_id: str):
"""Delete a connection. Blocked with 409 when a peer/egress references it."""
try:
result = connectivity_manager.delete_connection(conn_id)
if result.get('ok'):
return jsonify(result)
error = result.get('error', '')
if 'not found' in error:
return jsonify(result), 404
if 'in use by' in error:
return jsonify(result), 409
return jsonify(result), 400
except Exception as e:
logger.error(f"connectivity_delete_connection({conn_id}): {e}")
return jsonify({'error': str(e)}), 500
@app.route('/api/connectivity/connections/<conn_id>/health', methods=['GET'])
def connectivity_connection_health(conn_id: str):
"""On-demand probe of one connection's health (admin)."""
try:
conn = connectivity_manager.get_connection(conn_id)
if conn is None:
return jsonify({'error': f'connection {conn_id!r} not found'}), 404
health, detail = connectivity_manager.probe_health(conn)
return jsonify({'id': conn_id, 'health': health, 'detail': detail})
except Exception as e:
logger.error(f"connectivity_connection_health({conn_id}): {e}")
return jsonify({'error': str(e)}), 500
@app.route('/api/connectivity/peers/<peer_name>/failopen', methods=['PUT'])
def connectivity_set_peer_failopen(peer_name: str):
"""Set or clear a peer's fail-open override. Body: {"failopen": bool|null}."""
try:
data = request.get_json(silent=True) or {}
failopen = data.get('failopen')
if failopen is not None and not isinstance(failopen, bool):
return jsonify({'ok': False, 'error': 'failopen must be a boolean or null'}), 400
result = connectivity_manager.set_peer_failopen(peer_name, failopen)
if result.get('ok'):
return jsonify(result)
status = 404 if 'not found' in result.get('error', '') else 400
return jsonify(result), status
except Exception as e:
logger.error(f"connectivity_set_peer_failopen({peer_name}): {e}")
return jsonify({'error': str(e)}), 500
@app.route('/api/caddy/cert-status', methods=['GET'])
def caddy_cert_status():
"""Return TLS certificate status (expiry, days remaining, domain, mode).
Refreshes from Caddy if the cached value is older than 5 minutes.
For LAN mode returns {'status': 'internal'}; for ACME modes returns
expiry info read via SSL handshake with the Caddy container.
"""
try:
return jsonify(caddy_manager.get_cert_status_fresh(max_age_seconds=300))
except Exception as e:
logger.error(f"caddy_cert_status: {e}")
return jsonify({'error': str(e)}), 500
@app.route('/api/caddy/cert-renew', methods=['POST'])
def caddy_cert_renew():
"""Trigger ACME certificate renewal by reloading Caddy.
Returns immediately with status='pending'; poll GET /api/caddy/cert-status
to track progress (Caddy typically acquires the cert within 30-60 s).
"""
try:
result = caddy_manager.renew_cert()
return jsonify(result), (200 if result.get('ok') else 400)
except Exception as e:
logger.error(f"caddy_cert_renew: {e}")
return jsonify({'error': str(e)}), 500
@app.route('/api/caddy/custom-cert', methods=['POST'])
def caddy_upload_custom_cert():
"""Install a custom TLS certificate (PEM format).
Body: { "cert_pem": "<PEM>", "key_pem": "<PEM>" }
Validates the cert/key pair, writes to the shared certs directory,
and reloads Caddy with the updated Caddyfile.
"""
try:
data = request.get_json(silent=True) or {}
cert_pem = (data.get('cert_pem') or '').strip()
key_pem = (data.get('key_pem') or '').strip()
if not cert_pem or not key_pem:
return jsonify({'ok': False, 'error': 'cert_pem and key_pem are required'}), 400
result = caddy_manager.upload_custom_cert(cert_pem, key_pem)
return jsonify(result), (200 if result.get('ok') else 422)
except Exception as e:
logger.error(f"caddy_upload_custom_cert: {e}")
return jsonify({'error': str(e)}), 500
@app.route('/api/egress/status', methods=['GET'])
def egress_status():
"""Return egress status for all installed services that have an egress config."""
try:
return jsonify(egress_manager.get_status())
except Exception as e:
logger.error(f"egress_status: {e}")
return jsonify({'error': str(e)}), 500
@app.route('/api/egress/services/<service_id>/exit', methods=['PUT'])
def egress_set_service_exit(service_id: str):
"""Persist and immediately apply a per-service egress override.
Body: {"connection_id": "<id>|default"}. The legacy {"exit_type": "<type>"}
field is still accepted as a one-release back-compat shim and resolved to
the single connection instance of that type.
"""
try:
data = request.get_json(silent=True) or {}
connection_id = data.get('connection_id', data.get('exit_type'))
if not isinstance(connection_id, str) or not connection_id:
return jsonify({'ok': False, 'error': 'connection_id is required'}), 400
result = egress_manager.set_service_exit(service_id, connection_id)
if result.get('ok'):
return jsonify(result)
return jsonify(result), 400
except Exception as e:
logger.error(f"egress_set_service_exit({service_id}): {e}")
return jsonify({'error': str(e)}), 500
if __name__ == '__main__':
debug = os.environ.get('FLASK_DEBUG', '0') == '1'
app.run(host='0.0.0.0', port=3000, debug=debug)
+330
View File
@@ -0,0 +1,330 @@
#!/usr/bin/env python3
"""
Audit Manager for Personal Internet Cell.
Owner-visible, append-only audit trail of WHO (actor + role + ip) did WHAT
(action) to WHICH target, WHEN, with a redacted summary. Storage is a JSONL
file with a per-entry SHA-256 hash chain so tampering is detectable. Request
bodies and secret values are never written; summaries only ever list changed
config KEY NAMES, never their values.
"""
import os
import io
import re
import csv
import json
import hashlib
import logging
import threading
from datetime import datetime
from typing import Dict, List, Optional, Any
from base_service_manager import BaseServiceManager
logger = logging.getLogger(__name__)
def _utcnow_iso() -> str:
return datetime.utcnow().strftime('%Y-%m-%dT%H:%M:%SZ')
# Keys whose values must never be recorded — name-only in summaries.
_SECRET_KEY_RE = re.compile(r'(pass|secret|key|token|private|cred|otp|psk)', re.IGNORECASE)
# Final scrub of anything that looks like base64 key material / encoded blobs.
_BASE64_BLOCK_RE = re.compile(r'[A-Za-z0-9+/]{40,}={0,2}')
# bcrypt and age secret prefixes.
_SECRET_PREFIX_RE = re.compile(
r'(\$2[aby]\$[^\s]+|AGE-SECRET-KEY-[^\s]+|age1[^\s]+|-----BEGIN[^\n]+)'
)
_VALID_RESULTS = ('success', 'failure')
class AuditManager(BaseServiceManager):
"""Append-only, hash-chained audit trail."""
MAX_FILE_SIZE = 10 * 1024 * 1024 # 10 MB before rotation
BACKUP_COUNT = 10 # audit.log.1 .. audit.log.10
def __init__(self, data_dir: str = '/app/data', config_dir: str = '/app/config',
tamper_chain: bool = True):
super().__init__('audit', data_dir=data_dir, config_dir=config_dir)
self.tamper_chain = tamper_chain
self._lock = threading.RLock()
self._audit_dir = os.path.join(self.data_dir, 'api', 'audit')
self._audit_file = os.path.join(self._audit_dir, 'audit.log')
self._seq = 0
self._prev_hash = ''
self.safe_makedirs(self._audit_dir)
self._load_chain_state()
# ── chain bootstrap ─────────────────────────────────────────────────────
def _load_chain_state(self) -> None:
"""Recover seq + prev_hash from the last line of the live file."""
try:
if not os.path.exists(self._audit_file):
return
last = None
with open(self._audit_file, 'r', encoding='utf-8', errors='ignore') as f:
for line in f:
line = line.strip()
if line:
last = line
if last:
entry = json.loads(last)
self._seq = int(entry.get('seq', 0))
self._prev_hash = entry.get('hash', '') or ''
except Exception as e:
logger.warning(f"audit: could not load chain state: {e}")
# ── redaction ───────────────────────────────────────────────────────────
@staticmethod
def _scrub(text: str) -> str:
"""Strip anything resembling a secret value from a summary string."""
if not text:
return ''
text = _SECRET_PREFIX_RE.sub('[REDACTED]', text)
text = _BASE64_BLOCK_RE.sub('[REDACTED]', text)
return text
@classmethod
def _redact(cls, entry: Dict[str, Any]) -> Dict[str, Any]:
"""Enforce the redaction rules on a built entry before write.
- summary is scrubbed of base64/secret-prefixed blobs.
- any string field is scrubbed too (defence in depth).
Request bodies are never present — the caller passes only a summary.
"""
for field in ('summary', 'target_id', 'action', 'path'):
val = entry.get(field)
if isinstance(val, str):
entry[field] = cls._scrub(val)
return entry
@classmethod
def summarize_keys(cls, keys: List[str]) -> str:
"""Build a redacted summary listing changed config KEY NAMES only.
Secret-looking key names are kept (they are names, not values) but the
whole string is still scrubbed of any accidental value material.
"""
names = [str(k) for k in keys if k is not None]
return cls._scrub('changed: ' + ', '.join(names)) if names else 'no changes'
# ── hashing ─────────────────────────────────────────────────────────────
@staticmethod
def _canonical(entry: Dict[str, Any]) -> str:
return json.dumps(entry, sort_keys=True, separators=(',', ':'), ensure_ascii=False)
def _hash_entry(self, entry_without_hash: Dict[str, Any]) -> str:
return hashlib.sha256(self._canonical(entry_without_hash).encode('utf-8')).hexdigest()
# ── recording ───────────────────────────────────────────────────────────
def record(self, actor: str, role: str, ip: str, action: str,
target_type: str = '', target_id: str = '', summary: str = '',
result: str = 'success', status: int = 200, method: str = '',
path: str = '', request_id: str = '') -> Optional[Dict[str, Any]]:
"""Append one redacted, hash-chained JSON line. Never raises."""
try:
with self._lock:
self._maybe_rotate()
self._seq += 1
if result not in _VALID_RESULTS:
result = 'success' if int(status or 200) < 400 else 'failure'
entry: Dict[str, Any] = {
'ts': _utcnow_iso(),
'actor': actor or 'anonymous',
'role': role or 'system',
'ip': ip or '',
'action': action or '',
'target_type': target_type or '',
'target_id': target_id or '',
'summary': summary or '',
'result': result,
'status': int(status or 0),
'method': method or '',
'path': path or '',
'request_id': request_id or '',
'seq': self._seq,
'prev_hash': self._prev_hash if self.tamper_chain else '',
}
entry = self._redact(entry)
if self.tamper_chain:
entry['hash'] = self._hash_entry(entry)
else:
entry['hash'] = ''
self._append_line(json.dumps(entry, ensure_ascii=False))
self._prev_hash = entry['hash']
return entry
except Exception as e:
logger.warning(f"audit.record failed: {e}")
return None
def _append_line(self, line: str) -> None:
self.safe_makedirs(self._audit_dir)
fd = os.open(self._audit_file, os.O_WRONLY | os.O_CREAT | os.O_APPEND, 0o600)
try:
os.write(fd, (line + '\n').encode('utf-8'))
finally:
os.close(fd)
try:
os.chmod(self._audit_file, 0o600)
except OSError:
pass
# ── rotation ────────────────────────────────────────────────────────────
def _maybe_rotate(self) -> None:
try:
if not os.path.exists(self._audit_file):
return
if os.path.getsize(self._audit_file) < self.MAX_FILE_SIZE:
return
except OSError:
return
# audit.log.(N-1) -> audit.log.N, ... audit.log -> audit.log.1
for i in range(self.BACKUP_COUNT - 1, 0, -1):
src = f"{self._audit_file}.{i}"
dst = f"{self._audit_file}.{i + 1}"
if os.path.exists(src):
try:
os.replace(src, dst)
except OSError as e:
logger.warning(f"audit rotate {src}->{dst}: {e}")
try:
os.replace(self._audit_file, f"{self._audit_file}.1")
except OSError as e:
logger.warning(f"audit rotate live->.1: {e}")
def _segment_files(self) -> List[str]:
"""Live file first (newest), then rotated segments .1 .. .N (older)."""
files = []
if os.path.exists(self._audit_file):
files.append(self._audit_file)
for i in range(1, self.BACKUP_COUNT + 1):
seg = f"{self._audit_file}.{i}"
if os.path.exists(seg):
files.append(seg)
return files
# ── reading / filtering ─────────────────────────────────────────────────
@staticmethod
def _matches(entry: Dict[str, Any], filters: Dict[str, Any]) -> bool:
for field in ('actor', 'action', 'target_type', 'target_id', 'result'):
want = filters.get(field)
if want and str(entry.get(field, '')) != str(want):
return False
since = filters.get('since')
until = filters.get('until')
ts = entry.get('ts', '')
if since and ts < since:
return False
if until and ts > until:
return False
return True
def _read_all(self, filters: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Return matching entries, newest-first across all segments."""
results: List[Dict[str, Any]] = []
with self._lock:
for seg in self._segment_files():
try:
with open(seg, 'r', encoding='utf-8', errors='ignore') as f:
lines = f.readlines()
except OSError:
continue
for line in reversed(lines):
line = line.strip()
if not line:
continue
try:
entry = json.loads(line)
except json.JSONDecodeError:
continue
if self._matches(entry, filters):
results.append(entry)
return results
def query(self, filters: Optional[Dict[str, Any]] = None,
limit: int = 100, offset: int = 0) -> Dict[str, Any]:
filters = filters or {}
try:
limit = max(1, min(int(limit), 1000))
except (TypeError, ValueError):
limit = 100
try:
offset = max(0, int(offset))
except (TypeError, ValueError):
offset = 0
entries = self._read_all(filters)
total = len(entries)
page = entries[offset:offset + limit]
next_offset = offset + limit if offset + limit < total else None
return {'entries': page, 'total': total, 'next_offset': next_offset}
def export_csv(self, filters: Optional[Dict[str, Any]] = None) -> str:
filters = filters or {}
entries = self._read_all(filters)
fields = ['ts', 'actor', 'role', 'ip', 'action', 'target_type',
'target_id', 'summary', 'result', 'status', 'method', 'path',
'request_id', 'seq']
buf = io.StringIO()
writer = csv.writer(buf)
writer.writerow(fields)
for e in entries:
writer.writerow([e.get(f, '') for f in fields])
return buf.getvalue()
# ── integrity ───────────────────────────────────────────────────────────
def verify_chain(self) -> Dict[str, Any]:
"""Walk all segments oldest-first; verify each entry's hash + link."""
if not self.tamper_chain:
return {'ok': True, 'broken_at_seq': None, 'disabled': True}
with self._lock:
segs = list(reversed(self._segment_files())) # oldest -> newest
prev_hash = ''
first = True # oldest available record: its predecessor may be pruned
for seg in segs:
try:
with open(seg, 'r', encoding='utf-8', errors='ignore') as f:
lines = f.readlines()
except OSError:
continue
for line in lines:
line = line.strip()
if not line:
continue
try:
entry = json.loads(line)
except json.JSONDecodeError:
return {'ok': False, 'broken_at_seq': None}
stored_hash = entry.get('hash', '')
# Don't fail the prev_hash link on the very first available
# record — older segments may have rotated off the end.
if not first and entry.get('prev_hash', '') != prev_hash:
return {'ok': False, 'broken_at_seq': entry.get('seq')}
recomputed = self._hash_entry({k: v for k, v in entry.items() if k != 'hash'})
if recomputed != stored_hash:
return {'ok': False, 'broken_at_seq': entry.get('seq')}
prev_hash = stored_hash
first = False
return {'ok': True, 'broken_at_seq': None}
# ── BaseServiceManager interface ────────────────────────────────────────
def get_status(self) -> Dict[str, Any]:
size = 0
try:
if os.path.exists(self._audit_file):
size = os.path.getsize(self._audit_file)
except OSError:
pass
return {
'running': True,
'tamper_chain': self.tamper_chain,
'seq': self._seq,
'file': self._audit_file,
'file_size': size,
}
def test_connectivity(self) -> Dict[str, Any]:
return {'success': True}
-10
View File
@@ -47,16 +47,6 @@ class AuthManager(BaseServiceManager):
os.makedirs(os.path.dirname(self._users_file), exist_ok=True)
except Exception:
pass
if not os.path.exists(self._users_file):
try:
with open(self._users_file, 'w') as f:
f.write('[]')
try:
os.chmod(self._users_file, 0o600)
except Exception:
pass
except Exception as e:
self.logger.error(f'Could not create users file: {e}')
def _load_users(self) -> List[Dict[str, Any]]:
with self._lock:
+32
View File
@@ -20,6 +20,30 @@ auth_manager = None # type: ignore
auth_bp = Blueprint('auth', __name__, url_prefix='/api/auth')
def _audit(action, target_type, target_id, summary, result, status):
"""Record an explicit audit entry for auth actions the generic hook skips.
Never raises and never includes any password value.
"""
try:
from app import audit_manager
ip = request.remote_addr or ''
xff = request.headers.get('X-Forwarded-For', '')
if xff:
last = xff.split(',')[-1].strip()
if last:
ip = last
audit_manager.record(
actor=session.get('username', 'anonymous'),
role=session.get('role', 'system'),
ip=ip, action=action, target_type=target_type, target_id=target_id,
summary=summary, result=result, status=status,
method=request.method, path=request.path,
)
except Exception:
pass
def require_auth(role=None):
"""Decorator that enforces session authentication and an optional role."""
def deco(fn):
@@ -124,7 +148,11 @@ def change_password():
username = session.get('username')
ok = auth_manager.change_password(username, old_pw, new_pw)
if not ok:
_audit('user.password_change', 'user', username or '',
'password changed', 'failure', 400)
return jsonify({'error': 'Password change failed'}), 400
_audit('user.password_change', 'user', username or '',
'password changed', 'success', 200)
return jsonify({'ok': True})
@@ -142,7 +170,11 @@ def admin_reset_password():
return jsonify({'error': 'new_password must be at least 10 characters'}), 400
ok = auth_manager.set_password_admin(username, new_pw)
if not ok:
_audit('user.password_reset', 'user', username,
f'admin reset password for peer {username}', 'failure', 400)
return jsonify({'error': 'Reset failed (user not found?)'}), 400
_audit('user.password_reset', 'user', username,
f'admin reset password for peer {username}', 'success', 200)
return jsonify({'ok': True})
+71
View File
@@ -0,0 +1,71 @@
#!/usr/bin/env python3
"""Passphrase-based encryption for PIC backup archives.
A backup archive contains key material (WireGuard keys, the vault Fernet key,
the internal CA, admin credentials). When the operator supplies a passphrase we
encrypt the archive at rest.
The repo's only available crypto primitive is `cryptography` (Fernet, scrypt) —
PyNaCl / the age binary are not installed in the API image. We therefore derive
a Fernet key from the passphrase with scrypt and wrap the archive bytes. The
encrypted file keeps the `.age` extension expected by the UI/restore detection;
the embedded MAGIC distinguishes our format from a real age file.
"""
import os
import struct
from cryptography.fernet import Fernet, InvalidToken
from cryptography.hazmat.primitives.kdf.scrypt import Scrypt
import base64
# File layout: MAGIC | salt(16) | n(4) | r(4) | p(4) | fernet_token
MAGIC = b'PICBKP1\n'
_SALT_LEN = 16
# scrypt cost parameters (interactive-strong; ~tens of ms)
_N = 2 ** 15
_R = 8
_P = 1
class BackupDecryptError(Exception):
"""Raised when an encrypted backup cannot be decrypted (wrong passphrase)."""
def _derive_key(passphrase: str, salt: bytes, n: int, r: int, p: int) -> bytes:
kdf = Scrypt(salt=salt, length=32, n=n, r=r, p=p)
raw = kdf.derive(passphrase.encode('utf-8'))
return base64.urlsafe_b64encode(raw)
def encrypt_bytes(plaintext: bytes, passphrase: str) -> bytes:
"""Encrypt archive bytes with a passphrase. Returns the on-disk blob."""
if not passphrase:
raise ValueError('passphrase required for encryption')
salt = os.urandom(_SALT_LEN)
key = _derive_key(passphrase, salt, _N, _R, _P)
token = Fernet(key).encrypt(plaintext)
header = MAGIC + salt + struct.pack('>III', _N, _R, _P)
return header + token
def is_encrypted(blob: bytes) -> bool:
return blob[:len(MAGIC)] == MAGIC
def decrypt_bytes(blob: bytes, passphrase: str) -> bytes:
"""Decrypt a blob produced by encrypt_bytes. Raises BackupDecryptError."""
if not is_encrypted(blob):
raise BackupDecryptError('not a PIC encrypted backup')
if not passphrase:
raise BackupDecryptError('passphrase required')
off = len(MAGIC)
salt = blob[off:off + _SALT_LEN]
off += _SALT_LEN
n, r, p = struct.unpack('>III', blob[off:off + 12])
off += 12
token = blob[off:]
key = _derive_key(passphrase, salt, n, r, p)
try:
return Fernet(key).decrypt(token)
except (InvalidToken, ValueError) as e:
raise BackupDecryptError('invalid passphrase or corrupt archive') from e
+478 -38
View File
@@ -23,8 +23,13 @@ in the main server block (or, for ``http01``, written as their own per-host
blocks).
"""
import datetime as _dt
import logging
import os
import socket as _socket
import ssl as _ssl
import threading
import time as _time
from typing import Any, Dict, List, Optional
import requests
@@ -45,20 +50,43 @@ LIVE_CADDYFILE = os.environ.get('CADDYFILE_PATH', '/app/config-caddy/Caddyfile')
# localhost to match the dev/test wiring.
CADDY_ADMIN_URL = os.environ.get('CADDY_ADMIN_URL', 'http://cell-caddy:2019')
# Directory where the API writes custom TLS cert/key files.
# The Caddy container mounts ./config/caddy → /config/caddy, so files written
# here appear inside the container as /config/caddy/certs/<file>.
CADDY_CERTS_DIR = os.environ.get('CADDY_CERTS_DIR', '/app/config-caddy/certs')
# Paths as seen by the Caddy process (inside the container).
_CADDY_CUSTOM_CERT = '/config/caddy/certs/cert.pem'
_CADDY_CUSTOM_KEY = '/config/caddy/certs/key.pem'
_CADDY_INTERNAL_CERT = '/etc/caddy/internal/cert.pem'
_CADDY_INTERNAL_KEY = '/etc/caddy/internal/key.pem'
class CaddyManager(BaseServiceManager):
"""Manages Caddy reverse-proxy configuration and runtime health."""
def __init__(self, config_manager=None,
data_dir: str = '/app/data',
config_dir: str = '/app/config'):
config_dir: str = '/app/config',
service_bus=None,
service_registry=None):
super().__init__('caddy', data_dir, config_dir)
self.config_manager = config_manager
self.container_name = 'cell-caddy'
self.caddyfile_path = LIVE_CADDYFILE
self._service_registry = service_registry
# Consecutive health-check failure counter (reset on success or when
# the caller restarts the container).
self._health_failures = 0
# Monotonic timestamp of the last successful cert status refresh.
self._cert_refreshed_at: Optional[float] = None
# Debounce: prevent two rapid Caddyfile reloads (e.g. IDENTITY_CHANGED
# fires from wizard AND heartbeat re-registration within seconds of each other).
self._last_regenerate_at: float = 0.0
self._regenerate_lock = threading.Lock()
if service_bus is not None:
from service_bus import EventType
service_bus.subscribe_to_event(EventType.IDENTITY_CHANGED, self._on_identity_changed)
# ── BaseServiceManager required ───────────────────────────────────────
@@ -83,6 +111,30 @@ class CaddyManager(BaseServiceManager):
# ── Caddyfile generation ──────────────────────────────────────────────
# Python logging level → Caddy log level. Caddy only knows
# DEBUG/INFO/WARN/ERROR (no CRITICAL).
_CADDY_LEVEL_MAP = {
'DEBUG': 'DEBUG', 'INFO': 'INFO', 'WARNING': 'WARN',
'ERROR': 'ERROR', 'CRITICAL': 'ERROR',
}
def _resolve_caddy_level(self) -> str:
"""Read the configured caddy container log level (Python level name)."""
if self.config_manager is not None:
try:
return self.config_manager.get_logging_config()['containers'].get('caddy', 'INFO')
except Exception:
pass
return 'INFO'
def _global_log_block(self) -> str:
"""Return the global-options `log { level <X> }` line(s), or '' for the
Caddy default (INFO). Injected inside the global `{ ... }` block."""
level = self._CADDY_LEVEL_MAP.get(self._resolve_caddy_level(), 'INFO')
if level == 'INFO':
return ''
return f" log {{\n level {level}\n }}"
def generate_caddyfile(self, identity: Dict[str, Any],
installed_services: List[Dict[str, Any]]) -> str:
"""Generate a complete Caddyfile based on identity and services.
@@ -117,23 +169,25 @@ class CaddyManager(BaseServiceManager):
" reverse_proxy cell-api:3000\n"
" }\n"
" handle {\n"
" reverse_proxy cell-webui:80\n"
" reverse_proxy cell-webui:8080\n"
" }"
)
if domain_mode == 'lan':
return self._caddyfile_lan(cell_name, service_routes, core_routes)
cert_path, key_path = self._tls_cert_pair()
return self._caddyfile_lan(cell_name, service_routes, core_routes,
cert_path, key_path)
if domain_mode == 'pic_ngo':
return self._caddyfile_pic_ngo(cell_name, service_routes, core_routes)
if domain_mode == 'cloudflare':
custom_domain = identity.get('custom_domain', f'{cell_name}.local')
custom_domain = identity.get('domain_name', identity.get('domain', f'{cell_name}.local'))
return self._caddyfile_cloudflare(
custom_domain, service_routes, core_routes
)
if domain_mode == 'duckdns':
return self._caddyfile_duckdns(cell_name, service_routes, core_routes)
if domain_mode == 'http01':
host = identity.get('custom_domain', f'{cell_name}.noip.me')
host = identity.get('domain_name', identity.get('domain', f'{cell_name}.noip.me'))
return self._caddyfile_http01(host, installed_services, core_routes)
# Fallback to lan so we always emit a valid Caddyfile.
@@ -142,20 +196,86 @@ class CaddyManager(BaseServiceManager):
# ── per-mode generators ───────────────────────────────────────────────
@staticmethod
def _global_acme_block(email: Optional[str]) -> str:
def _global_acme_block(self, email: Optional[str]) -> str:
"""Return the ``{ ... }`` global block for an ACME-enabled mode."""
lines = ["{"]
# Bind admin API on all interfaces so cell-api can reach cell-caddy
# across the Docker bridge (default 127.0.0.1 is unreachable cross-container).
lines.append(" admin 0.0.0.0:2019")
log_block = self._global_log_block()
if log_block:
lines.append(log_block)
if email:
lines.append(f" email {email}")
# Always allow tests to override the ACME directory via env var.
lines.append(" acme_ca {$ACME_CA_URL}")
# Only write acme_ca when a URL is configured — an empty ACME_CA_URL
# causes Caddy to reject the Caddyfile with "wrong argument count".
# When absent, Caddy defaults to Let's Encrypt production.
acme_ca_url = os.environ.get('ACME_CA_URL', '').strip()
if acme_ca_url:
lines.append(f" acme_ca {acme_ca_url}")
lines.append("}")
return "\n".join(lines)
def _build_registry_service_routes(self, domain: str) -> str:
"""Build named-matcher + handle blocks from the service registry.
When no registry is wired or the registry returns nothing, only the
api block is emitted (api is always infrastructure, not delegated to
the registry).
"""
routes: List[Dict] = []
if self._service_registry is not None:
try:
routes = self._service_registry.get_caddy_routes()
except Exception as exc:
logger.warning('_build_registry_service_routes: registry error: %s', exc)
# Pre-seed with reserved names so no registry entry can squat them.
seen_matchers: set = {'api', 'webui'}
blocks: List[str] = []
for route in routes:
primary_sub = route['subdomain']
backend = route['backend']
extra_subs: List[str] = route.get('extra_subdomains') or []
extra_backends: Dict[str, str] = route.get('extra_backends') or {}
if primary_sub in seen_matchers:
logger.warning('Caddy: skipping duplicate/reserved matcher %r', primary_sub)
continue
seen_matchers.add(primary_sub)
# Subdomains that share the primary backend go in one matcher block.
shared = [primary_sub] + [s for s in extra_subs if s not in extra_backends]
host_list = ' '.join(f'{s}.{domain}' for s in shared)
blocks.append(
f' @{primary_sub} host {host_list}\n'
f' handle @{primary_sub} {{\n'
f' reverse_proxy {backend}\n'
f' }}'
)
# Extra subdomains with their own backends each get their own block.
for sub, sub_backend in extra_backends.items():
if sub in seen_matchers:
logger.warning('Caddy: skipping duplicate/reserved matcher %r', sub)
continue
seen_matchers.add(sub)
blocks.append(
f' @{sub} host {sub}.{domain}\n'
f' handle @{sub} {{\n'
f' reverse_proxy {sub_backend}\n'
f' }}'
)
# The api subdomain is always infrastructure — not delegated to the registry.
blocks.append(
f' @api host api.{domain}\n'
f' handle @api {{\n'
f' reverse_proxy cell-api:3000\n'
f' }}'
)
return '\n'.join(blocks)
@staticmethod
def _indent_routes(routes: str, spaces: int = 4) -> str:
"""Indent a multi-line route block by ``spaces`` columns."""
@@ -175,22 +295,38 @@ class CaddyManager(BaseServiceManager):
chunks.append(route.strip("\n"))
return "\n".join(chunks)
def _tls_cert_pair(self) -> tuple:
"""Return (cert_path, key_path) as seen inside the Caddy container.
Uses the custom-uploaded cert when one is installed, otherwise falls
back to the internal-CA cert that the VaultManager writes.
"""
ident = (self.config_manager.get_identity() if self.config_manager else {}) or {}
if ident.get('tls', {}).get('cert_type') == 'custom':
return _CADDY_CUSTOM_CERT, _CADDY_CUSTOM_KEY
return _CADDY_INTERNAL_CERT, _CADDY_INTERNAL_KEY
def _caddyfile_lan(self, cell_name: str,
service_routes: str, core_routes: str) -> str:
service_routes: str, core_routes: str,
cert_path: str = _CADDY_INTERNAL_CERT,
key_path: str = _CADDY_INTERNAL_KEY) -> str:
"""LAN mode: HTTP only + internal-CA TLS, no ACME."""
body = []
if service_routes:
body.append(self._indent_routes(service_routes))
body.append(core_routes)
inner = "\n".join(body)
log_block = self._global_log_block()
log_line = (log_block + "\n") if log_block else ""
return (
"{\n"
" admin 0.0.0.0:2019\n"
f"{log_line}"
" auto_https off\n"
"}\n"
"\n"
f"http://{cell_name}.cell, http://172.20.0.2:80 {{\n"
" tls /etc/caddy/internal/cert.pem /etc/caddy/internal/key.pem\n"
f" tls {cert_path} {key_path}\n"
f"{inner}\n"
"}\n"
)
@@ -198,20 +334,49 @@ class CaddyManager(BaseServiceManager):
def _caddyfile_pic_ngo(self, cell_name: str,
service_routes: str, core_routes: str) -> str:
"""pic_ngo mode: wildcard DNS-01 via the pic_ngo plugin."""
body = []
domain = f"{cell_name}.pic.ngo"
body = [self._build_registry_service_routes(domain)]
if service_routes:
body.append(self._indent_routes(service_routes))
body.append(core_routes)
inner = "\n".join(body)
email = f"admin@{cell_name}.pic.ngo"
email = f"admin@{domain}"
# Resolve credentials at write time — Caddy runs in its own container
# and does not inherit the API's environment variables, so we embed the
# actual values instead of {$VAR} placeholders.
# Token is read from data/api/ddns_token (not cell_config.json).
ddns_cfg = self.config_manager.configs.get('ddns', {})
if hasattr(self.config_manager, 'get_ddns_token'):
ddns_token = self.config_manager.get_ddns_token() or ''
else:
ddns_token = (ddns_cfg.get('token') or '').strip()
if not ddns_token:
ddns_token = os.environ.get('DDNS_TOKEN', '').strip()
_raw_api = (os.environ.get('DDNS_URL') or ddns_cfg.get('url') or 'https://ddns.pic.ngo').strip()
# Strip legacy /api/v1 suffix — the pic_ngo plugin appends /api/v1 itself.
ddns_api = _raw_api.rstrip('/').removesuffix('/api/v1')
# No token yet (fresh install, pre-registration) — Caddy would reject a
# bare `token` keyword with no value. Fall back to LAN mode so Caddy
# starts cleanly; the Caddyfile is regenerated once registration completes.
if not ddns_token:
logger.warning(
'pic_ngo mode configured but no DDNS token available; '
'falling back to lan mode until registration completes'
)
cert_path, key_path = self._tls_cert_pair()
return self._caddyfile_lan(cell_name, service_routes, core_routes,
cert_path, key_path)
return (
f"{self._global_acme_block(email)}\n"
"\n"
f"*.{cell_name}.pic.ngo, {cell_name}.pic.ngo {{\n"
f"*.{domain}, {domain} {{\n"
" tls {\n"
" dns pic_ngo {\n"
" token {$PIC_NGO_DDNS_TOKEN}\n"
" api_base_url {$PIC_NGO_DDNS_API}\n"
f" token {ddns_token}\n"
f" api_base_url {ddns_api}\n"
" }\n"
" }\n"
f"{inner}\n"
@@ -221,7 +386,7 @@ class CaddyManager(BaseServiceManager):
def _caddyfile_cloudflare(self, custom_domain: str,
service_routes: str, core_routes: str) -> str:
"""cloudflare mode: wildcard DNS-01 via the cloudflare plugin."""
body = []
body = [self._build_registry_service_routes(custom_domain)]
if service_routes:
body.append(self._indent_routes(service_routes))
body.append(core_routes)
@@ -240,7 +405,8 @@ class CaddyManager(BaseServiceManager):
def _caddyfile_duckdns(self, cell_name: str,
service_routes: str, core_routes: str) -> str:
"""duckdns mode: DNS-01 via the duckdns plugin."""
body = []
domain = f"{cell_name}.duckdns.org"
body = [self._build_registry_service_routes(domain)]
if service_routes:
body.append(self._indent_routes(service_routes))
body.append(core_routes)
@@ -248,7 +414,7 @@ class CaddyManager(BaseServiceManager):
return (
f"{self._global_acme_block(None)}\n"
"\n"
f"*.{cell_name}.duckdns.org {{\n"
f"*.{domain} {{\n"
" tls {\n"
" dns duckdns {$DUCKDNS_TOKEN}\n"
" }\n"
@@ -260,23 +426,29 @@ class CaddyManager(BaseServiceManager):
installed_services: List[Dict[str, Any]],
core_routes: str) -> str:
"""http01 mode: no wildcard. Each service gets its own block."""
# Main host block — only the core routes (api + webui). Service
# routes that could otherwise be served as path-prefixes are NOT
# placed here because in http01 mode each service is intended to
# live on its own subdomain (otherwise it could also use a path
# prefix here, but the spec calls for separate blocks).
# Main host block — only the core routes (api + webui).
out = [self._global_acme_block('{$ACME_EMAIL}'), ""]
out.append(f"{host} {{")
out.append(core_routes)
out.append("}")
# One block per installed service that has a caddy_route.
# Build (subdomain, backend) pairs from registry when available.
_core_services = self._http01_service_pairs()
for subdomain, backend in _core_services:
out.append("")
out.append(f"{subdomain}.{host} {{")
out.append(f" reverse_proxy {backend}")
out.append("}")
# One block per installed (store plugin) service that has a caddy_route,
# skipping any name that conflicts with a core service.
_core_names = {s for s, _ in _core_services}
for svc in installed_services or []:
if not svc:
continue
route = svc.get('caddy_route')
name = svc.get('name') or svc.get('subdomain')
if not route or not name:
if not route or not name or name in _core_names:
continue
out.append("")
out.append(f"{name}.{host} {{")
@@ -284,6 +456,24 @@ class CaddyManager(BaseServiceManager):
out.append("}")
return "\n".join(out) + "\n"
def _http01_service_pairs(self) -> List[tuple]:
"""Return (subdomain, backend) pairs for http01 per-host blocks."""
pairs: List[tuple] = []
if self._service_registry is not None:
try:
for route in self._service_registry.get_caddy_routes():
pairs.append((route['subdomain'], route['backend']))
extra_subs: List[str] = route.get('extra_subdomains') or []
extra_backends: Dict[str, str] = route.get('extra_backends') or {}
for sub in extra_subs:
backend = extra_backends.get(sub, route['backend'])
pairs.append((sub, backend))
except Exception as exc:
logger.warning('_http01_service_pairs: registry error: %s', exc)
pairs = []
pairs.append(('api', 'cell-api:3000'))
return pairs
# ── filesystem + admin-API operations ─────────────────────────────────
def write_caddyfile(self, caddyfile_content: str) -> bool:
@@ -306,6 +496,10 @@ class CaddyManager(BaseServiceManager):
os.fsync(f.fileno())
except OSError:
pass
try:
os.chmod(self.caddyfile_path, 0o600)
except OSError:
pass
logger.info("Wrote Caddyfile to %s (%d bytes)",
self.caddyfile_path, len(caddyfile_content))
except Exception as e:
@@ -348,9 +542,14 @@ class CaddyManager(BaseServiceManager):
return False
def check_caddy_health(self) -> bool:
"""GET the Caddy admin API root. Returns True on HTTP 200."""
"""GET Caddy's config endpoint. Returns True on HTTP 200.
Caddy's admin API has no root handler — GET / returns 404 even when
fully healthy. GET /config/ returns 200 + the running config JSON
whenever Caddy is up and serving.
"""
try:
resp = requests.get(CADDY_ADMIN_URL + "/", timeout=5)
resp = requests.get(CADDY_ADMIN_URL + "/config/", timeout=5)
except requests.RequestException as e:
logger.debug("Caddy health check error: %s", e)
return False
@@ -373,25 +572,266 @@ class CaddyManager(BaseServiceManager):
# ── certificate status ────────────────────────────────────────────────
_REGENERATE_DEBOUNCE = 5.0 # seconds
def regenerate_with_installed(self, installed_services: list) -> bool:
"""Regenerate Caddyfile with installed services and reload."""
"""Regenerate Caddyfile with installed services and reload.
Debounced: skips if called again within _REGENERATE_DEBOUNCE seconds.
This prevents two simultaneous ACME orders when IDENTITY_CHANGED fires
from multiple sources (e.g. wizard completion + heartbeat re-registration)
within a short window.
"""
now = _time.monotonic()
with self._regenerate_lock:
if now - self._last_regenerate_at < self._REGENERATE_DEBOUNCE:
logger.debug("caddy regenerate_with_installed: skipped (debounce)")
return True
self._last_regenerate_at = now
identity = self.config_manager.get_identity()
content = self.generate_caddyfile(identity, installed_services)
return self.write_caddyfile(content)
def get_cert_status(self) -> Dict[str, Any]:
"""Return TLS cert status from identity['tls'] if present."""
default = {'status': 'unknown', 'expiry': None, 'days_remaining': None}
if not self.config_manager:
return default
def _on_identity_changed(self, event) -> None:
"""Regenerate and reload the Caddyfile when cell identity changes."""
try:
ident = self.config_manager.get_identity() or {}
except Exception as e:
logger.error("get_cert_status: failed to read identity: %s", e)
return default
self.regenerate_with_installed([])
except Exception as exc:
self.logger.warning('caddy_manager identity_changed handler failed: %s', exc)
# ── Certificate status ────────────────────────────────────────────────
def get_cert_status(self) -> Dict[str, Any]:
"""Return TLS cert status enriched with identity context (cached)."""
ident: Dict[str, Any] = {}
if self.config_manager:
try:
ident = self.config_manager.get_identity() or {}
except Exception as e:
logger.error("get_cert_status: failed to read identity: %s", e)
domain_mode = ident.get('domain_mode', 'lan')
tls = ident.get('tls') or {}
cert_type = tls.get('cert_type', 'custom' if tls.get('cert_type') == 'custom'
else ('internal' if domain_mode == 'lan' else 'acme'))
return {
'status': tls.get('status', 'unknown'),
'expiry': tls.get('expiry'),
'days_remaining': tls.get('days_remaining'),
'domain': self._domain_label(ident),
'domain_mode': domain_mode,
'cert_type': cert_type,
}
@staticmethod
def _domain_label(ident: Dict[str, Any]) -> Optional[str]:
"""Return a human-readable domain string for display in the UI."""
mode = ident.get('domain_mode', 'lan')
cell = ident.get('cell_name', '')
if mode == 'pic_ngo':
return f'*.{cell}.pic.ngo' if cell else None
if mode == 'cloudflare':
d = ident.get('domain_name') or ident.get('domain', '')
return f'*.{d}' if d else None
if mode == 'duckdns':
return f'*.{cell}.duckdns.org' if cell else None
if mode == 'http01':
return ident.get('domain_name') or ident.get('domain')
return None # lan
def get_cert_status_fresh(self, max_age_seconds: int = 300) -> Dict[str, Any]:
"""Return cert status, refreshing if the cached value is older than max_age_seconds."""
now = _time.monotonic()
if self._cert_refreshed_at is None or (now - self._cert_refreshed_at) > max_age_seconds:
self.refresh_cert_status()
return self.get_cert_status()
def refresh_cert_status(self) -> Dict[str, Any]:
"""Check TLS cert expiry via SSL and persist to identity['tls'].
For LAN mode (no ACME): immediately returns {'status': 'internal'}.
For ACME modes: opens an SSL connection to Caddy on port 443 and
reads the cert expiry from the TLS handshake. On any error (cert
not yet issued, network unreachable): returns {'status': 'unknown'}.
"""
identity = self.config_manager.get_identity() if self.config_manager else {}
domain_mode = (identity or {}).get('domain_mode', 'lan')
if domain_mode == 'lan':
status: Dict[str, Any] = {'status': 'internal', 'expiry': None, 'days_remaining': None}
else:
caddy_host = os.environ.get('CADDY_CERT_HOST', 'cell-caddy')
caddy_port = int(os.environ.get('CADDY_HTTPS_PORT', '443'))
# Use the effective domain as TLS SNI so Caddy serves the right
# certificate. Without this, Caddy receives SNI='cell-caddy' which
# matches no cert and the handshake returns nothing.
sni = None
if self.config_manager:
try:
sni = self.config_manager.get_effective_domain()
except Exception:
pass
result = self._check_cert_via_ssl(caddy_host, caddy_port, sni=sni)
status = result if result is not None else {
'status': 'unknown', 'expiry': None, 'days_remaining': None
}
if self.config_manager:
try:
self.config_manager.set_identity_field('tls', status)
except Exception as exc:
logger.warning('refresh_cert_status: failed to persist tls status: %s', exc)
self._cert_refreshed_at = _time.monotonic()
return status
@staticmethod
def _check_cert_via_ssl(hostname: str, port: int = 443, sni: str = None) -> Optional[Dict[str, Any]]:
"""Open an SSL connection and return cert expiry info, or None on failure.
Connect to hostname:port but present sni (if given) as the TLS server
name so Caddy returns the right certificate for the public domain.
"""
ctx = _ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = _ssl.CERT_NONE
try:
with _socket.create_connection((hostname, port), timeout=5) as raw:
with ctx.wrap_socket(raw, server_hostname=sni or hostname) as tls:
der = tls.getpeercert(binary_form=True)
if not der:
return None
from cryptography import x509
from cryptography.hazmat.backends import default_backend
cert = x509.load_der_x509_certificate(der, default_backend())
# Use not_valid_after_utc (cryptography ≥42) with fallback for older builds.
try:
expiry = cert.not_valid_after_utc
except AttributeError:
expiry = cert.not_valid_after.replace(tzinfo=_dt.timezone.utc) # type: ignore[attr-defined]
now = _dt.datetime.now(_dt.timezone.utc)
days = (expiry - now).days
return {
'status': 'valid' if days > 0 else 'expired',
'expiry': expiry.isoformat(),
'days_remaining': days,
}
except Exception:
return None
# ── Active cert management ────────────────────────────────────────────
def renew_cert(self) -> Dict[str, Any]:
"""Regenerate the Caddyfile, reload Caddy, and trigger ACME cert renewal.
Regenerates first so a stale or broken on-disk Caddyfile never blocks
the reload. Returns immediately with status='pending'; the caller
polls GET /api/caddy/cert-status to track progress. Not applicable
to LAN mode — callers should use upload_custom_cert() instead.
"""
ident = (self.config_manager.get_identity() if self.config_manager else {}) or {}
domain_mode = ident.get('domain_mode', 'lan')
if domain_mode == 'lan':
return {
'ok': False,
'error': 'ACME renewal is not available in LAN mode. '
'Upload a custom certificate instead.',
}
# Regenerate → write → reload in one shot so the Caddyfile is always fresh.
if self.config_manager:
try:
ok = self.regenerate_with_installed([])
except Exception as exc:
logger.error('renew_cert: regenerate_with_installed failed: %s', exc)
ok = False
else:
ok = self.reload_caddy()
if not ok:
return {'ok': False, 'error': 'Caddy reload failed — check Caddy logs.'}
# Invalidate the cached status so the next poll triggers a fresh SSL check.
self._cert_refreshed_at = None
return {
'ok': True,
'status': 'pending',
'message': 'Renewal triggered. Certificate status will update within 60 s.',
}
def upload_custom_cert(self, cert_pem: str, key_pem: str) -> Dict[str, Any]:
"""Validate and install a custom TLS certificate.
Writes cert+key to the shared certs directory (visible to Caddy),
regenerates the Caddyfile to reference the new paths, and reloads.
Works for all domain modes — use this when you have a certificate
issued by your own CA or a commercial provider.
"""
cert_info = self._parse_pem_cert(cert_pem)
if cert_info is None:
return {'ok': False, 'error': 'Invalid certificate: could not parse PEM.'}
if not self._validate_key_pem(key_pem):
return {'ok': False, 'error': 'Invalid private key: expected PEM with PRIVATE KEY header.'}
try:
os.makedirs(CADDY_CERTS_DIR, exist_ok=True)
with open(os.path.join(CADDY_CERTS_DIR, 'cert.pem'), 'w') as fh:
fh.write(cert_pem)
with open(os.path.join(CADDY_CERTS_DIR, 'key.pem'), 'w') as fh:
fh.write(key_pem)
except OSError as exc:
logger.error('upload_custom_cert: write failed: %s', exc)
return {'ok': False, 'error': f'Failed to write cert files: {exc}'}
days = cert_info.get('days_remaining', 0)
tls_info: Dict[str, Any] = {
'status': 'valid' if days > 0 else 'expired',
'expiry': cert_info.get('expiry'),
'days_remaining': days,
'cert_type': 'custom',
}
if self.config_manager:
try:
self.config_manager.set_identity_field('tls', tls_info)
except Exception as exc:
logger.warning('upload_custom_cert: could not persist tls info: %s', exc)
# Regenerate Caddyfile so the tls directive references the new cert.
if self.config_manager:
try:
self.regenerate_with_installed([])
except Exception as exc:
logger.warning('upload_custom_cert: Caddyfile regeneration failed: %s', exc)
return {'ok': True, **tls_info}
@staticmethod
def _parse_pem_cert(cert_pem: str) -> Optional[Dict[str, Any]]:
"""Parse a PEM certificate and return expiry metadata, or None on error."""
try:
from cryptography import x509
cert_bytes = cert_pem.encode() if isinstance(cert_pem, str) else cert_pem
cert = x509.load_pem_x509_certificate(cert_bytes)
try:
expiry = cert.not_valid_after_utc
except AttributeError:
expiry = cert.not_valid_after.replace(tzinfo=_dt.timezone.utc) # type: ignore[attr-defined]
now = _dt.datetime.now(_dt.timezone.utc)
days = (expiry - now).days
return {
'expiry': expiry.isoformat(),
'days_remaining': days,
'subject': cert.subject.rfc4514_string(),
}
except Exception as exc:
logger.debug('_parse_pem_cert failed: %s', exc)
return None
@staticmethod
def _validate_key_pem(key_pem: str) -> bool:
"""Return True if key_pem contains a PEM-encoded private key block."""
return ('-----BEGIN' in key_pem
and 'PRIVATE KEY' in key_pem
and '-----END' in key_pem)
+41
View File
@@ -10,6 +10,7 @@ import subprocess
import logging
from datetime import datetime
from typing import Dict, List, Optional, Any
import bcrypt
from base_service_manager import BaseServiceManager
logger = logging.getLogger(__name__)
@@ -280,12 +281,51 @@ class CalendarManager(BaseServiceManager):
user_dir = os.path.join(self.calendar_data_dir, 'users', username)
self.safe_makedirs(user_dir)
# Write bcrypt entry to Radicale htpasswd (non-fatal if service not installed)
self._write_radicale_htpasswd(username, password)
logger.info(f"Created calendar user: {username}")
return True
except Exception as e:
logger.error(f"Failed to create calendar user {username}: {e}")
return False
def _radicale_htpasswd_path(self) -> str:
return os.path.join(self.data_dir, 'services', 'calendar', 'config', 'users')
def _write_radicale_htpasswd(self, username: str, password: str) -> None:
htpasswd = self._radicale_htpasswd_path()
config_dir = os.path.dirname(htpasswd)
if not os.path.isdir(config_dir):
return
try:
raw = bcrypt.hashpw(password.encode('utf-8'), bcrypt.gensalt()).decode('utf-8')
if raw.startswith('$2b$'):
raw = '$2y$' + raw[4:]
lines = []
if os.path.exists(htpasswd):
with open(htpasswd) as f:
lines = f.readlines()
lines = [l for l in lines if not l.startswith(f'{username}:')]
lines.append(f'{username}:{raw}\n')
with open(htpasswd, 'w') as f:
f.writelines(lines)
except Exception as e:
logger.warning('Failed to write Radicale htpasswd for %s: %s', username, e)
def _remove_radicale_htpasswd(self, username: str) -> None:
htpasswd = self._radicale_htpasswd_path()
if not os.path.exists(htpasswd):
return
try:
with open(htpasswd) as f:
lines = f.readlines()
lines = [l for l in lines if not l.startswith(f'{username}:')]
with open(htpasswd, 'w') as f:
f.writelines(lines)
except Exception as e:
logger.warning('Failed to remove Radicale htpasswd for %s: %s', username, e)
def delete_calendar_user(self, username: str) -> bool:
"""Delete a calendar user"""
try:
@@ -306,6 +346,7 @@ class CalendarManager(BaseServiceManager):
import shutil
shutil.rmtree(user_dir)
self._remove_radicale_htpasswd(username)
logger.info(f"Deleted calendar user: {username}")
return True
+2 -2
View File
@@ -426,7 +426,7 @@ class CellLinkManager:
try:
from app import config_manager
identity = config_manager.configs.get('_identity', {})
own_domain = identity.get('domain', os.environ.get('CELL_DOMAIN', ''))
own_domain = identity.get('domain_name') or identity.get('domain', os.environ.get('CELL_DOMAIN', ''))
if own_domain and remote_domain == own_domain:
raise ValueError(
f"Domain {remote_domain!r} is the same as this cell's own domain — "
@@ -466,7 +466,7 @@ class CellLinkManager:
identity = self._local_identity()
from app import config_manager
id_cfg = config_manager.configs.get('_identity', {})
own_domain = id_cfg.get('domain', os.environ.get('CELL_DOMAIN', 'cell'))
own_domain = id_cfg.get('domain_name') or id_cfg.get('domain', os.environ.get('CELL_DOMAIN', 'cell'))
own_invite = self.generate_invite(identity['cell_name'], own_domain)
except Exception as e:
return {'ok': False, 'error': f'could not build own invite: {e}'}
-83
View File
@@ -1,83 +0,0 @@
#!/usr/bin/env python3
"""
Configuration for Personal Internet Cell
"""
# Development mode - set to True for development, False for production
DEVELOPMENT_MODE = True
# Service configuration
SERVICES = {
'network': {
'enabled': True,
'development_status': {
'dns_running': True,
'dhcp_running': True,
'ntp_running': True,
'running': True,
'status': 'online'
}
},
'wireguard': {
'enabled': True,
'development_status': {
'running': True,
'status': 'online',
'interface': 'wg0',
'peers_count': 1,
'total_traffic': {'bytes_sent': 1024, 'bytes_received': 2048}
}
},
'email': {
'enabled': True,
'development_status': {
'running': True,
'status': 'online',
'smtp_running': True,
'imap_running': True,
'users_count': 0,
'domain': 'cell.local'
}
},
'calendar': {
'enabled': True,
'development_status': {
'running': True,
'status': 'online',
'users_count': 0,
'calendars_count': 0,
'events_count': 0
}
},
'files': {
'enabled': True,
'development_status': {
'running': True,
'status': 'online',
'webdav_status': {'running': True, 'port': 8080},
'users_count': 0,
'total_storage_used': {'bytes': 0, 'human_readable': '0 B'}
}
},
'routing': {
'enabled': True,
'development_status': {
'running': True,
'status': 'online',
'nat_rules_count': 1,
'peer_routes_count': 0,
'firewall_rules_count': 0,
'exit_nodes_count': 0
}
},
'vault': {
'enabled': True,
'development_status': {
'running': True,
'status': 'online',
'certificates_count': 1,
'secrets_count': 0,
'trusted_keys_count': 0
}
}
}
+844 -50
View File
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
+13
View File
@@ -0,0 +1,13 @@
"""
constants shared project-wide constants.
Single source of truth for values that multiple managers must agree on.
"""
# Core PIC infrastructure subdomains — never allow store services to hijack these.
# 'mail', 'calendar', 'files', 'webdav', 'webmail' are intentionally absent:
# they belong to official PIC store services and must be claimable by them.
RESERVED_SUBDOMAINS = frozenset({
'api', 'webui', 'admin', 'www', 'ns1', 'ns2',
'git', 'registry', 'install',
})
+306 -101
View File
@@ -7,16 +7,18 @@ cell's public IP registered under its chosen domain.
Supported providers:
pic_ngo pic.ngo DDNS service (primary / Phase 3 wiring)
cloudflare Cloudflare API v4 (stub; full impl in Phase 3b)
duckdns DuckDNS (stub; no DNS-01 support)
noip No-IP (stub)
freedns FreeDNS (stub)
cloudflare Cloudflare API v4
duckdns DuckDNS (no DNS-01 support)
'noip' and 'freedns' are NOT yet supported get_provider() rejects them
with a DDNSError so misconfiguration fails loudly instead of at update time.
The manager runs a background heartbeat thread that re-publishes the public IP
every 5 minutes, skipping the call when the IP has not changed.
"""
import logging
import os
import threading
import time
from typing import Any, Dict, Optional
@@ -36,6 +38,10 @@ class DDNSError(Exception):
"""Raised when a DDNS provider returns an error response."""
class DDNSTokenExpired(DDNSError):
"""Raised when the DDNS service rejects the token (401) — usually after a DB reset."""
# ---------------------------------------------------------------------------
# Provider base class
# ---------------------------------------------------------------------------
@@ -68,13 +74,25 @@ class PicNgoDDNS(DDNSProvider):
DEFAULT_API_BASE = 'https://ddns.pic.ngo'
TIMEOUT = 10
def __init__(self, api_base_url: Optional[str] = None):
def __init__(self, api_base_url: Optional[str] = None, totp_secret: Optional[str] = None):
self.api_base_url = (api_base_url or self.DEFAULT_API_BASE).rstrip('/')
self._totp_secret = totp_secret or ''
# ------------------------------------------------------------------
# Internal helpers
# ------------------------------------------------------------------
def _otp_header(self) -> Dict[str, str]:
"""Generate a fresh TOTP header for /register calls."""
if not self._totp_secret:
return {}
try:
import pyotp
return {'X-Register-OTP': pyotp.TOTP(self._totp_secret).now()}
except ImportError:
logger.warning("pyotp not installed — X-Register-OTP header omitted")
return {}
def _headers(self, token: Optional[str] = None) -> Dict[str, str]:
h: Dict[str, str] = {'Content-Type': 'application/json'}
if token:
@@ -83,6 +101,10 @@ class PicNgoDDNS(DDNSProvider):
def _raise_for_status(self, response: requests.Response, action: str):
if not response.ok:
if response.status_code == 401:
raise DDNSTokenExpired(
f"PicNgoDDNS {action} rejected token: HTTP 401 — {response.text}"
)
raise DDNSError(
f"PicNgoDDNS {action} failed: HTTP {response.status_code}{response.text}"
)
@@ -91,27 +113,38 @@ class PicNgoDDNS(DDNSProvider):
# Public interface
# ------------------------------------------------------------------
def release(self, token: str) -> bool:
"""DELETE /api/v1/registration — release the subdomain owned by token."""
url = f'{self.api_base_url}/api/v1/registration'
resp = requests.delete(url, json={'token': token},
headers=self._headers(), timeout=self.TIMEOUT)
self._raise_for_status(resp, 'release')
return True
def register(self, name: str, ip: str) -> dict:
"""POST /api/v1/register — register subdomain, returns token + subdomain."""
url = f'{self.api_base_url}/api/v1/register'
payload = {'name': name, 'ip': ip}
resp = requests.post(url, json=payload, headers=self._headers(), timeout=self.TIMEOUT)
headers = {**self._headers(), **self._otp_header()}
resp = requests.post(url, json=payload, headers=headers, timeout=self.TIMEOUT)
self._raise_for_status(resp, 'register')
return resp.json()
def update(self, token: str, ip: str) -> bool:
"""PUT /api/v1/update — update A record."""
url = f'{self.api_base_url}/api/v1/update'
payload = {'ip': ip}
# DDNS server validates token from request body, not Authorization header
payload = {'ip': ip, 'token': token}
resp = requests.put(url, json=payload,
headers=self._headers(token), timeout=self.TIMEOUT)
headers=self._headers(), timeout=self.TIMEOUT)
self._raise_for_status(resp, 'update')
return True
def dns_challenge_create(self, token: str, fqdn: str, value: str) -> bool:
"""POST /api/v1/dns-challenge — create DNS-01 TXT record."""
url = f'{self.api_base_url}/api/v1/dns-challenge'
payload = {'fqdn': fqdn, 'value': value}
# DDNS server authenticates the token from the request body, not the header
payload = {'fqdn': fqdn, 'value': value, 'token': token}
resp = requests.post(url, json=payload,
headers=self._headers(token), timeout=self.TIMEOUT)
self._raise_for_status(resp, 'dns_challenge_create')
@@ -120,7 +153,8 @@ class PicNgoDDNS(DDNSProvider):
def dns_challenge_delete(self, token: str, fqdn: str) -> bool:
"""DELETE /api/v1/dns-challenge — remove DNS-01 TXT record."""
url = f'{self.api_base_url}/api/v1/dns-challenge'
payload = {'fqdn': fqdn}
# DDNS server authenticates the token from the request body, not the header
payload = {'fqdn': fqdn, 'token': token}
resp = requests.delete(url, json=payload,
headers=self._headers(token), timeout=self.TIMEOUT)
self._raise_for_status(resp, 'dns_challenge_delete')
@@ -128,18 +162,19 @@ class PicNgoDDNS(DDNSProvider):
# ---------------------------------------------------------------------------
# Cloudflare provider (stub)
# Cloudflare provider
# ---------------------------------------------------------------------------
class CloudflareDDNS(DDNSProvider):
"""DDNS via Cloudflare API v4. Stub — full impl in Phase 3b."""
"""DDNS via Cloudflare API v4."""
API_BASE = 'https://api.cloudflare.com/client/v4'
TIMEOUT = 10
def __init__(self, api_token: str, zone_id: str):
def __init__(self, api_token: str, zone_id: str, domain: str = ''):
self.api_token = api_token
self.zone_id = zone_id
self.domain = domain
def _headers(self) -> Dict[str, str]:
return {
@@ -147,16 +182,92 @@ class CloudflareDDNS(DDNSProvider):
'Content-Type': 'application/json',
}
def _find_record_ids(self, record_type: str, name: str) -> list:
"""Return the ids of DNS records matching type+name, or [] when none exist."""
url = f'{self.API_BASE}/zones/{self.zone_id}/dns_records'
resp = requests.get(url, params={'type': record_type, 'name': name},
headers=self._headers(), timeout=self.TIMEOUT)
if not resp.ok:
raise DDNSError(
f"CloudflareDDNS record lookup failed: HTTP {resp.status_code}{resp.text}"
)
records = (resp.json() or {}).get('result') or []
return [r['id'] for r in records if r.get('id')]
def register(self, name: str, ip: str) -> dict:
# Cloudflare doesn't have a registration step — return stub data.
return {'token': self.api_token, 'subdomain': name}
def update(self, token: str, ip: str) -> bool:
"""PATCH /zones/{zone_id}/dns_records — update A record."""
url = f'{self.API_BASE}/zones/{self.zone_id}/dns_records'
resp = requests.patch(url, json={'ip': ip}, headers=self._headers(),
"""Update the A record: look up its record id, then PATCH that record."""
if not self.domain:
logger.error("CloudflareDDNS.update: no domain configured")
return False
try:
record_ids = self._find_record_ids('A', self.domain)
except DDNSError as exc:
logger.error("CloudflareDDNS.update: %s", exc)
return False
if not record_ids:
logger.error("CloudflareDDNS.update: no A record found for %s in zone %s",
self.domain, self.zone_id)
return False
url = f'{self.API_BASE}/zones/{self.zone_id}/dns_records/{record_ids[0]}'
payload = {'type': 'A', 'name': self.domain, 'content': ip}
resp = requests.patch(url, json=payload, headers=self._headers(),
timeout=self.TIMEOUT)
return resp.ok
if not resp.ok:
logger.error("CloudflareDDNS.update: PATCH failed: HTTP %s%s",
resp.status_code, resp.text)
return False
return True
def _ensure_a_record(self, name: str, ip: str) -> bool:
"""Ensure a single A record name → ip exists: POST when missing, PATCH when present."""
try:
record_ids = self._find_record_ids('A', name)
except DDNSError as exc:
logger.error("CloudflareDDNS.sync_service_records: lookup failed for %s: %s", name, exc)
return False
if record_ids:
url = f'{self.API_BASE}/zones/{self.zone_id}/dns_records/{record_ids[0]}'
payload = {'type': 'A', 'name': name, 'content': ip}
resp = requests.patch(url, json=payload, headers=self._headers(),
timeout=self.TIMEOUT)
else:
url = f'{self.API_BASE}/zones/{self.zone_id}/dns_records'
payload = {'type': 'A', 'name': name, 'content': ip, 'ttl': 120}
resp = requests.post(url, json=payload, headers=self._headers(),
timeout=self.TIMEOUT)
if not resp.ok:
logger.error("CloudflareDDNS.sync_service_records: write failed for %s: HTTP %s%s",
name, resp.status_code, resp.text)
return False
return True
def sync_service_records(self, subdomains, ip: str) -> dict:
"""Ensure the apex A record and one A record per service subdomain exist
and point at ip. Creates missing records (POST) and updates existing ones
(PATCH). Returns {'success': bool, 'synced': [...], 'failed': [...]}.
subdomains is an iterable of fully-qualified record names (e.g.
'mail.cell.example.com'). The apex (self.domain) is always synced.
"""
if not self.domain:
logger.error("CloudflareDDNS.sync_service_records: no domain configured")
return {'success': False, 'synced': [], 'failed': []}
names = [self.domain]
for sub in subdomains or []:
if sub and sub not in names:
names.append(sub)
synced = []
failed = []
for name in names:
if self._ensure_a_record(name, ip):
synced.append(name)
else:
failed.append(name)
return {'success': not failed, 'synced': synced, 'failed': failed}
def dns_challenge_create(self, token: str, fqdn: str, value: str) -> bool:
"""POST TXT record for DNS-01 challenge."""
@@ -167,9 +278,24 @@ class CloudflareDDNS(DDNSProvider):
return resp.ok
def dns_challenge_delete(self, token: str, fqdn: str) -> bool:
"""DELETE TXT record for DNS-01 challenge."""
# A real impl would look up the record ID first; stub returns True.
return True
"""Delete the DNS-01 TXT record(s): look up their ids, then DELETE each."""
try:
record_ids = self._find_record_ids('TXT', fqdn)
except DDNSError as exc:
logger.error("CloudflareDDNS.dns_challenge_delete: %s", exc)
return False
if not record_ids:
logger.warning("CloudflareDDNS.dns_challenge_delete: no TXT record found for %s", fqdn)
return False
all_ok = True
for record_id in record_ids:
url = f'{self.API_BASE}/zones/{self.zone_id}/dns_records/{record_id}'
resp = requests.delete(url, headers=self._headers(), timeout=self.TIMEOUT)
if not resp.ok:
logger.error("CloudflareDDNS.dns_challenge_delete: DELETE %s failed: HTTP %s%s",
record_id, resp.status_code, resp.text)
all_ok = False
return all_ok
# ---------------------------------------------------------------------------
@@ -201,46 +327,6 @@ class DuckDNSDDNS(DDNSProvider):
raise NotImplementedError("DuckDNS does not support programmatic TXT record deletion")
# ---------------------------------------------------------------------------
# No-IP provider (stub)
# ---------------------------------------------------------------------------
class NoIPDDNS(DDNSProvider):
"""DDNS via No-IP. Stub — DNS-01 not supported."""
def register(self, name: str, ip: str) -> dict:
raise NotImplementedError
def update(self, token: str, ip: str) -> bool:
raise NotImplementedError
def dns_challenge_create(self, token: str, fqdn: str, value: str) -> bool:
raise NotImplementedError
def dns_challenge_delete(self, token: str, fqdn: str) -> bool:
raise NotImplementedError
# ---------------------------------------------------------------------------
# FreeDNS provider (stub)
# ---------------------------------------------------------------------------
class FreeDNSDDNS(DDNSProvider):
"""DDNS via FreeDNS. Stub — DNS-01 not supported."""
def register(self, name: str, ip: str) -> dict:
raise NotImplementedError
def update(self, token: str, ip: str) -> bool:
raise NotImplementedError
def dns_challenge_create(self, token: str, fqdn: str, value: str) -> bool:
raise NotImplementedError
def dns_challenge_delete(self, token: str, fqdn: str) -> bool:
raise NotImplementedError
# ---------------------------------------------------------------------------
# Public IP helper
# ---------------------------------------------------------------------------
@@ -268,9 +354,13 @@ class DDNSManager(BaseServiceManager):
def __init__(self, config_manager=None,
data_dir: str = '/app/data',
config_dir: str = '/app/config'):
config_dir: str = '/app/config',
service_bus=None,
service_registry=None):
super().__init__('ddns', data_dir, config_dir)
self.config_manager = config_manager
self._service_bus = service_bus
self._service_registry = service_registry
self._last_ip: Optional[str] = None
self._stop_event = threading.Event()
self._heartbeat_thread: Optional[threading.Thread] = None
@@ -280,11 +370,9 @@ class DDNSManager(BaseServiceManager):
# ------------------------------------------------------------------
def get_status(self) -> Dict[str, Any]:
identity = self._identity()
domain_cfg = identity.get('domain', {})
return {
'service': 'ddns',
'provider': domain_cfg.get('ddns', {}).get('provider') if domain_cfg else None,
'provider': self._ddns_cfg().get('provider'),
'last_ip': self._last_ip,
'heartbeat_running': (
self._heartbeat_thread is not None and
@@ -293,7 +381,10 @@ class DDNSManager(BaseServiceManager):
}
def test_connectivity(self) -> Dict[str, Any]:
provider = self.get_provider()
try:
provider = self.get_provider()
except DDNSError as exc:
return {'success': False, 'reason': str(exc)}
if provider is None:
return {'success': False, 'reason': 'No DDNS provider configured'}
ip = _get_public_ip()
@@ -310,17 +401,45 @@ class DDNSManager(BaseServiceManager):
return {}
return self.config_manager.get_identity() or {}
def _ddns_cfg(self) -> Dict[str, Any]:
if self.config_manager is None:
return {}
return self.config_manager.configs.get('ddns', {}) or {}
def _get_token(self) -> str:
"""Return the DDNS bearer token from the secure token store."""
if self.config_manager is None:
return ''
if hasattr(self.config_manager, 'get_ddns_token'):
return self.config_manager.get_ddns_token() or ''
return self.config_manager.configs.get('ddns', {}).get('token', '')
def _fire_identity_changed(self, source: str) -> None:
"""Publish IDENTITY_CHANGED so CaddyManager regenerates its config."""
if self._service_bus is None:
return
try:
from service_bus import EventType
cell_name = self._identity().get('cell_name', '')
self._service_bus.publish_event(EventType.IDENTITY_CHANGED, source, {
'cell_name': cell_name,
})
except Exception as exc:
logger.warning('DDNSManager._fire_identity_changed: %s', exc)
# ------------------------------------------------------------------
# Provider factory
# ------------------------------------------------------------------
def get_provider(self) -> Optional[DDNSProvider]:
"""Instantiate and return the configured DDNS provider, or None."""
identity = self._identity()
domain_cfg = identity.get('domain', {})
if not domain_cfg:
"""Instantiate and return the configured DDNS provider, or None.
Raises DDNSError when the configured provider is recognised but not
yet supported ('noip', 'freedns').
"""
if self.config_manager is None:
return None
ddns_cfg = domain_cfg.get('ddns', {})
ddns_cfg = self.config_manager.configs.get('ddns', {})
if not ddns_cfg:
return None
@@ -329,13 +448,17 @@ class DDNSManager(BaseServiceManager):
return None
if provider_name == 'pic_ngo':
api_base = ddns_cfg.get('api_base_url')
return PicNgoDDNS(api_base_url=api_base)
# Env var takes priority so deployments can switch URLs without re-registering
_env_url = os.environ.get('DDNS_URL', '').replace('/api/v1', '').rstrip('/')
api_base = _env_url or ddns_cfg.get('api_base_url')
totp_secret = ddns_cfg.get('totp_secret') or os.environ.get('DDNS_TOTP_SECRET', '')
return PicNgoDDNS(api_base_url=api_base, totp_secret=totp_secret)
if provider_name == 'cloudflare':
return CloudflareDDNS(
api_token=ddns_cfg.get('api_token', ''),
zone_id=ddns_cfg.get('zone_id', ''),
domain=ddns_cfg.get('domain') or self._identity().get('domain_name', ''),
)
if provider_name == 'duckdns':
@@ -344,11 +467,11 @@ class DDNSManager(BaseServiceManager):
domain=ddns_cfg.get('domain', ''),
)
if provider_name == 'noip':
return NoIPDDNS()
if provider_name == 'freedns':
return FreeDNSDDNS()
if provider_name in ('noip', 'freedns'):
raise DDNSError(
f"DDNS provider {provider_name!r} is not yet supported — "
"use 'pic_ngo', 'cloudflare' or 'duckdns'"
)
logger.warning("Unknown DDNS provider: %s", provider_name)
return None
@@ -360,27 +483,44 @@ class DDNSManager(BaseServiceManager):
def register(self, name: str, ip: str) -> dict:
"""Register the cell's subdomain with the configured provider.
Stores the returned token in the identity config under
identity['domain']['ddns']['token'] and records the subdomain.
Fetches the public IP via ipify when ip is empty.
Stores the returned token in the top-level ddns config (where
update_ip reads it) and updates _identity.domain_name.
Returns the dict from provider.register().
"""
provider = self.get_provider()
if provider is None:
raise DDNSError("No DDNS provider configured")
if not ip:
ip = _get_public_ip() or ''
# Release the old subdomain if the name is changing and we hold a token
if self.config_manager is not None and hasattr(provider, 'release'):
old_token = self._get_token()
old_domain = self._identity().get('domain_name', '')
old_name = old_domain.replace('.pic.ngo', '') if old_domain else ''
if old_token and old_name and old_name != name:
try:
provider.release(old_token)
logger.info("DDNS released old subdomain %r before registering %r", old_name, name)
except Exception as exc:
logger.warning("DDNS could not release old subdomain %r: %s", old_name, exc)
result = provider.register(name, ip)
# Persist token + subdomain back into identity
identity = self._identity()
domain_cfg = dict(identity.get('domain', {}))
ddns_cfg = dict(domain_cfg.get('ddns', {}))
if 'token' in result:
ddns_cfg['token'] = result['token']
if 'subdomain' in result:
ddns_cfg['subdomain'] = result['subdomain']
domain_cfg['ddns'] = ddns_cfg
if self.config_manager is not None:
self.config_manager.set_identity_field('domain', domain_cfg)
# Token stored in data/api/ddns_token (not cell_config.json)
if 'token' in result:
if hasattr(self.config_manager, 'set_ddns_token'):
self.config_manager.set_ddns_token(result['token'])
else:
ddns_cfg = dict(self.config_manager.configs.get('ddns', {}))
ddns_cfg['token'] = result['token']
self.config_manager.set_ddns_config(ddns_cfg)
# Keep domain_name in identity up to date
if 'subdomain' in result:
self.config_manager.set_identity_field('domain_name', result['subdomain'])
self._last_ip = ip
return result
@@ -405,10 +545,26 @@ class DDNSManager(BaseServiceManager):
logger.debug("DDNS update_ip: IP unchanged (%s), skipping", current_ip)
return
identity = self._identity()
domain_cfg = identity.get('domain', {})
ddns_cfg = domain_cfg.get('ddns', {}) if domain_cfg else {}
token = ddns_cfg.get('token', '')
token = self._get_token()
# No token means we never successfully registered (e.g. wizard failed).
# Attempt registration immediately rather than waiting for the 401 cycle.
if not token:
provider_name = self._ddns_cfg().get('provider', '')
if provider_name == 'pic_ngo':
logger.info("DDNS update_ip: no token — attempting initial registration")
try:
cell_name = self._identity().get('cell_name', '')
if cell_name:
self.register(cell_name, current_ip)
logger.info("DDNS registered (no-token retry): cell_name=%r", cell_name)
self._last_ip = current_ip
self._fire_identity_changed('ddns_heartbeat')
else:
logger.error("DDNS update_ip: cannot register — cell_name not in identity")
except Exception as exc:
logger.error("DDNS update_ip: initial registration failed: %s", exc)
return
try:
success = provider.update(token, current_ip)
@@ -417,9 +573,64 @@ class DDNSManager(BaseServiceManager):
self._last_ip = current_ip
else:
logger.warning("DDNS update_ip: provider.update() returned False")
except DDNSTokenExpired:
logger.warning("DDNS update_ip: token rejected (401) — attempting re-registration")
try:
cell_name = self._identity().get('cell_name', '')
if cell_name:
self.register(cell_name, current_ip)
logger.info("DDNS re-registered after token expiry: cell_name=%r", cell_name)
self._last_ip = current_ip
self._fire_identity_changed('ddns_heartbeat')
else:
logger.error("DDNS update_ip: cannot re-register — cell_name not in identity")
except Exception as exc2:
logger.error("DDNS update_ip: re-registration failed: %s", exc2)
except DDNSError as exc:
logger.error("DDNS update_ip: provider error: %s", exc)
def sync_service_records(self) -> dict:
"""Sync per-service A records for providers that need explicit records
(currently Cloudflare). Builds the subdomain list from the service
registry via the effective domain and delegates to the provider.
"""
provider = self.get_provider()
if provider is None:
raise DDNSError("No DDNS provider configured")
if not hasattr(provider, 'sync_service_records'):
raise DDNSError(
f"Provider {self._ddns_cfg().get('provider')!r} does not support "
"per-service record sync"
)
ip = _get_public_ip()
if ip is None:
raise DDNSError("Could not determine public IP")
subdomains = self._service_record_names()
result = provider.sync_service_records(subdomains, ip)
if result.get('success'):
self._last_ip = ip
return result
def _service_record_names(self) -> list:
"""Return fully-qualified A record names for each installed service subdomain."""
if self.config_manager is None:
return []
try:
effective_domain = self.config_manager.get_effective_domain()
except Exception:
return []
registry = getattr(self, '_service_registry', None)
names = []
if registry is not None:
try:
for route in registry.get_caddy_routes():
subs = [route['subdomain']] + list(route.get('extra_subdomains') or [])
for sub in subs:
names.append(f'{sub}.{effective_domain}')
except Exception as exc:
logger.warning('_service_record_names: registry error: %s', exc)
return names
# ------------------------------------------------------------------
# Heartbeat
# ------------------------------------------------------------------
@@ -468,10 +679,7 @@ class DDNSManager(BaseServiceManager):
provider = self.get_provider()
if provider is None:
raise DDNSError("No DDNS provider configured")
identity = self._identity()
domain_cfg = identity.get('domain', {})
ddns_cfg = domain_cfg.get('ddns', {}) if domain_cfg else {}
token = ddns_cfg.get('token', '')
token = self._get_token()
return provider.dns_challenge_create(token, fqdn, value)
def dns_challenge_delete(self, fqdn: str) -> bool:
@@ -479,8 +687,5 @@ class DDNSManager(BaseServiceManager):
provider = self.get_provider()
if provider is None:
raise DDNSError("No DDNS provider configured")
identity = self._identity()
domain_cfg = identity.get('domain', {})
ddns_cfg = domain_cfg.get('ddns', {}) if domain_cfg else {}
token = ddns_cfg.get('token', '')
token = self._get_token()
return provider.dns_challenge_delete(token, fqdn)
+429
View File
@@ -0,0 +1,429 @@
#!/usr/bin/env python3
"""
EgressManager per-service egress enforcement.
Routes outbound traffic from installed service containers through
alternate exits (wireguard_ext, openvpn, tor) using host-side
iptables fwmark policy-routing. Integrates with ServiceStoreManager
for install/remove lifecycle hooks.
Rules live on the HOST in PIC_EGRESS chains in the mangle and nat
tables. Container IPs are discovered via docker inspect using the
container_name from the service manifest.
Connectivity v2: a service routes through a *connection instance* (by id),
sharing the same fwmark / routing table / redirect port as any peer that
egresses through the same connection. The (mark, table, redirect_port) for a
service are resolved from ConnectivityManager.get_connection(id) EgressManager
no longer owns its own per-type MARKS/TABLES tables.
"""
import logging
import subprocess
import time
from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
EGRESS_CHAIN = "PIC_EGRESS"
class EgressManager:
"""Per-service egress enforcement via host iptables fwmark policy-routing."""
def __init__(self, config_manager, service_store_manager=None,
connectivity_manager=None,
data_dir: str = "/app/data", config_dir: str = "/app/config"):
self.config_manager = config_manager
self.service_store_manager = service_store_manager
self.connectivity_manager = connectivity_manager
self._data_dir = data_dir
self._config_dir = config_dir
# ── Public API ─────────────────────────────────────────────────────────
def apply_service(self, service_id: str) -> Dict[str, Any]:
"""Idempotently apply egress rules for one installed service.
Steps:
1. Look up the service manifest.
2. clear_service first (ensures idempotency).
3. If the manifest has no egress block, skip silently.
4. Discover the container IP.
5. Resolve the connection id (override > manifest default > 'default').
6. If 'default', return early with no rules.
7. Otherwise resolve the connection's (mark, table, redirect_port),
create chains, ensure ip rules, add mark/redirect rules.
"""
manifest = self._get_manifest(service_id)
if manifest is None:
return {'ok': False, 'error': f'manifest not found for {service_id}'}
# Always clear first for idempotency
self.clear_service(service_id)
if not self._has_egress(manifest):
return {'ok': True, 'skipped': True}
container_name = manifest.get('container_name', '')
container_ip = self._discover_container_ip(container_name)
if not container_ip:
return {'ok': False, 'error': 'container IP not discoverable'}
connection_id = self._resolve_exit(service_id, manifest)
if connection_id == 'default':
return {'ok': True, 'exit_via': 'default'}
conn = self._get_connection(connection_id)
if conn is None:
return {
'ok': False,
'error': f'unknown connection {connection_id!r}',
}
mark = conn.get('mark')
table = conn.get('table')
if not isinstance(mark, int) or not isinstance(table, int):
return {
'ok': False,
'error': f'connection {connection_id!r} has no routing resources',
}
try:
self._ensure_chains()
self._ensure_host_ip_rule(mark, table)
self._add_mark_rule(container_ip, mark, service_id)
redirect_port = conn.get('redirect_port')
if isinstance(redirect_port, int):
self._add_redirect(container_ip, redirect_port, service_id)
except Exception as exc:
logger.error('apply_service(%s): %s', service_id, exc)
return {'ok': False, 'error': str(exc)}
return {'ok': True, 'exit_via': connection_id,
'container_ip': container_ip}
def clear_service(self, service_id: str) -> Dict[str, Any]:
"""Remove all PIC_EGRESS rules tagged for this service."""
try:
self._clear_egress_rules(service_id)
return {'ok': True}
except Exception as exc:
logger.error('clear_service(%s): %s', service_id, exc)
return {'ok': False, 'error': str(exc)}
def apply_all(self) -> Dict[str, Any]:
"""Apply egress rules for every installed service that has a manifest."""
installed = self.config_manager.get_installed_services()
results: Dict[str, Any] = {}
for svc_id, record in installed.items():
if not isinstance(record, dict) or not record.get('manifest'):
continue
results[svc_id] = self.apply_service(svc_id)
return {'ok': True, 'services': results}
def set_service_exit(self, service_id: str, connection_id: str) -> Dict[str, Any]:
"""Persist a per-service egress override (by connection id) and reapply.
`connection_id` is a real connection id or 'default'. A legacy exit
*type* string is accepted as a one-release back-compat shim and resolved
to the single connection instance of that type. The resolved
connection's type must be in the manifest's egress.allowed list.
"""
manifest = self._get_manifest(service_id)
if manifest is None:
return {'ok': False, 'error': f'service {service_id!r} not installed'}
if not self._has_egress(manifest):
return {'ok': False, 'error': f'service {service_id!r} has no egress configuration'}
if connection_id == 'default':
overrides = self._get_egress_overrides()
overrides[service_id] = 'default'
self._set_egress_overrides(overrides)
return self.apply_service(service_id)
resolved = self._resolve_connection_id(connection_id)
if resolved is None:
return {
'ok': False,
'error': f"unknown connection {connection_id!r}; "
f"must be a connection id or 'default'",
}
conn = self._get_connection(resolved)
egress = manifest.get('egress', {})
allowed = egress.get('allowed')
if isinstance(allowed, list) and conn is not None:
if conn.get('type') not in allowed:
return {
'ok': False,
'error': (
f"connection type {conn.get('type')!r} is not in the "
f'allowed list for {service_id}: {allowed}'
),
}
# Persist the override so it survives restarts
overrides = self._get_egress_overrides()
overrides[service_id] = resolved
self._set_egress_overrides(overrides)
return self.apply_service(service_id)
def _connections(self) -> List[dict]:
"""Return the v2 connection records, or [] when unavailable."""
if self.connectivity_manager is not None:
try:
conns = self.connectivity_manager.list_connections()
return conns if isinstance(conns, list) else []
except Exception as exc:
logger.warning('egress: list_connections failed: %s', exc)
return []
if self.config_manager is not None:
try:
conns = self.config_manager.list_connections()
return conns if isinstance(conns, list) else []
except Exception as exc:
logger.warning('egress: list_connections failed: %s', exc)
return []
def _get_connection(self, connection_id: str) -> Optional[dict]:
"""Resolve a connection record (with mark/table/redirect_port) by id."""
if self.connectivity_manager is not None:
try:
return self.connectivity_manager.get_connection(connection_id)
except Exception as exc:
logger.warning('egress: get_connection failed: %s', exc)
return None
if self.config_manager is not None:
try:
return self.config_manager.get_connection(connection_id)
except Exception as exc:
logger.warning('egress: get_connection failed: %s', exc)
return None
_LEGACY_EXIT_TYPES = ('wireguard_ext', 'openvpn', 'tor', 'sshuttle', 'proxy')
def _resolve_connection_id(self, value: str) -> Optional[str]:
"""Resolve a value to a valid connection id.
Accepts a real connection id, or as a back-compat shim a legacy
type string resolved to the single instance of that type. Returns None
when nothing matches.
"""
conns = self._connections()
for c in conns:
if c.get('id') == value:
return value
if value in self._LEGACY_EXIT_TYPES:
matches = [c for c in conns if c.get('type') == value]
if len(matches) == 1:
return matches[0].get('id')
return None
def get_status(self) -> Dict[str, Any]:
"""Return egress status for every installed service that has egress config."""
installed = self.config_manager.get_installed_services()
statuses: Dict[str, Any] = {}
for svc_id, record in installed.items():
if not isinstance(record, dict):
continue
manifest = record.get('manifest')
if not manifest or not self._has_egress(manifest):
continue
container_name = manifest.get('container_name', '')
container_ip = self._discover_container_ip(container_name, retries=1)
exit_via = self._resolve_exit(svc_id, manifest)
statuses[svc_id] = {
'exit_via': exit_via,
'container_ip': container_ip,
'has_egress': True,
}
return {'ok': True, 'services': statuses}
# ── Internals ──────────────────────────────────────────────────────────
def _get_manifest(self, service_id: str) -> Optional[dict]:
"""Retrieve the manifest for an installed service, if available."""
installed = self.config_manager.get_installed_services()
record = installed.get(service_id)
if not record:
return None
return record.get('manifest')
def _has_egress(self, manifest: dict) -> bool:
"""Return True only when the manifest explicitly declares an egress block."""
return bool(manifest.get('has_egress', False) and manifest.get('egress'))
def _resolve_exit(self, service_id: str, manifest: dict) -> str:
"""Determine the effective connection id for a service.
Priority: persisted override > manifest egress.default > 'default'.
Legacy type strings (from old overrides or a manifest default) are
resolved to the single connection instance of that type; if that can't
be resolved the service falls back to 'default'.
"""
overrides = self._get_egress_overrides()
if service_id in overrides:
value = overrides[service_id]
else:
egress = manifest.get('egress') or {}
value = egress.get('default', 'default')
if value == 'default':
return 'default'
if value in self._LEGACY_EXIT_TYPES:
resolved = self._resolve_connection_id(value)
return resolved if resolved is not None else 'default'
return value
def _discover_container_ip(self, container_name: str,
retries: int = 5, delay: float = 0.2) -> Optional[str]:
"""Return the container's cell-network IP, retrying on transient failure."""
if not container_name:
return None
for attempt in range(retries):
result = subprocess.run(
[
'docker', 'inspect',
'-f', '{{.NetworkSettings.Networks.cell-network.IPAddress}}',
container_name,
],
capture_output=True, text=True, timeout=10,
)
ip = result.stdout.strip()
if ip and result.returncode == 0:
return ip
if attempt < retries - 1:
time.sleep(delay)
return None
def _ensure_chains(self) -> None:
"""Idempotently create PIC_EGRESS chains in mangle and nat on the host."""
for table in ('mangle', 'nat'):
# Create the chain if it does not yet exist
check = self._iptables(['-t', table, '-L', EGRESS_CHAIN, '-n'])
if check.returncode != 0:
create = self._iptables(['-t', table, '-N', EGRESS_CHAIN])
if create.returncode != 0 and 'exists' not in (create.stderr or ''):
logger.warning(
'_ensure_chains: cannot create %s/%s: %s',
table, EGRESS_CHAIN, (create.stderr or '').strip(),
)
# Insert jump from PREROUTING at position 1 (idempotent via -C check)
jump_check = self._iptables(
['-t', table, '-C', 'PREROUTING', '-j', EGRESS_CHAIN]
)
if jump_check.returncode != 0:
self._iptables(
['-t', table, '-I', 'PREROUTING', '1', '-j', EGRESS_CHAIN]
)
def _ensure_host_ip_rule(self, mark: int, table: int) -> None:
"""Ensure a single `ip rule fwmark <mark> lookup <table>` exists.
Idempotent: drains any duplicate rules first, then adds exactly one.
The mark/table belong to the connection instance the service routes
through, so a peer and a service on the same connection share the rule.
"""
for _ in range(8):
r = self._ip_rule(['del', 'fwmark', hex(mark), 'lookup', str(table)])
if r.returncode != 0:
break
self._ip_rule(['add', 'fwmark', hex(mark), 'lookup', str(table)])
def _add_mark_rule(self, service_ip: str, mark: int, service_id: str) -> None:
"""Mark outbound packets from the service container with fwmark."""
self._iptables([
'-t', 'mangle', '-A', EGRESS_CHAIN,
'-s', service_ip,
'-j', 'MARK', '--set-mark', hex(mark),
'-m', 'comment', '--comment', self._tag(service_id),
])
def _add_redirect(self, service_ip: str, port: int, service_id: str) -> None:
"""Redirect the container's TCP traffic to a local transparent-proxy port."""
self._iptables([
'-t', 'nat', '-A', EGRESS_CHAIN,
'-s', service_ip, '-p', 'tcp',
'-j', 'REDIRECT', '--to-ports', str(port),
'-m', 'comment', '--comment', self._tag(service_id),
])
def _clear_egress_rules(self, service_id: str) -> None:
"""Remove all rules tagged pic-egr-<service_id> from mangle and nat."""
import re as _re
tag = self._tag(service_id)
comment_re = _re.compile(
rf'--comment\s+["\']?{_re.escape(tag)}["\']?(\s|$)'
)
for table in ('mangle', 'nat'):
try:
save = subprocess.run(
['iptables-save', '-t', table],
capture_output=True, text=True, timeout=10,
)
if save.returncode != 0:
continue
lines = save.stdout.splitlines()
filtered = [ln for ln in lines if not comment_re.search(ln)]
if len(filtered) == len(lines):
continue # nothing to remove
restore_input = '\n'.join(filtered) + '\n'
restore = subprocess.run(
['iptables-restore', '-T', table],
input=restore_input,
capture_output=True, text=True, timeout=10,
)
if restore.returncode != 0:
logger.warning(
'_clear_egress_rules(%s): iptables-restore for %s failed: %s',
service_id, table, (restore.stderr or '').strip(),
)
except Exception as exc:
logger.error('_clear_egress_rules(%s, %s): %s', service_id, table, exc)
@staticmethod
def _tag(service_id: str) -> str:
"""iptables comment tag used to identify rules belonging to a service."""
return f'pic-egr-{service_id}'
def _iptables(self, args: List[str], check: bool = False) -> subprocess.CompletedProcess:
"""Run iptables on the host with the given arguments."""
cmd = ['iptables'] + args
try:
return subprocess.run(cmd, capture_output=True, text=True, timeout=10)
except Exception as exc:
logger.error('_iptables %s: %s', args, exc)
raise
def _ip_rule(self, args: List[str]) -> subprocess.CompletedProcess:
"""Run `ip rule` on the host with the given arguments."""
cmd = ['ip', 'rule'] + args
try:
return subprocess.run(cmd, capture_output=True, text=True, timeout=10)
except Exception as exc:
logger.error('_ip_rule %s: %s', args, exc)
raise
# ── Config persistence helpers ─────────────────────────────────────────
def _get_egress_overrides(self) -> Dict[str, str]:
"""Return the persisted egress override map {service_id: exit_type}."""
try:
overrides = self.config_manager.configs.get('egress_overrides')
if isinstance(overrides, dict):
return dict(overrides)
except Exception:
pass
return {}
def _set_egress_overrides(self, overrides: Dict[str, str]) -> None:
"""Persist the egress override map to config."""
try:
self.config_manager.configs['egress_overrides'] = overrides
self.config_manager._save_all_configs()
except Exception as exc:
logger.error('_set_egress_overrides: %s', exc)
+43 -1
View File
@@ -19,7 +19,8 @@ logger = logging.getLogger(__name__)
class EmailManager(BaseServiceManager):
"""Manages email service configuration and users"""
def __init__(self, data_dir: str = '/app/data', config_dir: str = '/app/config'):
def __init__(self, data_dir: str = '/app/data', config_dir: str = '/app/config',
service_bus=None):
super().__init__('email', data_dir, config_dir)
self.email_data_dir = os.path.join(data_dir, 'email')
self.email_dir = self.email_data_dir # alias used by tests
@@ -33,6 +34,10 @@ class EmailManager(BaseServiceManager):
self.safe_makedirs(self.dovecot_dir)
self.safe_makedirs(os.path.dirname(self.domain_config_file))
if service_bus is not None:
from service_bus import EventType
service_bus.subscribe_to_event(EventType.IDENTITY_CHANGED, self._on_identity_changed)
def _get_service_config(self) -> Dict[str, Any]:
"""Read configured ports/domain from service config file."""
cfg = self.get_config()
@@ -252,6 +257,15 @@ class EmailManager(BaseServiceManager):
return {'restarted': restarted, 'warnings': warnings}
def _on_identity_changed(self, event) -> None:
"""Regenerate email config when cell identity changes."""
try:
effective = event.data.get('effective_domain')
if effective:
self.apply_config({'domain': effective})
except Exception as exc:
self.logger.warning('email_manager identity_changed handler failed: %s', exc)
def get_email_status(self) -> Dict[str, Any]:
"""Get detailed email service status including postfix/dovecot state."""
try:
@@ -326,12 +340,39 @@ class EmailManager(BaseServiceManager):
mailbox_dir = os.path.join(self.email_data_dir, 'mailboxes', f'{username}@{domain}')
self.safe_makedirs(mailbox_dir)
# Provision account in docker-mailserver (non-fatal if container not running)
self._dms_add_account(username, domain, password)
logger.info(f"Created email user: {username}@{domain}")
return True
except Exception as e:
logger.error(f"Failed to create email user {username}@{domain}: {e}")
return False
def _dms_add_account(self, username: str, domain: str, password: str) -> None:
try:
r = subprocess.run(
['docker', 'exec', 'cell-mail', 'setup', 'email', 'add',
f'{username}@{domain}', password],
capture_output=True, text=True, timeout=30, check=False,
)
if r.returncode != 0:
logger.warning('dms add account %s@%s: %s', username, domain, r.stderr.strip())
except Exception as e:
logger.warning('dms add account %s@%s failed (non-fatal): %s', username, domain, e)
def _dms_del_account(self, username: str, domain: str) -> None:
try:
r = subprocess.run(
['docker', 'exec', 'cell-mail', 'setup', 'email', 'del',
f'{username}@{domain}'],
capture_output=True, text=True, timeout=30, check=False,
)
if r.returncode != 0:
logger.warning('dms del account %s@%s: %s', username, domain, r.stderr.strip())
except Exception as e:
logger.warning('dms del account %s@%s failed (non-fatal): %s', username, domain, e)
def delete_email_user(self, username: str, domain: str) -> bool:
"""Delete an email user"""
try:
@@ -352,6 +393,7 @@ class EmailManager(BaseServiceManager):
import shutil
shutil.rmtree(mailbox_dir)
self._dms_del_account(username, domain)
logger.info(f"Deleted email user: {username}@{domain}")
return True
+92 -26
View File
@@ -41,6 +41,18 @@ CADDY_CONTAINER = 'cell-caddy'
COREFILE_PATH = '/app/config/dns/Corefile'
ZONE_DATA_DIR = '/data' # inside CoreDNS container; mounted from ./data/dns
# Optional callable wired by managers.py that returns the persisted CoreDNS log
# level (Python level name). Lets generate_corefile keep the configured level
# sticky across regenerations triggered for unrelated reasons (peer changes,
# IP-range edits) without threading config_manager through every call site.
_coredns_level_resolver = None
def set_coredns_level_resolver(resolver) -> None:
"""Wire the persisted-CoreDNS-level resolver (called once at startup)."""
global _coredns_level_resolver
_coredns_level_resolver = resolver
def _run(cmd: List[str], check: bool = True) -> subprocess.CompletedProcess:
"""Run a shell command and return the result."""
@@ -569,7 +581,7 @@ def ensure_dns_dnat() -> bool:
def ensure_service_dnat() -> bool:
"""DNAT wg0:80 (scoped to WG server IP) → cell-caddy:80.
"""DNAT wg0:80 and wg0:443 (scoped to WG server IP) → cell-caddy.
Service DNS names resolve to the WG server IP. DNAT is scoped with -d {server_ip}
so that cross-cell HTTP traffic destined for another cell passes through unmodified.
@@ -583,21 +595,22 @@ def ensure_service_dnat() -> bool:
if not caddy_ip:
logger.warning('ensure_service_dnat: cell-caddy not found')
return False
dnat_check = ['-t', 'nat', '-C', 'PREROUTING', '-i', 'wg0', '-d', server_ip,
'-p', 'tcp', '--dport', '80',
'-j', 'DNAT', '--to-destination', f'{caddy_ip}:80']
dnat_add = ['-t', 'nat', '-A', 'PREROUTING', '-i', 'wg0', '-d', server_ip,
'-p', 'tcp', '--dport', '80',
'-j', 'DNAT', '--to-destination', f'{caddy_ip}:80']
if _wg_exec(['iptables'] + dnat_check).returncode != 0:
_wg_exec(['iptables'] + dnat_add)
fwd_check = ['-C', 'FORWARD', '-i', 'wg0', '-o', 'eth0',
'-p', 'tcp', '--dport', '80', '-j', 'ACCEPT']
fwd_add = ['-I', 'FORWARD', '-i', 'wg0', '-o', 'eth0',
'-p', 'tcp', '--dport', '80', '-j', 'ACCEPT']
if _wg_exec(['iptables'] + fwd_check).returncode != 0:
_wg_exec(['iptables'] + fwd_add)
logger.info(f'ensure_service_dnat: wg0:{server_ip}:80 → {caddy_ip}:80')
for port in ('80', '443'):
dnat_check = ['-t', 'nat', '-C', 'PREROUTING', '-i', 'wg0', '-d', server_ip,
'-p', 'tcp', '--dport', port,
'-j', 'DNAT', '--to-destination', f'{caddy_ip}:{port}']
dnat_add = ['-t', 'nat', '-A', 'PREROUTING', '-i', 'wg0', '-d', server_ip,
'-p', 'tcp', '--dport', port,
'-j', 'DNAT', '--to-destination', f'{caddy_ip}:{port}']
if _wg_exec(['iptables'] + dnat_check).returncode != 0:
_wg_exec(['iptables'] + dnat_add)
fwd_check = ['-C', 'FORWARD', '-i', 'wg0', '-o', 'eth0',
'-p', 'tcp', '--dport', port, '-j', 'ACCEPT']
fwd_add = ['-I', 'FORWARD', '-i', 'wg0', '-o', 'eth0',
'-p', 'tcp', '--dport', port, '-j', 'ACCEPT']
if _wg_exec(['iptables'] + fwd_check).returncode != 0:
_wg_exec(['iptables'] + fwd_add)
logger.info(f'ensure_service_dnat: wg0:{server_ip}:80+443 → {caddy_ip}')
return True
except Exception as e:
logger.error(f'ensure_service_dnat: {e}')
@@ -708,9 +721,21 @@ def _build_acl_block(blocked_peers_by_service: Dict[str, List[str]],
return '\n'.join(lines)
def _coredns_log_directive(level: str) -> str:
"""Return the per-block logging directive line for CoreDNS.
DEBUG the verbose `log` query-logging plugin. Any higher level `errors`
only (CoreDNS has no INFO/WARN query-log granularity), keeping the per-cell
DNS logs quiet by default.
"""
return 'log' if (level or 'INFO').upper() == 'DEBUG' else 'errors'
def generate_corefile(peers: List[Dict[str, Any]], corefile_path: str = COREFILE_PATH,
domain: str = 'cell',
cell_links: Optional[List[Dict[str, Any]]] = None) -> bool:
cell_links: Optional[List[Dict[str, Any]]] = None,
split_horizon_zones: Optional[List[str]] = None,
coredns_level: Optional[str] = None) -> bool:
"""
Rewrite the CoreDNS Corefile with per-peer ACL rules and reload plugin.
The file is written to corefile_path (API-side path mapped into CoreDNS container).
@@ -718,6 +743,10 @@ def generate_corefile(peers: List[Dict[str, Any]], corefile_path: str = COREFILE
cell_links: optional list of cell-to-cell DNS forwarding entries, each a dict with
'domain' and 'dns_ip' keys (same shape as CellLinkManager.list_connections()).
When non-empty, a forwarding stanza is appended for each entry.
split_horizon_zones: optional list of FQDNs (e.g. ['pic1.pic.ngo']) for which a
local authoritative zone block is added so LAN clients resolve
service subdomains to the internal Caddy IP without hairpin NAT.
Each zone must have a corresponding zone file under /data/<fqdn>.zone.
"""
try:
# Collect which peers block which services
@@ -733,7 +762,14 @@ def generate_corefile(peers: List[Dict[str, Any]], corefile_path: str = COREFILE
acl_block = _build_acl_block(blocked, domain)
primary_zone_block = f'{domain} {{\n file /data/{domain}.zone\n log\n'
if coredns_level is None and _coredns_level_resolver is not None:
try:
coredns_level = _coredns_level_resolver()
except Exception:
coredns_level = 'INFO'
log_directive = _coredns_log_directive(coredns_level)
primary_zone_block = f'{domain} {{\n file /data/{domain}.zone\n {log_directive}\n'
if acl_block:
primary_zone_block += acl_block + '\n'
primary_zone_block += '}\n'
@@ -741,13 +777,36 @@ def generate_corefile(peers: List[Dict[str, Any]], corefile_path: str = COREFILE
corefile = f""". {{
forward . 8.8.8.8 1.1.1.1
cache
log
{log_directive}
health
reload
}}
{primary_zone_block}"""
# Split-horizon zones for DDNS/public domains — LAN clients resolve
# *.pic1.pic.ngo to the internal Caddy IP without hairpin NAT.
if split_horizon_zones:
for sz in split_horizon_zones:
# More-specific block for ACME DNS-01 challenge records: forward
# to public DNS so Caddy can verify TXT records it creates on the
# DDNS server. Without this, the wildcard A record in the zone
# file causes CoreDNS to return NODATA for TXT queries, blocking
# Caddy's internal pre-verification step.
corefile += (
f'\n_acme-challenge.{sz} {{\n'
f' forward . 8.8.8.8 1.1.1.1\n'
f' cache\n'
f' {log_directive}\n'
f'}}\n'
)
corefile += (
f'\n{sz} {{\n'
f' file /data/{sz}.zone\n'
f' {log_directive}\n'
f'}}\n'
)
# Append cell-to-cell DNS forwarding stanzas if provided
if cell_links:
for link in cell_links:
@@ -759,21 +818,27 @@ def generate_corefile(peers: List[Dict[str, Any]], corefile_path: str = COREFILE
f'\n{link_domain} {{\n'
f' forward . {link_dns_ip}\n'
f' cache\n'
f' log\n'
f' {log_directive}\n'
f'}}\n'
)
else:
elif not split_horizon_zones:
corefile += '\n'
# local.{domain} block intentionally omitted: /data/local.zone does not exist
# and CoreDNS logs errors on every reload for a missing zone file.
os.makedirs(os.path.dirname(corefile_path), exist_ok=True)
tmp_path = corefile_path + '.tmp'
with open(tmp_path, 'w') as f:
# Write in place (truncate + rewrite the SAME inode) rather than
# writing a temp file and os.replace()-ing it in. The Corefile is a
# Docker FILE bind-mount (./config/dns/Corefile:/etc/coredns/Corefile);
# os.replace creates a NEW inode, but the container stays bound to the
# original inode and never sees the update — so CoreDNS silently runs
# stale config until the container restarts. CoreDNS only re-reads on
# the SIGUSR1 we send right after this completes, so a non-atomic
# write is safe here.
with open(corefile_path, 'w') as f:
f.write(corefile)
f.flush()
os.fsync(f.fileno())
os.replace(tmp_path, corefile_path)
logger.info(f"Wrote Corefile to {corefile_path}")
return True
@@ -798,9 +863,10 @@ def reload_coredns() -> bool:
def apply_all_dns_rules(peers: List[Dict[str, Any]], corefile_path: str = COREFILE_PATH,
domain: str = 'cell',
cell_links: Optional[List[Dict[str, Any]]] = None) -> bool:
cell_links: Optional[List[Dict[str, Any]]] = None,
split_horizon_zones: Optional[List[str]] = None) -> bool:
"""Regenerate Corefile (including any cell-to-cell forwarding stanzas) and reload CoreDNS."""
ok = generate_corefile(peers, corefile_path, domain, cell_links)
ok = generate_corefile(peers, corefile_path, domain, cell_links, split_horizon_zones)
if ok:
reload_coredns()
return ok
+3 -3
View File
@@ -164,7 +164,7 @@ http://{cell_name}.{domain}, http://{caddy_ip}:80 {{
reverse_proxy cell-rainloop:8888
}}
handle {{
reverse_proxy cell-webui:80
reverse_proxy cell-webui:8080
}}
}}
@@ -190,7 +190,7 @@ http://api.{domain} {{
}}
http://webui.{domain} {{
reverse_proxy cell-webui:80
reverse_proxy cell-webui:8080
}}
# Catch-all for direct IP / localhost
@@ -199,7 +199,7 @@ http://webui.{domain} {{
reverse_proxy cell-api:3000
}}
handle {{
reverse_proxy cell-webui:80
reverse_proxy cell-webui:8080
}}
}}
"""
+53
View File
@@ -0,0 +1,53 @@
"""One-shot cleanup of legacy builtin containers from the old main compose stack."""
import logging
import subprocess
logger = logging.getLogger('picell')
_LEGACY_BUILTIN_CONTAINERS = [
'cell-mail', 'cell-rainloop', 'cell-radicale', 'cell-webdav', 'cell-filegator',
]
def cleanup_legacy_builtin_containers(config_manager) -> None:
"""Remove legacy containers whose compose project is 'pic' (main stack).
Idempotent guarded by _meta.legacy_builtins_cleaned in cell_config.json.
Containers from per-service installs (project != 'pic') are left untouched.
"""
try:
already_done = config_manager.configs.get('_meta', {}).get('legacy_builtins_cleaned', False)
if already_done:
return
except Exception:
return
removed = []
for cname in _LEGACY_BUILTIN_CONTAINERS:
try:
inspect = subprocess.run(
['docker', 'inspect', cname,
'--format', '{{index .Config.Labels "com.docker.compose.project"}}'],
capture_output=True, text=True, timeout=10,
)
if inspect.returncode != 0:
continue
project = inspect.stdout.strip()
if project != 'pic':
continue
subprocess.run(['docker', 'stop', cname], capture_output=True, timeout=30)
subprocess.run(['docker', 'rm', cname], capture_output=True, timeout=30)
removed.append(cname)
except Exception as exc:
logger.warning('cleanup_legacy_builtin_containers: %s: %s', cname, exc)
try:
meta = dict(config_manager.configs.get('_meta', {}))
meta['legacy_builtins_cleaned'] = True
config_manager.configs['_meta'] = meta
config_manager._save_all_configs()
except Exception as exc:
logger.warning('cleanup_legacy_builtin_containers: failed to set sentinel: %s', exc)
if removed:
logger.info('Removed legacy builtin containers: %s', ', '.join(removed))
+23 -1
View File
@@ -21,6 +21,20 @@ from enum import Enum
logger = logging.getLogger(__name__)
# Maps a verbosity-panel service name to the bare module logger(s) used by the
# corresponding manager (logging.getLogger(__name__)). Managers log under BOTH
# 'picell.<service>' (self.logger) and their module name, so a verbosity change
# must reach both for per-service log files to capture everything.
_SERVICE_MODULE_LOGGERS = {
'network': ['network_manager'],
'wireguard': ['wireguard_manager'],
'email': ['email_manager'],
'calendar': ['calendar_manager'],
'files': ['file_manager'],
'routing': ['routing_manager', 'firewall_manager'],
'vault': ['vault_manager'],
}
class LogLevel(Enum):
"""Log levels"""
DEBUG = "DEBUG"
@@ -499,7 +513,13 @@ class LogManager:
return {'error': str(e)}
def set_service_level(self, service: str, level: str):
"""Change log level for a service at runtime."""
"""Change log level for a service at runtime.
Sets BOTH the 'picell.<service>' logger (self.logger in managers) AND the
bare module logger(s) the manager uses via logging.getLogger(__name__),
so the change reaches every record a service emits not just the half
that goes through self.logger.
"""
try:
log_level = getattr(logging, level.upper(), logging.INFO)
if service in self.service_loggers:
@@ -509,6 +529,8 @@ class LogManager:
logger.info(f"Set log level for {service} to {level}")
else:
logger.warning(f"Service logger not found: {service}")
for module_name in _SERVICE_MODULE_LOGGERS.get(service, []):
logging.getLogger(module_name).setLevel(log_level)
except Exception as e:
logger.error(f"Error setting log level for {service}: {e}")
+100 -45
View File
@@ -31,6 +31,10 @@ from setup_manager import SetupManager
from caddy_manager import CaddyManager
from ddns_manager import DDNSManager
from connectivity_manager import ConnectivityManager
from service_registry import ServiceRegistry
from service_composer import ServiceComposer
from account_manager import AccountManager
from audit_manager import AuditManager
DATA_DIR = os.environ.get('DATA_DIR', '/app/data')
CONFIG_DIR = os.environ.get('CONFIG_DIR', '/app/config')
@@ -42,41 +46,9 @@ config_manager = ConfigManager(
service_bus = ServiceBus()
log_manager = LogManager(log_dir='./data/logs')
network_manager = NetworkManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
wireguard_manager = WireGuardManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
peer_registry = PeerRegistry(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
email_manager = EmailManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
calendar_manager = CalendarManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
file_manager = FileManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
routing_manager = RoutingManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
vault_manager = VaultManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
container_manager = ContainerManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
cell_link_manager = CellLinkManager(
data_dir=DATA_DIR, config_dir=CONFIG_DIR,
wireguard_manager=wireguard_manager,
network_manager=network_manager,
)
auth_manager = AuthManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
setup_manager = SetupManager(config_manager=config_manager, auth_manager=auth_manager)
caddy_manager = CaddyManager(config_manager=config_manager, data_dir=DATA_DIR, config_dir=CONFIG_DIR)
ddns_manager = DDNSManager(config_manager=config_manager, data_dir=DATA_DIR, config_dir=CONFIG_DIR)
connectivity_manager = ConnectivityManager(
config_manager=config_manager,
peer_registry=peer_registry,
data_dir=DATA_DIR,
config_dir=CONFIG_DIR,
)
from service_store_manager import ServiceStoreManager
service_store_manager = ServiceStoreManager(
config_manager=config_manager,
caddy_manager=caddy_manager,
container_manager=container_manager,
data_dir=DATA_DIR,
config_dir=CONFIG_DIR,
)
# Service logger configuration
# Attach per-service file loggers BEFORE any manager is instantiated. Managers
# log during __init__ via self.logger ('picell.<svc>'); without the handlers in
# place first, those early records would be lost from the per-service log files.
_service_log_configs = {
'network': {'level': 'INFO', 'formatter': 'json', 'console': False},
'wireguard': {'level': 'INFO', 'formatter': 'json', 'console': False},
@@ -90,16 +62,97 @@ _service_log_configs = {
for _svc, _cfg in _service_log_configs.items():
log_manager.add_service_logger(_svc, _cfg)
# Apply any persisted log level overrides
import json as _json
_levels_file = os.path.join(os.path.dirname(__file__), 'config', 'log_levels.json')
if os.path.exists(_levels_file):
try:
with open(_levels_file) as _lf:
for _s, _l in _json.load(_lf).items():
log_manager.set_service_level(_s, _l)
except Exception:
pass
# ServiceRegistry depends only on config_manager; create it early so
# NetworkManager and CaddyManager can derive subdomains from manifests
# instead of hardcoding service names.
service_registry = ServiceRegistry(config_manager=config_manager)
network_manager = NetworkManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR,
service_registry=service_registry)
wireguard_manager = WireGuardManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
peer_registry = PeerRegistry(data_dir=DATA_DIR, config_dir=CONFIG_DIR,
config_manager=config_manager)
email_manager = EmailManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR, service_bus=service_bus)
calendar_manager = CalendarManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
file_manager = FileManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
routing_manager = RoutingManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
vault_manager = VaultManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
container_manager = ContainerManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
cell_link_manager = CellLinkManager(
data_dir=DATA_DIR, config_dir=CONFIG_DIR,
wireguard_manager=wireguard_manager,
network_manager=network_manager,
)
auth_manager = AuthManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
caddy_manager = CaddyManager(config_manager=config_manager, data_dir=DATA_DIR, config_dir=CONFIG_DIR,
service_bus=service_bus, service_registry=service_registry)
ddns_manager = DDNSManager(config_manager=config_manager, data_dir=DATA_DIR, config_dir=CONFIG_DIR,
service_bus=service_bus, service_registry=service_registry)
connectivity_manager = ConnectivityManager(
config_manager=config_manager,
peer_registry=peer_registry,
vault_manager=vault_manager,
data_dir=DATA_DIR,
config_dir=CONFIG_DIR,
)
service_composer = ServiceComposer(config_manager=config_manager, data_dir=DATA_DIR)
# Connectivity brings one container up per connection instance via the composer;
# wire it now that the composer exists (composer is built after connectivity).
connectivity_manager.service_composer = service_composer
# cell_relay connections are derived from cell links and route through the cell
# WG tunnel; wire the managers that drive that path + handshake-based health.
connectivity_manager.cell_link_manager = cell_link_manager
connectivity_manager.wireguard_manager = wireguard_manager
account_manager = AccountManager(
service_registry=service_registry,
data_dir=DATA_DIR,
config_manager=config_manager,
email_manager=email_manager,
calendar_manager=calendar_manager,
file_manager=file_manager,
)
from service_store_manager import ServiceStoreManager
service_store_manager = ServiceStoreManager(
config_manager=config_manager,
caddy_manager=caddy_manager,
container_manager=container_manager,
data_dir=DATA_DIR,
config_dir=CONFIG_DIR,
service_composer=service_composer,
)
from egress_manager import EgressManager
egress_manager = EgressManager(
config_manager=config_manager,
service_store_manager=service_store_manager,
connectivity_manager=connectivity_manager,
data_dir=DATA_DIR,
config_dir=CONFIG_DIR,
)
service_store_manager.egress_manager = egress_manager
audit_manager = AuditManager(data_dir=DATA_DIR, config_dir=CONFIG_DIR)
setup_manager = SetupManager(config_manager=config_manager, auth_manager=auth_manager,
network_manager=network_manager)
# Apply persisted per-service log levels from ConfigManager (single source of
# truth — the logging section of cell_config). This runs AFTER managers are
# instantiated so it overrides their default INFO and reaches the module loggers.
try:
_logging_cfg = config_manager.get_logging_config()
for _svc, _lvl in _logging_cfg['python']['services'].items():
log_manager.set_service_level(_svc, _lvl)
except Exception:
pass
# Let generate_corefile keep the configured CoreDNS log level sticky across all
# regenerations, not just verbosity-triggered ones.
firewall_manager.set_coredns_level_resolver(
lambda: config_manager.get_logging_config()['containers'].get('coredns', 'INFO')
)
service_bus.start()
@@ -110,6 +163,8 @@ __all__ = [
'routing_manager', 'vault_manager', 'container_manager',
'cell_link_manager', 'auth_manager', 'setup_manager', 'caddy_manager',
'ddns_manager', 'service_store_manager', 'connectivity_manager',
'service_registry', 'service_composer', 'account_manager',
'egress_manager', 'audit_manager',
'firewall_manager', 'EventType',
'DATA_DIR', 'CONFIG_DIR',
]
+550
View File
@@ -0,0 +1,550 @@
"""
manifest_validator single chokepoint for all manifest and compose YAML security checks.
Both ServiceComposer and ServiceStoreManager import from here so validation logic
lives in exactly one place and cannot be bypassed by taking either code path.
"""
import logging
import re
import yaml
from constants import RESERVED_SUBDOMAINS
logger = logging.getLogger('picell')
_SUBDOMAIN_RE = re.compile(r'^[a-z][a-z0-9-]{0,30}$')
_BACKEND_RE = re.compile(r'^[A-Za-z0-9._-]+:\d{1,5}$')
_CAP_ALLOWLIST = frozenset({
'NET_ADMIN', 'NET_RAW', 'NET_BIND_SERVICE', 'CHOWN', 'DAC_OVERRIDE',
'SETUID', 'SETGID', 'KILL', 'SYS_NICE',
})
_CAP_DENYLIST = frozenset({
'ALL', 'SYS_ADMIN', 'SYS_MODULE', 'SYS_PTRACE', 'SYS_RAWIO',
'SYS_BOOT', 'MAC_ADMIN', 'MAC_OVERRIDE', 'SYS_TIME', 'SYS_TTY_CONFIG',
})
_BACKEND_DENYLIST = frozenset({
'cell-api', 'cell-caddy', 'cell-coredns', 'cell-dnsmasq',
'cell-wireguard', 'cell-vault', 'localhost', '127.0.0.1',
'0.0.0.0', 'host.docker.internal',
})
_RESERVED_CONTAINER_NAMES = frozenset({
'cell-api', 'cell-caddy', 'cell-webui', 'cell-coredns',
'cell-dnsmasq', 'cell-wireguard', 'cell-chrony',
})
_CONTAINER_NAME_RE = re.compile(r'^cell-[a-z0-9][a-z0-9-]{0,30}$')
# Instanceable services template their container name with the connection's
# short id, e.g. "cell-wgext-${INSTANCE_ID}". The literal prefix is validated;
# ${INSTANCE_ID} is substituted at up-time with a hex token that itself matches
# the per-instance naming rules.
_INSTANCEABLE_CONTAINER_NAME_RE = re.compile(
r'^cell-[a-z0-9][a-z0-9-]{0,22}-\$\{INSTANCE_ID\}$'
)
_ENV_VALUE_RE = re.compile(r'^[A-Za-z0-9._@:/+\-]{0,256}$')
_HOOK_BINARY_RE = re.compile(r'^[a-z][a-z0-9_-]{0,31}$')
_CAP_NAME_RE = re.compile(r'^[A-Z_]+$')
_ID_RE = re.compile(r'^[a-z][a-z0-9_-]{0,30}$')
_IMAGE_DIGEST_RE = re.compile(
r'^git\.pic\.ngo/roof/[a-zA-Z0-9._/-]+@sha256:[0-9a-f]{64}$'
)
# ── Build-context (Dockerfile) lint ───────────────────────────────────────
#
# These checks are *defense-in-depth*, not a guarantee. A Dockerfile is
# Turing-ish: a determined author can still fetch code at build time via a
# permitted base image's package manager, multi-stage tricks, or build args.
# The real trust boundary is the isolated builder + cosign signature applied
# by the trusted publish stage (P2). This static lint exists to catch the
# obvious-and-cheap mistakes (un-pinned bases, remote ADD, secret-named args)
# before an image is ever built, and to keep the published corpus uniform.
# Base images a community Dockerfile may build FROM. Each MUST be digest
# pinned so the build is reproducible and the base cannot be swapped under us.
# Keep this curated and small; extend deliberately as P2/P3 add languages.
BUILD_BASE_IMAGE_ALLOWLIST = frozenset({
'docker.io/library/alpine',
'docker.io/library/debian',
'docker.io/library/python',
'docker.io/library/golang',
'docker.io/library/node',
'alpine',
'debian',
'python',
'golang',
'node',
'gcr.io/distroless/static',
'gcr.io/distroless/base',
})
# FROM scratch is only allowed for these (otherwise rejected). Empty by
# default — community images should start from a pinned, scannable base.
BUILD_SCRATCH_ALLOWLIST = frozenset()
_DOCKERFILE_SECRET_NAME_RE = re.compile(r'(TOKEN|KEY|PASSWORD|SECRET)', re.IGNORECASE)
_FROM_RE = re.compile(r'^FROM\s+(.+?)(?:\s+AS\s+\S+)?$', re.IGNORECASE)
_ADD_RE = re.compile(r'^ADD\s+(.+)$', re.IGNORECASE)
_ARG_RE = re.compile(r'^ARG\s+([A-Za-z_][A-Za-z0-9_]*)', re.IGNORECASE)
_ENV_RE = re.compile(r'^ENV\s+(.+)$', re.IGNORECASE)
# Context size / file-count caps — a community build context should be small
# (a Dockerfile + a handful of config/entrypoint files), never a whole tree.
BUILD_CONTEXT_MAX_BYTES = 5 * 1024 * 1024 # 5 MiB
BUILD_CONTEXT_MAX_FILES = 200
def validate_manifest(manifest: dict) -> tuple:
"""
Validate security-relevant fields of a store manifest.
Returns (True, []) when all checks pass; (False, [error_strings]) otherwise.
Does not replace the existing _validate_manifest in ServiceStoreManager
it supplements it as a second layer focused on security-critical fields.
"""
errors = []
# schema_version must be 3
schema_version = manifest.get('schema_version')
if schema_version is not None and schema_version != 3:
errors.append(
f'schema_version must be 3, got: {schema_version!r}'
)
# kind must be "store" if present — reject builtins coming in over the wire
kind = manifest.get('kind')
if kind is not None and kind != 'store':
errors.append(f'manifest kind must be "store", got: {kind!r}')
# id format check
manifest_id = manifest.get('id')
if manifest_id is not None:
if not isinstance(manifest_id, str) or not _ID_RE.match(manifest_id):
errors.append(
f'id must match ^[a-z][a-z0-9_-]{{0,30}}$, got: {manifest_id!r}'
)
# image must come from git.pic.ngo/roof/*; if a digest IS provided it must be
# valid; first-party images without a digest pin are allowed with a warning.
image = manifest.get('image')
if image is not None:
if not isinstance(image, str):
errors.append(f'image must be a string, got: {image!r}')
elif not image.startswith('git.pic.ngo/roof/'):
errors.append(
f'image must be from git.pic.ngo/roof/*, got: {image!r}'
)
elif '@sha256:' in image:
if not _IMAGE_DIGEST_RE.match(image):
errors.append(
f'image digest must match @sha256:<64-hex>, got: {image!r}'
)
else:
logger.warning('manifest image %s has no digest pin', image)
# container_name structural check
cname = manifest.get('container_name')
if cname is not None:
instanceable = bool(manifest.get('instanceable'))
if instanceable:
if not _INSTANCEABLE_CONTAINER_NAME_RE.match(cname):
errors.append(
'instanceable container_name must match '
"^cell-[a-z0-9][a-z0-9-]{0,22}-${INSTANCE_ID}$, "
f'got: {cname!r}'
)
elif not _CONTAINER_NAME_RE.match(cname):
errors.append(
f'container_name must match ^cell-[a-z0-9][a-z0-9-]{{0,30}}$, got: {cname!r}'
)
elif cname in _RESERVED_CONTAINER_NAMES:
errors.append(f'container_name is reserved: {cname!r}')
# subdomain
subdomain = manifest.get('subdomain')
if subdomain is not None:
_check_subdomain(subdomain, 'subdomain', errors)
# extra_subdomains
for sub in manifest.get('extra_subdomains') or []:
_check_subdomain(sub, 'extra_subdomains entry', errors)
# backend
backend = manifest.get('backend')
if backend is not None:
_check_backend(backend, 'backend', errors)
# extra_backends
for sub_key, bknd_val in (manifest.get('extra_backends') or {}).items():
_check_backend(bknd_val, f'extra_backends[{sub_key!r}]', errors)
# cap_add
cap_add = manifest.get('cap_add')
if cap_add is not None:
if not isinstance(cap_add, list):
errors.append('cap_add must be a list')
else:
for cap in cap_add:
if not isinstance(cap, str):
errors.append(f'cap_add entry must be a string, got: {cap!r}')
continue
if not _CAP_NAME_RE.match(cap):
errors.append(f'cap_add entry must match ^[A-Z_]+$, got: {cap!r}')
continue
if cap in _CAP_DENYLIST:
errors.append(f'cap_add entry is explicitly denied: {cap}')
elif cap not in _CAP_ALLOWLIST:
errors.append(f'cap_add entry not in allowlist: {cap}')
# env values
for env_entry in manifest.get('env') or []:
val = str(env_entry.get('value', ''))
if not _ENV_VALUE_RE.match(val):
errors.append(
f'env[].value contains disallowed characters: {val!r}'
)
# provision_hook
hook = (manifest.get('accounts') or {}).get('provision_hook')
if hook is not None:
ok, msg = validate_provision_hook(hook)
if not ok:
errors.append(msg)
return (len(errors) == 0, errors)
def validate_rendered_compose(yaml_text: str, allowed_data_dir: str = None,
allow_host_network: bool = False) -> tuple:
"""
Parse and security-validate a rendered docker-compose YAML string.
Returns (True, []) when safe; (False, [error_strings]) otherwise.
Rejects constructs that would give a store service elevated access to the host.
allowed_data_dir: when set, absolute bind mounts under this prefix are
permitted they come from ${PIC_DATA_DIR} substitution and land in the
designated service data directory.
allow_host_network: when True, the compose file is permitted to use
network_mode: host and devices: required for connectivity services
(wireguard-ext, openvpn-client, tor, sshuttle [cell-sshuttle],
proxy [cell-redsocks]) that must share the host network namespace to
create tun/wg interfaces or expose local transparent-proxy listeners.
The external-network requirement is also waived since host-network
containers reach the cell network directly.
"""
errors = []
try:
doc = yaml.safe_load(yaml_text)
except yaml.YAMLError as exc:
return (False, [f'YAML parse error: {exc}'])
if not isinstance(doc, dict):
return (False, ['compose file must be a YAML mapping'])
# Regular (bridged) services must join the cell-network so Caddy and CoreDNS
# can reach them. Host-network services share the host namespace directly,
# so the external network declaration would be wrong and is omitted.
if not allow_host_network:
networks = doc.get('networks') or {}
has_external = any(
isinstance(v, dict) and v.get('external')
for v in networks.values()
)
if not has_external:
errors.append(
'compose file must declare at least one network with external: true'
)
for svc_name, svc in (doc.get('services') or {}).items():
if not isinstance(svc, dict):
continue
prefix = f'service {svc_name!r}'
cname = svc.get('container_name')
if cname is not None and cname in _RESERVED_CONTAINER_NAMES:
errors.append(f'{prefix}: container_name {cname!r} is reserved')
if svc.get('privileged') is True:
errors.append(f'{prefix}: privileged: true is not allowed')
net_mode = svc.get('network_mode')
if allow_host_network:
if net_mode is not None and net_mode not in ('host',):
errors.append(
f'{prefix}: network_mode {net_mode!r} is not allowed '
'(connectivity services must use host)'
)
else:
if net_mode is not None and net_mode not in (None, 'bridge'):
errors.append(
f'{prefix}: network_mode {net_mode!r} is not allowed (only bridge)'
)
if svc.get('pid') == 'host':
errors.append(f'{prefix}: pid: host is not allowed')
if svc.get('ipc') == 'host':
errors.append(f'{prefix}: ipc: host is not allowed')
if svc.get('userns_mode') == 'host':
errors.append(f'{prefix}: userns_mode: host is not allowed')
# cap_add
for cap in svc.get('cap_add') or []:
cap_str = str(cap)
if cap_str in _CAP_DENYLIST:
errors.append(f'{prefix}: cap_add {cap_str!r} is explicitly denied')
elif cap_str not in _CAP_ALLOWLIST:
errors.append(f'{prefix}: cap_add {cap_str!r} not in allowlist')
# volumes — reject absolute host-side bind mounts unless they're under
# the sanctioned data directory (injected by ServiceComposer via PIC_DATA_DIR)
for vol in svc.get('volumes') or []:
vol_str = str(vol)
src = vol_str.split(':')[0] if ':' in vol_str else vol_str
if src.startswith('/'):
if allowed_data_dir and src.startswith(allowed_data_dir):
continue
errors.append(
f'{prefix}: absolute host bind mount not allowed: {vol_str!r}'
)
if 'devices' in svc and not allow_host_network:
errors.append(f'{prefix}: devices key is not allowed')
for opt in svc.get('security_opt') or []:
opt_str = str(opt)
if opt_str in ('apparmor=unconfined', 'seccomp=unconfined'):
errors.append(
f'{prefix}: security_opt {opt_str!r} is not allowed'
)
# command must be a list — string form passes through the shell
cmd = svc.get('command')
if cmd is not None and isinstance(cmd, str):
errors.append(
f'{prefix}: command must be a list, not a shell string'
)
# entrypoint must also be a list for the same reason
ep = svc.get('entrypoint')
if ep is not None and isinstance(ep, str):
errors.append(
f'{prefix}: entrypoint must be a list, not a shell string'
)
return (len(errors) == 0, errors)
def _stage_aliases(dockerfile_text: str) -> set:
"""Collect multi-stage build aliases (FROM x AS alias) so later FROM <alias>
references resolve to a same-file stage rather than an external base."""
aliases = set()
for raw in dockerfile_text.splitlines():
line = raw.strip()
m = re.match(r'^FROM\s+\S+\s+AS\s+(\S+)\s*$', line, re.IGNORECASE)
if m:
aliases.add(m.group(1).lower())
return aliases
def _base_is_allowed(base_ref: str) -> tuple:
"""Return (ok, error_or_None) for a single FROM base image reference.
Requires an @sha256: digest pin and that the repository part (sans tag/
digest) is in BUILD_BASE_IMAGE_ALLOWLIST. 'scratch' is handled separately.
"""
if '@sha256:' not in base_ref:
return (False, f'FROM base image must be digest-pinned (@sha256:): {base_ref!r}')
repo = base_ref.split('@', 1)[0].split(':', 1)[0]
if repo not in BUILD_BASE_IMAGE_ALLOWLIST:
return (False, f'FROM base image not in allowlist: {repo!r}')
return (True, None)
def validate_build_context(dockerfile_text: str, context_files=None) -> tuple:
"""
Static lint of a community Dockerfile and its build context.
Returns (True, []) when the Dockerfile passes; (False, [errors]) otherwise.
Enforced (defense-in-depth see module note above, this is NOT a sandbox):
- every external FROM base must be in BUILD_BASE_IMAGE_ALLOWLIST and
digest-pinned (@sha256:)
- FROM scratch only when allowlisted in BUILD_SCRATCH_ALLOWLIST
- no `ADD http(s)://...` (fetches arbitrary remote content at build time)
- no ARG/ENV whose name matches /(TOKEN|KEY|PASSWORD|SECRET)/i (baking a
secret into a layer / build cache)
- context size and file-count caps when context_files metadata is given
context_files: optional iterable of (path, size_bytes) tuples describing the
build context. Pass None to skip the size/count checks (e.g. when only the
Dockerfile text is available, as in CI lint of the manifest repo).
"""
errors = []
if not isinstance(dockerfile_text, str) or not dockerfile_text.strip():
return (False, ['Dockerfile is empty'])
aliases = _stage_aliases(dockerfile_text)
# Join backslash-continued lines so a multi-line instruction is one logical line.
logical_lines = []
buf = ''
for raw in dockerfile_text.splitlines():
stripped = raw.rstrip()
if stripped.endswith('\\'):
buf += stripped[:-1] + ' '
continue
buf += stripped
logical_lines.append(buf)
buf = ''
if buf:
logical_lines.append(buf)
saw_from = False
for line in logical_lines:
line = line.strip()
if not line or line.startswith('#'):
continue
m_from = _FROM_RE.match(line)
if m_from:
saw_from = True
base = m_from.group(1).strip().split()[0]
base_l = base.lower()
if base_l in aliases:
continue # references an earlier build stage, not an external base
if base_l == 'scratch':
if 'scratch' not in BUILD_SCRATCH_ALLOWLIST:
errors.append('FROM scratch is not allowed')
continue
ok, err = _base_is_allowed(base)
if not ok:
errors.append(err)
continue
m_add = _ADD_RE.match(line)
if m_add:
if re.search(r'https?://', m_add.group(1), re.IGNORECASE):
errors.append(f'ADD from a remote URL is not allowed: {line!r}')
continue
m_arg = _ARG_RE.match(line)
if m_arg and _DOCKERFILE_SECRET_NAME_RE.search(m_arg.group(1)):
errors.append(
f'ARG name looks secret-bearing (matches TOKEN|KEY|PASSWORD|SECRET): {m_arg.group(1)!r}'
)
continue
m_env = _ENV_RE.match(line)
if m_env:
# ENV NAME value | ENV NAME=value [NAME2=value2 ...]
body = m_env.group(1).strip()
names = []
if '=' in body:
for tok in body.split():
if '=' in tok:
names.append(tok.split('=', 1)[0])
else:
names.append(body.split()[0] if body.split() else '')
for name in names:
if name and _DOCKERFILE_SECRET_NAME_RE.search(name):
errors.append(
f'ENV name looks secret-bearing (matches TOKEN|KEY|PASSWORD|SECRET): {name!r}'
)
if not saw_from:
errors.append('Dockerfile has no FROM instruction')
if context_files is not None:
total_bytes = 0
count = 0
for entry in context_files:
try:
_path, size = entry
except (TypeError, ValueError):
_path, size = entry, 0
count += 1
total_bytes += int(size or 0)
if count > BUILD_CONTEXT_MAX_FILES:
errors.append(
f'build context has too many files: {count} > {BUILD_CONTEXT_MAX_FILES}'
)
if total_bytes > BUILD_CONTEXT_MAX_BYTES:
errors.append(
f'build context too large: {total_bytes} bytes > {BUILD_CONTEXT_MAX_BYTES}'
)
return (len(errors) == 0, errors)
def validate_provision_hook(hook) -> tuple:
"""
Validate a provision_hook value from accounts.provision_hook.
Acceptable: None/absent, or a dict {"argv": ["binary", "arg1", ...]}.
Rejected: any plain string (shell injection risk), empty argv, uppercase binary,
NUL bytes in any element.
Returns (True, "") on success; (False, error_string) on failure.
"""
if hook is None:
return (True, '')
if isinstance(hook, str):
return (
False,
'provision_hook must be an argv list dict {"argv": [...]}, not a shell string',
)
if not isinstance(hook, dict):
return (False, 'provision_hook must be a dict with argv list')
argv = hook.get('argv')
if not isinstance(argv, list) or len(argv) == 0:
return (False, 'provision_hook.argv must be a non-empty list')
# NUL-byte check must precede regex check so the error message is unambiguous.
for elem in argv:
if isinstance(elem, str) and '\x00' in elem:
return (False, 'provision_hook.argv element contains NUL byte')
binary = argv[0]
if not isinstance(binary, str) or not _HOOK_BINARY_RE.match(binary):
return (
False,
f'provision_hook.argv[0] must match ^[a-z][a-z0-9_-]{{0,31}}$, got: {binary!r}',
)
return (True, '')
# ---------------------------------------------------------------------------
# Internal helpers
# ---------------------------------------------------------------------------
def _check_subdomain(value, field_name: str, errors: list) -> None:
if not isinstance(value, str):
errors.append(f'{field_name} must be a string')
return
if value in RESERVED_SUBDOMAINS:
errors.append(f'{field_name} is reserved: {value!r}')
elif not _SUBDOMAIN_RE.match(value):
errors.append(
f'{field_name} must match ^[a-z][a-z0-9-]{{0,30}}$, got: {value!r}'
)
def _check_backend(value, field_name: str, errors: list) -> None:
if not isinstance(value, str):
errors.append(f'{field_name} must be a string')
return
if not _BACKEND_RE.match(value):
errors.append(
f'{field_name} must be host:port (e.g. cell-foo:8080), got: {value!r}'
)
return
host = value.split(':')[0]
if host in _BACKEND_DENYLIST:
errors.append(f'{field_name} host {host!r} is in the backend denylist')
+288 -249
View File
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
"""
Network Manager for Personal Internet Cell
Handles DNS, DHCP, and NTP functionality
Handles DNS and NTP functionality
"""
import os
@@ -11,21 +11,24 @@ import subprocess
import logging
from datetime import datetime
from typing import Dict, List, Optional, Tuple, Any
import requests
from base_service_manager import BaseServiceManager
logger = logging.getLogger(__name__)
class NetworkManager(BaseServiceManager):
"""Manages network services (DNS, DHCP, NTP)"""
def __init__(self, data_dir: str = '/app/data', config_dir: str = '/app/config'):
"""Manages network services (DNS, NTP)"""
def __init__(self, data_dir: str = '/app/data', config_dir: str = '/app/config',
service_registry=None):
super().__init__('network', data_dir, config_dir)
self.dns_zones_dir = os.path.join(data_dir, 'dns')
self.dhcp_leases_file = os.path.join(data_dir, 'dhcp', 'leases')
self._service_registry = service_registry
# Ensure directories exist
self.safe_makedirs(self.dns_zones_dir)
self.safe_makedirs(os.path.dirname(self.dhcp_leases_file))
def update_dns_zone(self, zone_name: str, records: List[Dict]) -> bool:
"""Update DNS zone file with new records"""
@@ -45,7 +48,7 @@ class NetworkManager(BaseServiceManager):
for rec in records:
rname = rec.get('name', '')
rvalue = rec.get('value', '')
if rname and not re.match(r'^[a-zA-Z0-9_.*-]{1,253}$', str(rname)):
if rname and not re.match(r'^[a-zA-Z0-9_@.*-]{1,253}$', str(rname)):
logger.error(f"update_dns_zone: invalid record name {rname!r}")
return False
if rvalue and not re.match(r'^[a-zA-Z0-9._: -]{1,512}$', str(rvalue)):
@@ -165,6 +168,61 @@ class NetworkManager(BaseServiceManager):
self.update_dns_zone(domain, records)
logger.info(f"Created {len(records)} default DNS records for zone '{domain}'")
def update_split_horizon_zone(self, effective_domain: str, caddy_ip: str,
primary_domain: str = 'cell',
peers: Optional[List[Dict]] = None,
cell_links: Optional[List[Dict]] = None) -> bool:
"""Write a local authoritative zone for effective_domain pointing all
hosts (wildcard) to caddy_ip so LAN clients resolve service subdomains
without hairpin NAT. Regenerates the Corefile and reloads CoreDNS."""
import firewall_manager as _fm
# SOA/NS are generated by _generate_zone_content; just pass the A records.
records = [
{'name': '@', 'type': 'A', 'value': caddy_ip},
{'name': '*', 'type': 'A', 'value': caddy_ip},
]
ok = self.update_dns_zone(effective_domain, records)
if not ok:
logger.warning('update_split_horizon_zone: zone file write failed for %s', effective_domain)
# Delete split-horizon zone files for prior cell names sharing the same TLD.
# E.g. when renaming from pic3.pic.ngo → pic2.pic.ngo, remove pic3.pic.ngo.zone.
eff_parts = effective_domain.split('.')
if len(eff_parts) >= 2:
tld_suffix = '.' + '.'.join(eff_parts[1:])
for fname in os.listdir(self.dns_zones_dir):
if fname.endswith('.zone'):
z = fname[:-5]
if z.endswith(tld_suffix) and z != effective_domain:
try:
os.remove(os.path.join(self.dns_zones_dir, fname))
logger.info('Deleted stale split-horizon zone: %s', fname)
except OSError as _e:
logger.warning('Failed to delete stale zone %s: %s', fname, _e)
# If the internal zone name happens to be a parent of the effective DDNS
# domain (e.g. primary_domain='pic.ngo', effective_domain='pic2.pic.ngo'),
# bootstrap service records like 'api', 'calendar' etc. would pollute the
# zone display and shadow the public domain. Remove them.
_stale = {'api', 'webui'} | set(self._BUILTIN_SERVICE_SUBDOMAINS) | set(self._get_service_subdomains())
if effective_domain.endswith('.' + primary_domain):
existing = self._load_dns_records(primary_domain)
cleaned = [r for r in existing if r.get('name', '') not in _stale]
if len(cleaned) < len(existing):
self.update_dns_zone(primary_domain, cleaned)
logger.info('Removed stale service records from zone %s', primary_domain)
corefile = os.path.join(self.config_dir, 'dns', 'Corefile')
peers_data = peers or []
ok_cf = _fm.generate_corefile(
peers_data, corefile, primary_domain,
cell_links=cell_links,
split_horizon_zones=[effective_domain],
)
if ok_cf:
_fm.reload_coredns()
return ok and ok_cf
def apply_ip_range(self, ip_range: str, cell_name: str, domain: str) -> Dict[str, Any]:
"""Rewrite the primary DNS zone file with IPs derived from the new subnet."""
restarted: List[str] = []
@@ -194,6 +252,30 @@ class NetworkManager(BaseServiceManager):
pass
return '10.0.0.1'
_SUBDOMAIN_RE = re.compile(r'^[a-z][a-z0-9-]{0,30}$')
def _get_service_subdomains(self) -> List[str]:
"""Return all service subdomains from the registry, or a hardcoded fallback."""
registry = getattr(self, "_service_registry", None)
if registry is not None:
try:
subs: List[str] = []
for route in registry.get_caddy_routes():
for sub in [route['subdomain']] + list(route.get('extra_subdomains') or []):
if self._SUBDOMAIN_RE.match(sub):
subs.append(sub)
else:
logger.warning('_get_service_subdomains: skipping invalid subdomain %r', sub)
return subs
except Exception as exc:
logger.warning('_get_service_subdomains: registry error: %s', exc)
return []
# Built-in service subdomains that are always present on a PIC instance.
# These must stay in sync with firewall_manager.SERVICE_IPS keys and the
# Caddy routes for each built-in service.
_BUILTIN_SERVICE_SUBDOMAINS = ('calendar', 'files', 'mail', 'webdav')
def _build_dns_records(self, cell_name: str, ip_range: str) -> List[Dict]:
"""Build the standard set of DNS A records.
@@ -203,16 +285,16 @@ class NetworkManager(BaseServiceManager):
routes requests to the correct backend by Host header.
"""
wg_ip = self._get_wg_server_ip()
return [
{'name': cell_name, 'type': 'A', 'value': wg_ip},
{'name': 'api', 'type': 'A', 'value': wg_ip},
{'name': 'webui', 'type': 'A', 'value': wg_ip},
{'name': 'calendar', 'type': 'A', 'value': wg_ip},
{'name': 'files', 'type': 'A', 'value': wg_ip},
{'name': 'mail', 'type': 'A', 'value': wg_ip},
{'name': 'webmail', 'type': 'A', 'value': wg_ip},
{'name': 'webdav', 'type': 'A', 'value': wg_ip},
records = [
{'name': cell_name, 'type': 'A', 'value': wg_ip},
{'name': 'api', 'type': 'A', 'value': wg_ip},
{'name': 'webui', 'type': 'A', 'value': wg_ip},
]
for sub in self._BUILTIN_SERVICE_SUBDOMAINS:
records.append({'name': sub, 'type': 'A', 'value': wg_ip})
for sub in self._get_service_subdomains():
records.append({'name': sub, 'type': 'A', 'value': wg_ip})
return records
def get_dns_records(self, zone: str = 'cell') -> List[Dict]:
"""Get all DNS records across all zones"""
@@ -228,13 +310,137 @@ class NetworkManager(BaseServiceManager):
logger.error(f"Failed to list DNS records: {e}")
return all_records
def _service_subdomain_routes(self) -> List[Dict[str, str]]:
"""Return validated service subdomain → backend pairs from the registry."""
registry = getattr(self, '_service_registry', None)
if registry is None:
return []
try:
routes: List[Dict[str, str]] = []
for route in registry.get_caddy_routes():
pairs = [(route['subdomain'], route.get('backend', ''))]
extra_backends = route.get('extra_backends') or {}
for sub in route.get('extra_subdomains') or []:
pairs.append((sub, extra_backends.get(sub, route.get('backend', ''))))
for sub, backend in pairs:
if self._SUBDOMAIN_RE.match(sub):
routes.append({'subdomain': sub, 'backend': backend})
else:
logger.warning('_service_subdomain_routes: skipping invalid subdomain %r', sub)
return routes
except Exception as exc:
logger.warning('_service_subdomain_routes: registry error: %s', exc)
return []
def get_dns_overview(self, config_manager, ddns_manager=None,
public_ip: Optional[str] = None) -> Dict[str, Any]:
"""Compose a provider-aware DNS overview from the existing managers.
Does NOT write DNS it only reads from config_manager (identity/effective
domain), the service registry (subdomains), the internal zone files, and the
DDNS manager (registration status). public_ip may be supplied by the caller
(cached); otherwise it is fetched on demand.
"""
identity = config_manager.get_identity() or {}
mode = identity.get('domain_mode', 'lan')
effective_domain = config_manager.get_effective_domain()
internal_domain = config_manager.get_internal_domain()
ddns_cfg = config_manager.configs.get('ddns', {}) or {}
provider = ddns_cfg.get('provider', '') or ''
if public_ip is None and mode != 'lan':
public_ip = self._fetch_public_ip()
service_subdomains = []
for route in self._service_subdomain_routes():
sub = route['subdomain']
service_subdomains.append({
'subdomain': sub,
'fqdn': f'{sub}.{effective_domain}',
'backend': route['backend'],
})
registration_status: Dict[str, Any] = {}
registered = False
if ddns_manager is not None:
try:
registration_status = ddns_manager.get_status() or {}
except Exception as exc:
logger.warning('get_dns_overview: ddns_manager.get_status failed: %s', exc)
try:
registered = bool(config_manager.get_ddns_token())
except Exception:
registered = False
registration_status.setdefault('registered', registered)
public_records = self._build_public_records(
mode, effective_domain, public_ip, service_subdomains, registered)
return {
'mode': mode,
'provider': provider,
'effective_domain': effective_domain,
'internal_domain': internal_domain,
'public_ip': public_ip,
'public_records': public_records,
'internal_records': self.get_dns_records(),
'service_subdomains': service_subdomains,
'registration_status': registration_status,
}
def _build_public_records(self, mode: str, effective_domain: str,
public_ip: Optional[str],
service_subdomains: List[Dict[str, str]],
registered: bool) -> List[Dict[str, str]]:
"""Derive the public A records the cell publishes (or should publish) per mode."""
ip = public_ip or ''
status = 'registered' if registered else 'unregistered'
records: List[Dict[str, str]] = []
if mode == 'lan':
return records
if mode == 'pic_ngo':
records.append({'name': effective_domain, 'type': 'A',
'value': ip, 'status': status})
records.append({'name': f'*.{effective_domain}', 'type': 'A',
'value': ip, 'status': status})
return records
if mode in ('cloudflare', 'custom'):
records.append({'name': effective_domain, 'type': 'A',
'value': ip, 'status': status})
for svc in service_subdomains:
records.append({'name': svc['fqdn'], 'type': 'A',
'value': ip, 'status': status})
return records
if mode == 'duckdns':
records.append({'name': effective_domain, 'type': 'A',
'value': ip, 'status': status})
records.append({'name': f'*.{effective_domain}', 'type': 'A',
'value': ip, 'status': status})
return records
return records
def _fetch_public_ip(self) -> Optional[str]:
"""Return the current public IPv4 address using ipify, or None on failure."""
try:
resp = requests.get('https://api.ipify.org', timeout=5)
if resp.ok:
return resp.text.strip()
except Exception as exc:
logger.warning('get_dns_overview: could not determine public IP: %s', exc)
return None
def _load_dns_records(self, zone: str) -> List[Dict]:
"""Load DNS records from zone file"""
zone_file = os.path.join(self.dns_zones_dir, f'{zone}.zone')
if not os.path.exists(zone_file):
return []
records = []
try:
with open(zone_file, 'r') as f:
@@ -263,80 +469,6 @@ class NetworkManager(BaseServiceManager):
return records
def get_dhcp_leases(self) -> List[Dict]:
"""Get current DHCP leases"""
leases = []
try:
if os.path.exists(self.dhcp_leases_file):
with open(self.dhcp_leases_file, 'r') as f:
for line in f:
line = line.strip()
if line and not line.startswith('#'):
parts = line.split()
if len(parts) >= 4:
leases.append({
'mac': parts[1],
'ip': parts[2],
'hostname': parts[3] if len(parts) > 3 else '',
'timestamp': parts[0]
})
except Exception as e:
logger.error(f"Failed to load DHCP leases: {e}")
return leases
def add_dhcp_reservation(self, mac: str, ip: str, hostname: str = '') -> bool:
"""Add a DHCP reservation"""
try:
reservation_file = os.path.join(self.config_dir, 'dhcp', 'reservations.conf')
# Ensure directory exists
self.safe_makedirs(os.path.dirname(reservation_file))
# Add reservation
with open(reservation_file, 'a') as f:
f.write(f"dhcp-host={mac},{ip},{hostname}\n")
# Reload DHCP service
self._reload_dhcp_service()
logger.info(f"Added DHCP reservation: {mac} -> {ip}")
return True
except Exception as e:
logger.error(f"Failed to add DHCP reservation: {e}")
return False
def remove_dhcp_reservation(self, mac: str) -> bool:
"""Remove a DHCP reservation"""
try:
reservation_file = os.path.join(self.config_dir, 'dhcp', 'reservations.conf')
if not os.path.exists(reservation_file):
return True
# Read existing reservations
with open(reservation_file, 'r') as f:
lines = f.readlines()
# Remove matching reservation
lines = [line for line in lines if not line.startswith(f"dhcp-host={mac},")]
# Write back
with open(reservation_file, 'w') as f:
f.writelines(lines)
# Reload DHCP service
self._reload_dhcp_service()
logger.info(f"Removed DHCP reservation: {mac}")
return True
except Exception as e:
logger.error(f"Failed to remove DHCP reservation: {e}")
return False
def get_ntp_status(self) -> Dict:
"""Get NTP service status"""
try:
@@ -372,43 +504,17 @@ class NetworkManager(BaseServiceManager):
return {'running': False, 'stats': {}}
def _reload_dns_service(self):
"""Reload DNS service"""
"""Send SIGUSR1 to CoreDNS so the reload plugin picks up zone file changes."""
try:
subprocess.run(['docker', 'exec', 'cell-dns', 'kill', '-HUP', '1'],
capture_output=True, timeout=10)
subprocess.run(['docker', 'kill', '--signal=SIGUSR1', 'cell-dns'],
capture_output=True, timeout=10)
except Exception as e:
logger.error(f"Failed to reload DNS service: {e}")
def _reload_dhcp_service(self):
"""Reload DHCP service"""
try:
subprocess.run(['docker', 'exec', 'cell-dhcp', 'kill', '-HUP', '1'],
capture_output=True, timeout=10)
except Exception as e:
logger.error(f"Failed to reload DHCP service: {e}")
def apply_config(self, config: Dict[str, Any]) -> Dict[str, Any]:
"""Write config to real service files and reload/restart affected containers."""
restarted = []
warnings = []
dnsmasq_changed = False
# DHCP range
if 'dhcp_range' in config:
try:
dhcp_conf = os.path.join(self.config_dir, 'dhcp', 'dnsmasq.conf')
if os.path.exists(dhcp_conf):
with open(dhcp_conf) as f:
lines = f.readlines()
lines = [
f"dhcp-range={config['dhcp_range']}\n" if l.startswith('dhcp-range=') else l
for l in lines
]
with open(dhcp_conf, 'w') as f:
f.writelines(lines)
dnsmasq_changed = True
except Exception as e:
warnings.append(f"dhcp_range write failed: {e}")
# NTP servers
if 'ntp_servers' in config and config['ntp_servers']:
@@ -428,39 +534,17 @@ class NetworkManager(BaseServiceManager):
except Exception as e:
warnings.append(f"ntp_servers write failed: {e}")
if dnsmasq_changed:
self._reload_dhcp_service()
restarted.append('cell-dhcp (reloaded)')
return {'restarted': restarted, 'warnings': warnings}
def apply_domain(self, domain: str, reload: bool = True) -> Dict[str, Any]:
"""Update domain across dnsmasq, Corefile, and zone file; reload DNS + DHCP.
"""Update domain across the Corefile and zone file; reload DNS.
reload=False writes config files only use when deferring container restart.
"""
restarted = []
warnings = []
# 1. Update dnsmasq.conf domain= line
try:
dhcp_conf = os.path.join(self.config_dir, 'dhcp', 'dnsmasq.conf')
if os.path.exists(dhcp_conf):
with open(dhcp_conf) as f:
lines = f.readlines()
lines = [
f"domain={domain}\n" if l.startswith('domain=') else l
for l in lines
]
with open(dhcp_conf, 'w') as f:
f.writelines(lines)
if reload:
self._reload_dhcp_service()
restarted.append('cell-dhcp (reloaded)')
except Exception as e:
warnings.append(f"dnsmasq domain update failed: {e}")
# 2. Regenerate Corefile — include cell-to-cell forwarding stanzas so a
# 1. Regenerate Corefile — include cell-to-cell forwarding stanzas so a
# domain/ip_range change doesn't wipe cross-cell DNS forwarding zones.
try:
import firewall_manager as _fm
@@ -481,7 +565,7 @@ class NetworkManager(BaseServiceManager):
except Exception as e:
warnings.append(f"Corefile domain update failed: {e}")
# 3. Update zone file: rename and rewrite $ORIGIN / SOA, remove stale zones
# 2. Update zone file: rename and rewrite $ORIGIN / SOA, remove stale zones
try:
dns_data = os.path.join(self.data_dir, 'dns')
if os.path.isdir(dns_data):
@@ -518,7 +602,7 @@ class NetworkManager(BaseServiceManager):
except Exception as e:
warnings.append(f"zone file domain update failed: {e}")
# 4. Reload CoreDNS (only when not deferring to Apply)
# 3. Reload CoreDNS (only when not deferring to Apply)
if reload:
try:
self._reload_dns_service()
@@ -539,42 +623,53 @@ class NetworkManager(BaseServiceManager):
warnings = []
if not new_name:
return {'restarted': restarted, 'warnings': warnings}
_service_names = {'api', 'webui', 'calendar', 'files', 'mail', 'webmail', 'webdav'}
# Exclude service names, wildcard, and apex from cell-hostname detection.
_service_names = {'api', 'webui'} | set(self._BUILTIN_SERVICE_SUBDOMAINS) | set(self._get_service_subdomains())
_reserved = _service_names | {'@', '*'}
changed = False
try:
dns_data = os.path.join(self.data_dir, 'dns')
if os.path.isdir(dns_data):
for fname in os.listdir(dns_data):
if fname.endswith('.zone') and 'local' not in fname:
zone_file = os.path.join(dns_data, fname)
with open(zone_file) as f:
content = f.read()
# Determine which name to replace: prefer old_name if present,
# otherwise detect from zone (non-service A record not in _service_names)
actual_old = old_name if (
old_name and re.search(
rf'^{re.escape(old_name)}\s', content, re.MULTILINE)
) else None
if actual_old is None:
for m in re.finditer(
r'^(\S+)\s+(?:\d+\s+)?IN\s+A\s+\S+', content, re.MULTILINE
):
candidate = m.group(1)
if candidate not in _service_names and candidate != '@':
actual_old = candidate
break
if actual_old is None or actual_old == new_name:
break
new_content = re.sub(
rf'^{re.escape(actual_old)}(\s+(?:\d+\s+)?IN\s+A\s+)',
f'{new_name}\\1',
content, flags=re.MULTILINE
)
if new_content != content:
with open(zone_file, 'w') as f:
f.write(new_content)
changed = True
break
if not fname.endswith('.zone'):
continue
zone_name = fname[:-5]
# Skip split-horizon DDNS zones (multi-label, e.g. 'pic2.pic.ngo.zone')
# and any zone with 'local' in its name. The cell hostname only lives
# in the primary single-label zone (e.g. 'cell.zone').
if 'local' in zone_name or '.' in zone_name:
continue
zone_file = os.path.join(dns_data, fname)
with open(zone_file) as f:
content = f.read()
# Determine which name to replace: prefer old_name if present,
# otherwise detect from zone (non-service A record not in _reserved)
actual_old = old_name if (
old_name and re.search(
rf'^{re.escape(old_name)}\s', content, re.MULTILINE)
) else None
if actual_old is None:
for m in re.finditer(
r'^(\S+)\s+(?:\d+\s+)?IN\s+A\s+\S+', content, re.MULTILINE
):
candidate = m.group(1)
if candidate not in _reserved:
actual_old = candidate
break
if actual_old is None:
continue # no hostname in this zone; try next
if actual_old == new_name:
break # already correct
new_content = re.sub(
rf'^{re.escape(actual_old)}(\s+(?:\d+\s+)?IN\s+A\s+)',
f'{new_name}\\1',
content, flags=re.MULTILINE
)
if new_content != content:
with open(zone_file, 'w') as f:
f.write(new_content)
changed = True
break
if changed and reload:
self._reload_dns_service()
restarted.append('cell-dns (reloaded)')
@@ -666,29 +761,6 @@ class NetworkManager(BaseServiceManager):
except Exception as e:
return {'success': False, 'output': '', 'error': str(e)}
def test_dhcp_functionality(self) -> Dict:
"""Test DHCP functionality"""
try:
# Check if DHCP service is running
result = subprocess.run(['docker', 'ps', '--filter', 'name=cell-dhcp', '--format', '{{.Names}}'],
capture_output=True, text=True)
is_running = len(result.stdout.strip()) > 0
# Get DHCP leases
leases = self.get_dhcp_leases()
return {
'success': is_running,
'running': is_running,
'leases_count': len(leases),
'leases': leases
}
except Exception as e:
logger.error(f"Failed to test DHCP functionality: {e}")
return {'success': False, 'running': False, 'leases_count': 0, 'leases': []}
def test_ntp_functionality(self) -> Dict:
"""Test NTP functionality"""
try:
@@ -787,19 +859,16 @@ class NetworkManager(BaseServiceManager):
if is_docker:
# Check if network containers are actually running
dns_running = self._check_dns_container_status()
dhcp_running = self._check_dhcp_container_status()
ntp_running = self._check_ntp_container_status()
all_running = dns_running and dhcp_running and ntp_running
all_running = dns_running and ntp_running
status = {
'dns_running': dns_running,
'dhcp_running': dhcp_running,
'ntp_running': ntp_running,
'running': all_running,
'status': 'online' if all_running else 'offline',
'network': {
'dns_running': dns_running,
'dhcp_running': dhcp_running,
'ntp_running': ntp_running,
'running': all_running,
'status': 'online' if all_running else 'offline'
@@ -809,25 +878,22 @@ class NetworkManager(BaseServiceManager):
else:
# Check actual service status in production
dns_running = self._check_dns_status()
dhcp_running = self._check_dhcp_status()
ntp_running = self._check_ntp_status()
status = {
'dns_running': dns_running,
'dhcp_running': dhcp_running,
'ntp_running': ntp_running,
'running': dns_running and dhcp_running and ntp_running,
'status': 'online' if (dns_running and dhcp_running and ntp_running) else 'offline',
'running': dns_running and ntp_running,
'status': 'online' if (dns_running and ntp_running) else 'offline',
'network': {
'dns_running': dns_running,
'dhcp_running': dhcp_running,
'ntp_running': ntp_running,
'running': dns_running and dhcp_running and ntp_running,
'status': 'online' if (dns_running and dhcp_running and ntp_running) else 'offline'
'running': dns_running and ntp_running,
'status': 'online' if (dns_running and ntp_running) else 'offline'
},
'timestamp': datetime.utcnow().isoformat()
}
return status
except Exception as e:
return self.handle_error(e, "get_status")
@@ -842,16 +908,6 @@ class NetworkManager(BaseServiceManager):
except Exception:
return False
def _check_dhcp_container_status(self) -> bool:
"""Check if DHCP Docker container is running"""
try:
import docker
client = docker.from_env()
containers = client.containers.list(filters={'name': 'cell-dhcp'})
return len(containers) > 0
except Exception:
return False
def _check_ntp_container_status(self) -> bool:
"""Check if NTP Docker container is running"""
try:
@@ -866,31 +922,28 @@ class NetworkManager(BaseServiceManager):
"""Test network service connectivity"""
try:
dns_test = self.test_dns_resolution('google.com')
dhcp_test = self.test_dhcp_functionality()
ntp_test = self.test_ntp_functionality()
results = {
'dns_test': dns_test,
'dhcp_test': dhcp_test,
'ntp_test': ntp_test,
'timestamp': datetime.utcnow().isoformat()
}
# Determine overall success
success = all(
result.get('success', False)
for result in [dns_test, dhcp_test, ntp_test]
result.get('success', False)
for result in [dns_test, ntp_test]
)
results['success'] = success
# Add network key for compatibility
results['network'] = {
'dns_test': dns_test,
'dhcp_test': dhcp_test,
'ntp_test': ntp_test,
'success': success
}
return results
except Exception as e:
return self.handle_error(e, "test_connectivity")
@@ -909,20 +962,6 @@ class NetworkManager(BaseServiceManager):
except Exception:
return False
def _check_dhcp_status(self) -> bool:
"""Check if DHCP service is running"""
try:
result = subprocess.run(['systemctl', 'is-active', 'dnsmasq'],
capture_output=True, text=True, timeout=5)
return result.returncode == 0 and result.stdout.strip() == 'active'
except Exception:
# Fallback: check if port 67 is listening
try:
result = subprocess.run(['netstat', '-tuln'], capture_output=True, text=True)
return ':67 ' in result.stdout
except Exception:
return False
def _check_ntp_status(self) -> bool:
"""Check if NTP service is running"""
try:
+96 -9
View File
@@ -17,11 +17,17 @@ logger = logging.getLogger(__name__)
class PeerRegistry(BaseServiceManager):
"""Manages peer registration and management"""
def __init__(self, data_dir: str = '/app/data', config_dir: str = '/app/config'):
def __init__(self, data_dir: str = '/app/data', config_dir: str = '/app/config',
config_manager=None):
super().__init__('peer_registry', data_dir, config_dir)
self.lock = RLock()
self.peers = []
self.peers_file = os.path.join(data_dir, 'peers.json')
# config_manager is used to resolve/validate connection ids for the
# per-peer exit (exit_via). It may be wired after construction (the
# singletons in managers.py are built in dependency order), so the
# exit_via→connection-id migration also runs lazily, idempotently.
self.config_manager = config_manager
self._load_peers()
def get_status(self) -> Dict[str, Any]:
@@ -205,6 +211,11 @@ class PeerRegistry(BaseServiceManager):
changed = True
if changed:
self._save_peers()
# Phase 2 (connectivity v2): exit_via is now a connection id (or
# 'default'). Rewrite any legacy per-type exit_via to the id of
# the single migrated connection instance of that type. Runs
# lazily if config_manager is not yet wired.
self._migrate_exit_via_to_connection_id()
else:
self.peers = []
self.logger.info("No peers file found, starting with empty registry")
@@ -350,25 +361,101 @@ class PeerRegistry(BaseServiceManager):
return dict(peer)
raise ValueError(f"Peer '{peer_name}' not found")
# Phase 5: extended connectivity per-peer egress exit
VALID_EXIT_VIA = ('default', 'wireguard_ext', 'openvpn', 'tor')
# Connectivity v2: legacy per-type exit values. A peer's exit_via is now a
# connection id (or 'default'); these strings are accepted only as a
# one-release back-compat shim — resolved to the single migrated instance
# of that type via config_manager.list_connections().
_LEGACY_EXIT_TYPES = ('wireguard_ext', 'openvpn', 'tor', 'sshuttle', 'proxy')
def _connections(self) -> List[Dict[str, Any]]:
"""Return the v2 connection records, or [] when unavailable."""
if self.config_manager is None:
return []
try:
conns = self.config_manager.list_connections()
except Exception as e:
self.logger.warning(f"peer_registry: list_connections failed: {e}")
return []
return conns if isinstance(conns, list) else []
def _resolve_exit_via(self, value: str) -> Optional[str]:
"""Resolve an exit_via value to a valid connection id or 'default'.
Accepts 'default', a real connection id, or as a back-compat shim
a legacy type string (resolved to the single instance of that type).
Returns None when the value cannot be resolved to anything valid.
"""
if value == 'default':
return 'default'
conns = self._connections()
for c in conns:
if c.get('id') == value:
return value
if value in self._LEGACY_EXIT_TYPES:
matches = [c for c in conns if c.get('type') == value]
if len(matches) == 1:
return matches[0].get('id')
return None
def _migrate_exit_via_to_connection_id(self) -> bool:
"""Rewrite legacy per-type exit_via values to migrated connection ids.
Idempotent: ids and 'default' are left untouched. Legacy type strings
are mapped to the single instance of that type; if no instance exists
the peer falls back to 'default'. Returns True if anything changed.
Runs only when config_manager (and its v2 connections) are available.
"""
if self.config_manager is None:
return False
conns = self._connections()
valid_ids = {c.get('id') for c in conns}
by_type: Dict[str, List[str]] = {}
for c in conns:
by_type.setdefault(c.get('type'), []).append(c.get('id'))
changed = False
with self.lock:
for peer in self.peers:
exit_via = peer.get('exit_via', 'default')
if exit_via == 'default' or exit_via in valid_ids:
continue
new_value = 'default'
if exit_via in self._LEGACY_EXIT_TYPES:
ids = by_type.get(exit_via, [])
if len(ids) == 1:
new_value = ids[0]
peer['exit_via'] = new_value
changed = True
self.logger.info(
f"peer_registry: migrated exit_via {exit_via!r}"
f"{new_value!r} for {peer.get('peer')!r}"
)
if changed:
self._save_peers()
return changed
def set_peer_exit_via(self, peer_name: str, exit_type: str) -> bool:
"""Set the per-peer egress exit type. Returns True if updated, False
if the peer is not found (logged as warning, no exception)."""
if exit_type not in self.VALID_EXIT_VIA:
"""Set the per-peer egress connection id. Returns True if updated, False
if the peer is not found or the id is invalid (logged, no exception).
`exit_type` must be a real connection id or 'default'. A legacy type
string is accepted as a back-compat shim and resolved to the single
instance of that type.
"""
resolved = self._resolve_exit_via(exit_type)
if resolved is None:
self.logger.warning(
f"set_peer_exit_via: invalid exit_type {exit_type!r}"
f"set_peer_exit_via: invalid connection id {exit_type!r}"
)
return False
with self.lock:
for peer in self.peers:
if peer.get('peer') == peer_name:
peer['exit_via'] = exit_type
peer['exit_via'] = resolved
peer['updated_at'] = datetime.utcnow().isoformat()
self._save_peers()
self.logger.info(
f"Set exit_via for {peer_name}: {exit_type!r}"
f"Set exit_via for {peer_name}: {resolved!r}"
)
return True
self.logger.warning(
+1
View File
@@ -1,6 +1,7 @@
flask>=3.0.3
flask-cors>=4.0.1
requests>=2.32.3
pyotp>=2.9.0
cryptography>=42.0.5
pyyaml==6.0.1
icalendar==5.0.7
+19
View File
@@ -0,0 +1,19 @@
from functools import wraps
from flask import jsonify
def require_active_service(service_id: str):
"""Decorator: return 404 if the named service is not installed.
Apply to all email/calendar/files routes except /status endpoints,
so the UI can always check installation state without being blocked.
"""
def decorator(fn):
@wraps(fn)
def wrapper(*args, **kwargs):
from app import service_registry
if service_registry.get(service_id) is None:
return jsonify({'error': f'Service {service_id!r} is not installed'}), 404
return fn(*args, **kwargs)
return wrapper
return decorator
+69
View File
@@ -0,0 +1,69 @@
"""Audit trail API (admin-only).
Not added to app._PEER_READABLE_PATHS, so enforce_auth blocks peer-role
sessions with 403. Routes are thin all logic lives in AuditManager.
"""
import logging
from flask import Blueprint, request, jsonify, Response
logger = logging.getLogger('picell')
bp = Blueprint('audit', __name__)
def _filters_from_args():
args = request.args
filters = {}
for field in ('actor', 'action', 'target_type', 'target_id', 'result', 'since', 'until'):
val = args.get(field)
if val:
filters[field] = val
return filters
@bp.route('/api/audit', methods=['GET'])
def list_audit():
try:
from app import audit_manager
try:
limit = int(request.args.get('limit', 100))
except (TypeError, ValueError):
limit = 100
try:
offset = int(request.args.get('offset', 0))
except (TypeError, ValueError):
offset = 0
result = audit_manager.query(_filters_from_args(), limit=limit, offset=offset)
return jsonify(result)
except Exception as e:
logger.error(f"list_audit: {e}")
return jsonify({'error': str(e)}), 500
@bp.route('/api/audit/export', methods=['GET'])
def export_audit():
try:
from app import audit_manager
fmt = request.args.get('format', 'csv')
if fmt != 'csv':
return jsonify({'error': 'only csv format is supported'}), 400
csv_text = audit_manager.export_csv(_filters_from_args())
return Response(
csv_text,
mimetype='text/csv',
headers={'Content-Disposition': 'attachment; filename="audit.csv"'},
)
except Exception as e:
logger.error(f"export_audit: {e}")
return jsonify({'error': str(e)}), 500
@bp.route('/api/audit/verify', methods=['GET'])
def verify_audit():
try:
from app import audit_manager
return jsonify(audit_manager.verify_chain())
except Exception as e:
logger.error(f"verify_audit: {e}")
return jsonify({'error': str(e)}), 500
+9
View File
@@ -1,9 +1,12 @@
import logging
from flask import Blueprint, request, jsonify
from routes import require_active_service
logger = logging.getLogger('picell')
bp = Blueprint('calendar', __name__)
@bp.route('/api/calendar/users', methods=['GET'])
@require_active_service('calendar')
def get_calendar_users():
"""Get calendar users."""
try:
@@ -15,6 +18,7 @@ def get_calendar_users():
return jsonify({"error": str(e)}), 500
@bp.route('/api/calendar/users', methods=['POST'])
@require_active_service('calendar')
def create_calendar_user():
"""Create calendar user."""
try:
@@ -33,6 +37,7 @@ def create_calendar_user():
return jsonify({"error": str(e)}), 500
@bp.route('/api/calendar/users/<username>', methods=['DELETE'])
@require_active_service('calendar')
def delete_calendar_user(username):
"""Delete calendar user."""
try:
@@ -44,6 +49,7 @@ def delete_calendar_user(username):
return jsonify({"error": str(e)}), 500
@bp.route('/api/calendar/calendars', methods=['POST'])
@require_active_service('calendar')
def create_calendar():
"""Create calendar."""
try:
@@ -67,6 +73,7 @@ def create_calendar():
return jsonify({"error": str(e)}), 500
@bp.route('/api/calendar/events', methods=['POST'])
@require_active_service('calendar')
def add_calendar_event():
try:
from app import calendar_manager
@@ -85,6 +92,7 @@ def add_calendar_event():
return jsonify({"error": str(e)}), 500
@bp.route('/api/calendar/events/<username>/<calendar_name>', methods=['GET'])
@require_active_service('calendar')
def get_calendar_events(username, calendar_name):
"""Get calendar events."""
try:
@@ -108,6 +116,7 @@ def get_calendar_status():
return jsonify({"error": str(e)}), 500
@bp.route('/api/calendar/connectivity', methods=['GET'])
@require_active_service('calendar')
def test_calendar_connectivity():
"""Test calendar connectivity."""
try:
+16 -5
View File
@@ -47,7 +47,7 @@ def get_cell_invite():
from app import cell_link_manager, config_manager
identity = config_manager.configs.get('_identity', {})
cell_name = identity.get('cell_name', os.environ.get('CELL_NAME', 'mycell'))
domain = identity.get('domain', os.environ.get('CELL_DOMAIN', 'cell'))
domain = identity.get('domain_name') or identity.get('domain', os.environ.get('CELL_DOMAIN', 'cell'))
return jsonify(cell_link_manager.generate_invite(cell_name, domain))
except Exception as e:
logger.error(f"Error generating cell invite: {e}")
@@ -145,12 +145,13 @@ def update_cell_permissions(cell_name):
# Regenerate Corefile so outbound DNS changes take effect
try:
from app import config_manager
domain = config_manager.configs.get('_identity', {}).get('domain', 'cell')
from app import _configured_dns_params
peers = peer_registry.list_peers()
cell_links = cell_link_manager.list_connections()
firewall_manager.apply_all_dns_rules(peers, COREFILE_PATH, domain,
cell_links=cell_links)
_dns_primary, _dns_szones = _configured_dns_params()
firewall_manager.apply_all_dns_rules(peers, COREFILE_PATH, _dns_primary,
cell_links=cell_links,
split_horizon_zones=_dns_szones)
except Exception as e:
logger.warning(f"DNS regen after permission update failed (non-fatal): {e}")
@@ -175,6 +176,11 @@ def set_exit_offer(cell_name):
if 'exit_offered' not in data:
return jsonify({'error': 'exit_offered field required'}), 400
link = cell_link_manager.set_exit_offered(cell_name, bool(data['exit_offered']))
try:
from app import connectivity_manager
connectivity_manager.reconcile_cell_relays()
except Exception as _re:
logger.warning(f"cell_relay reconcile after exit-offer failed (non-fatal): {_re}")
return jsonify({'message': f"Exit offer for '{cell_name}' updated", 'link': link})
except ValueError as e:
return jsonify({'error': str(e)}), 404
@@ -261,6 +267,11 @@ def peer_sync_permissions():
cell_link_manager.apply_remote_permissions(sender_pubkey, perms,
exit_offered=exit_offered,
use_as_exit_relay=use_as_exit_relay)
try:
from app import connectivity_manager
connectivity_manager.reconcile_cell_relays()
except Exception as _re:
logger.warning(f"cell_relay reconcile after peer-sync failed (non-fatal): {_re}")
return jsonify({'ok': True, 'applied_at': datetime.utcnow().isoformat()})
except ValueError as e:
return jsonify({'ok': False, 'error': str(e)}), 404
+317 -32
View File
@@ -118,6 +118,21 @@ def get_config():
'vip_webdav': _ips['vip_webdav'],
}
config['service_configs'] = service_configs
config['installed_services'] = config_manager.get_installed_services()
config['domain_mode'] = identity.get('domain_mode', 'lan')
config['domain_name'] = identity.get('domain_name', '')
config['effective_domain'] = config_manager.get_effective_domain()
ddns_section = config_manager.configs.get('ddns', {})
_provider = ddns_section.get('provider', '')
_has_token = bool(
(config_manager.get_ddns_token() if _provider == 'pic_ngo' else '') or
ddns_section.get('api_token') or ddns_section.get('token')
)
config['ddns'] = {
'provider': _provider,
'subdomain': ddns_section.get('subdomain', ''),
'has_token': _has_token,
}
return jsonify(config)
except Exception as e:
logger.error(f"Error getting config: {e}")
@@ -306,12 +321,6 @@ def update_config():
domain = identity_updates['domain']
net_result = network_manager.apply_domain(domain, reload=False)
all_warnings.extend(net_result.get('warnings', []))
_cur_id = config_manager.configs.get('_identity', {})
ip_utils.write_caddyfile(
_cur_id.get('ip_range', os.environ.get('CELL_IP_RANGE', '172.20.0.0/16')),
_cur_id.get('cell_name', os.environ.get('CELL_NAME', 'mycell')),
domain, '/app/config-caddy/Caddyfile'
)
_set_pending_restart(
[f'domain changed to {domain}'],
['dns', 'caddy'],
@@ -324,18 +333,23 @@ def update_config():
if old_name != new_name:
cn_result = network_manager.apply_cell_name(old_name, new_name, reload=False)
all_warnings.extend(cn_result.get('warnings', []))
_cur_id2 = config_manager.configs.get('_identity', {})
ip_utils.write_caddyfile(
_cur_id2.get('ip_range', os.environ.get('CELL_IP_RANGE', '172.20.0.0/16')),
new_name,
identity_updates.get('domain') or _cur_id2.get('domain', os.environ.get('CELL_DOMAIN', 'cell')),
'/app/config-caddy/Caddyfile'
)
_set_pending_restart(
[f'cell_name changed to {new_name}'],
['dns'],
pre_change_snapshot=_pre_change_snapshot,
)
_ddns_cfg = config_manager.configs.get('ddns', {})
if _ddns_cfg.get('provider') == 'pic_ngo':
try:
from ddns_manager import DDNSManager as _DDNSManager
_ddns_mgr = _DDNSManager(config_manager)
_result = _ddns_mgr.register(new_name, '')
_new_sub = _result.get('subdomain', f'{new_name}.pic.ngo')
config_manager.set_identity_field('domain_name', _new_sub)
logger.info('DDNS re-registered: cell_name=%r subdomain=%r', new_name, _new_sub)
except Exception as _exc:
logger.warning('DDNS re-registration failed for %r: %s', new_name, _exc)
all_warnings.append(f'DDNS name update failed — {_exc}')
if identity_updates.get('ip_range') and identity_updates['ip_range'] != old_identity.get('ip_range', ''):
new_range = identity_updates['ip_range']
@@ -349,13 +363,34 @@ def update_config():
firewall_manager.ensure_caddy_virtual_ips()
env_file = os.environ.get('COMPOSE_ENV_FILE', '/app/.env.compose')
ip_utils.write_env_file(new_range, env_file, _collect_service_ports(config_manager.configs))
ip_utils.write_caddyfile(new_range, cur_cell_name, cur_domain, '/app/config-caddy/Caddyfile')
_set_pending_restart(
[f'ip_range changed to {new_range} — network will be recreated'],
['*'], network_recreate=True,
pre_change_snapshot=_pre_change_snapshot,
)
if identity_updates:
_cur_identity = config_manager.configs.get('_identity', {})
_eff_domain = config_manager.get_effective_domain()
service_bus.publish_event(EventType.IDENTITY_CHANGED, 'config', {
'cell_name': _cur_identity.get('cell_name'),
'domain': _cur_identity.get('domain'),
'domain_name': _cur_identity.get('domain_name'),
'domain_mode': _cur_identity.get('domain_mode'),
'effective_domain': _eff_domain,
})
if _cur_identity.get('domain_mode', 'lan') != 'lan' and _eff_domain:
try:
import ip_utils as _ip_sh
_ip_range = _cur_identity.get('ip_range', os.environ.get('CELL_IP_RANGE', '172.20.0.0/16'))
_caddy_ip = _ip_sh.get_service_ips(_ip_range).get('caddy', '172.20.0.2')
_primary_domain = _cur_identity.get('domain', os.environ.get('CELL_DOMAIN', 'cell'))
network_manager.update_split_horizon_zone(
_eff_domain, _caddy_ip, primary_domain=_primary_domain
)
except Exception as _sh_exc:
logger.warning('split-horizon zone update failed: %s', _sh_exc)
_PORT_CHANGE_MAP = {
('network', 'dns_port'): ('dns_port', ['dns']),
('wireguard','port'): ('wg_port', ['wireguard']),
@@ -442,6 +477,205 @@ def update_config():
return jsonify({"error": str(e)}), 500
@bp.route('/api/ddns/check/<name>', methods=['GET'])
def ddns_check_name(name):
import urllib.request as _ureq
import urllib.error as _uerr
import json as _json_
from setup_manager import DDNS_API_BASE as _DDNS_BASE
try:
url = f'{_DDNS_BASE}/api/v1/check/{name}'
with _ureq.urlopen(url, timeout=8) as resp:
body = _json_.loads(resp.read())
return jsonify({'available': bool(body.get('available'))})
except Exception as exc:
logger.warning('DDNS check failed for %r: %s', name, exc)
return jsonify({'available': None, 'error': 'DDNS service unreachable'}), 503
@bp.route('/api/ddns', methods=['PUT'])
def update_ddns_config():
import urllib.request as _ureq
import urllib.error as _uerr
import json as _json_
try:
from app import config_manager
from setup_manager import _build_ddns_config, DDNS_API_BASE as _DDNS_BASE
data = request.get_json(silent=True) or {}
domain_mode = data.get('domain_mode', '').strip()
domain_name = data.get('domain_name', '').strip()
cf_token = data.get('cloudflare_api_token', '').strip()
duck_token = data.get('duckdns_token', '').strip()
from setup_manager import VALID_DOMAIN_MODES
if domain_mode not in VALID_DOMAIN_MODES:
return jsonify({'error': f'domain_mode must be one of: {", ".join(sorted(VALID_DOMAIN_MODES))}'}), 400
if domain_mode == 'cloudflare':
if not domain_name:
return jsonify({'error': 'domain_name is required for cloudflare'}), 400
if not cf_token:
existing = config_manager.configs.get('ddns', {}).get('api_token', '')
if not existing:
return jsonify({'error': 'cloudflare_api_token is required'}), 400
cf_token = existing
try:
req = _ureq.Request(
'https://api.cloudflare.com/client/v4/user/tokens/verify',
headers={'Authorization': f'Bearer {cf_token}'},
)
with _ureq.urlopen(req, timeout=8) as resp:
body = _json_.loads(resp.read())
if not body.get('success'):
return jsonify({'error': 'Cloudflare token is invalid'}), 422
except _uerr.HTTPError:
return jsonify({'error': 'Cloudflare token is invalid'}), 422
except Exception as exc:
return jsonify({'error': f'Could not reach Cloudflare: {exc}'}), 503
if domain_mode == 'duckdns':
if not domain_name:
return jsonify({'error': 'domain_name is required for duckdns'}), 400
if not duck_token:
existing = config_manager.configs.get('ddns', {}).get('token', '')
if not existing:
return jsonify({'error': 'duckdns_token is required'}), 400
duck_token = existing
subdomain = domain_name.replace('.duckdns.org', '')
try:
url = f'https://www.duckdns.org/update?domains={subdomain}&token={duck_token}&ip='
with _ureq.urlopen(url, timeout=8) as resp:
if resp.read().strip() != b'OK':
return jsonify({'error': 'DuckDNS token or subdomain is invalid'}), 422
except Exception as exc:
return jsonify({'error': f'Could not reach DuckDNS: {exc}'}), 503
duck_sub = domain_name.replace('.duckdns.org', '') if domain_mode == 'duckdns' else ''
ddns_cfg = _build_ddns_config(
domain_mode,
cloudflare_api_token=cf_token,
duckdns_token=duck_token,
duckdns_subdomain=duck_sub,
)
config_manager.set_ddns_config(ddns_cfg)
config_manager.set_identity_field('domain_mode', domain_mode)
if domain_name:
config_manager.set_identity_field('domain_name', domain_name)
if domain_mode == 'cloudflare' and cf_token:
config_manager.set_identity_field('cloudflare_api_token', cf_token)
if domain_mode == 'duckdns':
if duck_token:
config_manager.set_identity_field('duckdns_token', duck_token)
config_manager.set_identity_field('duckdns_subdomain', duck_sub)
# Fire IDENTITY_CHANGED so CaddyManager regenerates the Caddyfile
# for the new domain mode without requiring a container restart.
try:
from app import service_bus as _sbus, EventType as _ET
_cur = config_manager.configs.get('_identity', {})
_sbus.publish_event(_ET.IDENTITY_CHANGED, 'config', {
'cell_name': _cur.get('cell_name'),
'domain': _cur.get('domain'),
'domain_name': _cur.get('domain_name'),
'domain_mode': _cur.get('domain_mode'),
'effective_domain': config_manager.get_effective_domain(),
})
except Exception as _ev_err:
logger.warning('update_ddns_config: failed to fire IDENTITY_CHANGED: %s', _ev_err)
logger.info('DDNS config updated: domain_mode=%r domain_name=%r', domain_mode, domain_name)
return jsonify({'updated': True})
except Exception as e:
logger.error(f'Error updating DDNS config: {e}')
return jsonify({'error': str(e)}), 500
_ddns_public_ip_cache: dict = {'ip': None, 'at': 0}
@bp.route('/api/ddns/status', methods=['GET'])
def ddns_status():
import time as _time
from app import config_manager
ddns_cfg = config_manager.configs.get('ddns', {})
identity = config_manager.configs.get('_identity', {})
now = _time.time()
if now - _ddns_public_ip_cache['at'] > 30 or not _ddns_public_ip_cache['ip']:
try:
import requests as _req
resp = _req.get('https://api.ipify.org', timeout=5)
if resp.ok:
_ddns_public_ip_cache['ip'] = resp.text.strip()
_ddns_public_ip_cache['at'] = now
except Exception:
pass
last_ip = None
try:
from app import ddns_manager as _ddns_mgr_singleton
last_ip = _ddns_mgr_singleton._last_ip
except Exception:
pass
registered = bool(config_manager.get_ddns_token())
return jsonify({
'registered': registered,
'domain_name': identity.get('domain_name', ''),
'public_ip': _ddns_public_ip_cache['ip'],
'last_ip': last_ip,
})
@bp.route('/api/ddns/register', methods=['POST'])
def ddns_register():
"""Trigger (re-)registration with the configured DDNS provider."""
try:
from app import config_manager
ddns_cfg = config_manager.configs.get('ddns', {})
if ddns_cfg.get('provider') != 'pic_ngo':
return jsonify({'error': 'Re-registration only supported for pic_ngo provider'}), 400
identity = config_manager.configs.get('_identity', {})
cell_name = identity.get('cell_name', os.environ.get('CELL_NAME', ''))
if not cell_name:
return jsonify({'error': 'cell_name not configured'}), 400
from ddns_manager import DDNSManager as _DDNSManager
_mgr = _DDNSManager(config_manager)
result = _mgr.register(cell_name, '')
new_sub = result.get('subdomain', f'{cell_name}.pic.ngo')
config_manager.set_identity_field('domain_name', new_sub)
logger.info('DDNS registered via /api/ddns/register: cell_name=%r subdomain=%r', cell_name, new_sub)
from app import service_bus, EventType
_reg_identity = config_manager.configs.get('_identity', {})
service_bus.publish_event(EventType.IDENTITY_CHANGED, 'ddns_register', {
'cell_name': _reg_identity.get('cell_name'),
'domain': _reg_identity.get('domain'),
'domain_name': new_sub,
'domain_mode': _reg_identity.get('domain_mode'),
'effective_domain': config_manager.get_effective_domain(),
})
return jsonify({'registered': True, 'subdomain': new_sub})
except Exception as e:
logger.error('Error in /api/ddns/register: %s', e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/ddns/sync', methods=['POST'])
def ddns_sync_records():
"""Sync per-service public DNS records (Cloudflare provider)."""
try:
from app import ddns_manager
from ddns_manager import DDNSError
try:
result = ddns_manager.sync_service_records()
except DDNSError as exc:
return jsonify({'error': str(exc)}), 400
return jsonify(result)
except Exception as e:
logger.error('Error in /api/ddns/sync: %s', e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/config/pending', methods=['GET'])
def get_pending_config():
from app import config_manager
@@ -481,11 +715,12 @@ def cancel_pending_config():
if cur_cell_name and old_cell_name and cur_cell_name != old_cell_name:
network_manager.apply_cell_name(cur_cell_name, old_cell_name, reload=False)
_ip_revert.write_caddyfile(
_id.get('ip_range', os.environ.get('CELL_IP_RANGE', '172.20.0.0/16')),
_id.get('cell_name', os.environ.get('CELL_NAME', 'mycell')),
_dom, '/app/config-caddy/Caddyfile'
)
# Regenerate Caddyfile for the reverted identity (all domain modes)
try:
from app import caddy_manager as _cm
_cm.regenerate_with_installed([])
except Exception as _cm_err:
logger.warning('cancel_pending_config: caddy regenerate failed (non-fatal): %s', _cm_err)
_clear_pending_restart()
return jsonify({'message': 'Pending changes discarded'})
@@ -573,6 +808,12 @@ def apply_pending_config():
+ (' (network_recreate)' if needs_network_recreate else '')
)
else:
# Clear needs_restart immediately so frontend polls don't see stale
# state while the container restart runs in the background thread.
config_manager.configs['_pending_restart']['needs_restart'] = False
config_manager.configs['_pending_restart']['applying'] = True
config_manager._save_all_configs()
def _do_apply():
import time as _time
import subprocess as _subprocess
@@ -589,7 +830,7 @@ def apply_pending_config():
logger.error(f"docker compose up failed: {result.stderr.strip()}")
else:
logger.info(f'docker compose up completed for: {containers}')
_clear_pending_restart()
_clear_pending_restart()
threading.Thread(target=_do_apply, daemon=False).start()
return jsonify({
@@ -604,13 +845,22 @@ def apply_pending_config():
@bp.route('/api/config/backup', methods=['POST'])
def create_config_backup():
try:
from app import config_manager, service_bus, EventType
backup_id = config_manager.backup_config()
from app import config_manager, service_bus, service_registry, EventType
data = request.get_json(silent=True) or {}
passphrase = data.get('passphrase') or None
backup_id = config_manager.backup_config(
service_registry=service_registry, passphrase=passphrase)
service_bus.publish_event(EventType.BACKUP_CREATED, 'api', {
'backup_id': backup_id,
'timestamp': datetime.utcnow().isoformat()
})
return jsonify({"backup_id": backup_id})
return jsonify({
"backup_id": backup_id,
"encrypted": bool(passphrase),
"warning": "This backup contains secrets and key material "
"(WireGuard keys, internal CA, admin credentials). "
"Store it securely.",
})
except Exception as e:
logger.error(f"Error creating backup: {e}")
return jsonify({"error": str(e)}), 500
@@ -629,9 +879,19 @@ def list_config_backups():
@bp.route('/api/config/restore/<backup_id>', methods=['POST'])
def restore_config(backup_id):
try:
from app import config_manager, service_bus, EventType
from app import config_manager, service_bus, service_registry, EventType
data = request.get_json(silent=True) or {}
success = config_manager.restore_config(backup_id, services=data.get('services'))
services = data.get('services')
passphrase = data.get('passphrase') or None
try:
success = config_manager.restore_config(
backup_id,
services=services,
service_registry=service_registry if services is None else None,
passphrase=passphrase,
)
except PermissionError:
return jsonify({"error": "Invalid or missing passphrase for encrypted backup"}), 400
if success:
service_bus.publish_event(EventType.RESTORE_COMPLETED, 'api', {
'backup_id': backup_id,
@@ -679,6 +939,10 @@ def download_backup(backup_id):
backup_path = config_manager.backup_dir / backup_id
if not backup_path.exists():
return jsonify({'error': f'Backup {backup_id} not found'}), 404
if backup_path.is_file():
# Encrypted archive — serve as-is.
return send_file(str(backup_path), mimetype='application/octet-stream',
as_attachment=True, download_name=backup_id)
buf = io.BytesIO()
with zipfile.ZipFile(buf, 'w', zipfile.ZIP_DEFLATED) as zf:
for f in backup_path.rglob('*'):
@@ -697,27 +961,48 @@ def download_backup(backup_id):
def upload_backup():
try:
from app import config_manager
import backup_crypto
if 'file' not in request.files:
return jsonify({'error': 'No file provided'}), 400
f = request.files['file']
filename = f.filename or ''
if filename.endswith('.zip'):
backup_id = filename[:-4]
else:
raw = f.read()
# Derive a clean backup id from the filename, stripping known suffixes.
stem = filename
for suffix in ('.tar.gz.age', '.age', '.zip'):
if stem.endswith(suffix):
stem = stem[:-len(suffix)]
break
backup_id = ''.join(c for c in stem if c.isalnum() or c == '_')
if not backup_id:
backup_id = f"backup_{datetime.utcnow().strftime('%Y%m%d_%H%M%S')}"
backup_id = ''.join(c for c in backup_id if c.isalnum() or c == '_')
# Encrypted backups are opaque blobs: store them verbatim as
# <id>.tar.gz.age so restore_config()/_resolve_backup_dir() can decrypt
# them with the passphrase supplied at restore time.
if backup_crypto.is_encrypted(raw):
config_manager.backup_dir.mkdir(parents=True, exist_ok=True)
archive_path = config_manager.backup_dir / f'{backup_id}.tar.gz.age'
archive_path.write_bytes(raw)
try:
os.chmod(archive_path, 0o600)
except OSError as e:
logger.warning(f"Could not chmod uploaded backup: {e}")
return jsonify({'backup_id': backup_id, 'encrypted': True})
backup_path = config_manager.backup_dir / backup_id
backup_path.mkdir(parents=True, exist_ok=True)
try:
with zipfile.ZipFile(io.BytesIO(f.read())) as zf:
with zipfile.ZipFile(io.BytesIO(raw)) as zf:
zf.extractall(backup_path)
except zipfile.BadZipFile:
shutil.rmtree(backup_path, ignore_errors=True)
return jsonify({'error': 'Invalid zip file'}), 400
return jsonify({'error': 'Invalid backup file'}), 400
if not (backup_path / 'manifest.json').exists():
shutil.rmtree(backup_path, ignore_errors=True)
return jsonify({'error': 'Invalid backup: missing manifest.json'}), 400
return jsonify({'backup_id': backup_id})
return jsonify({'backup_id': backup_id, 'encrypted': False})
except Exception as e:
logger.error(f"Error uploading backup: {e}")
return jsonify({'error': str(e)}), 500
+13 -5
View File
@@ -1,29 +1,33 @@
import logging
from flask import Blueprint, request, jsonify
from routes import require_active_service
logger = logging.getLogger('picell')
bp = Blueprint('email', __name__)
@bp.route('/api/email/users', methods=['GET'])
@require_active_service('email')
def get_email_users():
"""Get email users."""
try:
from app import email_manager
users = email_manager.get_users()
users = email_manager.get_email_users()
return jsonify(users)
except Exception as e:
logger.error(f"Error getting email users: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/email/users', methods=['POST'])
@require_active_service('email')
def create_email_user():
"""Create email user."""
try:
from app import email_manager, _configured_domain
from app import email_manager, config_manager
data = request.get_json(silent=True)
if data is None:
return jsonify({"error": "No data provided"}), 400
username = data.get('username')
domain = data.get('domain') or _configured_domain()
domain = data.get('domain') or config_manager.get_effective_domain()
password = data.get('password')
if not username or not password:
return jsonify({"error": "Missing required fields: username, password"}), 400
@@ -34,11 +38,12 @@ def create_email_user():
return jsonify({"error": str(e)}), 500
@bp.route('/api/email/users/<username>', methods=['DELETE'])
@require_active_service('email')
def delete_email_user(username):
"""Delete email user."""
try:
from app import email_manager, _configured_domain
domain = request.args.get('domain') or _configured_domain()
from app import email_manager, config_manager
domain = request.args.get('domain') or config_manager.get_effective_domain()
result = email_manager.delete_email_user(username, domain)
return jsonify({"deleted": result})
except Exception as e:
@@ -57,6 +62,7 @@ def get_email_status():
return jsonify({"error": str(e)}), 500
@bp.route('/api/email/connectivity', methods=['GET'])
@require_active_service('email')
def test_email_connectivity():
"""Test email connectivity."""
try:
@@ -68,6 +74,7 @@ def test_email_connectivity():
return jsonify({"error": str(e)}), 500
@bp.route('/api/email/send', methods=['POST'])
@require_active_service('email')
def send_email():
try:
from app import email_manager
@@ -81,6 +88,7 @@ def send_email():
return jsonify({"error": str(e)}), 500
@bp.route('/api/email/mailbox/<username>', methods=['GET'])
@require_active_service('email')
def get_mailbox_info(username):
"""Get mailbox information."""
try:
+12
View File
@@ -1,9 +1,12 @@
import logging
from flask import Blueprint, request, jsonify
from routes import require_active_service
logger = logging.getLogger('picell')
bp = Blueprint('files', __name__)
@bp.route('/api/files/users', methods=['GET'])
@require_active_service('files')
def get_file_users():
"""Get file storage users."""
try:
@@ -15,6 +18,7 @@ def get_file_users():
return jsonify({"error": str(e)}), 500
@bp.route('/api/files/users', methods=['POST'])
@require_active_service('files')
def create_file_user():
"""Create file storage user."""
try:
@@ -33,6 +37,7 @@ def create_file_user():
return jsonify({"error": str(e)}), 500
@bp.route('/api/files/users/<username>', methods=['DELETE'])
@require_active_service('files')
def delete_file_user(username):
"""Delete file storage user."""
try:
@@ -44,6 +49,7 @@ def delete_file_user(username):
return jsonify({"error": str(e)}), 500
@bp.route('/api/files/folders', methods=['POST'])
@require_active_service('files')
def create_folder():
"""Create folder."""
try:
@@ -64,6 +70,7 @@ def create_folder():
return jsonify({"error": str(e)}), 500
@bp.route('/api/files/folders/<username>/<path:folder_path>', methods=['DELETE'])
@require_active_service('files')
def delete_folder(username, folder_path):
"""Delete folder."""
try:
@@ -77,6 +84,7 @@ def delete_folder(username, folder_path):
return jsonify({"error": str(e)}), 500
@bp.route('/api/files/upload/<username>', methods=['POST'])
@require_active_service('files')
def upload_file(username):
"""Upload file."""
try:
@@ -97,6 +105,7 @@ def upload_file(username):
return jsonify({"error": str(e)}), 500
@bp.route('/api/files/download/<username>/<path:file_path>', methods=['GET'])
@require_active_service('files')
def download_file(username, file_path):
"""Download file."""
try:
@@ -110,6 +119,7 @@ def download_file(username, file_path):
return jsonify({"error": str(e)}), 500
@bp.route('/api/files/delete/<username>/<path:file_path>', methods=['DELETE'])
@require_active_service('files')
def delete_file(username, file_path):
"""Delete file."""
try:
@@ -123,6 +133,7 @@ def delete_file(username, file_path):
return jsonify({"error": str(e)}), 500
@bp.route('/api/files/list/<username>', methods=['GET'])
@require_active_service('files')
def list_files(username):
"""List files."""
try:
@@ -148,6 +159,7 @@ def get_file_status():
return jsonify({"error": str(e)}), 500
@bp.route('/api/files/connectivity', methods=['GET'])
@require_active_service('files')
def test_file_connectivity():
"""Test file service connectivity."""
try:
+22 -35
View File
@@ -1,5 +1,6 @@
import logging
from flask import Blueprint, request, jsonify
import os
from flask import Blueprint, request, jsonify, Response
logger = logging.getLogger('picell')
bp = Blueprint('network', __name__)
@@ -34,42 +35,14 @@ def remove_dns_record():
logger.error(f"Error removing DNS record: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/dhcp/leases', methods=['GET'])
def get_dhcp_leases():
@bp.route('/api/dns/overview', methods=['GET'])
def get_dns_overview():
try:
from app import network_manager
return jsonify(network_manager.get_dhcp_leases())
from app import network_manager, config_manager, ddns_manager
overview = network_manager.get_dns_overview(config_manager, ddns_manager)
return jsonify(overview)
except Exception as e:
logger.error(f"Error getting DHCP leases: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/dhcp/reservations', methods=['POST'])
def add_dhcp_reservation():
try:
from app import network_manager
data = request.get_json(silent=True)
if not data:
return jsonify({"error": "No data provided"}), 400
for field in ('mac', 'ip'):
if field not in data:
return jsonify({"error": f"Missing required field: {field}"}), 400
result = network_manager.add_dhcp_reservation(data['mac'], data['ip'], data.get('hostname', ''))
return jsonify({"success": result})
except Exception as e:
logger.error(f"Error adding DHCP reservation: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/dhcp/reservations', methods=['DELETE'])
def remove_dhcp_reservation():
try:
from app import network_manager
data = request.get_json(silent=True)
if not data or 'mac' not in data:
return jsonify({"error": "Missing required field: mac"}), 400
result = network_manager.remove_dhcp_reservation(data['mac'])
return jsonify({"success": result})
except Exception as e:
logger.error(f"Error removing DHCP reservation: {e}")
logger.error(f"Error getting DNS overview: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/ntp/status', methods=['GET'])
@@ -99,6 +72,20 @@ def get_dns_status():
logger.error(f"Error getting DNS status: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/network/dns/corefile', methods=['GET'])
def get_corefile():
try:
from app import COREFILE_PATH
with open(COREFILE_PATH, 'r') as f:
content = f.read()
return Response(content, mimetype='text/plain')
except FileNotFoundError:
return Response('', mimetype='text/plain'), 404
except Exception as e:
logger.error(f"Error reading Corefile: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/network/test', methods=['POST'])
def test_network():
try:
+3 -2
View File
@@ -65,10 +65,11 @@ def peer_services():
wg_port = 51820
server_endpoint = ''
try:
from routes.wireguard import _effective_endpoint
from app import config_manager
server_public_key = wireguard_manager.get_keys().get('public_key', '')
wg_port = config_manager.configs.get('_identity', {}).get('wireguard_port', 51820)
srv = wireguard_manager.get_server_config()
server_endpoint = srv.get('endpoint') or '<SERVER_IP>'
server_endpoint = _effective_endpoint(wireguard_manager, config_manager)
except Exception:
pass
+100 -10
View File
@@ -37,7 +37,8 @@ def add_peer():
try:
from app import (peer_registry, wireguard_manager, firewall_manager,
email_manager, calendar_manager, file_manager, auth_manager,
cell_link_manager, _configured_domain, COREFILE_PATH)
cell_link_manager, _configured_domain, _configured_dns_params,
config_manager as _app_cfg, COREFILE_PATH)
try:
_wg_addr = wireguard_manager._get_configured_address()
_wg_subnet = str(ipaddress.ip_network(_wg_addr, strict=False)) if _wg_addr else '10.0.0.0/24'
@@ -64,7 +65,13 @@ def add_peer():
except ValueError as e:
return jsonify({'error': str(e)}), 409
_valid_services = {'calendar', 'files', 'mail', 'webdav'}
# 'webdav' is part of the 'files' store service (same container set);
# expose it only when 'files' is installed.
_STORE_ID_TO_ACCESS = {'email': 'mail', 'calendar': 'calendar', 'files': 'files'}
_installed = set(_app_cfg.get_installed_services() or {})
_valid_services = {_STORE_ID_TO_ACCESS[sid] for sid in _installed if sid in _STORE_ID_TO_ACCESS}
if 'files' in _installed:
_valid_services.add('webdav')
service_access = data.get('service_access', list(_valid_services))
if not isinstance(service_access, list) or not all(s in _valid_services for s in service_access):
return jsonify({"error": f"service_access must be a list of: {sorted(_valid_services)}"}), 400
@@ -76,11 +83,16 @@ def add_peer():
provisioned = ['auth']
domain = _configured_domain()
# Only provision accounts on services that are actually installed —
# email/calendar/files are optional store services.
for step_name, step_fn in [
('email', lambda: email_manager.create_email_user(peer_name, domain, password)),
('calendar', lambda: calendar_manager.create_calendar_user(peer_name, password)),
('files', lambda: file_manager.create_user(peer_name, password)),
]:
if step_name not in _installed:
logger.debug(f"Peer {peer_name}: {step_name} not installed — skipping account provisioning")
continue
try:
if step_fn():
provisioned.append(step_name)
@@ -89,6 +101,20 @@ def add_peer():
except Exception as e:
logger.warning(f"Peer {peer_name}: {step_name} account creation failed (non-fatal): {e}")
# Provision accounts for installed HTTP-backed store services (non-fatal)
try:
from app import account_manager as _am, config_manager as _cfg, service_registry as _sreg
for _svc_id in (_cfg.get_installed_services() or {}):
_svc_info = _sreg.get(_svc_id)
if _svc_info and (_svc_info.get('accounts') or {}).get('manager') == 'http':
try:
_am.provision(_svc_id, peer_name)
except Exception as _he:
logger.warning('Peer %s: HTTP account provision for %s failed (non-fatal): %s',
peer_name, _svc_id, _he)
except Exception as _am_err:
logger.warning('Peer %s: HTTP store provisioning failed (non-fatal): %s', peer_name, _am_err)
peer_info = {
'peer': peer_name,
'ip': assigned_ip,
@@ -125,6 +151,17 @@ def add_peer():
return jsonify({"error": f"Peer {peer_name} already exists"}), 400
peer_added_to_registry = True
# Store credentials only after the peer is committed — avoids orphaned
# credential entries if peer_registry.add_peer rejects a duplicate name.
try:
from app import account_manager
_svc_names = {'email', 'calendar', 'files'}
for svc in provisioned:
if svc in _svc_names:
account_manager.store_credentials(svc, peer_name, {'password': password})
except Exception as _am_err:
logger.warning(f"Peer {peer_name}: credential storage failed (non-fatal): {_am_err}")
firewall_manager.apply_peer_rules(peer_info['ip'], peer_info,
wg_subnet=_wg_subnet, cell_subnets=_cell_subnets)
firewall_applied = True
@@ -135,8 +172,10 @@ def add_peer():
except Exception as wg_err:
logger.warning(f"Peer {peer_name}: WireGuard server config update failed (non-fatal): {wg_err}")
firewall_manager.apply_all_dns_rules(peer_registry.list_peers(), COREFILE_PATH, _configured_domain(),
cell_links=cell_link_manager.list_connections())
_dns_primary, _dns_szones = _configured_dns_params()
firewall_manager.apply_all_dns_rules(peer_registry.list_peers(), COREFILE_PATH, _dns_primary,
cell_links=cell_link_manager.list_connections(),
split_horizon_zones=_dns_szones)
return jsonify({"message": f"Peer {peer_name} added successfully", "ip": assigned_ip}), 201
except Exception as e:
@@ -158,11 +197,24 @@ def add_peer():
return jsonify({"error": str(e)}), 500
@bp.route('/api/peers/<peer_name>', methods=['GET'])
def get_peer(peer_name):
try:
from app import peer_registry
peer = peer_registry.get_peer(peer_name)
if peer is None:
return jsonify({'error': 'Peer not found'}), 404
return jsonify(peer)
except Exception as e:
logger.error(f"Error getting peer {peer_name}: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/peers/<peer_name>', methods=['PUT'])
def update_peer(peer_name):
try:
from app import (peer_registry, wireguard_manager, firewall_manager,
cell_link_manager, _configured_domain, COREFILE_PATH)
cell_link_manager, _configured_dns_params, COREFILE_PATH)
try:
_wg_addr = wireguard_manager._get_configured_address()
_wg_subnet = str(ipaddress.ip_network(_wg_addr, strict=False)) if _wg_addr else '10.0.0.0/24'
@@ -191,8 +243,10 @@ def update_peer(peer_name):
if updated_peer:
firewall_manager.apply_peer_rules(updated_peer['ip'], updated_peer,
wg_subnet=_wg_subnet, cell_subnets=_cell_subnets)
firewall_manager.apply_all_dns_rules(peer_registry.list_peers(), COREFILE_PATH, _configured_domain(),
cell_links=cell_link_manager.list_connections())
_dns_primary, _dns_szones = _configured_dns_params()
firewall_manager.apply_all_dns_rules(peer_registry.list_peers(), COREFILE_PATH, _dns_primary,
cell_links=cell_link_manager.list_connections(),
split_horizon_zones=_dns_szones)
return jsonify({"message": f"Peer {peer_name} updated", "config_changed": config_changed})
return jsonify({"error": "Update failed"}), 500
except Exception as e:
@@ -293,7 +347,7 @@ def remove_peer(peer_name):
try:
from app import (peer_registry, wireguard_manager, firewall_manager,
email_manager, calendar_manager, file_manager, auth_manager,
cell_link_manager, _configured_domain, COREFILE_PATH)
cell_link_manager, _configured_domain, _configured_dns_params, COREFILE_PATH)
peer = peer_registry.get_peer(peer_name)
if not peer:
return jsonify({"message": f"Peer {peer_name} not found or already removed"})
@@ -303,8 +357,10 @@ def remove_peer(peer_name):
if success:
if peer_ip:
firewall_manager.clear_peer_rules(peer_ip)
firewall_manager.apply_all_dns_rules(peer_registry.list_peers(), COREFILE_PATH, _configured_domain(),
cell_links=cell_link_manager.list_connections())
_dns_primary, _dns_szones = _configured_dns_params()
firewall_manager.apply_all_dns_rules(peer_registry.list_peers(), COREFILE_PATH, _dns_primary,
cell_links=cell_link_manager.list_connections(),
split_horizon_zones=_dns_szones)
if peer_pubkey:
try:
wireguard_manager.remove_peer(peer_pubkey)
@@ -320,12 +376,46 @@ def remove_peer(peer_name):
_cleanup()
except Exception:
pass
try:
from app import account_manager
account_manager.deprovision_peer(peer_name)
except Exception as _am_err:
logger.warning(f"Peer {peer_name}: account_manager cleanup failed (non-fatal): {_am_err}")
return jsonify({"message": f"Peer {peer_name} removed successfully"})
except Exception as e:
logger.error(f"Error removing peer: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/peers/<peer_name>/service-credentials', methods=['GET'])
def get_peer_service_credentials(peer_name: str):
"""Return service credentials for a peer across all provisioned services (admin only).
Returns filled peer_config_template values for each service the peer is provisioned on.
Intended for an admin to view or copy credentials to share with the peer during
device setup. The global enforce_auth gate already restricts this to admin sessions.
Phase 2 note: a peer-self-service variant should live at /api/peer/service-credentials
(no path arg) and restrict to session['username'] to prevent cross-peer enumeration.
"""
try:
from app import peer_registry, account_manager, service_registry, config_manager
peer = peer_registry.get_peer(peer_name)
if not peer:
return jsonify({'error': f'Peer {peer_name!r} not found'}), 404
raw_creds = account_manager.get_all_credentials(peer_name)
identity = config_manager.get_identity()
domain = config_manager.get_effective_domain() or identity.get('domain', '')
result = {}
for service_id, cred in raw_creds.items():
svc_info = service_registry.get_peer_service_info(service_id, peer_name, domain, cred)
result[service_id] = svc_info if svc_info is not None else cred
return jsonify({'peer': peer_name, 'services': result})
except Exception as e:
logger.error('get_peer_service_credentials(%s): %s', peer_name, e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/peers/register', methods=['POST'])
def register_peer():
try:
+4
View File
@@ -62,6 +62,10 @@ def install_service(service_id: str):
result = _ssm().install(service_id)
if result.get('ok'):
return jsonify(result)
# Normalize docker compose stderr into the error key so the frontend
# can display the actual failure reason rather than a generic message.
if not result.get('error') and result.get('stderr'):
result = {**result, 'error': result['stderr']}
return jsonify(result), 400
except Exception as e:
logger.error(f'install_service({service_id}): {e}')
+260 -19
View File
@@ -6,6 +6,194 @@ from flask import Blueprint, request, jsonify
logger = logging.getLogger('picell')
bp = Blueprint('services', __name__)
@bp.route('/api/services/catalog', methods=['GET'])
def get_services_catalog():
"""
Return all services (builtins + installed store packages) with merged config.
Used by the frontend to build navigation and service pages dynamically.
"""
try:
from app import service_registry
return jsonify({'services': service_registry.list_all()})
except Exception as e:
logger.error('get_services_catalog: %s', e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/services/active', methods=['GET'])
def get_active_services():
"""Return minimal info for all installed services. Used by webui to build nav."""
try:
from app import service_registry
active = service_registry.list_active()
return jsonify([
{
'id': svc['id'],
'name': svc.get('name', svc['id']),
'subdomain': svc.get('subdomain'),
'capabilities': svc.get('capabilities', {}),
}
for svc in active
])
except Exception as e:
logger.error('get_active_services: %s', e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/services/catalog/<service_id>', methods=['GET'])
def get_service_catalog_entry(service_id: str):
"""Return a single service manifest+config, or 404 if unknown."""
try:
from app import service_registry
svc = service_registry.get(service_id)
if svc is None:
return jsonify({'error': f'Service {service_id!r} not found'}), 404
return jsonify(svc)
except Exception as e:
logger.error('get_service_catalog_entry(%s): %s', service_id, e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/services/catalog/<service_id>/status', methods=['GET'])
def get_service_container_status(service_id: str):
"""
Return container status for a service.
Builtins query the main compose stack; store services query their own compose project.
"""
try:
from app import service_registry, service_composer
svc = service_registry.get(service_id)
if svc is None:
return jsonify({'error': f'Service {service_id!r} not found'}), 404
result = service_composer.status_service(service_id, svc)
return jsonify(result)
except ValueError as e:
return jsonify({'error': str(e)}), 400
except Exception as e:
logger.error('get_service_container_status(%s): %s', service_id, e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/services/catalog/<service_id>/restart', methods=['POST'])
def restart_service_containers(service_id: str):
"""
Restart containers for a service.
Builtins restart via the main compose stack; store services via their own compose project.
"""
try:
from app import service_registry, service_composer
svc = service_registry.get(service_id)
if svc is None:
return jsonify({'error': f'Service {service_id!r} not found'}), 404
result = service_composer.restart_service(service_id, svc)
if result['ok']:
return jsonify({'message': f'Service {service_id!r} restarted', **result})
return jsonify({'error': result.get('stderr') or result.get('error', 'restart failed')}), 500
except ValueError as e:
return jsonify({'error': str(e)}), 400
except Exception as e:
logger.error('restart_service_containers(%s): %s', service_id, e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/services/catalog/<service_id>/reconfigure', methods=['POST'])
def reconfigure_service(service_id: str):
"""
Re-apply the stored compose file for a store service (rolling `up -d`).
The compose template must already exist on disk from the original install
accepting templates from the request body is deliberately not supported
(arbitrary compose files can mount host paths or request privileged mode).
"""
try:
from app import service_registry, service_composer
svc = service_registry.get(service_id)
if svc is None:
return jsonify({'error': f'Service {service_id!r} not found'}), 404
if svc.get('kind') == 'builtin':
return jsonify({'error': 'Builtins are reconfigured via their settings routes'}), 400
if not service_composer.has_compose_file(service_id):
return jsonify({'error': f'No compose file for {service_id!r} — install it first'}), 400
result = service_composer.up(service_id)
if result['ok']:
return jsonify({'message': f'Service {service_id!r} reconfigured', **result})
return jsonify({'error': result.get('stderr') or result.get('error', 'reconfigure failed')}), 500
except ValueError as e:
return jsonify({'error': str(e)}), 400
except Exception as e:
logger.error('reconfigure_service(%s): %s', service_id, e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/services/catalog/<service_id>/accounts', methods=['GET'])
def list_service_accounts(service_id: str):
"""Return peer usernames provisioned on a service."""
try:
from app import account_manager
accounts = account_manager.list_accounts(service_id)
return jsonify({'service_id': service_id, 'accounts': accounts})
except Exception as e:
logger.error('list_service_accounts(%s): %s', service_id, e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/services/catalog/<service_id>/accounts', methods=['POST'])
def provision_service_account(service_id: str):
"""Provision a peer account on a service. Generates a password if none is given.
The generated or provided password is NOT echoed in this response retrieve it
separately via GET /api/services/catalog/<id>/accounts/<username>/credentials.
This keeps passwords out of HTTP logs and browser network panels.
"""
try:
from app import account_manager
data = request.get_json(silent=True) or {}
peer_username = data.get('username')
if not peer_username:
return jsonify({'error': 'username is required'}), 400
account_manager.provision(service_id, peer_username,
password=data.get('password'))
return jsonify({'service_id': service_id, 'username': peer_username,
'provisioned': True}), 201
except ValueError as e:
return jsonify({'error': str(e)}), 400
except RuntimeError as e:
return jsonify({'error': str(e)}), 500
except Exception as e:
logger.error('provision_service_account(%s): %s', service_id, e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/services/catalog/<service_id>/accounts/<username>', methods=['DELETE'])
def deprovision_service_account(service_id: str, username: str):
"""Remove a peer's account from a service."""
try:
from app import account_manager
ok = account_manager.deprovision(service_id, username)
if ok:
return jsonify({'message': f'{username!r} deprovisioned from {service_id!r}'})
return jsonify({'error': 'deprovision failed'}), 500
except ValueError as e:
return jsonify({'error': str(e)}), 400
except Exception as e:
logger.error('deprovision_service_account(%s, %s): %s', service_id, username, e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/services/catalog/<service_id>/accounts/<username>/credentials', methods=['GET'])
def get_service_account_credentials(service_id: str, username: str):
"""Return stored credentials for a peer on a service."""
try:
from app import account_manager
creds = account_manager.get_credentials(service_id, username)
if creds is None:
return jsonify({'error': f'{username!r} not provisioned on {service_id!r}'}), 404
return jsonify({'service_id': service_id, 'username': username, **creds})
except Exception as e:
logger.error('get_service_account_credentials(%s, %s): %s', service_id, username, e)
return jsonify({'error': str(e)}), 500
@bp.route('/api/services/bus/status', methods=['GET'])
def get_service_bus_status():
try:
@@ -144,39 +332,89 @@ def get_log_file_infos():
logger.error(f"Error listing log files: {e}")
return jsonify({"error": str(e)}), 500
# Container-ENV driven services need a container recreate before a level change
# takes effect (the others — caddy/coredns/api — apply hot).
_RESTART_CONTAINERS = {'wireguard', 'mailserver'}
@bp.route('/api/logs/verbosity', methods=['GET'])
def get_log_verbosity():
"""Return both the python (per-service + root) and container log levels."""
try:
from app import log_manager
return jsonify(log_manager.get_service_levels())
from app import config_manager
return jsonify(config_manager.get_logging_config())
except Exception as e:
logger.error(f"Error getting log verbosity: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/logs/verbosity', methods=['PUT'])
def set_log_verbosity():
"""Update python and/or container log levels.
Payload: {"python": {"root": "DEBUG", "services": {...}}, "containers": {...}}
Python levels apply hot to the running API. Container levels regenerate the
relevant config and hot-reload (caddy/coredns) or are queued for the next
container recreate (wireguard/mailserver). Returns an `applied` map of
"hot" | "pending_restart" per container entry.
"""
try:
from app import log_manager
from app import config_manager, log_manager, apply_root_log_level
data = request.get_json(silent=True) or {}
for service, level in data.items():
python = data.get('python', {}) or {}
containers = data.get('containers', {}) or {}
applied = {}
services = python.get('services', {}) or {}
for service, level in services.items():
config_manager.set_python_log_level(service, level)
log_manager.set_service_level(service, level)
levels_file = os.path.join(os.path.dirname(os.path.dirname(__file__)), 'config', 'log_levels.json')
os.makedirs(os.path.dirname(levels_file), exist_ok=True)
current = {}
if os.path.exists(levels_file):
try:
with open(levels_file) as f:
current = json.load(f)
except Exception:
pass
current.update(data)
with open(levels_file, 'w') as f:
json.dump(current, f, indent=2)
return jsonify({"message": "Log levels updated", "levels": log_manager.get_service_levels()})
if 'root' in python:
config_manager.set_python_log_level('root', python['root'])
apply_root_log_level(python['root'])
for container, level in containers.items():
config_manager.set_container_log_level(container, level)
applied[container] = _apply_container_level(container)
return jsonify({
"message": "Log levels updated",
"logging": config_manager.get_logging_config(),
"applied": applied,
})
except ValueError as e:
return jsonify({"error": str(e)}), 400
except Exception as e:
logger.error(f"Error setting log verbosity: {e}")
return jsonify({"error": str(e)}), 500
def _apply_container_level(container: str) -> str:
"""Apply a container's log level. Returns "hot" or "pending_restart"."""
if container == 'caddy':
from app import caddy_manager, config_manager
caddy_manager.regenerate_with_installed(
list(config_manager.get_installed_services().values())
)
return "hot"
if container == 'coredns':
from app import firewall_manager, peer_registry, config_manager, cell_link_manager
peers = peer_registry.list_peers() if peer_registry else []
cell_links = cell_link_manager.list_connections() if cell_link_manager else None
firewall_manager.generate_corefile(
peers, domain=config_manager.get_internal_domain(), cell_links=cell_links)
firewall_manager.reload_coredns()
return "hot"
if container == 'api':
# The API container's own root level is applied hot via apply_root_log_level
# when python.root changes; the container entry is informational.
return "hot"
if container in _RESTART_CONTAINERS:
return "pending_restart"
return "pending_restart"
@bp.route('/api/services/status', methods=['GET'])
def get_all_services_status():
try:
@@ -195,7 +433,6 @@ def get_all_services_status():
if service_name == 'network':
clean_status.update({
'dns_status': status.get('dns_running', False),
'dhcp_status': status.get('dhcp_running', False),
'ntp_status': status.get('ntp_running', False)
})
elif service_name == 'wireguard':
@@ -279,12 +516,16 @@ def test_all_services_connectivity():
def get_backend_logs():
log_file = os.path.join(os.path.dirname(os.path.dirname(__file__)), 'picell.log')
lines = int(request.args.get('lines', 100))
level = (request.args.get('level') or 'ALL').upper()
try:
if not os.path.exists(log_file):
return jsonify({"error": "Log file not found."}), 404
with open(log_file, 'r', encoding='utf-8', errors='ignore') as f:
all_lines = f.readlines()
tail_lines = all_lines[-lines:] if lines > 0 else all_lines
if level != 'ALL':
from app import log_manager
all_lines = [ln for ln in all_lines if log_manager._is_log_level(ln, level)]
tail_lines = all_lines[-lines:] if lines > 0 else all_lines
return jsonify({"log": ''.join(tail_lines)})
except Exception as e:
logger.error(f"Error reading log file: {e}")
+93 -14
View File
@@ -1,10 +1,17 @@
import logging
import re
import urllib.request
import urllib.error
import json as _json
from flask import Blueprint, request, jsonify
from setup_manager import DDNS_API_BASE
logger = logging.getLogger('picell')
setup_bp = Blueprint('setup', __name__, url_prefix='/api/setup')
_DOMAIN_RE = re.compile(r'^[a-z0-9]([a-z0-9-]*[a-z0-9])?(\.[a-z]{2,})+$', re.I)
def _get_setup_manager():
from app import setup_manager
@@ -24,8 +31,8 @@ def get_setup_status():
def validate_setup_step():
"""Validate a single wizard step.
Expects JSON body: ``{'step': '<step_name>', 'data': {...}}``.
Supported steps: ``cell_name``, ``password``.
Supported steps: ``cell_name``, ``password``,
``pic_ngo_available``, ``cloudflare_token``, ``duckdns_token``.
"""
sm = _get_setup_manager()
if sm.is_setup_complete():
@@ -37,12 +44,39 @@ def validate_setup_step():
if step == 'cell_name':
errors = sm.validate_cell_name(data.get('cell_name', ''))
elif step == 'password':
errors = sm.validate_password(data.get('password', ''))
else:
return jsonify({'valid': False, 'errors': [f"Unknown step: {step!r}"]}), 400
return jsonify({'valid': len(errors) == 0, 'errors': errors})
return jsonify({'valid': len(errors) == 0, 'errors': errors})
if step == 'password':
errors = sm.validate_password(data.get('password', ''))
return jsonify({'valid': len(errors) == 0, 'errors': errors})
if step == 'pic_ngo_available':
name = data.get('cell_name', '').strip()
errors = sm.validate_cell_name(name)
if errors:
return jsonify({'available': False, 'errors': errors})
try:
available = _check_pic_ngo_available(name)
return jsonify({'available': available})
except Exception:
return jsonify({'available': False, 'error': 'DDNS service unreachable'}), 503
if step == 'cloudflare_token':
token = data.get('token', '').strip()
if not token:
return jsonify({'valid': False, 'error': 'Token is required.'})
valid = _verify_cloudflare_token(token)
return jsonify({'valid': valid})
if step == 'duckdns_token':
subdomain = data.get('subdomain', '').strip()
token = data.get('token', '').strip()
if not token or not subdomain:
return jsonify({'valid': False, 'error': 'Subdomain and token are required.'})
valid = _verify_duckdns_token(subdomain, token)
return jsonify({'valid': valid})
return jsonify({'valid': False, 'errors': [f"Unknown step: {step!r}"]}), 400
@setup_bp.route('/complete', methods=['POST'])
@@ -54,12 +88,57 @@ def complete_setup():
payload = request.get_json(silent=True) or {}
result = sm.complete_setup(payload)
if result.get('success'):
try:
from app import config_manager, service_bus, EventType, network_manager
identity = config_manager.configs.get('_identity', {})
cell_name = identity.get('cell_name', '')
service_bus.publish_event(EventType.IDENTITY_CHANGED, 'setup', {
'cell_name': cell_name,
'domain': identity.get('domain'),
'domain_name': identity.get('domain_name'),
'domain_mode': identity.get('domain_mode'),
'effective_domain': config_manager.get_effective_domain(),
})
# Bootstrap wrote the zone with 'mycell'; rename to the real cell name.
if cell_name:
network_manager.apply_cell_name('', cell_name)
except Exception as exc:
logger.warning(f'Failed to publish IDENTITY_CHANGED after setup: {exc}')
status_code = 200 if result.get('success') else 400
# TODO (Phase 3): if result.get('success') and domain_mode == 'pic_ngo':
# from app import ddns_manager
# name = payload.get('cell_name', '')
# ip = payload.get('public_ip', '')
# ddns_manager.register(name, ip)
return jsonify(result), status_code
# ── external validation helpers ───────────────────────────────────────────────
def _check_pic_ngo_available(name: str) -> bool:
try:
url = f'{DDNS_API_BASE}/api/v1/check/{name}'
with urllib.request.urlopen(url, timeout=8) as resp:
body = _json.loads(resp.read())
return bool(body.get('available'))
except Exception as exc:
logger.warning(f'DDNS availability check failed for {name!r}: {exc}')
raise
def _verify_cloudflare_token(token: str) -> bool:
try:
req = urllib.request.Request(
'https://api.cloudflare.com/client/v4/user/tokens/verify',
headers={'Authorization': f'Bearer {token}'},
)
with urllib.request.urlopen(req, timeout=8) as resp:
body = _json.loads(resp.read())
return bool(body.get('success'))
except Exception:
return False
def _verify_duckdns_token(subdomain: str, token: str) -> bool:
try:
url = f'https://www.duckdns.org/update?domains={subdomain}&token={token}&ip='
with urllib.request.urlopen(url, timeout=8) as resp:
return resp.read().strip() == b'OK'
except Exception:
return False
+61 -8
View File
@@ -4,6 +4,20 @@ from flask import Blueprint, request, jsonify
logger = logging.getLogger('picell')
bp = Blueprint('wireguard', __name__)
def _effective_endpoint(wireguard_manager, config_manager) -> str:
"""Return the WireGuard endpoint to embed in peer configs.
Uses wireguard_endpoint from identity config when set (admin override),
falling back to get_external_ip() detection.
"""
srv = wireguard_manager.get_server_config()
override = (config_manager.get_identity().get('wireguard_endpoint') or '').strip()
if override:
port = srv.get('port', 51820)
return override if ':' in override else f'{override}:{port}'
return srv.get('endpoint') or '<SERVER_IP>'
@bp.route('/api/wireguard/keys', methods=['GET'])
def get_wireguard_keys():
try:
@@ -171,8 +185,8 @@ def get_peer_config():
server_endpoint = data.get('server_endpoint', '')
if not server_endpoint:
srv = wireguard_manager.get_server_config()
server_endpoint = srv.get('endpoint') or '<SERVER_IP>'
from app import config_manager
server_endpoint = _effective_endpoint(wireguard_manager, config_manager)
allowed_ips = data.get('allowed_ips') or None
if not allowed_ips and registered:
@@ -198,12 +212,40 @@ def get_peer_config():
@bp.route('/api/wireguard/server-config', methods=['GET'])
def get_server_config():
try:
from app import wireguard_manager
return jsonify(wireguard_manager.get_server_config())
from app import wireguard_manager, config_manager
cfg = wireguard_manager.get_server_config()
cfg['endpoint_override'] = (config_manager.get_identity().get('wireguard_endpoint') or '').strip()
cfg['effective_endpoint'] = _effective_endpoint(wireguard_manager, config_manager)
return jsonify(cfg)
except Exception as e:
logger.error(f"Error getting server config: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/wireguard/endpoint', methods=['GET'])
def get_wireguard_endpoint():
try:
from app import wireguard_manager, config_manager
return jsonify({
'endpoint_override': (config_manager.get_identity().get('wireguard_endpoint') or '').strip(),
'detected_endpoint': wireguard_manager.get_server_config().get('endpoint'),
'effective_endpoint': _effective_endpoint(wireguard_manager, config_manager),
})
except Exception as e:
logger.error(f"Error getting wireguard endpoint: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/wireguard/endpoint', methods=['PUT'])
def set_wireguard_endpoint():
try:
from app import config_manager
data = request.get_json(silent=True) or {}
override = (data.get('endpoint_override') or '').strip()
config_manager.set_identity_field('wireguard_endpoint', override)
return jsonify({'endpoint_override': override, 'ok': True})
except Exception as e:
logger.error(f"Error setting wireguard endpoint: {e}")
return jsonify({"error": str(e)}), 500
@bp.route('/api/wireguard/refresh-ip', methods=['GET', 'POST'])
def refresh_external_ip():
try:
@@ -223,7 +265,7 @@ def refresh_external_ip():
def apply_wireguard_enforcement():
try:
from app import (peer_registry, wireguard_manager, firewall_manager,
cell_link_manager, _configured_domain, COREFILE_PATH)
cell_link_manager, _configured_dns_params, COREFILE_PATH)
peers = peer_registry.list_peers()
try:
_wg_addr = wireguard_manager._get_configured_address()
@@ -233,8 +275,10 @@ def apply_wireguard_enforcement():
_cell_links = cell_link_manager.list_connections()
_cell_subnets = [l['vpn_subnet'] for l in _cell_links if l.get('vpn_subnet')]
firewall_manager.apply_all_peer_rules(peers, wg_subnet=_wg_subnet, cell_subnets=_cell_subnets)
firewall_manager.apply_all_dns_rules(peers, COREFILE_PATH, _configured_domain(),
cell_links=_cell_links)
_dns_primary, _dns_szones = _configured_dns_params()
firewall_manager.apply_all_dns_rules(peers, COREFILE_PATH, _dns_primary,
cell_links=_cell_links,
split_horizon_zones=_dns_szones)
return jsonify({'ok': True, 'peers': len(peers)})
except Exception as e:
return jsonify({'error': str(e)}), 500
@@ -244,6 +288,15 @@ def check_wireguard_port():
try:
from app import wireguard_manager
port_open = wireguard_manager.check_port_open()
return jsonify({'port_open': port_open, 'port': wireguard_manager._get_configured_port()})
configured_port = wireguard_manager._get_configured_port()
listening_port = wireguard_manager._kernel_listening_port()
return jsonify({
'port_open': port_open,
'port': configured_port,
'listening_port': listening_port,
'port_mismatch': (
listening_port is not None and listening_port != configured_port
),
})
except Exception as e:
return jsonify({"error": str(e)}), 500
+3 -2
View File
@@ -31,6 +31,7 @@ class EventType(Enum):
CERTIFICATE_EXPIRING = "certificate_expiring"
BACKUP_CREATED = "backup_created"
RESTORE_COMPLETED = "restore_completed"
IDENTITY_CHANGED = "identity_changed"
@dataclass
class Event:
@@ -185,7 +186,7 @@ class ServiceBus:
'email': ['cell-mail', 'cell-rainloop'], # Email service includes both mail server and web client
'calendar': ['cell-radicale'],
'files': ['cell-webdav', 'cell-filegator'], # Files service includes both webdav and file manager
'network': ['cell-dns', 'cell-dhcp', 'cell-ntp'], # Network service includes all network components
'network': ['cell-dns', 'cell-ntp'], # Network service includes all network components
'routing': None, # Routing is a system service, not a container
'vault': None, # Vault is part of API, not a separate container
'container': None # Container manager doesn't have its own container
@@ -236,7 +237,7 @@ class ServiceBus:
'email': ['cell-mail', 'cell-rainloop'], # Email service includes both mail server and web client
'calendar': ['cell-radicale'],
'files': ['cell-webdav', 'cell-filegator'], # Files service includes both webdav and file manager
'network': ['cell-dns', 'cell-dhcp', 'cell-ntp'], # Network service includes all network components
'network': ['cell-dns', 'cell-ntp'], # Network service includes all network components
'routing': None, # Routing is a system service, not a container
'vault': None, # Vault is part of API, not a separate container
'container': None # Container manager doesn't have its own container
+619
View File
@@ -0,0 +1,619 @@
"""
ServiceComposer docker-compose generation and container lifecycle for PIC services.
Responsibilities:
- Render compose-template.yml per-service docker-compose.yml with PIC_* substitution
- Manage store-service container lifecycle (up / down / restart / status / reconfigure)
- Manage builtin-service restarts and status via the main compose stack
- Generate and persist PIC_SECRET_* variables in a dedicated secrets file
Template variable reference (for compose-template.yml authors):
${PIC_CFG_<KEY>} value from manifest config_schema, uppercased
${PIC_SECRET_<NAME>} auto-generated random secret, persisted across reconfigures
${PIC_DOMAIN} effective domain (e.g. cell.pic.ngo)
${PIC_CELL_NAME} cell name (e.g. mycell)
${PIC_SERVICE_ID} service identifier (e.g. nextcloud)
"""
import json
import logging
import os
import re
import secrets as _secrets_lib
import shutil
import subprocess
import threading
from pathlib import Path
from typing import Dict, List, Optional
from manifest_validator import validate_rendered_compose
logger = logging.getLogger('picell')
_SECRET_RE = re.compile(r'\$\{(PIC_SECRET_\w+)\}')
_SAFE_ID_RE = re.compile(r'^[a-z0-9][a-z0-9_-]{0,63}$')
_DIGEST_RE = re.compile(r'@sha256:[0-9a-f]{64}$')
# Bundled cosign public key — shipped in the repo (config/cosign/cosign.pub) so
# every cell can verify store-service image signatures offline. install.sh keeps
# it at /opt/pic/config/cosign/cosign.pub; in the cell-api container it is
# COPYed to /app/config/cosign/cosign.pub.
_COSIGN_PUBKEY_PATH = os.environ.get(
'PIC_COSIGN_PUBKEY', '/app/config/cosign/cosign.pub'
)
_COSIGN_BIN = os.environ.get('PIC_COSIGN_BIN', 'cosign')
class ServiceComposer:
def __init__(self, config_manager, data_dir: str):
self.cm = config_manager
self.data_dir = data_dir
self._services_dir = os.path.join(data_dir, 'services')
self._secrets_path = os.path.join(data_dir, 'service_secrets.json')
self._lock = threading.Lock()
# ── Path helpers ──────────────────────────────────────────────────────
@staticmethod
def _validate_service_id(service_id: str) -> None:
"""Raise ValueError if service_id could be used for path traversal."""
if not _SAFE_ID_RE.match(service_id):
raise ValueError(
f'Invalid service_id {service_id!r}: '
'must match ^[a-z0-9][a-z0-9_-]{{0,63}}$'
)
def _svc_dir(self, service_id: str) -> str:
self._validate_service_id(service_id)
candidate = os.path.join(self._services_dir, service_id)
# Paranoia: ensure the resolved path stays inside _services_dir
real_base = os.path.realpath(self._services_dir)
real_cand = os.path.realpath(candidate)
if not real_cand.startswith(real_base + os.sep) and real_cand != real_base:
raise ValueError(f'service_id {service_id!r} escapes services directory')
return candidate
def _compose_path(self, service_id: str) -> str:
return os.path.join(self._svc_dir(service_id), 'docker-compose.yml')
def has_compose_file(self, service_id: str) -> bool:
try:
return os.path.exists(self._compose_path(service_id))
except ValueError:
return False
# ── Secrets management ────────────────────────────────────────────────
def _load_secrets(self) -> Dict:
if not os.path.exists(self._secrets_path):
return {}
try:
with open(self._secrets_path) as f:
return json.load(f)
except (OSError, json.JSONDecodeError) as e:
logger.warning('ServiceComposer: failed to load secrets: %s', e)
return {}
def _save_secrets(self, secrets: Dict) -> None:
tmp = self._secrets_path + '.tmp'
# 0o600: readable only by the process owner — secrets must not be world-readable
with open(tmp, 'w',
opener=lambda path, flags: os.open(path, flags, 0o600)) as f:
json.dump(secrets, f, indent=2)
f.flush()
os.fsync(f.fileno())
os.replace(tmp, self._secrets_path)
def _get_or_create_secret(self, service_id: str, var_name: str) -> str:
with self._lock:
secrets = self._load_secrets()
svc_secrets = secrets.setdefault(service_id, {})
if var_name not in svc_secrets:
svc_secrets[var_name] = _secrets_lib.token_urlsafe(24)
self._save_secrets(secrets)
return svc_secrets[var_name]
def _clear_secrets(self, service_id: str) -> None:
with self._lock:
secrets = self._load_secrets()
if service_id in secrets:
del secrets[service_id]
self._save_secrets(secrets)
# ── Template rendering ────────────────────────────────────────────────
def render_template(self, service_id: str, manifest: Dict,
template_content: str,
instance_vars: Optional[Dict[str, str]] = None) -> str:
"""
Substitute all PIC_* variables in a compose-template.yml string.
Returns the rendered compose YAML.
instance_vars optionally supplies per-connection-instance values for
${INSTANCE_ID} and ${REDIRECT_PORT} so an instanceable connectivity
service can be rendered once per connection without collisions. They
are ignored for non-instanceable services (the placeholders simply
never appear in the template).
"""
schema = manifest.get('config_schema') or {}
saved = self.cm.configs.get(service_id, {})
config: Dict = {k: v['default'] for k, v in schema.items() if 'default' in v}
config.update({k: saved[k] for k in schema if k in saved})
identity = self.cm.get_identity()
domain = self.cm.get_effective_domain() or identity.get('domain', 'cell.local')
cell_name = identity.get('cell_name', 'mycell')
result = template_content
for key, value in config.items():
# Strip newlines/tabs to prevent YAML injection (a config string containing
# \n could inject new YAML keys into the compose file)
safe_val = str(value).replace('\n', '').replace('\r', '').replace('\t', ' ')
result = result.replace(f'${{PIC_CFG_{key.upper()}}}', safe_val)
result = result.replace('${PIC_DOMAIN}', domain)
result = result.replace('${PIC_CELL_NAME}', cell_name)
result = result.replace('${PIC_SERVICE_ID}', service_id)
result = result.replace('${PIC_DATA_DIR}', str(Path(self.data_dir).resolve()))
if instance_vars:
for var in ('INSTANCE_ID', 'REDIRECT_PORT'):
if var in instance_vars and instance_vars[var] is not None:
safe = str(instance_vars[var]).replace('\n', '').replace(
'\r', '').replace('\t', ' ')
result = result.replace(f'${{{var}}}', safe)
# PIC_SECRET_* — generate on first use, reuse on reconfigure
for match in _SECRET_RE.finditer(template_content):
var_name = match.group(1)
secret = self._get_or_create_secret(service_id, var_name)
result = result.replace(f'${{{var_name}}}', secret)
return result
def write_compose(self, service_id: str, manifest: Dict,
template_content: str) -> str:
"""Render and atomically write the per-service compose file. Returns rendered content."""
os.makedirs(self._svc_dir(service_id), exist_ok=True)
content = self.render_template(service_id, manifest, template_content)
# Validate before any file I/O so a bad template never touches disk.
# Pass the resolved data_dir so that bind mounts created by ${PIC_DATA_DIR}
# substitution are allowed; all other absolute paths are still rejected.
# Connectivity services (wireguard-ext, openvpn-client, tor) set
# requires_host_network: true in their manifest to opt into network_mode: host.
allow_host_network = bool(manifest.get('requires_host_network'))
ok, errs = validate_rendered_compose(
content,
allowed_data_dir=str(Path(self.data_dir).resolve()),
allow_host_network=allow_host_network,
)
if not ok:
raise ValueError(
f'Compose template failed security validation: {"; ".join(errs)}'
)
path = self._compose_path(service_id)
tmp = path + '.tmp'
with open(tmp, 'w') as f:
f.write(content)
f.flush()
os.fsync(f.fileno())
os.replace(tmp, path)
logger.info('ServiceComposer: wrote compose file for %s', service_id)
return content
# ── Subprocess helper ─────────────────────────────────────────────────
def _run(self, cmd: List[str], timeout: int = 120) -> Dict:
try:
r = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
if r.returncode != 0 and r.stderr:
logger.warning('ServiceComposer command failed: %s', r.stderr.strip())
return {
'ok': r.returncode == 0,
'stdout': r.stdout.strip(),
'stderr': r.stderr.strip(),
}
except subprocess.TimeoutExpired:
return {'ok': False, 'error': 'docker compose command timed out'}
except Exception as e:
logger.error('ServiceComposer._run error: %s', e)
return {'ok': False, 'error': str(e)}
@staticmethod
def _parse_ps_json(output: str) -> List[Dict]:
"""Parse `docker compose ps --format json` output (one JSON object per line)."""
containers = []
for line in output.splitlines():
line = line.strip()
if not line:
continue
try:
containers.append(json.loads(line))
except json.JSONDecodeError:
pass
return containers
# ── Store-service lifecycle (per-service compose file) ────────────────
def _store_cmd(self, service_id: str, *args, timeout: int = 120) -> Dict:
compose_file = self._compose_path(service_id)
if not os.path.exists(compose_file):
return {'ok': False, 'error': f'No compose file found for service {service_id!r}'}
cmd = [
'docker', 'compose',
'-f', compose_file,
'--project-name', f'pic-{service_id}',
*args,
]
return self._run(cmd, timeout)
def up(self, service_id: str) -> Dict:
# 600s: image pulls on slow connections can take several minutes
return self._store_cmd(service_id, 'up', '-d', '--remove-orphans', timeout=600)
def down(self, service_id: str, remove_volumes: bool = False) -> Dict:
args = ['down']
if remove_volumes:
args.append('--volumes')
return self._store_cmd(service_id, *args)
def restart(self, service_id: str) -> Dict:
return self._store_cmd(service_id, 'restart')
def status(self, service_id: str) -> Dict:
result = self._store_cmd(service_id, 'ps', '--format', 'json')
result['containers'] = self._parse_ps_json(result.get('stdout', ''))
return result
def reconfigure(self, service_id: str, manifest: Dict,
template_content: str) -> Dict:
"""Re-render the compose file then re-apply with `up -d` (rolling update)."""
self.write_compose(service_id, manifest, template_content)
return self.up(service_id)
# ── Image signature verification ──────────────────────────────────────
def _verification_mode(self) -> str:
"""Resolve the configured image verification mode (off|warn|enforce)."""
getter = getattr(self.cm, 'get_image_verification_mode', None)
if callable(getter):
try:
return getter()
except Exception as e: # config corruption must not crash install
logger.warning('service_composer: could not read verification mode: %s', e)
return 'enforce'
def _cosign_verify(self, image_ref: str) -> Dict:
"""Run `cosign verify` against the bundled public key for one image ref.
Factored out so tests can mock it / mock the subprocess call. Returns a
_run-style dict ({'ok': bool, 'stdout', 'stderr'/'error'}).
"""
cmd = [
_COSIGN_BIN, 'verify',
'--key', _COSIGN_PUBKEY_PATH,
'--insecure-ignore-tlog=true',
image_ref,
]
return self._run(cmd, timeout=120)
def verify_image(self, service_id: str, manifest: Dict) -> Dict:
"""Verify a store image's signature subject to the configured mode.
Returns {'ok': True, 'skipped'|'verified'|'warned': ...} when the install
may proceed, or {'ok': False, 'error': ...} when it must abort (enforce
mode with a missing digest or a failed/absent signature).
"""
mode = self._verification_mode()
if mode == 'off':
return {'ok': True, 'skipped': True}
image_ref = (manifest or {}).get('image', '')
if not image_ref:
# No image to verify (e.g. builtin-style manifest); nothing to do.
return {'ok': True, 'skipped': True}
# Store images must be digest-pinned to be verifiable by digest.
if not _DIGEST_RE.search(image_ref):
msg = (f'image {image_ref!r} for {service_id} is not digest-pinned '
'(@sha256:) — cannot verify signature')
if mode == 'enforce':
logger.error('service_composer: %s; aborting install (enforce)', msg)
return {'ok': False, 'error': msg}
logger.warning('service_composer: %s; proceeding (warn)', msg)
return {'ok': True, 'warned': True}
result = self._cosign_verify(image_ref)
if result.get('ok'):
logger.info('service_composer: cosign verified %s', image_ref)
return {'ok': True, 'verified': True}
detail = result.get('stderr') or result.get('error') or 'signature verification failed'
msg = f'cosign verification failed for {image_ref}: {str(detail)[:200]}'
if mode == 'enforce':
logger.error('service_composer: %s; aborting install (enforce)', msg)
return {'ok': False, 'error': msg}
logger.warning('service_composer: %s; proceeding (warn)', msg)
return {'ok': True, 'warned': True}
def install(self, service_id: str, manifest: Dict,
template_content: str) -> Dict:
"""Write compose file, verify + pull image, then start containers.
Image signature verification runs before pull/up. Under enforce mode a
missing digest, missing signature, or failed verification aborts the
install (containers are never started); under warn mode the problem is
logged and the install proceeds; under off mode verification is skipped.
pull is run first so the up step doesn't time out on slow connections.
A single retry handles transient registry hiccups on first install.
"""
self.write_compose(service_id, manifest, template_content)
verify = self.verify_image(service_id, manifest)
if not verify.get('ok'):
return {'ok': False, 'error': verify.get('error', 'image verification failed')}
mode = self._verification_mode()
pull = self._store_cmd(service_id, 'pull', timeout=600)
if not pull.get('ok'):
pull_err = pull.get('stderr') or pull.get('error') or 'unknown error'
if mode == 'enforce':
logger.error('service_composer: image pull for %s failed under enforce, '
'aborting: %s', service_id, str(pull_err)[:200])
return {'ok': False,
'error': f'image pull failed (enforce): {str(pull_err)[:200]}'}
logger.warning('service_composer: image pull for %s failed, proceeding anyway: %s',
service_id, str(pull_err)[:200])
result = self.up(service_id)
if not result.get('ok'):
logger.info('service_composer: retrying up for %s after initial failure', service_id)
result = self.up(service_id)
return result
def remove(self, service_id: str, purge_data: bool = False) -> Dict:
"""Stop containers, optionally delete compose file, secrets, and service data dir."""
result = self.down(service_id, remove_volumes=purge_data)
if purge_data:
self._clear_secrets(service_id)
svc_dir = self._svc_dir(service_id) # already validates service_id + realpath
if os.path.isdir(svc_dir):
# Final realpath check: reject symlinks that escape the services dir
real_svc = os.path.realpath(svc_dir)
real_base = os.path.realpath(self._services_dir)
if not real_svc.startswith(real_base + os.sep):
logger.error('ServiceComposer: refusing rmtree outside services dir: %s', svc_dir)
else:
try:
shutil.rmtree(svc_dir)
except OSError as e:
logger.warning('ServiceComposer: could not remove %s: %s', svc_dir, e)
elif os.path.exists(self._compose_path(service_id)):
# Remove compose file even without purge so stale file doesn't confuse future installs
try:
os.remove(self._compose_path(service_id))
except OSError:
pass
return result
# ── Connection-instance lifecycle (one container per connection) ──────
#
# An instanceable connectivity service (wireguard-ext / openvpn-client /
# sshuttle / proxy) backs MANY connections — one container per connection.
# The store service supplies the image + raw compose-template; each
# connection renders that template with its own ${INSTANCE_ID} (short id),
# ${REDIRECT_PORT} and a per-instance config dir, so two connections of the
# same type never collide on container name, config mount, or listen port.
#
# Layout (all under data/services/<service_id>/<instance_id>/):
# docker-compose.yml rendered per-instance compose
# config/ per-instance bind-mounted config dir
# Tor is single-instance and keeps using the plain store-service path.
@staticmethod
def instance_id_for(conn_id: str) -> str:
"""Derive a short, docker-safe INSTANCE_ID from a connection id."""
return conn_id.split('_')[-1][:12]
def _instance_dir(self, service_id: str, instance_id: str) -> str:
self._validate_service_id(service_id)
if not _SAFE_ID_RE.match(instance_id):
raise ValueError(f'invalid instance_id {instance_id!r}')
candidate = os.path.join(self._svc_dir(service_id), instance_id)
real_base = os.path.realpath(self._svc_dir(service_id))
real_cand = os.path.realpath(candidate)
if not real_cand.startswith(real_base + os.sep) and real_cand != real_base:
raise ValueError(f'instance_id {instance_id!r} escapes service directory')
return candidate
def _instance_compose_path(self, service_id: str, instance_id: str) -> str:
return os.path.join(self._instance_dir(service_id, instance_id),
'docker-compose.yml')
def instance_config_dir(self, service_id: str, instance_id: str) -> str:
"""Per-instance config dir that the compose template bind-mounts."""
return os.path.join(self._instance_dir(service_id, instance_id), 'config')
def has_instance_compose(self, service_id: str, instance_id: str) -> bool:
try:
return os.path.exists(self._instance_compose_path(service_id, instance_id))
except ValueError:
return False
def write_instance_compose(self, service_id: str, instance_id: str,
manifest: Dict, template_content: str,
redirect_port: Optional[int] = None) -> str:
"""Render + atomically write a per-instance compose file. Returns content."""
inst_dir = self._instance_dir(service_id, instance_id)
os.makedirs(os.path.join(inst_dir, 'config'), exist_ok=True)
instance_vars = {'INSTANCE_ID': instance_id}
if redirect_port is not None:
instance_vars['REDIRECT_PORT'] = str(redirect_port)
content = self.render_template(
service_id, manifest, template_content, instance_vars=instance_vars)
allow_host_network = bool(manifest.get('requires_host_network'))
ok, errs = validate_rendered_compose(
content,
allowed_data_dir=str(Path(self.data_dir).resolve()),
allow_host_network=allow_host_network,
)
if not ok:
raise ValueError(
f'Instance compose failed security validation: {"; ".join(errs)}')
path = self._instance_compose_path(service_id, instance_id)
tmp = path + '.tmp'
with open(tmp, 'w') as f:
f.write(content)
f.flush()
os.fsync(f.fileno())
os.replace(tmp, path)
logger.info('ServiceComposer: wrote instance compose %s/%s',
service_id, instance_id)
return content
def _instance_cmd(self, service_id: str, instance_id: str, *args,
timeout: int = 120) -> Dict:
compose_file = self._instance_compose_path(service_id, instance_id)
if not os.path.exists(compose_file):
return {'ok': False,
'error': f'No compose file for instance {service_id}/{instance_id}'}
cmd = [
'docker', 'compose',
'-f', compose_file,
'--project-name', f'pic-conn-{instance_id}',
*args,
]
return self._run(cmd, timeout)
def up_instance(self, service_id: str, instance_id: str, manifest: Dict,
template_content: str,
redirect_port: Optional[int] = None) -> Dict:
"""Render + bring up the container for one connection instance."""
try:
self.write_instance_compose(service_id, instance_id, manifest,
template_content, redirect_port)
except ValueError as e:
return {'ok': False, 'error': str(e)}
return self._instance_cmd(service_id, instance_id, 'up', '-d',
'--remove-orphans', timeout=600)
def down_instance(self, service_id: str, instance_id: str,
purge_data: bool = False) -> Dict:
"""Stop the connection instance's container and remove its compose/dir."""
result = {'ok': True}
if self.has_instance_compose(service_id, instance_id):
args = ['down']
if purge_data:
args.append('--volumes')
result = self._instance_cmd(service_id, instance_id, *args)
try:
inst_dir = self._instance_dir(service_id, instance_id)
except ValueError as e:
logger.warning('down_instance: %s', e)
return result
if os.path.isdir(inst_dir):
real_inst = os.path.realpath(inst_dir)
real_base = os.path.realpath(self._svc_dir(service_id))
if not real_inst.startswith(real_base + os.sep):
logger.error('ServiceComposer: refusing rmtree outside service dir: %s',
inst_dir)
else:
try:
shutil.rmtree(inst_dir)
except OSError as e:
logger.warning('ServiceComposer: could not remove %s: %s',
inst_dir, e)
return result
def status_instance(self, service_id: str, instance_id: str) -> Dict:
result = self._instance_cmd(service_id, instance_id, 'ps', '--format', 'json')
result['containers'] = self._parse_ps_json(result.get('stdout', ''))
return result
# ── Dependency resolution ─────────────────────────────────────────────
def _resolve_requires(self, manifest: Dict, installed_services: Dict) -> Optional[str]:
"""Return an error string if any required services are missing, else None."""
requires = manifest.get('requires') or []
missing = [r for r in requires if r not in installed_services]
if missing:
return f"Required services not installed: {', '.join(sorted(missing))}"
return None
def _resolve_dependents(self, service_id: str, installed_services: Dict) -> List[str]:
"""Return list of installed service IDs that declare service_id in their requires."""
dependents = []
for svc_id, record in installed_services.items():
if svc_id == service_id:
continue
m = (record.get('manifest') or {})
if service_id in (m.get('requires') or []):
dependents.append(svc_id)
return dependents
def reapply_active_services(self) -> None:
"""Call up() for every installed service that has a compose file. Called at startup."""
installed = self.cm.get_installed_services()
for svc_id in installed:
if not self.has_compose_file(svc_id):
logger.warning('reapply_active_services: no compose file for %s, skipping', svc_id)
continue
result = self.up(svc_id)
if not result.get('ok'):
logger.warning('reapply_active_services: up failed for %s: %s',
svc_id, result.get('error') or result.get('stderr', ''))
# ── Builtin-service lifecycle (main compose stack) ─────────────────────
@staticmethod
def _main_compose() -> str:
return os.environ.get('COMPOSE_FILE', '/app/docker-compose.yml')
def restart_builtin(self, container_names: List[str]) -> Dict:
"""Restart one or more containers that live in the main docker-compose stack."""
if not container_names:
return {'ok': False, 'error': 'No container names provided'}
cmd = ['docker', 'compose', '-f', self._main_compose(),
'restart', *container_names]
return self._run(cmd)
def status_builtin(self, container_names: List[str]) -> Dict:
"""Return status of containers from the main compose stack."""
if not container_names:
return {'ok': False, 'error': 'No container names provided'}
cmd = ['docker', 'compose', '-f', self._main_compose(),
'ps', '--format', 'json', *container_names]
result = self._run(cmd)
result['containers'] = self._parse_ps_json(result.get('stdout', ''))
return result
# ── Unified lifecycle (dispatches based on service kind) ───────────────
def restart_service(self, service_id: str, manifest: Dict) -> Dict:
"""
Restart any service builtin or store using the right compose stack.
Builtin: uses manifest.containers + main docker-compose.yml.
Store: uses per-service compose file.
"""
if manifest.get('kind') == 'builtin':
containers = manifest.get('containers') or []
return self.restart_builtin(containers)
return self.restart(service_id)
def status_service(self, service_id: str, manifest: Dict) -> Dict:
"""
Return container status for any service.
Builtin: queries manifest.containers from main compose stack.
Store: queries per-service compose project.
"""
if manifest.get('kind') == 'builtin':
containers = manifest.get('containers') or []
return self.status_builtin(containers)
return self.status(service_id)
+177
View File
@@ -0,0 +1,177 @@
"""
ServiceRegistry single source of truth for all PIC services.
Merges two layers:
1. Manifest defaults (config_schema.*.default)
2. Admin-saved config from ConfigManager (cell_config.json)
All consumers (CaddyManager, backup, peer services endpoint) read from here
rather than hardcoding service names or subdomains.
"""
import logging
import re
from typing import Dict, List, Optional
from urllib.parse import quote as _urlquote
logger = logging.getLogger('picell')
_SUBDOMAIN_RE = re.compile(r'^[a-z][a-z0-9-]{0,30}$')
_BACKEND_RE = re.compile(r'^[A-Za-z0-9._-]+:\d{1,5}$')
_RESERVED_SUBS = frozenset({'api', 'webui', 'admin', 'www', 'ns1', 'ns2', 'git', 'registry', 'install'})
class ServiceRegistry:
def __init__(self, config_manager):
self._cm = config_manager
# ── Config merging ────────────────────────────────────────────────────
_TYPE_COERCIONS = {'integer': int, 'string': str, 'boolean': bool}
def _merged_config(self, manifest: Dict) -> Dict:
"""Return manifest defaults overridden by admin-saved values, type-coerced."""
svc_id = manifest.get('id', '')
saved = self._cm.configs.get(svc_id, {})
schema = manifest.get('config_schema') or {}
merged = {k: v['default'] for k, v in schema.items() if 'default' in v}
for k, spec in schema.items():
if k not in saved:
continue
raw = saved[k]
coerce = self._TYPE_COERCIONS.get(spec.get('type', ''))
if coerce is not None:
try:
raw = coerce(raw)
except (TypeError, ValueError):
raw = merged.get(k, raw)
merged[k] = raw
return merged
# ── Public API ────────────────────────────────────────────────────────
def get(self, service_id: str) -> Optional[Dict]:
"""Return manifest + merged config for one service, or None if unknown."""
record = self._cm.get_installed_services().get(service_id)
if not record:
return None
manifest = record.get('manifest')
if not manifest:
return None
return {**manifest, 'config': self._merged_config(manifest)}
def list_active(self) -> List[Dict]:
"""Return all installed store services, each with merged config."""
results = []
for _svc_id, record in self._cm.get_installed_services().items():
manifest = record.get('manifest') or {}
if manifest.get('id'):
results.append({**manifest, 'config': self._merged_config(manifest)})
return results
def list_all(self) -> List[Dict]:
"""Return all installed store services, each with merged config attached as the 'config' key."""
return self.list_active()
def get_caddy_routes(self) -> List[Dict]:
"""
Return routing info for all services that have a subdomain.
Used by CaddyManager to build service blocks without hardcoding.
Values are validated here as a chokepoint so Caddyfile/DNS builders
can safely interpolate them regardless of how manifests reached disk.
"""
routes = []
for svc in self.list_all():
caps = svc.get('capabilities') or {}
if not caps.get('has_subdomain'):
continue
sub = svc.get('subdomain', '')
bknd = svc.get('backend', '')
if not sub or not bknd:
continue
svc_id = svc.get('id', '?')
if not _SUBDOMAIN_RE.match(sub) or sub in _RESERVED_SUBS:
logger.warning('ServiceRegistry: skipping %s — invalid/reserved subdomain %r', svc_id, sub)
continue
if not _BACKEND_RE.match(bknd):
logger.warning('ServiceRegistry: skipping %s — invalid backend %r', svc_id, bknd)
continue
extra_subs = [
s for s in (svc.get('extra_subdomains') or [])
if isinstance(s, str) and _SUBDOMAIN_RE.match(s) and s not in _RESERVED_SUBS
]
extra_backends = {
k: v for k, v in (svc.get('extra_backends') or {}).items()
if (isinstance(k, str) and _SUBDOMAIN_RE.match(k) and k not in _RESERVED_SUBS
and isinstance(v, str) and _BACKEND_RE.match(v))
}
routes.append({
'service_id': svc_id,
'subdomain': sub,
'backend': bknd,
'extra_subdomains': extra_subs,
'extra_backends': extra_backends,
})
return routes
def get_backup_plan(self) -> List[Dict]:
"""
Return backup declarations for all services that have storage.
Used by the backup system instead of hardcoded file lists.
Each entry:
service_id service identifier
volumes list of {container, path, name} for docker-exec streaming
config_paths host-relative paths copied directly (config files)
"""
plan = []
for svc in self.list_all():
caps = svc.get('capabilities') or {}
if not caps.get('has_storage'):
continue
backup = svc.get('backup') or {}
volumes = backup.get('volumes') or []
config_paths = backup.get('config_paths') or []
if not volumes and not config_paths:
continue
plan.append({
'service_id': svc['id'],
'volumes': volumes,
'config_paths': config_paths,
})
return plan
def get_peer_service_info(self, service_id: str, peer_username: str,
domain: str, credentials: Dict) -> Optional[Dict]:
"""
Fill peer_config_template for one service+peer combination.
credentials: dict of {field_name: value} for that peer+service.
Returns None if service unknown or has no peer template.
"""
svc = self.get(service_id)
if not svc:
return None
template = svc.get('peer_config_template')
if not template:
return None
# URL-safe peer username (safe='') — prevents path traversal in CalDAV/WebDAV URLs
safe_username = _urlquote(peer_username, safe='')
result = {}
for key, raw in template.items():
val = raw
val = val.replace('{domain}', domain)
val = val.replace('{peer.username}', safe_username)
for field, cred_val in credentials.items():
val = val.replace(
'{peer.service_credentials.' + service_id + '.' + field + '}',
str(cred_val) if cred_val is not None else '',
)
cfg = svc.get('config') or {}
for cfg_key, cfg_val in cfg.items():
val = val.replace('{config.' + cfg_key + '}', str(cfg_val) if cfg_val is not None else '')
result[key] = val
return result
+220 -285
View File
@@ -14,15 +14,16 @@ import logging
import os
import re
import threading
import subprocess
from datetime import datetime
from typing import Any, Dict, List, Optional, Tuple
import json
import requests
import yaml
from base_service_manager import BaseServiceManager
from ip_utils import CONTAINER_OFFSETS
from constants import RESERVED_SUBDOMAINS
from manifest_validator import validate_manifest, validate_provision_hook
logger = logging.getLogger(__name__)
@@ -30,27 +31,36 @@ logger = logging.getLogger(__name__)
# Constants
# ---------------------------------------------------------------------------
SERVICE_POOL_START = 20
SERVICE_POOL_END = 254
INDEX_URL_DEFAULT = (
'https://git.pic.ngo/roof/pic-services/raw/branch/main/index.json'
)
MANIFEST_URL_TPL = (
'https://git.pic.ngo/roof/pic-services/raw/branch/main/services/{id}/manifest.json'
)
TEMPLATE_URL_TPL = (
'https://git.pic.ngo/roof/pic-services/raw/branch/main/services/{id}/compose-template.yml'
)
IMAGE_ALLOWLIST_RE = re.compile(
r'^git\.pic\.ngo/roof/[a-z0-9._/-]+(:[a-zA-Z0-9._-]+)?$'
r'^git\.pic\.ngo/roof/[a-z0-9._/-]+(:[a-zA-Z0-9._-]+)?(@sha256:[a-f0-9]{64})?$'
)
# Images from well-known vendors that pre-date digest pinning in PIC.
# These are allowed to ship without a @sha256 digest; all others require one
# or must come from git.pic.ngo/roof/*.
TRUSTED_IMAGES_NO_DIGEST = frozenset({
'mailserver/docker-mailserver',
'tomsquest/docker-radicale',
'bytemark/webdav',
'filegator/filegator',
'hardware/rainloop',
})
FORBIDDEN_MOUNTS = frozenset([
'/', '/etc', '/var', '/proc', '/sys', '/dev', '/app', '/run', '/boot',
])
RESERVED_SUBDOMAINS = frozenset([
'api', 'webui', 'admin', 'www', 'mail', 'ns1', 'ns2',
'git', 'registry', 'install',
])
ENV_VALUE_RE = re.compile(r'^[A-Za-z0-9._@:/+\-= ]*$')
SUBDOMAIN_RE = re.compile(r'^[a-z][a-z0-9-]{0,30}$')
BACKEND_RE = re.compile(r'^[A-Za-z0-9._-]+:\d{1,5}$')
# ---------------------------------------------------------------------------
@@ -61,11 +71,14 @@ class ServiceStoreManager(BaseServiceManager):
"""Manages service store: install, remove, and list available/installed services."""
def __init__(self, config_manager, caddy_manager, container_manager,
data_dir: str = '', config_dir: str = ''):
data_dir: str = '', config_dir: str = '',
service_composer=None, egress_manager=None):
super().__init__('service_store', data_dir, config_dir)
self.config_manager = config_manager
self.caddy_manager = caddy_manager
self.container_manager = container_manager
self.service_composer = service_composer
self.egress_manager = egress_manager
self.compose_override = os.environ.get(
'COMPOSE_SERVICES_PATH', '/app/docker-compose.services.yml'
)
@@ -110,6 +123,21 @@ class ServiceStoreManager(BaseServiceManager):
errors.append(
f'image must match git.pic.ngo/roof/* pattern, got: {image}'
)
elif image:
# Warn when a digest pin is absent so operators know exact-version
# tracking is not guaranteed. Images in TRUSTED_IMAGES_NO_DIGEST
# and images from our own git.pic.ngo/roof/* registry (which we
# build and tag) get warnings rather than hard errors; any other
# image that somehow passes the allowlist gets a hard error.
if '@sha256:' not in image:
image_base = image.split(':')[0].split('@')[0]
is_own_registry = image_base.startswith('git.pic.ngo/roof/')
if image_base in TRUSTED_IMAGES_NO_DIGEST or is_own_registry:
logger.warning('image %s has no digest pin', image)
else:
errors.append(
f'image {image!r} must include a @sha256:<digest> pin'
)
# Volume mount safety
for vol in m.get('volumes', []):
@@ -141,19 +169,55 @@ class ServiceStoreManager(BaseServiceManager):
f'iptables_rules[].proto must be tcp or udp, got: {proto}'
)
# Caddy route subdomain
# Legacy caddy_route dict subdomain (for store manifests using the old format)
caddy_route = m.get('caddy_route') or {}
if isinstance(caddy_route, dict):
subdomain = caddy_route.get('subdomain', '')
legacy_sub = caddy_route.get('subdomain', '')
else:
subdomain = ''
if subdomain:
if subdomain in RESERVED_SUBDOMAINS:
errors.append(f'caddy_route.subdomain is reserved: {subdomain}')
elif not re.match(r'^[a-z][a-z0-9-]{0,30}$', subdomain):
legacy_sub = ''
if legacy_sub:
if legacy_sub in RESERVED_SUBDOMAINS:
errors.append(f'caddy_route.subdomain is reserved: {legacy_sub}')
elif not SUBDOMAIN_RE.match(legacy_sub):
errors.append(
f'caddy_route.subdomain must match ^[a-z][a-z0-9-]{{0,30}}$, '
f'got: {subdomain}'
f'got: {legacy_sub}'
)
# Top-level subdomain + backend (consumed by ServiceRegistry.get_caddy_routes)
subdomain = m.get('subdomain', '')
if subdomain:
if subdomain in RESERVED_SUBDOMAINS:
errors.append(f'subdomain is reserved: {subdomain}')
elif not SUBDOMAIN_RE.match(subdomain):
errors.append(
f'subdomain must match ^[a-z][a-z0-9-]{{0,30}}$, got: {subdomain}'
)
backend = m.get('backend', '')
if backend and not BACKEND_RE.match(backend):
errors.append(f'backend must be host:port (e.g. cell-foo:8080), got: {backend}')
for sub in m.get('extra_subdomains') or []:
if not isinstance(sub, str):
errors.append('extra_subdomains entries must be strings')
elif sub in RESERVED_SUBDOMAINS:
errors.append(f'extra_subdomains entry is reserved: {sub}')
elif not SUBDOMAIN_RE.match(sub):
errors.append(
f'extra_subdomains entry must match ^[a-z][a-z0-9-]{{0,30}}$, got: {sub}'
)
for sub, bknd in (m.get('extra_backends') or {}).items():
if not isinstance(sub, str) or not SUBDOMAIN_RE.match(sub):
errors.append(
f'extra_backends key must match ^[a-z][a-z0-9-]{{0,30}}$, got: {sub!r}'
)
elif sub in RESERVED_SUBDOMAINS:
errors.append(f'extra_backends key is reserved: {sub}')
if not isinstance(bknd, str) or not BACKEND_RE.match(bknd):
errors.append(
f'extra_backends[{sub!r}] value must be host:port, got: {bknd!r}'
)
# Env value safety
@@ -164,139 +228,30 @@ class ServiceStoreManager(BaseServiceManager):
f'env[].value contains disallowed characters: {val!r}'
)
# Security layer: delegate to manifest_validator for cap_add, backend
# denylist, provision_hook, reserved container names, and kind guard.
ok, sec_errs = validate_manifest(m)
if not ok:
errors.extend(sec_errs)
return (len(errors) == 0, errors)
# ── IP allocation ─────────────────────────────────────────────────────
def _allocate_service_ip(self, service_id: str) -> str:
"""Allocate the next free IP from the service pool."""
identity = self.config_manager.get_identity()
ip_range = identity.get('ip_range', '172.20.0.0/16')
import ipaddress
network = ipaddress.IPv4Network(ip_range, strict=False)
base = int(network.network_address)
# IPs already assigned to named containers
reserved_offsets = set(CONTAINER_OFFSETS.values())
# IPs already assigned to installed services
service_ips: Dict[str, str] = identity.get('service_ips', {})
taken_ips = set(service_ips.values())
for offset in range(SERVICE_POOL_START, SERVICE_POOL_END + 1):
if offset in reserved_offsets:
continue
candidate = str(ipaddress.IPv4Address(base + offset))
if candidate not in taken_ips:
return candidate
raise RuntimeError('Service IP pool exhausted (offsets 20-254 all taken)')
# ── Compose override ──────────────────────────────────────────────────
def _render_compose_override(self, installed_records: dict) -> str:
"""Generate docker-compose YAML override for all installed services."""
services: Dict[str, Any] = {}
for svc_id, record in installed_records.items():
manifest = record.get('manifest', {})
container_name = record.get('container_name', svc_id)
image = manifest.get('image', record.get('image', ''))
service_ip = record.get('service_ip', '')
# Volumes
volumes = []
for vol in manifest.get('volumes', []):
vol_name = vol.get('name', '')
mount = vol.get('mount', '')
if vol_name and mount:
volumes.append(f'{vol_name}:{mount}')
# Environment
environment: Dict[str, str] = {}
for env_entry in manifest.get('env', []):
k = env_entry.get('key', '')
v = str(env_entry.get('value', ''))
if k:
environment[k] = v
svc_def: Dict[str, Any] = {
'image': image,
'container_name': container_name,
'restart': 'unless-stopped',
'logging': {
'driver': 'json-file',
'options': {
'max-size': '10m',
'max-file': '5',
},
},
'networks': {
'cell-network': {
'ipv4_address': service_ip,
}
},
}
if volumes:
svc_def['volumes'] = volumes
if environment:
svc_def['environment'] = environment
services[container_name] = svc_def
# Collect named volumes
named_volumes: Dict[str, Any] = {}
for svc_id, record in installed_records.items():
manifest = record.get('manifest', {})
for vol in manifest.get('volumes', []):
vol_name = vol.get('name', '')
if vol_name:
named_volumes[vol_name] = None # Docker default driver
doc: Dict[str, Any] = {
'version': '3.8',
'services': services,
'networks': {
'cell-network': {
'external': True,
}
},
}
if named_volumes:
doc['volumes'] = named_volumes
return yaml.dump(doc, default_flow_style=False, allow_unicode=True)
def _write_compose_override(self, content: str) -> None:
"""Atomic write of the compose override file."""
tmp_path = self.compose_override + '.tmp'
try:
os.makedirs(os.path.dirname(os.path.abspath(self.compose_override)),
exist_ok=True)
except (PermissionError, OSError):
pass
with open(tmp_path, 'w') as f:
f.write(content)
f.flush()
try:
os.fsync(f.fileno())
except OSError:
pass
os.replace(tmp_path, self.compose_override)
# ── Index / manifest fetching ─────────────────────────────────────────
def fetch_index(self) -> list:
"""Fetch and cache the service index."""
import time
_SIZE_LIMIT = 256 * 1024
now = time.time()
if self._index_cache is not None and (now - self._index_cache_time) < self._cache_ttl:
return self._index_cache
try:
resp = requests.get(self.index_url, timeout=10)
resp = requests.get(self.index_url, timeout=10, stream=True)
resp.raise_for_status()
data = resp.json()
content = resp.raw.read(_SIZE_LIMIT + 1, decode_content=True)
if len(content) > _SIZE_LIMIT:
raise ValueError('Index response exceeds 256 KB limit')
data = json.loads(content)
self._index_cache = data if isinstance(data, list) else data.get('services', [])
self._index_cache_time = now
return self._index_cache
@@ -306,19 +261,33 @@ class ServiceStoreManager(BaseServiceManager):
def _fetch_manifest(self, service_id: str) -> dict:
"""Fetch a service manifest by ID."""
_SIZE_LIMIT = 256 * 1024
url = MANIFEST_URL_TPL.format(id=service_id)
resp = requests.get(url, timeout=10)
resp = requests.get(url, timeout=10, stream=True)
resp.raise_for_status()
return resp.json()
content = resp.raw.read(_SIZE_LIMIT + 1, decode_content=True)
if len(content) > _SIZE_LIMIT:
raise ValueError(
f'Manifest response for {service_id} exceeds 256 KB limit'
)
return json.loads(content)
def _fetch_template(self, service_id: str, manifest: dict) -> str:
"""Fetch the compose template for a service."""
_SIZE_LIMIT = 256 * 1024
url = TEMPLATE_URL_TPL.format(id=service_id)
resp = requests.get(url, timeout=10, stream=True)
resp.raise_for_status()
content = resp.raw.read(_SIZE_LIMIT + 1, decode_content=True)
if len(content) > _SIZE_LIMIT:
raise ValueError(f'Compose template for {service_id} exceeds 256 KB limit')
return content.decode('utf-8')
# ── Core operations ───────────────────────────────────────────────────
def install(self, service_id: str) -> dict:
"""Install a service from the store."""
from firewall_manager import apply_service_rules
with self._lock:
# Already installed?
installed = self.config_manager.get_installed_services()
if service_id in installed:
return {'ok': True, 'already_installed': True}
@@ -333,154 +302,111 @@ class ServiceStoreManager(BaseServiceManager):
if not ok:
return {'ok': False, 'errors': errs}
# Allocate IP
try:
ip = self._allocate_service_ip(service_id)
except RuntimeError as e:
return {'ok': False, 'error': str(e)}
ok2, errs2 = validate_manifest(manifest)
if not ok2:
return {'ok': False, 'errors': errs2}
# Build install record
# Digest-pin requirement is mode-dependent: the static validators
# above only warn on a missing @sha256: pin (so installs keep
# working until the publish pipeline writes digests). Under
# enforce, a store image without a digest pin is fatal.
mode = self.config_manager.get_image_verification_mode()
image = manifest.get('image', '')
if mode == 'enforce' and image and '@sha256:' not in image:
return {
'ok': False,
'error': (
f'image {image!r} must be digest-pinned (@sha256:) '
'under image_verification mode "enforce"'
),
}
# Dependency check
if self.service_composer is not None:
err = self.service_composer._resolve_requires(manifest, installed)
if err:
return {'ok': False, 'error': err}
# Fetch compose template
try:
template_content = self._fetch_template(service_id, manifest)
except Exception as e:
return {'ok': False, 'error': f'Failed to fetch compose template: {e}'}
# Write compose file and start containers (validation inside write_compose)
if self.service_composer is not None:
try:
result = self.service_composer.install(service_id, manifest, template_content)
except ValueError as e:
return {'ok': False, 'error': str(e)}
except Exception as e:
return {'ok': False, 'error': f'Failed to start service: {e}'}
if not result.get('ok'):
return {'ok': False, 'error': result.get('error') or result.get('stderr', 'docker up failed')}
# Persist minimal install record. For instanceable connectivity
# services the raw compose template is stored so ConnectivityManager
# can render one container per connection instance without re-fetching.
record = {
'id': service_id,
'name': manifest.get('name', service_id),
'container_name': manifest['container_name'],
'image': manifest.get('image', ''),
'service_ip': ip,
'caddy_route': manifest.get('caddy_route'),
'iptables_rules': manifest.get('iptables_rules', []),
'manifest': manifest,
'installed_at': datetime.utcnow().isoformat(),
}
# Persist to config
if manifest.get('instanceable'):
record['compose_template'] = template_content
self.config_manager.set_installed_service(service_id, record)
identity = self.config_manager.get_identity()
service_ips = dict(identity.get('service_ips', {}))
service_ips[service_id] = ip
self.config_manager.set_identity_field('service_ips', service_ips)
# Write compose override
all_installed = self.config_manager.get_installed_services()
# Regenerate Caddy (registry now drives routes, no caddy_routes list needed)
try:
content = self._render_compose_override(all_installed)
self._write_compose_override(content)
self.caddy_manager.regenerate_with_installed([])
except Exception as e:
logger.error(f'Failed to write compose override: {e}')
logger.warning('install: caddy regenerate failed for %s (non-fatal): %s', service_id, e)
# Apply iptables rules (best-effort)
try:
apply_service_rules(service_id, ip, manifest.get('iptables_rules', []))
except Exception as e:
logger.warning(f'apply_service_rules for {service_id} failed (non-fatal): {e}')
if self.egress_manager:
try:
self.egress_manager.apply_service(service_id)
except Exception as exc:
logger.warning('Egress apply failed for %s (non-fatal): %s', service_id, exc)
# Regenerate Caddyfile
try:
caddy_routes = [
r.get('caddy_route')
for r in all_installed.values()
if r.get('caddy_route')
]
self.caddy_manager.regenerate_with_installed(caddy_routes)
except Exception as e:
logger.warning(f'caddy regenerate for {service_id} failed (non-fatal): {e}')
# Start the container via docker compose
base_compose = os.environ.get('COMPOSE_FILE', '/app/docker-compose.yml')
try:
result = subprocess.run(
['docker', 'compose',
'-f', base_compose,
'-f', self.compose_override,
'up', '-d', manifest['container_name']],
capture_output=True, text=True, timeout=120,
)
if result.returncode != 0:
logger.warning(
f'docker compose up for {service_id} failed: {result.stderr.strip()}'
)
except Exception as e:
logger.warning(f'docker compose up for {service_id} failed (non-fatal): {e}')
return {
'ok': True,
'service_ip': ip,
'container_name': manifest['container_name'],
}
return {'ok': True}
def remove(self, service_id: str, purge_data: bool = False) -> dict:
"""Remove an installed service."""
from firewall_manager import clear_service_rules
with self._lock:
installed = self.config_manager.get_installed_services()
record = installed.get(service_id)
if not record:
if service_id not in installed:
return {'ok': False, 'error': f'Service {service_id} is not installed'}
container_name = record.get('container_name', service_id)
manifest = record.get('manifest', {})
base_compose = os.environ.get('COMPOSE_FILE', '/app/docker-compose.yml')
# Prevent removing a service that others depend on
if self.service_composer is not None:
dependents = self.service_composer._resolve_dependents(service_id, installed)
if dependents:
return {
'ok': False,
'error': f'Cannot remove {service_id}: required by {", ".join(sorted(dependents))}',
}
# Stop and remove container
try:
subprocess.run(
['docker', 'compose',
'-f', base_compose,
'-f', self.compose_override,
'stop', container_name],
capture_output=True, text=True, timeout=60,
)
except Exception as e:
logger.warning(f'docker compose stop for {service_id} failed (non-fatal): {e}')
if self.egress_manager:
try:
self.egress_manager.clear_service(service_id)
except Exception as exc:
logger.warning('Egress clear failed for %s (non-fatal): %s', service_id, exc)
try:
subprocess.run(
['docker', 'rm', '-f', container_name],
capture_output=True, text=True, timeout=30,
)
except Exception as e:
logger.warning(f'docker rm for {service_id} failed (non-fatal): {e}')
# Stop and remove containers (best-effort)
if self.service_composer is not None:
try:
self.service_composer.remove(service_id, purge_data=purge_data)
except Exception as e:
logger.warning('remove: composer.remove failed for %s (non-fatal): %s', service_id, e)
# Clear iptables rules
try:
clear_service_rules(service_id)
except Exception as e:
logger.warning(f'clear_service_rules for {service_id} failed (non-fatal): {e}')
# Remove from config, regenerate compose + caddy
# Remove from config
self.config_manager.remove_installed_service(service_id)
remaining = self.config_manager.get_installed_services()
# Regenerate Caddy
try:
content = self._render_compose_override(remaining)
self._write_compose_override(content)
self.caddy_manager.regenerate_with_installed([])
except Exception as e:
logger.error(f'Failed to write compose override after remove: {e}')
try:
caddy_routes = [
r.get('caddy_route')
for r in remaining.values()
if r.get('caddy_route')
]
self.caddy_manager.regenerate_with_installed(caddy_routes)
except Exception as e:
logger.warning(f'caddy regenerate after remove failed (non-fatal): {e}')
# Purge named volumes if requested
if purge_data:
for vol in manifest.get('volumes', []):
vol_name = vol.get('name', '')
if vol_name:
try:
subprocess.run(
['docker', 'volume', 'rm', vol_name],
capture_output=True, text=True, timeout=30,
)
except Exception as e:
logger.warning(
f'docker volume rm {vol_name} failed (non-fatal): {e}'
)
logger.warning('remove: caddy regenerate failed for %s (non-fatal): %s', service_id, e)
return {'ok': True}
@@ -495,16 +421,22 @@ class ServiceStoreManager(BaseServiceManager):
from firewall_manager import apply_service_rules
installed = self.config_manager.get_installed_services()
# Always regenerate the Caddyfile so a cell rename or fresh install
# produces the correct domain even when no store services are installed.
try:
caddy_routes = [
r.get('caddy_route')
for r in (installed or {}).values()
if r.get('caddy_route')
]
self.caddy_manager.regenerate_with_installed(caddy_routes)
except Exception as e:
logger.warning(f'reapply_on_startup: caddy regenerate failed: {e}')
if not installed:
return
# Regenerate compose override in case it was deleted
try:
content = self._render_compose_override(installed)
self._write_compose_override(content)
except Exception as e:
logger.warning(f'reapply_on_startup: compose override write failed: {e}')
# Re-apply iptables rules
for svc_id, record in installed.items():
ip = record.get('service_ip', '')
@@ -514,13 +446,16 @@ class ServiceStoreManager(BaseServiceManager):
except Exception as e:
logger.warning(f'reapply_on_startup: apply_service_rules({svc_id}) failed: {e}')
# Regenerate Caddyfile
try:
caddy_routes = [
r.get('caddy_route')
for r in installed.values()
if r.get('caddy_route')
]
self.caddy_manager.regenerate_with_installed(caddy_routes)
except Exception as e:
logger.warning(f'reapply_on_startup: caddy regenerate failed: {e}')
# Bring up per-service compose stacks
if self.service_composer is not None:
try:
self.service_composer.reapply_active_services()
except Exception as e:
logger.warning('reapply_on_startup: reapply_active_services failed: %s', e)
# Re-apply egress fwmark rules
if self.egress_manager is not None:
try:
self.egress_manager.apply_all()
except Exception as e:
logger.warning('reapply_on_startup: egress apply_all failed: %s', e)
+119 -15
View File
@@ -60,12 +60,44 @@ VALID_DOMAIN_MODES = {'pic_ngo', 'cloudflare', 'duckdns', 'http01', 'lan'}
CELL_NAME_RE = re.compile(r'^[a-z][a-z0-9-]{1,30}$')
DDNS_API_BASE = os.environ.get('DDNS_URL', 'https://ddns.pic.ngo/api/v1').rstrip('/').replace('/api/v1', '')
DDNS_TOTP_SECRET = os.environ.get('DDNS_TOTP_SECRET', '')
def _build_ddns_config(domain_mode: str, cloudflare_api_token: str = '',
duckdns_token: str = '', duckdns_subdomain: str = '') -> dict:
"""Return the top-level ddns config dict for a given domain mode."""
if domain_mode == 'pic_ngo':
return {
'provider': 'pic_ngo',
'api_base_url': DDNS_API_BASE,
'totp_secret': DDNS_TOTP_SECRET,
'enabled': True,
}
if domain_mode == 'cloudflare':
cfg = {'provider': 'cloudflare', 'enabled': True}
if cloudflare_api_token:
cfg['api_token'] = cloudflare_api_token
return cfg
if domain_mode == 'duckdns':
cfg = {'provider': 'duckdns', 'enabled': True}
if duckdns_token:
cfg['token'] = duckdns_token
if duckdns_subdomain:
cfg['subdomain'] = duckdns_subdomain
return cfg
if domain_mode == 'http01':
return {'provider': 'http01', 'enabled': True}
return {'provider': 'none', 'enabled': False}
class SetupManager:
"""Manages the first-run setup wizard state and completion."""
def __init__(self, config_manager, auth_manager):
def __init__(self, config_manager, auth_manager, network_manager=None):
self.config_manager = config_manager
self.auth_manager = auth_manager
self.network_manager = network_manager
# ── state helpers ─────────────────────────────────────────────────────
@@ -74,11 +106,22 @@ class SetupManager:
return bool(self.config_manager.get_identity().get('setup_complete', False))
def get_setup_status(self) -> Dict[str, Any]:
"""Return current setup status and wizard metadata."""
"""Return current setup status, wizard metadata, and any pre-configured identity."""
identity = self.config_manager.get_identity()
preconfigured = {
k: v for k, v in {
'cell_name': identity.get('cell_name', ''),
'domain_mode': identity.get('domain_mode', ''),
'domain_name': identity.get('domain_name', ''),
'cloudflare_api_token': identity.get('cloudflare_api_token', ''),
'duckdns_token': identity.get('duckdns_token', ''),
}.items() if v
}
return {
'complete': self.is_setup_complete(),
'available_services': AVAILABLE_SERVICES,
'available_timezones': AVAILABLE_TIMEZONES,
'preconfigured': preconfigured,
}
# ── validation ────────────────────────────────────────────────────────
@@ -128,9 +171,11 @@ class SetupManager:
cell_name = payload.get('cell_name', '')
password = payload.get('password', '')
domain_mode = payload.get('domain_mode', '')
domain_name = payload.get('domain_name', '')
timezone = payload.get('timezone', '')
services_enabled = payload.get('services_enabled', [])
ddns_provider = payload.get('ddns_provider', 'none')
cloudflare_api_token = payload.get('cloudflare_api_token', '')
duckdns_token = payload.get('duckdns_token', '')
errors.extend(self.validate_cell_name(cell_name))
errors.extend(self.validate_password(password))
@@ -141,8 +186,6 @@ class SetupManager:
)
if not timezone or not isinstance(timezone, str):
errors.append('timezone is required.')
if not isinstance(services_enabled, list):
errors.append('services_enabled must be a list.')
if errors:
return {'success': False, 'errors': errors}
@@ -168,35 +211,96 @@ class SetupManager:
if self.is_setup_complete():
return {'success': False, 'errors': ['Setup has already been completed.']}
# ── create admin user ──────────────────────────────────────────
# ── create or update admin user ────────────────────────────────
# The installer may have bootstrapped an admin account from a
# generated password. The wizard's job is to set the real password,
# so update it if the account already exists.
ok = self.auth_manager.create_user(
username='admin',
password=password,
role='admin',
)
if not ok:
return {'success': False, 'errors': ['Failed to create admin user. The username may already exist.']}
ok = self.auth_manager.set_password_admin('admin', password)
if not ok:
return {'success': False, 'errors': ['Failed to set admin password.']}
# ── persist identity fields ────────────────────────────────────
self.config_manager.set_identity_field('cell_name', cell_name)
self.config_manager.set_identity_field('domain_mode', domain_mode)
if domain_name:
self.config_manager.set_identity_field('domain_name', domain_name)
self.config_manager.set_identity_field('timezone', timezone)
self.config_manager.set_identity_field('services_enabled', services_enabled)
self.config_manager.set_identity_field('ddns_provider', ddns_provider)
if cloudflare_api_token:
self.config_manager.set_identity_field('cloudflare_api_token', cloudflare_api_token)
if duckdns_token:
self.config_manager.set_identity_field('duckdns_token', duckdns_token)
# NOTE: DDNS registration is deferred to Phase 3.
# For now we just store ddns_provider in config.
logger.info(
'DDNS registration skipped (Phase 1). '
'DDNS registration will happen in Phase 3. '
f'ddns_provider={ddns_provider!r} stored in identity config.'
# ── write top-level ddns section so DDNSManager can find provider ──
duckdns_sub = domain_name.replace('.duckdns.org', '') if domain_mode == 'duckdns' else ''
ddns_cfg = _build_ddns_config(
domain_mode,
cloudflare_api_token=cloudflare_api_token,
duckdns_token=duckdns_token,
duckdns_subdomain=duckdns_sub,
)
self.config_manager.set_ddns_config(ddns_cfg)
# ── trigger DDNS registration for pic_ngo ─────────────────────────
warnings: List[str] = []
if domain_mode == 'pic_ngo':
try:
from ddns_manager import DDNSManager
ddns_mgr = DDNSManager(self.config_manager)
ddns_mgr.register(cell_name, '')
logger.info(f'DDNS registered: {cell_name}.pic.ngo')
except Exception as exc:
msg = str(exc)
logger.warning(f'DDNS registration failed: {msg}')
if '409' in msg or 'taken' in msg.lower():
warnings.append(
f'The name "{cell_name}" is already registered on pic.ngo. '
'HTTPS will not be active until you re-register: go to '
'Settings → DDNS and click Re-register, or choose a different name.'
)
else:
warnings.append(
'DDNS registration could not be completed right now '
f'({msg}). The cell will retry automatically. '
'HTTPS will activate once registration succeeds.'
)
# ── write the split-horizon DNS zone for non-LAN modes ─────────
# VPN clients use the cell's CoreDNS (DNS=<wg ip>) and must resolve
# the effective domain to the internal Caddy IP so traffic reaches
# Caddy through the tunnel. _bootstrap_dns runs at container start
# BEFORE setup completes (domain_mode still 'lan'), so it takes the
# LAN branch and never writes this zone — leaving CoreDNS pointing
# at a missing zone file and VPN lookups returning nothing
# (dns_probe_finished_bad_config). Write it here now that the mode
# and effective domain are known.
if domain_mode != 'lan' and self.network_manager is not None:
try:
effective_domain = self.config_manager.get_effective_domain()
primary_domain = self.config_manager.get_identity().get('domain', 'cell')
if effective_domain and effective_domain != primary_domain:
caddy_ip = self.network_manager._get_wg_server_ip()
self.network_manager.update_split_horizon_zone(
effective_domain, caddy_ip, primary_domain=primary_domain)
logger.info(
f'Split-horizon zone written for {effective_domain} -> {caddy_ip}')
except Exception as exc:
logger.warning(f'Split-horizon zone setup failed (non-fatal): {exc}')
# ── mark setup complete (must be last) ─────────────────────────
self.config_manager.set_identity_field('setup_complete', True)
logger.info(f"Setup completed. cell_name={cell_name!r}, domain_mode={domain_mode!r}")
return {'success': True, 'redirect': '/login'}
result: Dict[str, Any] = {'success': True, 'redirect': '/login'}
if warnings:
result['warnings'] = warnings
return result
finally:
try:
+57 -17
View File
@@ -155,7 +155,9 @@ class WireGuardManager(BaseServiceManager):
f'iptables -t nat -A PREROUTING -i %i -d {server_ip} -p udp --dport 53 -j DNAT --to-destination {dns_ip}:53; '
f'iptables -t nat -A PREROUTING -i %i -d {server_ip} -p tcp --dport 53 -j DNAT --to-destination {dns_ip}:53; '
f'iptables -t nat -A PREROUTING -i %i -d {server_ip} -p tcp --dport 80 -j DNAT --to-destination {caddy_ip}:80; '
f'iptables -t nat -A PREROUTING -i %i -d {server_ip} -p tcp --dport 443 -j DNAT --to-destination {caddy_ip}:443; '
f'iptables -I FORWARD -i %i -o eth0 -p tcp --dport 80 -j ACCEPT; '
f'iptables -I FORWARD -i %i -o eth0 -p tcp --dport 443 -j ACCEPT; '
f'iptables -I FORWARD -i %i -o eth0 -p udp --dport 53 -j ACCEPT; '
f'iptables -I FORWARD -i %i -o eth0 -p tcp --dport 53 -j ACCEPT; '
f'iptables -I FORWARD -i eth0 -o %i -s 172.20.0.0/16 -j ACCEPT; '
@@ -165,7 +167,9 @@ class WireGuardManager(BaseServiceManager):
f'iptables -t nat -D PREROUTING -i %i -d {server_ip} -p udp --dport 53 -j DNAT --to-destination {dns_ip}:53 2>/dev/null || true; '
f'iptables -t nat -D PREROUTING -i %i -d {server_ip} -p tcp --dport 53 -j DNAT --to-destination {dns_ip}:53 2>/dev/null || true; '
f'iptables -t nat -D PREROUTING -i %i -d {server_ip} -p tcp --dport 80 -j DNAT --to-destination {caddy_ip}:80 2>/dev/null || true; '
f'iptables -t nat -D PREROUTING -i %i -d {server_ip} -p tcp --dport 443 -j DNAT --to-destination {caddy_ip}:443 2>/dev/null || true; '
f'iptables -D FORWARD -i %i -o eth0 -p tcp --dport 80 -j ACCEPT 2>/dev/null || true; '
f'iptables -D FORWARD -i %i -o eth0 -p tcp --dport 443 -j ACCEPT 2>/dev/null || true; '
f'iptables -D FORWARD -i %i -o eth0 -p udp --dport 53 -j ACCEPT 2>/dev/null || true; '
f'iptables -D FORWARD -i %i -o eth0 -p tcp --dport 53 -j ACCEPT 2>/dev/null || true; '
f'iptables -D FORWARD -i eth0 -o %i -s 172.20.0.0/16 -j ACCEPT 2>/dev/null || true; '
@@ -179,13 +183,11 @@ class WireGuardManager(BaseServiceManager):
f'PostUp = iptables -A FORWARD -i %i -j DROP; '
f'iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE; '
f'{hairpin}'
f'{dnat_up}; '
f'sysctl -q net.ipv4.conf.all.rp_filter=0 || true\n'
f'{dnat_up}\n'
f'PostDown = iptables -D FORWARD -i %i -j DROP 2>/dev/null || true; '
f'iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE 2>/dev/null || true; '
f'{hairpin_down}'
f'{dnat_down}; '
f'sysctl -q net.ipv4.conf.all.rp_filter=1 || true\n'
f'{dnat_down}\n'
)
@staticmethod
@@ -194,11 +196,11 @@ class WireGuardManager(BaseServiceManager):
t = token.strip()
if not t.startswith('iptables'):
return False
# PREROUTING DNAT on ports 53 or 80 (scoped or unscoped — we replace both)
if 'PREROUTING' in t and 'DNAT' in t and ('--dport 53' in t or '--dport 80' in t):
# PREROUTING DNAT on ports 53, 80, or 443 (scoped or unscoped — we replace both)
if 'PREROUTING' in t and 'DNAT' in t and ('--dport 53' in t or '--dport 80' in t or '--dport 443' in t):
return True
# FORWARD accept to eth0 for ports 53 or 80 (service traffic forwarding)
if 'FORWARD' in t and '-o eth0' in t and ('--dport 53' in t or '--dport 80' in t):
# FORWARD accept to eth0 for ports 53, 80, or 443 (service traffic forwarding)
if 'FORWARD' in t and '-o eth0' in t and ('--dport 53' in t or '--dport 80' in t or '--dport 443' in t):
return True
# Docker-to-WG FORWARD: eth0 → wg0 for 172.20.0.0/16
if 'FORWARD' in t and '-i eth0' in t and '172.20.0.0/16' in t:
@@ -290,6 +292,8 @@ class WireGuardManager(BaseServiceManager):
return self.generate_config()
def _write_config(self, content: str):
if content and not content.endswith('\n'):
content += '\n'
with open(self._config_file(), 'w') as f:
f.write(content)
self._syncconf()
@@ -801,12 +805,20 @@ class WireGuardManager(BaseServiceManager):
"""Remove the [Peer] block matching public_key from wg0.conf."""
try:
content = self._read_config()
# Split on blank lines between blocks
raw_blocks = ('\n' + content).split('\n\n')
# Normalise to ensure blank-line block separators before splitting.
# Without this, a file written without trailing newline will merge
# [Interface] and the first [Peer] into one block, and the filter
# below would then delete [Interface] together with the peer.
normalised = content.replace('\n[Peer]', '\n\n[Peer]')
raw_blocks = ('\n' + normalised).split('\n\n')
new_blocks = [
b for b in raw_blocks
if not (f'PublicKey = {public_key}' in b and '[Peer]' in b)
]
# Never write an empty file — that would destroy the [Interface] block.
if not any('[Interface]' in b for b in new_blocks):
logger.error('remove_peer: [Interface] block would be lost — aborting write')
return False
self._write_config('\n\n'.join(new_blocks).lstrip('\n'))
return True
except Exception as e:
@@ -972,19 +984,44 @@ class WireGuardManager(BaseServiceManager):
pass
return ip
def check_port_open(self, port: int = None) -> bool:
"""Check if WireGuard is running and listening on the configured UDP port."""
configured_port = port if port is not None else self._get_configured_port()
# Primary: verify wg0 is up AND listening on the configured port
def _kernel_listening_port(self) -> Optional[int]:
"""Return the UDP port wg0 is actually bound to per `wg show`, or None.
This reads the live kernel state, which is the source of truth for what
port traffic must reach it may differ from wg0.conf's ListenPort if the
container has not been recreated since the port was changed.
"""
try:
result = subprocess.run(
['docker', 'exec', 'cell-wireguard', 'wg', 'show', 'wg0'],
capture_output=True, text=True, timeout=5,
)
if result.returncode == 0 and f'listening port: {configured_port}' in result.stdout.lower():
return True
if result.returncode != 0:
return None
for line in result.stdout.lower().splitlines():
line = line.strip()
if line.startswith('listening port:'):
try:
return int(line.split(':', 1)[1].strip())
except (ValueError, IndexError):
return None
except Exception:
pass
return None
def check_port_open(self, port: int = None) -> bool:
"""True when WireGuard is up and bound to a UDP port (reachable).
This is a liveness check, not a strict equality check against the
configured port: an interface that is up with a `listening port:` line
is serving traffic on that bound port. The bound port may differ from
wg0.conf's ListenPort if the container has not yet been recreated — that
is surfaced separately via the endpoint's actual-port field, not by
reporting the port closed.
"""
# Primary: wg0 is up and has a listening port → reachable on that port.
if self._kernel_listening_port() is not None:
return True
# Fallback: recent peer handshake confirms external reachability
try:
statuses = self.get_all_peer_statuses()
@@ -1080,11 +1117,14 @@ class WireGuardManager(BaseServiceManager):
capture_output=True, text=True, timeout=5,
)
running = 'cell-wireguard' in result.stdout
configured_addr = self._get_configured_address()
return {
'running': running,
'status': 'online' if running else 'offline',
'interface': 'wg0',
'ip_info': {'address': SERVER_ADDRESS} if running else {},
'listen_port': self._get_configured_port(),
'address': configured_addr if running else None,
'ip_info': {'address': configured_addr} if running else {},
'peers_count': len(self.get_peers()),
'timestamp': datetime.utcnow().isoformat(),
}
-36
View File
@@ -1,36 +0,0 @@
{
"cell_name": "modified",
"domain": "cell.local",
"ip_range": "10.0.0.0/24",
"network": {
"dns_port": 53,
"dhcp_range": "10.0.0.100-10.0.0.200",
"ntp_servers": ["pool.ntp.org"]
},
"wireguard": {
"port": 51820,
"private_key": "test_key",
"address": "10.0.0.1/24"
},
"email": {
"domain": "cell.local",
"smtp_port": 25,
"imap_port": 143
},
"calendar": {
"port": 5232,
"data_dir": "/app/data/calendar"
},
"files": {
"port": 8080,
"data_dir": "/app/data/files"
},
"routing": {
"nat_enabled": true,
"firewall_enabled": true
},
"vault": {
"ca_configured": true,
"fernet_configured": true
}
}
+4
View File
@@ -0,0 +1,4 @@
-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEjzJzXg0lMxYRVnJXecvl5YZUhUpK
2WQnyK1SB8Bn9K2JRCHkTIk0D3/78Q4Y5cNuj7i6LFgqx21L/QAiDY21Zw==
-----END PUBLIC KEY-----
+1 -2
View File
@@ -1,6 +1,5 @@
version: '3.3'
services: {}
networks:
cell-network:
external: true
name: pic_cell-network
name: cell-network
+32 -153
View File
@@ -1,5 +1,3 @@
version: '3.3'
services:
# Reverse Proxy - Caddy for routing all .cell traffic
caddy:
@@ -14,6 +12,9 @@ services:
- ./data/caddy:/data
- ./config/caddy/certs:/config/caddy/certs
restart: unless-stopped
mem_limit: 256m
cpus: 0.5
pids_limit: 256
cap_add:
- NET_ADMIN
networks:
@@ -27,7 +28,7 @@ services:
# DNS Server - CoreDNS for .cell TLD resolution
dns:
image: coredns/coredns:latest
image: coredns/coredns:1.11.3@sha256:9caabbf6238b189a65d0d6e6ac138de60d6a1c419e5a341fbbb7c78382559c6e
container_name: cell-dns
profiles: ["core", "full"]
command: ["-conf", "/etc/coredns/Corefile"]
@@ -38,6 +39,9 @@ services:
- ./config/dns/Corefile:/etc/coredns/Corefile
- ./data/dns:/data
restart: unless-stopped
mem_limit: 128m
cpus: 0.25
pids_limit: 256
networks:
cell-network:
ipv4_address: ${DNS_IP:-172.20.0.3}
@@ -47,118 +51,24 @@ services:
max-size: "10m"
max-file: "5"
# DHCP Server - dnsmasq for IP leasing
dhcp:
image: alpine:latest
container_name: cell-dhcp
profiles: ["full"]
ports:
- "${DHCP_PORT:-67}:67/udp"
volumes:
- ./config/dhcp/dnsmasq.conf:/etc/dnsmasq.conf
- ./data/dhcp:/var/lib/misc
restart: unless-stopped
networks:
cell-network:
ipv4_address: ${DHCP_IP:-172.20.0.4}
command: ["/bin/sh", "-c", "apk add --no-cache dnsmasq && dnsmasq -d -C /etc/dnsmasq.conf"]
cap_add:
- NET_ADMIN
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
# NTP Server - chrony for time synchronization
ntp:
image: alpine:latest
build: ./ntp
container_name: cell-ntp
profiles: ["full"]
profiles: ["core", "full"]
ports:
- "${NTP_PORT:-123}:123/udp"
volumes:
- ./config/ntp/chrony.conf:/etc/chrony/chrony.conf
restart: unless-stopped
mem_limit: 128m
cpus: 0.25
pids_limit: 256
networks:
cell-network:
ipv4_address: ${NTP_IP:-172.20.0.5}
cap_add:
- SYS_TIME
command: ["/bin/sh", "-c", "apk add --no-cache chrony && rm -f /var/run/chrony/chronyd.pid && exec chronyd -d -f /etc/chrony/chrony.conf -n"]
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
# Email Server - Postfix + Dovecot
mail:
image: mailserver/docker-mailserver:latest
container_name: cell-mail
profiles: ["full"]
hostname: mail
domainname: cell.local
env_file: ./config/mail/mailserver.env
ports:
- "${MAIL_SMTP_PORT:-25}:25"
- "${MAIL_SUBMISSION_PORT:-587}:587"
- "${MAIL_IMAP_PORT:-993}:993"
volumes:
- ./data/maildata:/var/mail
- ./data/mailstate:/var/mail-state
- ./data/maillogs:/var/log/mail
- ./config/mail/config:/tmp/docker-mailserver/
- ./config/mail/ssl:/etc/letsencrypt
restart: unless-stopped
networks:
cell-network:
ipv4_address: ${MAIL_IP:-172.20.0.6}
cap_add:
- NET_ADMIN
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
# Calendar & Contacts - Radicale
radicale:
image: tomsquest/docker-radicale:latest
container_name: cell-radicale
profiles: ["full"]
ports:
- "127.0.0.1:${RADICALE_PORT:-5232}:5232"
volumes:
- ./config/radicale:/etc/radicale
- ./data/radicale:/data
restart: unless-stopped
networks:
cell-network:
ipv4_address: ${RADICALE_IP:-172.20.0.7}
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
# File Storage - WebDAV
webdav:
image: bytemark/webdav:latest
container_name: cell-webdav
profiles: ["full"]
ports:
- "127.0.0.1:${WEBDAV_PORT:-8080}:80"
environment:
- AUTH_TYPE=Basic
- USERNAME=${WEBDAV_USER:-admin}
- PASSWORD=${WEBDAV_PASS}
volumes:
- ./data/files:/var/lib/dav
restart: unless-stopped
networks:
cell-network:
ipv4_address: ${WEBDAV_IP:-172.20.0.8}
logging:
driver: json-file
options:
@@ -167,26 +77,25 @@ services:
# WireGuard VPN
wireguard:
image: linuxserver/wireguard:latest
build: ./wireguard
container_name: cell-wireguard
profiles: ["core", "full"]
environment:
- SERVERMODE=true
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
ports:
- "${WG_PORT:-51820}:${WG_PORT:-51820}/udp"
volumes:
- ./config/wireguard:/config
- /lib/modules:/lib/modules
restart: unless-stopped
mem_limit: 256m
cpus: 0.5
pids_limit: 256
networks:
cell-network:
ipv4_address: ${WG_IP:-172.20.0.9}
cap_add:
- NET_ADMIN
- SYS_MODULE
privileged: true
# FALLBACK for kernels lacking builtin WireGuard: re-add `privileged: true`,
# `- SYS_MODULE` under cap_add, and the `- /lib/modules:/lib/modules` volume.
# Default assumes a modern kernel (>= 5.6) with WireGuard compiled in.
sysctls:
- net.ipv4.conf.all.src_valid_mark=1
- net.ipv4.ip_forward=1
@@ -204,10 +113,14 @@ services:
profiles: ["core", "full"]
ports:
- "127.0.0.1:${API_PORT:-3000}:3000"
environment:
- DDNS_URL=${DDNS_URL:-https://ddns.pic.ngo/api/v1}
- DDNS_TOTP_SECRET=${DDNS_TOTP_SECRET:-S6UMA464YIKM74QHXWL5WELDIO3HFZ6K}
volumes:
- ./data/api:/app/data
- ./data/dns:/app/data/dns
- ./config/api:/app/config
- ./config/cosign:/app/config/cosign:ro
- ./config/caddy:/app/config-caddy
- ./config/wireguard:/app/config/wireguard
- ./config/dns:/app/config/dns
@@ -219,6 +132,9 @@ services:
- ./scripts:/app/scripts:ro
pid: host
restart: unless-stopped
mem_limit: 512m
cpus: 1.0
pids_limit: 256
networks:
cell-network:
ipv4_address: ${API_IP:-172.20.0.10}
@@ -237,8 +153,11 @@ services:
container_name: cell-webui
profiles: ["core", "full"]
ports:
- "${WEBUI_PORT:-8081}:80"
- "${WEBUI_PORT:-8081}:8080"
restart: unless-stopped
mem_limit: 256m
cpus: 0.5
pids_limit: 256
networks:
cell-network:
ipv4_address: ${WEBUI_IP:-172.20.0.11}
@@ -248,47 +167,7 @@ services:
max-size: "10m"
max-file: "5"
# Webmail - RainLoop
rainloop:
image: hardware/rainloop
container_name: cell-rainloop
profiles: ["full"]
restart: unless-stopped
networks:
cell-network:
ipv4_address: ${RAINLOOP_IP:-172.20.0.12}
ports:
- "127.0.0.1:${RAINLOOP_PORT:-8888}:8888"
volumes:
- ./data/rainloop:/rainloop/data
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
# File Manager - FileGator
filegator:
image: filegator/filegator
container_name: cell-filegator
profiles: ["full"]
restart: unless-stopped
networks:
cell-network:
ipv4_address: ${FILEGATOR_IP:-172.20.0.13}
ports:
- "127.0.0.1:${FILEGATOR_PORT:-8082}:8080"
volumes:
- ./data/filegator:/var/www/filegator/private
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
networks:
cell-network:
driver: bridge
ipam:
config:
- subnet: ${CELL_NETWORK:-172.20.0.0/16}
name: cell-network
external: true
-389
View File
@@ -1,389 +0,0 @@
# Personal Internet Cell - Network Configuration Guide
This guide explains how to configure networking for the Personal Internet Cell to provide internet access to WireGuard VPN clients.
## Table of Contents
1. [Overview](#overview)
2. [Network Architecture](#network-architecture)
3. [Quick Setup](#quick-setup)
4. [Detailed Configuration](#detailed-configuration)
5. [Troubleshooting](#troubleshooting)
6. [Advanced Configuration](#advanced-configuration)
7. [Security Considerations](#security-considerations)
## Overview
The Personal Internet Cell provides a complete VPN solution with internet access. This requires proper configuration of:
- **IP Forwarding**: Allow traffic to pass through the server
- **NAT (Network Address Translation)**: Translate private IPs to public IPs
- **Routing**: Direct traffic from VPN clients to the internet
- **Firewall Rules**: Control traffic flow and security
## Network Architecture
```
Internet
[Host Server] (195.178.106.244)
├── [Docker Network] (172.20.0.0/16)
│ └── [WireGuard Container] (cell-wireguard)
│ └── [WireGuard Interface] (wg0: 10.0.0.1/24)
└── [VPN Clients] (10.0.0.2-10.0.0.254/24)
└── [Internet Access via NAT]
```
### Key Components
- **Host Interface**: `eth0` (or main network interface)
- **WireGuard Interface**: `wg0` (10.0.0.1/24)
- **Client Network**: `10.0.0.0/24`
- **NAT Translation**: Client IPs → Host IP
## Quick Setup
### 1. Run the Network Configuration Script
```bash
# Make the script executable (if not already done)
chmod +x /opt/pic/scripts/setup-network.sh
# Run the configuration
sudo /opt/pic/scripts/setup-network.sh setup
```
### 2. Verify Configuration
```bash
# Check status
sudo /opt/pic/scripts/setup-network.sh status
# Test configuration
sudo /opt/pic/scripts/setup-network.sh test
```
### 3. Connect a VPN Client
Use the generated WireGuard configuration to connect a client. The client should now have internet access.
## Detailed Configuration
### IP Forwarding
IP forwarding allows the server to route packets between different network interfaces.
**Enable on Host:**
```bash
echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf
sysctl -p
```
**Enable in Container:**
```bash
docker exec cell-wireguard sh -c "echo 1 > /proc/sys/net/ipv4/ip_forward"
```
### NAT Configuration
NAT (Network Address Translation) allows VPN clients to access the internet using the server's public IP.
**Container NAT Rules:**
```bash
# Allow forwarding for WireGuard traffic
iptables -A FORWARD -i wg0 -j ACCEPT
iptables -A FORWARD -o wg0 -j ACCEPT
# NAT rule for internet access
iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o eth0 -j MASQUERADE
```
**Host NAT Rules:**
```bash
# Allow traffic from WireGuard network
iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o eth0 -j MASQUERADE
iptables -A FORWARD -i wg0 -j ACCEPT
iptables -A FORWARD -o wg0 -j ACCEPT
```
### Routing Configuration
**WireGuard Interface Setup:**
```bash
# Create WireGuard interface
ip link add dev wg0 type wireguard
# Set private key
wg set wg0 private-key /path/to/private-key
# Set listen port
wg set wg0 listen-port 51820
# Add IP address
ip addr add 10.0.0.1/24 dev wg0
# Bring interface up
ip link set wg0 up
# Add peers
wg set wg0 peer <public-key> allowed-ips 10.0.0.2/32
```
## Troubleshooting
### Common Issues
#### 1. VPN Connected but No Internet
**Symptoms:**
- WireGuard shows connected
- Can ping server (10.0.0.1)
- Cannot access internet
**Solutions:**
```bash
# Check IP forwarding
cat /proc/sys/net/ipv4/ip_forward
# Should return 1
# Check NAT rules
iptables -t nat -L POSTROUTING -n
# Should show MASQUERADE rule for 10.0.0.0/24
# Check forwarding rules
iptables -L FORWARD -n
# Should show ACCEPT rules for wg0
# Restart network configuration
sudo /opt/pic/scripts/setup-network.sh reset
sudo /opt/pic/scripts/setup-network.sh setup
```
#### 2. Cannot Connect to VPN
**Symptoms:**
- WireGuard client cannot connect
- No handshake in server logs
**Solutions:**
```bash
# Check WireGuard interface
docker exec cell-wireguard wg show
# Check if port 51820 is open
netstat -ulnp | grep 51820
# Check firewall rules
ufw status
iptables -L INPUT -n
# Check Docker port mapping
docker port cell-wireguard
```
#### 3. DNS Issues
**Symptoms:**
- Can ping IP addresses
- Cannot resolve domain names
**Solutions:**
```bash
# Check DNS configuration in client config
# Should include: DNS = 8.8.8.8, 1.1.1.1
# Test DNS from container
docker exec cell-wireguard nslookup google.com
# Check if DNS is being blocked
docker exec cell-wireguard iptables -L -n | grep 53
```
### Diagnostic Commands
```bash
# Check network status
sudo /opt/pic/scripts/setup-network.sh status
# Test connectivity from container
docker exec cell-wireguard ping -c 3 8.8.8.8
# Check routing table
docker exec cell-wireguard ip route show
# Check interface status
docker exec cell-wireguard ip addr show wg0
# Check NAT rules
docker exec cell-wireguard iptables -t nat -L -n
# Check forwarding rules
docker exec cell-wireguard iptables -L FORWARD -n
```
## Advanced Configuration
### Custom DNS Servers
To use custom DNS servers, modify the WireGuard client configuration:
```ini
[Interface]
PrivateKey = <private-key>
Address = 10.0.0.2/32
DNS = 1.1.1.1, 1.0.0.1, 8.8.8.8, 8.8.4.4
[Peer]
PublicKey = <server-public-key>
Endpoint = 195.178.106.244:51820
AllowedIPs = 0.0.0.0/0
PersistentKeepalive = 25
```
### Split Tunneling
To allow only specific traffic through the VPN:
```ini
[Peer]
AllowedIPs = 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
# Only route private networks through VPN
```
### Port Forwarding
To forward specific ports to VPN clients:
```bash
# Forward port 8080 to client 10.0.0.2
iptables -t nat -A PREROUTING -p tcp --dport 8080 -j DNAT --to-destination 10.0.0.2:8080
iptables -A FORWARD -p tcp -d 10.0.0.2 --dport 8080 -j ACCEPT
```
### Bandwidth Limiting
To limit bandwidth for VPN clients:
```bash
# Install tc (traffic control)
apt-get install iproute2
# Limit client 10.0.0.2 to 1Mbps
tc qdisc add dev wg0 root handle 1: htb default 30
tc class add dev wg0 parent 1: classid 1:1 htb rate 1mbit
tc class add dev wg0 parent 1:1 classid 1:10 htb rate 1mbit ceil 1mbit
tc filter add dev wg0 protocol ip parent 1:0 prio 1 u32 match ip dst 10.0.0.2 flowid 1:10
```
## Security Considerations
### Firewall Rules
**Basic Security Rules:**
```bash
# Drop invalid packets
iptables -A INPUT -m conntrack --ctstate INVALID -j DROP
# Allow established connections
iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
# Allow WireGuard traffic
iptables -A INPUT -p udp --dport 51820 -j ACCEPT
# Allow SSH (if needed)
iptables -A INPUT -p tcp --dport 22 -j ACCEPT
# Drop everything else
iptables -A INPUT -j DROP
```
### Client Isolation
To prevent clients from communicating with each other:
```bash
# Block inter-client communication
iptables -A FORWARD -i wg0 -o wg0 -j DROP
```
### Logging
To log VPN traffic:
```bash
# Log all WireGuard traffic
iptables -A FORWARD -i wg0 -j LOG --log-prefix "WG-FORWARD: "
iptables -A FORWARD -o wg0 -j LOG --log-prefix "WG-FORWARD: "
# Log NAT traffic
iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -j LOG --log-prefix "WG-NAT: "
```
## Monitoring
### Real-time Monitoring
```bash
# Monitor WireGuard connections
watch -n 1 "docker exec cell-wireguard wg show"
# Monitor traffic
watch -n 1 "docker exec cell-wireguard wg show wg0 transfer"
# Monitor NAT rules
watch -n 1 "iptables -t nat -L POSTROUTING -n -v"
```
### Log Analysis
```bash
# Check system logs
journalctl -u pic-network.service -f
# Check iptables logs
tail -f /var/log/kern.log | grep WG-
# Check Docker logs
docker logs cell-wireguard -f
```
## Backup and Recovery
### Backup Configuration
```bash
# Backup iptables rules
iptables-save > /opt/pic/backups/iptables-backup-$(date +%Y%m%d).rules
# Backup WireGuard configuration
cp /opt/pic/config/wireguard/wg_confs/wg0.conf /opt/pic/backups/wg0-backup-$(date +%Y%m%d).conf
# Backup network script
cp /opt/pic/scripts/setup-network.sh /opt/pic/backups/setup-network-backup-$(date +%Y%m%d).sh
```
### Restore Configuration
```bash
# Restore iptables rules
iptables-restore < /opt/pic/backups/iptables-backup-YYYYMMDD.rules
# Restore WireGuard configuration
cp /opt/pic/backups/wg0-backup-YYYYMMDD.conf /opt/pic/config/wireguard/wg_confs/wg0.conf
docker restart cell-wireguard
```
## Support
If you encounter issues:
1. Check the troubleshooting section above
2. Run the diagnostic commands
3. Check the logs for error messages
4. Verify your network configuration
5. Test with a simple client configuration
For additional help, check the main Personal Internet Cell documentation or create an issue in the project repository.
-51
View File
@@ -1,51 +0,0 @@
#!/usr/bin/env python3
"""
Script to fix import statements in test files
"""
import os
import re
from pathlib import Path
def fix_imports_in_file(file_path):
"""Fix import statements in a test file"""
with open(file_path, 'r', encoding='utf-8') as f:
content = f.read()
# Fix relative imports to absolute imports from api package
content = re.sub(r'from \.(\w+) import', r'from \1 import', content)
content = re.sub(r'import \.(\w+)', r'import \1', content)
# Add path setup if not present
if 'sys.path.insert' not in content and 'api_dir' not in content:
path_setup = '''import sys
from pathlib import Path
# Add api directory to path
api_dir = Path(__file__).parent.parent / 'api'
sys.path.insert(0, str(api_dir))
'''
# Insert after the first import line
lines = content.split('\n')
for i, line in enumerate(lines):
if line.startswith('import ') or line.startswith('from '):
lines.insert(i, path_setup.rstrip())
break
content = '\n'.join(lines)
with open(file_path, 'w', encoding='utf-8') as f:
f.write(content)
print(f"Fixed imports in {file_path}")
def main():
"""Fix all test files"""
tests_dir = Path('tests')
for test_file in tests_dir.glob('test_*.py'):
if test_file.name not in ['test_cli_tool.py', 'test_peer_registry.py']: # Already fixed
fix_imports_in_file(test_file)
if __name__ == '__main__':
main()
-31
View File
@@ -1,31 +0,0 @@
#!/usr/bin/env python3
"""
Fix import statements in test files
"""
import os
import re
from pathlib import Path
def fix_imports_in_file(file_path):
"""Fix import statements in a test file"""
with open(file_path, 'r', encoding='utf-8') as f:
content = f.read()
# Replace 'from api.' with 'from .'
content = re.sub(r'from api\.', 'from .', content)
content = re.sub(r'import api\.', 'import .', content)
with open(file_path, 'w', encoding='utf-8') as f:
f.write(content)
print(f"Fixed imports in {file_path}")
def main():
tests_dir = Path('tests')
for test_file in tests_dir.glob('test_*.py'):
fix_imports_in_file(test_file)
if __name__ == '__main__':
main()
+204 -63
View File
@@ -22,9 +22,9 @@
# =============================================================================
#
# Usage:
# sudo bash install.sh # Standard install
# sudo bash install.sh --force # Bypass idempotency check
# sudo PIC_DIR=/srv/pic bash install.sh # Custom install directory
# bash install.sh # Standard install (uses sudo internally for packages)
# bash install.sh --force # Bypass idempotency check
# PIC_DIR=/srv/pic bash install.sh # Custom install directory
#
# Supported OS: Debian/Ubuntu (apt), Fedora/RHEL (dnf), Alpine Linux (apk)
#
@@ -42,19 +42,31 @@ API_HEALTH_URL="http://127.0.0.1:3000/health"
API_HEALTH_TIMEOUT=60
WEBUI_PORT=8081
FORCE=0
PIC_DEBUG="${PIC_DEBUG:-0}"
# Parse flags
for arg in "$@"; do
case "$arg" in
--force) FORCE=1 ;;
--debug) PIC_DEBUG=1 ;;
*)
echo "Unknown argument: $arg" >&2
echo "Usage: $0 [--force]" >&2
echo "Usage: $0 [--force] [--debug]" >&2
exit 1
;;
esac
done
# ---------------------------------------------------------------------------
# Log file — /var/log/pic-install.log when writable (root via sudo), else /tmp
# ---------------------------------------------------------------------------
if touch /var/log/pic-install.log 2>/dev/null; then
LOGFILE="/var/log/pic-install.log"
else
LOGFILE="${TMPDIR:-/tmp}/pic-install.log"
fi
: > "$LOGFILE" # truncate / create
# ---------------------------------------------------------------------------
# Color output
# ---------------------------------------------------------------------------
@@ -77,17 +89,81 @@ log_ok() { printf " ${GREEN}✓${RESET} %s\n" "$1"; }
log_warn() { printf " ${YELLOW}${RESET} %s\n" "$1"; }
log_error() { printf "\n${RED}${BOLD}ERROR:${RESET}${RED} %s${RESET}\n" "$1" >&2; }
die() { log_error "$1"; exit 1; }
die() {
log_error "$1"
if [ "$PIC_DEBUG" -eq 0 ]; then
printf "\n${YELLOW}Last output (full log: %s):${RESET}\n" "$LOGFILE" >&2
tail -n 30 "$LOGFILE" | sed 's/^/ /' >&2
fi
exit 1
}
# ---------------------------------------------------------------------------
# run_step <label_in_progress> <label_done> <cmd> [args...]
#
# Default mode: redirect stdout+stderr to LOGFILE; print a single "in
# progress" line then overwrite it with a checkmark on success. On failure
# print the last 30 log lines and die.
#
# Debug mode (PIC_DEBUG=1): tee output to LOGFILE AND stdout (indented),
# print the done line at the end.
#
# TERM safety: when stdout is not a TTY the \r trick does not work, so we
# fall back to a plain two-line "... / done" style.
# ---------------------------------------------------------------------------
_IS_TTY=0
[ -t 1 ] && _IS_TTY=1
run_step() {
local label_running="$1"
local label_done="$2"
shift 2
# "$@" is the command to run
if [ "$PIC_DEBUG" -eq 1 ]; then
printf " → %s\n" "$label_running"
# set -o pipefail: the pipeline below fails if "$@" fails, regardless
# of tee's or sed's exit code.
{ "$@" 2>&1 | tee -a "$LOGFILE" | sed 's/^/ /'; } || \
die "Command failed. See $LOGFILE for details."
log_ok "$label_done"
return
fi
# Default (quiet) mode
if [ "$_IS_TTY" -eq 1 ]; then
printf " → %s..." "$label_running"
else
printf " → %s...\n" "$label_running"
fi
local exit_code=0
"$@" >> "$LOGFILE" 2>&1 || exit_code=$?
if [ "$exit_code" -ne 0 ]; then
[ "$_IS_TTY" -eq 1 ] && printf "\n"
die "Step failed: $label_running"
fi
if [ "$_IS_TTY" -eq 1 ]; then
printf "\r ${GREEN}${RESET} %-60s\n" "$label_done"
else
printf " ${GREEN}${RESET} %s\n" "$label_done"
fi
}
TOTAL_STEPS=7
# ---------------------------------------------------------------------------
# Must run as root
# Sudo check — we need it for package installs and system user creation
# ---------------------------------------------------------------------------
if [ "$(id -u)" -ne 0 ]; then
die "This installer must be run as root (use sudo)."
if ! command -v sudo >/dev/null 2>&1; then
die "sudo is required. Install it and ensure your user has sudo access."
fi
printf " Full log: %s\n" "$LOGFILE"
[ "$PIC_DEBUG" -eq 1 ] && printf " ${YELLOW}Debug mode enabled — verbose output active${RESET}\n"
# ---------------------------------------------------------------------------
# Idempotency guard
# ---------------------------------------------------------------------------
@@ -145,43 +221,72 @@ log_ok "Detected OS: ${OS_ID} (package manager: ${PKG_MANAGER})"
# ---------------------------------------------------------------------------
log_step 2 "Installing dependencies..."
_install_deps() {
case "$PKG_MANAGER" in
apt)
export DEBIAN_FRONTEND=noninteractive
sudo apt-get update -qq
sudo apt-get install -y -qq git curl make docker.io docker-compose-plugin || true
if ! docker compose version >/dev/null 2>&1; then
sudo apt-get install -y -qq docker-compose || true
fi
# Ensure host clock is synchronised before DDNS/TOTP registration.
sudo apt-get install -y -qq chrony || true
if sudo systemctl enable --now chrony >/dev/null 2>&1; then
: # NTP enabled
elif sudo systemctl enable --now chronyd >/dev/null 2>&1; then
: # NTP enabled
fi
;;
dnf)
sudo dnf install -y -q git curl make docker || true
sudo systemctl enable --now docker >/dev/null 2>&1 || true
if ! docker compose version >/dev/null 2>&1; then
sudo dnf install -y -q docker-compose-plugin || true
fi
sudo dnf install -y -q chrony || true
sudo systemctl enable --now chronyd >/dev/null 2>&1 || true
;;
apk)
sudo apk add --quiet git curl make docker docker-cli-compose || true
sudo rc-update add docker default >/dev/null 2>&1 || true
sudo service docker start >/dev/null 2>&1 || true
sudo apk add --quiet chrony || true
sudo rc-update add chronyd default >/dev/null 2>&1 || true
sudo service chronyd start >/dev/null 2>&1 || true
;;
esac
}
run_step "Installing system packages" "System packages installed" _install_deps
# Report NTP status (informational, outside the noisy run_step)
case "$PKG_MANAGER" in
apt)
export DEBIAN_FRONTEND=noninteractive
apt-get update -qq
apt-get install -y -qq git curl make docker.io docker-compose-plugin 2>&1 \
| grep -v "^$" | sed 's/^/ /' || true
# Verify docker compose plugin installed
if ! docker compose version >/dev/null 2>&1; then
log_warn "docker-compose-plugin not available; falling back to standalone docker-compose"
apt-get install -y -qq docker-compose 2>&1 | grep -v "^$" | sed 's/^/ /' || true
if sudo systemctl is-active --quiet chrony 2>/dev/null || \
sudo systemctl is-active --quiet chronyd 2>/dev/null; then
log_ok "Host NTP (chrony) is running"
else
log_warn "Could not start chrony — verify host clock is accurate before running the setup wizard"
fi
;;
dnf)
dnf install -y -q git curl make docker 2>&1 | sed 's/^/ /' || true
# Enable and start Docker (dnf installs but doesn't enable it)
systemctl enable --now docker >/dev/null 2>&1 || true
# Docker Compose plugin comes bundled with the Docker CE package on Fedora/RHEL.
# If not present, install via the docker-compose-plugin package (Docker CE repo).
if ! docker compose version >/dev/null 2>&1; then
log_warn "docker compose plugin not found; installing docker-compose-plugin..."
dnf install -y -q docker-compose-plugin 2>&1 | sed 's/^/ /' || true
dnf|apk)
if sudo systemctl is-active --quiet chronyd 2>/dev/null || \
sudo service chronyd status >/dev/null 2>&1; then
log_ok "Host NTP (chronyd) is running"
else
log_warn "Could not start chronyd — verify host clock is accurate before running the setup wizard"
fi
;;
apk)
apk add --quiet git curl make docker docker-cli-compose 2>&1 | sed 's/^/ /' || true
# Enable Docker on Alpine (OpenRC)
rc-update add docker default >/dev/null 2>&1 || true
service docker start >/dev/null 2>&1 || true
;;
esac
# Final sanity checks
@@ -204,10 +309,10 @@ log_step 3 "Configuring system user..."
if ! id "$PIC_USER" >/dev/null 2>&1; then
case "$PKG_MANAGER" in
apk)
adduser -S -D -H -s /sbin/nologin "$PIC_USER"
sudo adduser -S -D -H -s /sbin/nologin "$PIC_USER"
;;
*)
useradd --system --no-create-home --shell /usr/sbin/nologin "$PIC_USER"
sudo useradd --system --no-create-home --shell /usr/sbin/nologin "$PIC_USER"
;;
esac
log_ok "Created system user: ${PIC_USER}"
@@ -215,17 +320,18 @@ else
log_ok "System user already exists: ${PIC_USER}"
fi
# Ensure docker group exists and user is in it
# Ensure docker group exists and invoking user is in it
if ! getent group docker >/dev/null 2>&1; then
groupadd docker
sudo groupadd docker
log_ok "Created docker group"
fi
if ! id -nG "$PIC_USER" | grep -qw docker; then
usermod -aG docker "$PIC_USER"
log_ok "Added ${PIC_USER} to docker group"
CURRENT_USER="${USER:-$(id -un)}"
if ! id -nG "$CURRENT_USER" | grep -qw docker; then
sudo usermod -aG docker "$CURRENT_USER"
log_ok "Added ${CURRENT_USER} to docker group (re-login or newgrp docker to apply)"
else
log_ok "${PIC_USER} is already in docker group"
log_ok "${CURRENT_USER} is already in docker group"
fi
# ---------------------------------------------------------------------------
@@ -235,31 +341,54 @@ log_step 4 "Setting up repository at ${PIC_DIR}..."
if [ -d "${PIC_DIR}/.git" ]; then
log_warn "Repository already cloned — running git pull"
git -C "$PIC_DIR" pull --ff-only 2>&1 | sed 's/^/ /'
log_ok "Repository updated"
run_step "Updating repository" "Repository updated" \
git -C "$PIC_DIR" pull --ff-only
elif [ -d "$PIC_DIR" ] && [ "$(ls -A "$PIC_DIR" 2>/dev/null)" ]; then
die "${PIC_DIR} exists and is not empty and is not a git repo. Aborting to avoid data loss."
else
mkdir -p "$(dirname "$PIC_DIR")"
git clone "$PIC_REPO" "$PIC_DIR" 2>&1 | sed 's/^/ /'
log_ok "Repository cloned to ${PIC_DIR}"
run_step "Cloning repository" "Repository cloned to ${PIC_DIR}" \
git clone "$PIC_REPO" "$PIC_DIR"
fi
# Ensure the pic user owns the directory
chown -R "${PIC_USER}:${PIC_USER}" "$PIC_DIR"
sudo git config --system --add safe.directory "$PIC_DIR" 2>/dev/null || true
# The cosign public key ships in the repo and is bind-mounted into cell-api so
# store-service image signatures can be verified offline. It is checked in
# (config/cosign/cosign.pub), so the clone above should already provide it;
# warn loudly if it is somehow missing rather than silently skipping verify.
COSIGN_PUBKEY="${PIC_DIR}/config/cosign/cosign.pub"
if [ -f "$COSIGN_PUBKEY" ]; then
log_ok "cosign public key present at ${COSIGN_PUBKEY}"
else
log_warn "cosign public key missing at ${COSIGN_PUBKEY} — image signature verification will be unavailable"
fi
# ---------------------------------------------------------------------------
# Step 5 — Run make install
# ---------------------------------------------------------------------------
log_step 5 "Running 'make install'..."
log_step 5 "Generating configuration..."
# make install generates config, writes the systemd unit, and touches .installed.
# We run it as the pic user (via sudo -u) so files get correct ownership, but
# make install itself calls sudo internally where root is needed.
cd "$PIC_DIR"
if ! make install 2>&1 | sed 's/^/ /'; then
die "'make install' failed. Check the output above."
# run_step routes all output to LOGFILE. After it returns we scan LOGFILE
# for the admin password banner (printed once by setup_cell.py) and relay it
# to the user — it must never be silently buried in the log.
# We record the log byte-offset before the step so we only scan new output.
_LOG_OFFSET_BEFORE="$(wc -c < "$LOGFILE" 2>/dev/null || echo 0)"
run_step "Generating configuration" "Configuration generated" make install
# Extract only the lines added by this step.
_NEW_LOG="$(tail -c +"$(( _LOG_OFFSET_BEFORE + 1 ))" "$LOGFILE" 2>/dev/null || true)"
# Relay admin password banner if present.
if printf '%s\n' "$_NEW_LOG" | grep -qiE "(ADMIN PASSWORD|shown once)"; then
printf "\n"
printf '%s\n' "$_NEW_LOG" \
| awk '/ADMIN PASSWORD|shown once|={6}/{found=1} found{print} found && /^[[:space:]]*$/{exit}' \
| sed 's/^/ /'
printf "\n"
fi
log_ok "'make install' complete"
@@ -271,12 +400,24 @@ log_step 6 "Starting core services..."
cd "$PIC_DIR"
if ! make start-core 2>&1 | sed 's/^/ /'; then
die "'make start-core' failed. Check the output above."
fi
run_step \
"Downloading container images (first run can take a few minutes)" \
"Container images ready" \
make start-core
log_ok "Core services started"
# Enable and start the pic systemd unit so the stack survives a reboot.
# Skipped on Alpine (OpenRC) and on systems without systemd.
if command -v systemctl >/dev/null 2>&1; then
sudo systemctl daemon-reload 2>/dev/null || true
if sudo systemctl enable --now pic 2>/dev/null; then
log_ok "systemd unit pic.service enabled and started"
else
log_warn "Could not enable pic.service — run: sudo systemctl enable --now pic"
fi
fi
# ---------------------------------------------------------------------------
# Step 7 — Health check + print wizard URL
# ---------------------------------------------------------------------------
@@ -318,7 +459,7 @@ printf "\n${GREEN}${BOLD}=======================================================
printf "${GREEN}${BOLD} PIC installed successfully!${RESET}\n"
printf "${GREEN}${BOLD}============================================================${RESET}\n"
printf "\n"
printf " Open the setup wizard at:\n"
printf " Open the setup wizard to configure your cell:\n"
printf "\n"
printf " ${BOLD}http://${HOST_IP}:${WEBUI_PORT}/setup${RESET}\n"
printf "\n"
+7
View File
@@ -0,0 +1,7 @@
FROM alpine:3.20@sha256:d9e853e87e55526f6b2917df91a2115c36dd7c696a35be12163d44e6e2a4b6bc
RUN apk add --no-cache chrony \
&& mkdir -p /var/run/chrony /var/lib/chrony /var/log/chrony
# chrony.conf is mounted at /etc/chrony/chrony.conf by compose.
ENTRYPOINT ["chronyd", "-d", "-n", "-f", "/etc/chrony/chrony.conf"]
+86
View File
@@ -0,0 +1,86 @@
#!/usr/bin/env python3
"""
Update the cell's DDNS record with the current public IP.
Called by: make ddns-update
systemd timer (optional, see scripts/pic-ddns-update.timer)
Reads the DDNS token from data/api/.ddns_token (written by setup_cell.py).
Exits 0 on success or if already up to date, non-zero on failure.
"""
import json
import os
import sys
import urllib.error
import urllib.request
DDNS_URL = os.environ.get('DDNS_URL', 'https://ddns.pic.ngo/api/v1')
ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
TOKEN_FILE = os.path.join(ROOT, 'data', 'api', '.ddns_token')
IP_CACHE_FILE = os.path.join(ROOT, 'data', 'api', '.ddns_last_ip')
def get_public_ip() -> str:
return urllib.request.urlopen('https://api.ipify.org', timeout=5).read().decode().strip()
def read_token() -> str:
if not os.path.exists(TOKEN_FILE):
print('ERROR: DDNS token not found. Run "make setup" to register.', file=sys.stderr)
sys.exit(1)
return open(TOKEN_FILE).read().strip()
def read_last_ip() -> str:
try:
return open(IP_CACHE_FILE).read().strip()
except FileNotFoundError:
return ''
def write_last_ip(ip: str) -> None:
with open(IP_CACHE_FILE, 'w') as f:
f.write(ip)
def main() -> int:
try:
public_ip = get_public_ip()
except Exception as e:
print(f'ERROR: Could not detect public IP: {e}', file=sys.stderr)
return 1
last_ip = read_last_ip()
if public_ip == last_ip:
print(f'DDNS: IP unchanged ({public_ip}) — no update needed')
return 0
token = read_token()
data = json.dumps({'token': token, 'ip': public_ip}).encode()
req = urllib.request.Request(
f'{DDNS_URL}/update',
data=data,
headers={'Content-Type': 'application/json'},
method='PUT',
)
try:
resp = urllib.request.urlopen(req, timeout=10)
result = json.loads(resp.read())
if result.get('updated'):
write_last_ip(public_ip)
print(f'DDNS: Updated to {public_ip}')
return 0
else:
print(f'ERROR: Unexpected response: {result}', file=sys.stderr)
return 1
except urllib.error.HTTPError as e:
body = e.read().decode()
print(f'ERROR: DDNS update failed ({e.code}): {body}', file=sys.stderr)
return 1
except Exception as e:
print(f'ERROR: DDNS update failed: {e}', file=sys.stderr)
return 1
if __name__ == '__main__':
sys.exit(main())
+53 -48
View File
@@ -1,60 +1,65 @@
import requests
from bs4 import BeautifulSoup
import json
import sys
import urllib.request
import urllib.error
# Updated endpoints to use HTTPS
SERVICES = [
{"name": "Dashboard UI", "url": "https://localhost/"},
{"name": "Mail UI", "url": "https://localhost/mail"},
{"name": "Calendar UI", "url": "https://localhost/calendar"},
{"name": "Files UI", "url": "https://localhost/files"},
{"name": "DNS Management UI", "url": "https://localhost/dns"},
{"name": "API Health", "url": "https://localhost/api/health", "is_api": True},
{"name": "API Service Status", "url": "https://localhost/api/services/status", "is_api": True},
BASE = "http://127.0.0.1:3000"
CORE_CHECKS = [
{"name": "API health", "path": "/health"},
{"name": "API status", "path": "/api/status"},
{"name": "Active services", "path": "/api/services/active"},
]
def check_ui(url, name):
try:
resp = requests.get(url, timeout=5, verify=False)
if resp.status_code == 200:
# Try to parse HTML and look for a title or main element
soup = BeautifulSoup(resp.text, "html.parser")
title = soup.title.string if soup.title else "No title"
print(f"[OK] {name} ({url}) - {title}")
return True
else:
print(f"[FAIL] {name} ({url}) - HTTP {resp.status_code}")
return False
except Exception as e:
print(f"[ERROR] {name} ({url}) - {e}")
return False
OPTIONAL_SERVICE_CHECKS = {
"email": {"name": "Email status", "path": "/api/email/status"},
"calendar": {"name": "Calendar status", "path": "/api/calendar/status"},
"files": {"name": "Files status", "path": "/api/files/status"},
}
def check_api_status(url, name):
def get(path):
try:
resp = requests.get(url, timeout=5, verify=False)
if resp.status_code == 200:
print(f"[OK] {name}: {url}")
if 'services/status' in url:
data = resp.json()
for service, status in data.items():
s = status.get("status", "Unknown")
print(f" {service}: {s}")
else:
print(f" Response: {resp.text.strip()}")
return True
else:
print(f"[FAIL] {name}: HTTP {resp.status_code}")
return False
resp = urllib.request.urlopen(BASE + path, timeout=5)
body = resp.read().decode()
return resp.status, body
except urllib.error.HTTPError as e:
return e.code, e.read().decode()
except Exception as e:
print(f"[ERROR] {name}: {e}")
return False
return None, str(e)
def main():
print("=== UI & API Sanity Checks (Caddy-proxied, HTTPS) ===")
for svc in SERVICES:
if svc.get("is_api"):
check_api_status(svc["url"], svc["name"])
print("=== PIC Sanity Check ===")
for chk in CORE_CHECKS:
code, body = get(chk["path"])
if code == 200:
print(f"[OK] {chk['name']}")
else:
check_ui(svc["url"], svc["name"])
print(f"[FAIL] {chk['name']} — HTTP {code}: {body[:120]}")
# Discover installed services and check only those
code, body = get("/api/services/active")
installed_ids = set()
if code == 200:
try:
installed_ids = {svc["id"] for svc in json.loads(body)}
except Exception:
pass
print()
print("Optional services:")
for svc_id, chk in OPTIONAL_SERVICE_CHECKS.items():
if svc_id not in installed_ids:
print(f"[SKIP] {chk['name']} — not installed")
continue
code, body = get(chk["path"])
if code == 200:
print(f"[OK] {chk['name']}")
else:
print(f"[FAIL] {chk['name']} — HTTP {code}: {body[:120]}")
if __name__ == "__main__":
main()
+138 -31
View File
@@ -17,38 +17,26 @@ import sys
REQUIRED_DIRS = [
'config/caddy/certs',
'config/dns',
'config/dhcp',
'config/ntp',
'config/mail/config',
'config/mail/ssl',
'config/radicale',
'config/webdav',
'config/wireguard',
'config/wireguard/wg_confs',
'config/api',
'data/caddy',
'data/dns',
'data/dhcp',
'data/maildata',
'data/mailstate',
'data/maillogs',
'data/radicale',
'data/files',
'data/api',
'data/vault/certs',
'data/vault/keys',
'data/vault/trust',
'data/vault/ca',
'data/logs',
'data/services',
'data/wireguard/keys/peers',
'data/wireguard/wg_confs',
]
REQUIRED_FILES = [
'config/dns/Corefile',
'config/dhcp/dnsmasq.conf',
'config/ntp/chrony.conf',
'config/mail/mailserver.env',
'config/webdav/users.passwd',
]
ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
@@ -146,9 +134,11 @@ def generate_wg_keys():
def write_wg0_conf(private_key: str, address: str, port: int):
wg_conf = os.path.join(ROOT, 'config', 'wireguard', 'wg0.conf')
wg_confs_dir = os.path.join(ROOT, 'config', 'wireguard', 'wg_confs')
os.makedirs(wg_confs_dir, exist_ok=True)
wg_conf = os.path.join(wg_confs_dir, 'wg0.conf')
if os.path.exists(wg_conf):
print('[EXISTS] config/wireguard/wg0.conf')
print('[EXISTS] config/wireguard/wg_confs/wg0.conf')
return
server_ip = address.split('/')[0]
content = (
@@ -157,19 +147,18 @@ def write_wg0_conf(private_key: str, address: str, port: int):
f'Address = {address}\n'
f'ListenPort = {port}\n'
f'PostUp = iptables -A FORWARD -i %i -j ACCEPT; '
f'iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE; '
f'sysctl -q net.ipv4.conf.all.rp_filter=0\n'
f'iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE\n'
f'PostDown = iptables -D FORWARD -i %i -j ACCEPT; '
f'iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE; '
f'sysctl -q net.ipv4.conf.all.rp_filter=1\n'
f'iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE\n'
)
with open(wg_conf, 'w') as f:
f.write(content)
os.chmod(wg_conf, 0o600)
print(f'[CREATED] config/wireguard/wg0.conf address={address} port={port}')
print(f'[CREATED] config/wireguard/wg_confs/wg0.conf address={address} port={port}')
def write_cell_config(cell_name: str, domain: str, port: int):
def write_cell_config(cell_name: str, domain: str, port: int,
domain_mode: str, domain_name: str) -> None:
cfg_path = os.path.join(ROOT, 'config', 'api', 'cell_config.json')
if os.path.exists(cfg_path):
try:
@@ -179,17 +168,46 @@ def write_cell_config(cell_name: str, domain: str, port: int):
return
except Exception:
pass
ddns: dict = {}
if domain_mode == 'pic_ngo':
ddns = {
'provider': 'pic_ngo',
'api_base_url': DDNS_URL.replace('/api/v1', ''),
'totp_secret': DDNS_TOTP_SECRET,
'enabled': True,
}
elif domain_mode == 'cloudflare':
ddns = {'provider': 'cloudflare', 'enabled': True}
if CLOUDFLARE_TOKEN:
ddns['api_token'] = CLOUDFLARE_TOKEN
elif domain_mode == 'duckdns':
ddns = {'provider': 'duckdns', 'enabled': True}
if DUCKDNS_TOKEN:
ddns['token'] = DUCKDNS_TOKEN
if DUCKDNS_SUBDOMAIN:
ddns['subdomain'] = DUCKDNS_SUBDOMAIN
elif domain_mode == 'http01':
ddns = {'provider': 'http01', 'enabled': True}
else: # lan
ddns = {'provider': 'none', 'enabled': False}
config = {
'_identity': {
'cell_name': cell_name,
'domain': domain,
'domain_mode': domain_mode,
'domain_name': domain_name,
'ip_range': '172.20.0.0/16',
'wireguard_port': port,
}
},
'ddns': ddns,
}
with open(cfg_path, 'w') as f:
json.dump(config, f, indent=2)
print(f'[CREATED] config/api/cell_config.json name={cell_name} domain={domain}')
os.chmod(cfg_path, 0o600)
print(f'[CREATED] config/api/cell_config.json name={cell_name} mode={domain_mode}'
+ (f' domain={domain_name}' if domain_name else ''))
def write_compose_env(ip_range: str):
@@ -238,6 +256,82 @@ def ensure_session_secret():
print('[CREATED] data/api/.session_secret')
DDNS_URL = os.environ.get('DDNS_URL', 'http://ddns.pic.ngo:8080/api/v1')
DDNS_TOTP_SECRET = os.environ.get('DDNS_TOTP_SECRET', 'S6UMA464YIKM74QHXWL5WELDIO3HFZ6K')
DOMAIN_MODE = os.environ.get('DOMAIN_MODE', 'lan')
CELL_DOMAIN_NAME = os.environ.get('CELL_DOMAIN_NAME', '')
CLOUDFLARE_TOKEN = os.environ.get('CLOUDFLARE_API_TOKEN', '')
DUCKDNS_TOKEN = os.environ.get('DUCKDNS_TOKEN', '')
DUCKDNS_SUBDOMAIN= os.environ.get('DUCKDNS_SUBDOMAIN', '')
def register_with_ddns(cell_name: str) -> None:
"""Register cell_name.pic.ngo with the DDNS server using TOTP auth.
Idempotent: if a token file already exists the registration is skipped.
Skipped silently if DDNS_TOTP_SECRET is not set.
"""
token_path = os.path.join(ROOT, 'data', 'api', '.ddns_token')
if os.path.exists(token_path):
print('[EXISTS] DDNS registration — token already present')
return
if not DDNS_TOTP_SECRET:
print('[SKIP] DDNS_TOTP_SECRET not set — skipping DDNS registration')
return
import urllib.request
import urllib.error
# Detect public IP
try:
public_ip = urllib.request.urlopen(
'https://api.ipify.org', timeout=5
).read().decode().strip()
except Exception as e:
print(f'[WARN] Could not detect public IP: {e} — skipping DDNS registration')
return
# Generate TOTP using stdlib only — no third-party package needed
otp = ''
try:
import base64 as _b64, hashlib as _hl, hmac as _hmac, struct as _struct
import time as _time
_key = _b64.b32decode(DDNS_TOTP_SECRET.upper())
_t = int(_time.time()) // 30
_h = _hmac.new(_key, _struct.pack('>Q', _t), _hl.sha1).digest()
_offset = _h[-1] & 0xF
_code = _struct.unpack('>I', _h[_offset:_offset + 4])[0] & 0x7FFFFFFF
otp = f'{_code % 1_000_000:06d}'
except Exception as e:
print(f'[WARN] Could not generate OTP: {e} — registering without OTP header')
data = json.dumps({'name': cell_name, 'ip': public_ip}).encode()
headers = {'Content-Type': 'application/json'}
if otp:
headers['X-Register-OTP'] = otp
req = urllib.request.Request(
f'{DDNS_URL}/register',
data=data,
headers=headers,
method='POST',
)
try:
resp = urllib.request.urlopen(req, timeout=10)
result = json.loads(resp.read())
token = result['token']
os.makedirs(os.path.dirname(token_path), exist_ok=True)
with open(token_path, 'w') as f:
f.write(token)
os.chmod(token_path, 0o600)
print(f'[CREATED] DDNS registration: {result["subdomain"]} ip={public_ip}')
except urllib.error.HTTPError as e:
body = e.read().decode()
print(f'[WARN] DDNS registration failed ({e.code}): {body}')
except Exception as e:
print(f'[WARN] DDNS registration failed: {e}')
def bootstrap_admin_password():
import secrets as _secrets
users_file = os.path.join(ROOT, 'data', 'api', 'auth_users.json')
@@ -279,15 +373,28 @@ def bootstrap_admin_password():
def main():
cell_name = os.environ.get('CELL_NAME', 'mycell')
domain = os.environ.get('CELL_DOMAIN', 'cell')
cell_name = os.environ.get('CELL_NAME', 'mycell')
domain_mode = DOMAIN_MODE # module-level, read from env
domain_name = CELL_DOMAIN_NAME
# Derive the legacy 'domain' TLD field and fill in domain_name if empty
if domain_mode == 'pic_ngo':
domain = 'pic.ngo'
if not domain_name:
domain_name = f'{cell_name}.pic.ngo'
elif domain_mode == 'lan':
domain = os.environ.get('CELL_DOMAIN', 'cell')
domain_name = ''
else:
# cloudflare / duckdns / http01 — domain_name is the full FQDN
domain = domain_name
vpn_address = os.environ.get('VPN_ADDRESS', '10.0.0.1/24')
wg_port = int(os.environ.get('WG_PORT', '51820'))
# Prefer existing config ip_range over env var so `make setup` is safe to re-run
ip_range = os.environ.get('CELL_IP_RANGE') or _read_existing_ip_range() or '172.20.0.0/16'
wg_port = int(os.environ.get('WG_PORT', '51820'))
ip_range = os.environ.get('CELL_IP_RANGE') or _read_existing_ip_range() or '172.20.0.0/16'
print('--- Personal Internet Cell: Setup ---')
print(f' cell={cell_name} domain={domain} vpn={vpn_address} port={wg_port}')
print(f' cell={cell_name} mode={domain_mode} domain={domain_name or "(lan)"} vpn={vpn_address} port={wg_port}')
print()
for d in REQUIRED_DIRS:
@@ -298,7 +405,7 @@ def main():
ensure_caddy_ca_cert()
priv, _pub = generate_wg_keys()
write_wg0_conf(priv, vpn_address, wg_port)
write_cell_config(cell_name, domain, wg_port)
write_cell_config(cell_name, domain, wg_port, domain_mode, domain_name)
write_compose_env(ip_range)
write_caddy_config(ip_range, cell_name, domain)
ensure_session_secret()
-559
View File
@@ -1,559 +0,0 @@
#!/usr/bin/env python3
"""
Comprehensive tests for Flask app endpoints
"""
import unittest
import sys
import os
import tempfile
import shutil
import json
from pathlib import Path
from unittest.mock import patch, MagicMock
# Add api directory to path
api_dir = Path(__file__).parent / 'api'
sys.path.insert(0, str(api_dir))
class TestFlaskAppEndpoints(unittest.TestCase):
def setUp(self):
"""Set up test environment"""
# Create temporary directories
self.test_dir = tempfile.mkdtemp()
self.data_dir = os.path.join(self.test_dir, 'data')
self.config_dir = os.path.join(self.test_dir, 'config')
os.makedirs(self.data_dir, exist_ok=True)
os.makedirs(self.config_dir, exist_ok=True)
# Set environment variables
os.environ['TESTING'] = 'true'
os.environ['LOG_LEVEL'] = 'ERROR'
# Import and create app
from app import app
self.app = app
self.client = app.test_client()
# Mock external dependencies
self.patchers = []
# Mock subprocess.run
subprocess_patcher = patch('subprocess.run')
self.mock_subprocess = subprocess_patcher.start()
self.mock_subprocess.return_value.returncode = 0
self.mock_subprocess.return_value.stdout = b"test output"
self.patchers.append(subprocess_patcher)
# Mock docker
docker_patcher = patch('docker.from_env')
self.mock_docker = docker_patcher.start()
self.mock_docker_client = MagicMock()
self.mock_docker.return_value = self.mock_docker_client
self.patchers.append(docker_patcher)
# Mock file operations
file_patcher = patch('builtins.open', create=True)
self.mock_file = file_patcher.start()
self.mock_file.return_value.__enter__.return_value.read.return_value = '{}'
self.patchers.append(file_patcher)
def tearDown(self):
"""Clean up test environment"""
shutil.rmtree(self.test_dir)
for patcher in self.patchers:
patcher.stop()
def test_health_endpoint(self):
"""Test /health endpoint"""
response = self.client.get('/health')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('status', data)
def test_api_status_endpoint(self):
"""Test /api/status endpoint"""
response = self.client.get('/api/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('status', data)
def test_api_config_get_endpoint(self):
"""Test GET /api/config endpoint"""
response = self.client.get('/api/config')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, dict)
def test_api_config_put_endpoint(self):
"""Test PUT /api/config endpoint"""
test_config = {'test': 'value'}
response = self.client.put('/api/config',
data=json.dumps(test_config),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_config_backup_endpoint(self):
"""Test POST /api/config/backup endpoint"""
response = self.client.post('/api/config/backup')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('backup_id', data)
def test_api_config_backups_endpoint(self):
"""Test GET /api/config/backups endpoint"""
response = self.client.get('/api/config/backups')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
def test_api_config_restore_endpoint(self):
"""Test POST /api/config/restore/<backup_id> endpoint"""
response = self.client.post('/api/config/restore/test_backup')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_config_export_endpoint(self):
"""Test GET /api/config/export endpoint"""
response = self.client.get('/api/config/export')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, dict)
def test_api_config_import_endpoint(self):
"""Test POST /api/config/import endpoint"""
test_config = {'test': 'value'}
response = self.client.post('/api/config/import',
data=json.dumps(test_config),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_services_bus_status_endpoint(self):
"""Test GET /api/services/bus/status endpoint"""
response = self.client.get('/api/services/bus/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('services', data)
def test_api_services_bus_events_endpoint(self):
"""Test GET /api/services/bus/events endpoint"""
response = self.client.get('/api/services/bus/events')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
def test_api_services_bus_start_endpoint(self):
"""Test POST /api/services/bus/services/<service_name>/start endpoint"""
response = self.client.post('/api/services/bus/services/test/start')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_services_bus_stop_endpoint(self):
"""Test POST /api/services/bus/services/<service_name>/stop endpoint"""
response = self.client.post('/api/services/bus/services/test/stop')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_services_bus_restart_endpoint(self):
"""Test POST /api/services/bus/services/<service_name>/restart endpoint"""
response = self.client.post('/api/services/bus/services/test/restart')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_logs_services_endpoint(self):
"""Test GET /api/logs/services/<service> endpoint"""
response = self.client.get('/api/logs/services/test')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
def test_api_logs_search_endpoint(self):
"""Test POST /api/logs/search endpoint"""
search_data = {'query': 'test', 'level': 'INFO'}
response = self.client.post('/api/logs/search',
data=json.dumps(search_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
def test_api_logs_export_endpoint(self):
"""Test POST /api/logs/export endpoint"""
export_data = {'format': 'json', 'filters': {}}
response = self.client.post('/api/logs/export',
data=json.dumps(export_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('export_path', data)
def test_api_logs_statistics_endpoint(self):
"""Test GET /api/logs/statistics endpoint"""
response = self.client.get('/api/logs/statistics')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('total_entries', data)
def test_api_logs_rotate_endpoint(self):
"""Test POST /api/logs/rotate endpoint"""
response = self.client.post('/api/logs/rotate')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_dns_records_endpoints(self):
"""Test DNS records endpoints"""
# GET
response = self.client.get('/api/dns/records')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
# POST
record_data = {'name': 'test.example.com', 'type': 'A', 'value': '192.168.1.1'}
response = self.client.post('/api/dns/records',
data=json.dumps(record_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# DELETE
response = self.client.delete('/api/dns/records',
data=json.dumps(record_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_dhcp_endpoints(self):
"""Test DHCP endpoints"""
# GET leases
response = self.client.get('/api/dhcp/leases')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
# POST reservation
reservation_data = {'mac': '00:11:22:33:44:55', 'ip': '192.168.1.100'}
response = self.client.post('/api/dhcp/reservations',
data=json.dumps(reservation_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# DELETE reservation
response = self.client.delete('/api/dhcp/reservations',
data=json.dumps(reservation_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_ntp_status_endpoint(self):
"""Test GET /api/ntp/status endpoint"""
response = self.client.get('/api/ntp/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('status', data)
def test_api_network_info_endpoint(self):
"""Test GET /api/network/info endpoint"""
response = self.client.get('/api/network/info')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('interfaces', data)
def test_api_dns_status_endpoint(self):
"""Test GET /api/dns/status endpoint"""
response = self.client.get('/api/dns/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('status', data)
def test_api_network_test_endpoint(self):
"""Test POST /api/network/test endpoint"""
test_data = {'target': '8.8.8.8', 'type': 'ping'}
response = self.client.post('/api/network/test',
data=json.dumps(test_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_wireguard_endpoints(self):
"""Test WireGuard endpoints"""
# GET keys
response = self.client.get('/api/wireguard/keys')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('public_key', data)
# POST generate peer keys
response = self.client.post('/api/wireguard/keys/peer')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('public_key', data)
# GET config
response = self.client.get('/api/wireguard/config')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('config', data)
# GET peers
response = self.client.get('/api/wireguard/peers')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
# POST add peer
peer_data = {'peer': 'test_peer', 'ip': '10.0.0.1', 'public_key': 'test_key'}
response = self.client.post('/api/wireguard/peers',
data=json.dumps(peer_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# DELETE remove peer
response = self.client.delete('/api/wireguard/peers',
data=json.dumps({'peer': 'test_peer'}),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# GET status
response = self.client.get('/api/wireguard/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('status', data)
def test_api_peers_endpoints(self):
"""Test peers endpoints"""
# GET peers
response = self.client.get('/api/peers')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
# POST add peer
peer_data = {'peer': 'test_peer', 'ip': '10.0.0.1'}
response = self.client.post('/api/peers',
data=json.dumps(peer_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# DELETE remove peer
response = self.client.delete('/api/peers/test_peer')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
def test_api_email_endpoints(self):
"""Test email endpoints"""
# GET users
response = self.client.get('/api/email/users')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
# POST create user
user_data = {'username': 'test_user', 'email': 'test@example.com'}
response = self.client.post('/api/email/users',
data=json.dumps(user_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# DELETE user
response = self.client.delete('/api/email/users/test_user')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# GET status
response = self.client.get('/api/email/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('status', data)
def test_api_calendar_endpoints(self):
"""Test calendar endpoints"""
# GET users
response = self.client.get('/api/calendar/users')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
# POST create user
user_data = {'username': 'test_user', 'email': 'test@example.com'}
response = self.client.post('/api/calendar/users',
data=json.dumps(user_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# DELETE user
response = self.client.delete('/api/calendar/users/test_user')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# GET status
response = self.client.get('/api/calendar/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('status', data)
def test_api_files_endpoints(self):
"""Test files endpoints"""
# GET users
response = self.client.get('/api/files/users')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
# POST create user
user_data = {'username': 'test_user'}
response = self.client.post('/api/files/users',
data=json.dumps(user_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# DELETE user
response = self.client.delete('/api/files/users/test_user')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# GET status
response = self.client.get('/api/files/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('status', data)
def test_api_routing_endpoints(self):
"""Test routing endpoints"""
# GET status
response = self.client.get('/api/routing/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('status', data)
# POST NAT rule
nat_data = {'type': 'masquerade', 'interface': 'eth0'}
response = self.client.post('/api/routing/nat',
data=json.dumps(nat_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('rule_id', data)
# GET NAT rules
response = self.client.get('/api/routing/nat')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
def test_api_vault_endpoints(self):
"""Test vault endpoints"""
# GET status
response = self.client.get('/api/vault/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('status', data)
# GET certificates
response = self.client.get('/api/vault/certificates')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
# POST generate certificate
cert_data = {'common_name': 'test.example.com'}
response = self.client.post('/api/vault/certificates',
data=json.dumps(cert_data),
content_type='application/json')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('certificate', data)
# GET CA certificate
response = self.client.get('/api/vault/ca/certificate')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('certificate', data)
def test_api_containers_endpoints(self):
"""Test containers endpoints"""
# GET containers
response = self.client.get('/api/containers')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
# POST start container
response = self.client.post('/api/containers/test/start')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# POST stop container
response = self.client.post('/api/containers/test/stop')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('success', data)
# GET container logs
response = self.client.get('/api/containers/test/logs')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
def test_api_services_status_endpoint(self):
"""Test GET /api/services/status endpoint"""
response = self.client.get('/api/services/status')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('services', data)
def test_api_services_connectivity_endpoint(self):
"""Test GET /api/services/connectivity endpoint"""
response = self.client.get('/api/services/connectivity')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('results', data)
def test_api_health_history_endpoint(self):
"""Test GET /api/health/history endpoint"""
response = self.client.get('/api/health/history')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
def test_api_logs_endpoint(self):
"""Test GET /api/logs endpoint"""
response = self.client.get('/api/logs')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
if __name__ == '__main__':
unittest.main()
BIN
View File
Binary file not shown.
+1 -1
View File
@@ -26,7 +26,7 @@ def tmp_dir():
@pytest.fixture
def tmp_config_dir(tmp_dir):
"""Temporary config dir with the sub-directories expected by managers."""
for sub in ('api', 'caddy', 'dns', 'dhcp', 'ntp', 'wireguard'):
for sub in ('api', 'caddy', 'dns', 'ntp', 'wireguard'):
os.makedirs(os.path.join(tmp_dir, sub), exist_ok=True)
return tmp_dir
+18 -4
View File
@@ -193,7 +193,7 @@ class TestCellPermissionsApi:
fake_dns_ip = '10.99.0.1'
fake_invite = {
'cell_name': 'e2etest-synthetic-cell',
'public_key': 'AAAAFakePublicKeyForE2eTestingAAAAAAAAAAAAAAAA=',
'public_key': 'FakePublicKeyForE2eCellTestAAAAAAAAAAAAAAAA=',
'endpoint': '127.0.0.2:51820',
'vpn_subnet': fake_subnet,
'dns_ip': fake_dns_ip,
@@ -334,7 +334,7 @@ class TestLiveCellConnection:
if cell2_name:
_remove_connection(admin_client, cell2_name)
if cell1_name:
if cell1_name and cell2_client:
_remove_connection(cell2_client, cell1_name)
def _connect_cells(self, admin_client, cell2_client,
@@ -433,10 +433,24 @@ class TestLiveCellConnection:
After cell1 sets outbound.calendar=True (= cell2 gets inbound.calendar=True
from cell1), we verify that cell2's stored remote view is updated.
This test requires the cells to be able to reach each other's API on port 3000.
Requires cells to reach each other's API via the WireGuard tunnel (DNS IP on
port 3000). Skipped when the WG tunnel between cells is not active.
"""
cell1_name, cell2_name = self._connect_cells(admin_client, cell2_client)
# Verify the WG tunnel is up: cell1 must be able to reach cell2's API
# at cell2's WireGuard DNS IP before we assert that the push succeeded.
invite2 = _get_invite(cell2_client)
cell2_dns_ip = invite2['dns_ip']
import requests as _req
try:
_req.get(f'http://{cell2_dns_ip}:3000/health', timeout=2)
except Exception:
pytest.skip(
f"Cell2 not reachable at http://{cell2_dns_ip}:3000 via WG tunnel — "
"peer-sync push requires an active tunnel between the two cells"
)
# cell1 enables outbound calendar to cell2
inbound = {'calendar': False, 'files': False, 'mail': False, 'webdav': False}
outbound = {'calendar': True, 'files': False, 'mail': False, 'webdav': False}
@@ -530,7 +544,7 @@ class TestCellServiceAccessRestrictions:
cell1_name = None
if cell2_name:
_remove_connection(admin_client, cell2_name)
if cell1_name:
if cell1_name and cell2_client:
_remove_connection(cell2_client, cell1_name)
def _get_forward_rules(self, client) -> str:
+4 -1
View File
@@ -85,7 +85,10 @@ class TestServiceAccessUpdate:
if not rules:
return # can't verify without iptables access — skip silently
# No Caddy-targeted DROP for this peer; service blocking is DNS-ACL only
caddy_drop = f'{peer_ip}' in rules and 'DROP' in rules and 'dpt:80' in rules
caddy_drop = any(
peer_ip in line and 'DROP' in line and 'dpt:80' in line
for line in rules.splitlines()
)
assert not caddy_drop, (
f'Found Caddy DROP rule for {peer_ip} after service_access=[] — '
f'this blocks the PIC UI. Service access should be DNS-ACL only.\n{rules}'
+5 -1
View File
@@ -10,7 +10,11 @@ class PicAPIClient:
def login(self, username: str, password: str) -> dict:
r = self.s.post(f"{self.base}/api/auth/login", json={'username': username, 'password': password})
r.raise_for_status()
return r.json()
data = r.json()
csrf = data.get('csrf_token', '')
if csrf:
self.s.headers['X-CSRF-Token'] = csrf
return data
def logout(self):
self.s.post(f"{self.base}/api/auth/logout")
+10 -1
View File
@@ -52,9 +52,18 @@ def build_wg_config(private_key: str, peer_ip: str, server_pubkey: str,
def cleanup_stale_e2e_interfaces():
"""Remove any leftover pic-e2e-* interfaces from previous failed runs."""
"""Remove any leftover pic-e2e-* interfaces and nftables tables from previous failed runs."""
result = subprocess.run(['ip', 'link', 'show'], capture_output=True, text=True)
for line in result.stdout.splitlines():
if 'pic-e2e-' in line:
iface = line.split(':')[1].strip().split('@')[0]
subprocess.run(['sudo', 'ip', 'link', 'delete', iface], capture_output=True)
# wg-quick creates an nftables table per interface; if the interface was never brought
# down cleanly the table persists and drops decrypted ICMP replies on future runs.
nft_result = subprocess.run(['sudo', 'nft', 'list', 'tables'], capture_output=True, text=True)
for line in nft_result.stdout.splitlines():
if 'wg-quick-pic-e2e-' in line:
table_name = line.strip().split()[-1]
subprocess.run(['sudo', 'nft', 'delete', 'table', 'ip', table_name],
capture_output=True)
+175
View File
@@ -0,0 +1,175 @@
"""
Service Store E2E tests.
Tests that the admin can install and remove store services via the /store page.
Requires a running PIC stack with access to the service store index and registry.
Run with:
pytest tests/e2e/ui/test_service_store.py -v --base-url http://<pic-host>
"""
import pytest
pytestmark = pytest.mark.ui
STORE_ROUTE = '/services'
# Services to install in dependency order (webmail requires email)
INSTALL_ORDER = ['calendar', 'files', 'email', 'webmail']
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _goto_store(page, webui_base):
page.goto(f"{webui_base}{STORE_ROUTE}")
page.wait_for_load_state('networkidle')
def _service_card(page, service_name):
"""Return the card element containing the named service."""
return page.locator('.card', has=page.get_by_text(service_name, exact=False)).first
def _is_installed(page, service_name):
card = _service_card(page, service_name)
return card.get_by_text('Installed', exact=False).is_visible()
def _install_service(page, webui_base, service_name, timeout_ms=180_000):
"""Click Install on a service card and wait until the card shows Installed."""
_goto_store(page, webui_base)
card = _service_card(page, service_name)
install_btn = card.get_by_role('button', name='Install')
install_btn.click()
# Wait for the Install button to disappear (replaced by Remove) or for
# the Installed badge to appear — whichever comes first.
card.get_by_text('Installed', exact=False).wait_for(state='visible', timeout=timeout_ms)
def _remove_service(page, webui_base, service_name, timeout_ms=60_000):
"""Click Uninstall on a service card and confirm, then wait until Install reappears."""
_goto_store(page, webui_base)
card = _service_card(page, service_name)
card.get_by_role('button', name='Uninstall').click()
# A confirmation dialog appears — click the confirm Uninstall button
page.get_by_role('button', name='Uninstall Service').wait_for(state='visible', timeout=5000)
page.get_by_role('button', name='Uninstall Service').click()
card.get_by_role('button', name='Install').wait_for(state='visible', timeout=timeout_ms)
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
def test_store_page_loads(admin_page, webui_base):
"""Store page must load and list available services without errors."""
page = admin_page
_goto_store(page, webui_base)
# Should not show a generic error message
assert 'Could not load the service store' not in page.content(), (
'Store page showed error: could not load the service store'
)
# At least one service card should be visible
cards = page.locator('.card').all()
assert len(cards) > 0, 'No service cards found on the store page'
def test_store_shows_known_services(admin_page, webui_base):
"""Store page must list email, calendar, files, and webmail."""
page = admin_page
_goto_store(page, webui_base)
for name in ('Email Server', 'Calendar', 'File Storage', 'Webmail'):
assert page.get_by_text(name, exact=False).first.is_visible(), (
f"Expected service '{name}' not visible on store page"
)
def test_install_calendar(admin_page, webui_base):
"""Admin can install the calendar service."""
page = admin_page
_goto_store(page, webui_base)
if _is_installed(page, 'Calendar'):
pytest.skip('calendar already installed — skipping install test')
_install_service(page, webui_base, 'Calendar & Contacts', timeout_ms=180_000)
assert _is_installed(page, 'Calendar'), (
'Calendar service card did not show Installed after install'
)
def test_install_files(admin_page, webui_base):
"""Admin can install the file storage service."""
page = admin_page
_goto_store(page, webui_base)
if _is_installed(page, 'File Storage'):
pytest.skip('files already installed — skipping install test')
_install_service(page, webui_base, 'File Storage', timeout_ms=180_000)
assert _is_installed(page, 'File Storage'), (
'Files service card did not show Installed after install'
)
def test_install_email(admin_page, webui_base):
"""Admin can install the email service."""
page = admin_page
_goto_store(page, webui_base)
if _is_installed(page, 'Email Server'):
pytest.skip('email already installed — skipping install test')
_install_service(page, webui_base, 'Email Server', timeout_ms=300_000)
assert _is_installed(page, 'Email Server'), (
'Email service card did not show Installed after install'
)
def test_install_webmail(admin_page, webui_base):
"""Admin can install webmail after email is installed."""
page = admin_page
_goto_store(page, webui_base)
if not _is_installed(page, 'Email Server'):
pytest.skip('email not installed — webmail requires email first')
if _is_installed(page, 'Webmail'):
pytest.skip('webmail already installed — skipping install test')
_install_service(page, webui_base, 'Webmail', timeout_ms=180_000)
assert _is_installed(page, 'Webmail'), (
'Webmail service card did not show Installed after install'
)
def test_installed_services_appear_on_dashboard(admin_page, webui_base):
"""After installation, services should appear as links on the dashboard."""
page = admin_page
_goto_store(page, webui_base)
page.goto(f"{webui_base}/")
page.wait_for_load_state('networkidle')
# Check that at least the Cell Home link is present
assert page.get_by_text('Cell Home', exact=False).is_visible(), (
'Dashboard does not show the Cell Home service link'
)
def test_uninstall_webmail(admin_page, webui_base):
"""Admin can uninstall the webmail service."""
page = admin_page
_goto_store(page, webui_base)
if not _is_installed(page, 'Webmail'):
pytest.skip('webmail not installed — skipping uninstall test')
_remove_service(page, webui_base, 'Webmail')
assert not _is_installed(page, 'Webmail'), (
'Webmail service card still shows Installed after uninstall'
)
+19 -1
View File
@@ -39,10 +39,27 @@ def wg_server_info(admin_client, pic_host):
except Exception:
pass
# Server VPN IP (e.g. '10.0.0.1') and subnet (e.g. '10.0.0.0/24') from status
server_address = '10.0.0.1/24'
try:
server_address = admin_client.get('/api/wireguard/status').json().get('address', server_address)
except Exception:
pass
import ipaddress as _ip
try:
iface = _ip.ip_interface(server_address)
server_ip = str(iface.ip)
server_network = str(iface.network)
except Exception:
server_ip = '10.0.0.1'
server_network = '10.0.0.0/24'
return {
'public_key': server_pubkey,
'endpoint': pic_host,
'port': int(port),
'server_ip': server_ip,
'server_network': server_network,
}
@@ -65,7 +82,7 @@ def connected_peer(make_peer, wg_server_info, tmp_path):
server_pubkey=wg_server_info['public_key'],
server_endpoint=wg_server_info['endpoint'],
server_port=wg_server_info['port'],
allowed_ips='10.0.0.0/24',
allowed_ips=wg_server_info['server_network'],
)
# Write config with restricted permissions
@@ -78,6 +95,7 @@ def connected_peer(make_peer, wg_server_info, tmp_path):
iface.bring_up()
peer['iface'] = iface
peer['conf_path'] = conf_path
peer['server_ip'] = wg_server_info['server_ip']
yield peer
finally:
iface.bring_down()
+50 -21
View File
@@ -32,7 +32,8 @@ def _config(admin_client) -> dict:
def _domain(admin_client) -> str:
return _config(admin_client).get('domain') or 'lan'
cfg = _config(admin_client)
return cfg.get('domain_name') or cfg.get('domain') or 'lan'
def _dns_ip(admin_client) -> str:
@@ -66,16 +67,27 @@ def _curl_host(ip: str, host: str, path: str = '/', timeout: int = 8) -> tuple[i
def _curl_domain(host: str, path: str = '/', dns_ip: str = '', timeout: int = 8) -> tuple[int, str]:
"""Make an HTTP request using curl's --dns-servers to resolve via CoreDNS."""
cmd = ['curl', '-s', '--connect-timeout', '5',
'-w', '\n__HTTP_CODE__:%{http_code}',
f'http://{host}{path}']
"""Make an HTTP request to host, optionally resolving via a custom DNS server.
Uses dig to resolve the host (avoiding --dns-servers which requires c-ares),
then curls to the resolved IP with the original Host header.
"""
if dns_ip:
cmd = ['curl', '-s', '--connect-timeout', '5',
'--dns-servers', dns_ip,
'-w', '\n__HTTP_CODE__:%{http_code}',
f'http://{host}{path}']
result = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
dig = subprocess.run(
['dig', f'@{dns_ip}', host, 'A', '+short', '+time=3', '+tries=1'],
capture_output=True, text=True, timeout=5,
)
resolved_ips = [line for line in dig.stdout.strip().splitlines() if line and not line.startswith(';')]
if resolved_ips:
return _curl_host(resolved_ips[0], host, path, timeout)
return 0, ''
result = subprocess.run(
['curl', '-s', '--connect-timeout', '5',
'-w', '\n__HTTP_CODE__:%{http_code}',
f'http://{host}{path}'],
capture_output=True, text=True, timeout=timeout,
)
output = result.stdout
body = ''
code = 0
@@ -92,19 +104,21 @@ def _curl_domain(host: str, path: str = '/', dns_ip: str = '', timeout: int = 8)
# ── Scenario 35: api.<domain> routes to API ───────────────────────────────────
def test_api_domain_returns_json_not_webui(connected_peer, admin_client):
"""api.<domain>/api/status must return JSON, not the React WebUI HTML."""
"""api.<domain>/api/status must return JSON or a redirect, not the React WebUI HTML."""
dom = _domain(admin_client)
dns_ip = _dns_ip(admin_client)
code, body = _curl_domain(f'api.{dom}', '/api/status', dns_ip)
assert code not in (0, 000), f"curl to api.{dom}/api/status failed (code {code})"
assert code not in (0,), f"curl to api.{dom}/api/status failed completely (code {code})"
# 3xx means Caddy is routing (HTTP→HTTPS redirect in pic_ngo mode) — acceptable
if code in (301, 302, 308):
return
assert _WEBUI_MARKER not in body, (
f"api.{dom}/api/status returned WebUI HTML — "
"Caddy is not routing api.<domain> to the API; "
"check that the http://api.<domain> block exists in the Caddyfile "
"and uses the configured domain (not a stale .cell or .dev TLD)"
"check that the api.<domain> block exists in the Caddyfile"
)
assert '{' in body or '"' in body, (
f"api.{dom}/api/status did not return JSON (body: {body[:100]!r})"
f"api.{dom}/api/status did not return JSON (code={code}, body: {body[:100]!r})"
)
@@ -243,9 +257,16 @@ def test_vip_direct_access_not_webui(connected_peer, vip, expected_not):
# ── Scenario 41: Catch-all :80 routes API path correctly ─────────────────────
def test_catchall_api_path_returns_json(connected_peer):
"""The catch-all :80 block must route /api/* to the API (not WebUI)."""
def test_catchall_api_path_returns_json(connected_peer, admin_client):
"""The catch-all :80 block must route /api/* to the API (not WebUI).
Only applicable to HTTP-mode cells (e.g. lan/local domain). Cells using
pic_ngo / duckdns HTTPS mode have no catch-all :80 block Caddy redirects
all plain-HTTP to HTTPS instead.
"""
code, body = _curl_host('172.20.0.2', 'localhost', '/api/status')
if code in (301, 302, 308):
pytest.skip("Caddy is in HTTPS-redirect mode — no catch-all :80 block (expected for pic_ngo cells)")
assert _WEBUI_MARKER not in body, (
"Catch-all :80 returned WebUI HTML for /api/status — "
"the `handle /api/*` directive in the :80 block is missing or wrong"
@@ -255,9 +276,14 @@ def test_catchall_api_path_returns_json(connected_peer):
)
def test_catchall_root_serves_webui(connected_peer):
"""The catch-all :80 block serves the WebUI for the root path."""
def test_catchall_root_serves_webui(connected_peer, admin_client):
"""The catch-all :80 block serves the WebUI for the root path.
Only applicable to HTTP-mode cells. HTTPS-mode cells redirect :80 :443.
"""
code, body = _curl_host('172.20.0.2', 'localhost', '/')
if code in (301, 302, 308):
pytest.skip("Caddy is in HTTPS-redirect mode — no catch-all :80 block (expected for pic_ngo cells)")
assert _WEBUI_MARKER in body, (
"Catch-all :80 / did not return WebUI HTML — "
"something is broken with the catch-all :80 block"
@@ -269,7 +295,10 @@ def test_catchall_root_serves_webui(connected_peer):
def test_caddy_does_not_route_cell_tld(connected_peer):
"""Caddy must NOT have active routing for .cell domains — they are from old config."""
code, body = _curl_host('172.20.0.2', 'calendar.cell', '/')
assert _WEBUI_MARKER in body or code in (0, 404, 502, 503), (
"Caddy is still routing calendar.cell — stale .cell blocks remain in config. "
# 3xx redirects (e.g. HTTP→HTTPS) are acceptable — they mean Caddy is active but
# not serving a functional response. Only a 200-with-content or WebUI HTML is a problem.
assert _WEBUI_MARKER in body or code in (0, 301, 302, 308, 404, 502, 503), (
"Caddy is still routing calendar.cell with a functional response — "
"stale .cell blocks remain in config. "
"Check that write_caddyfile() is writing to the correct path that Caddy reads."
)
+4 -2
View File
@@ -7,8 +7,9 @@ pytestmark = pytest.mark.wg
def test_wg_connect_and_ping_server(connected_peer):
"""Scenario 25+26: create peer, connect, ping server VPN IP."""
iface = connected_peer['iface']
server_ip = connected_peer.get('server_ip', '10.0.0.1')
assert iface.up, "WireGuard interface should be up"
assert iface.is_connected('10.0.0.1'), "Server VPN IP 10.0.0.1 should be reachable via WireGuard"
assert iface.is_connected(server_ip), f"Server VPN IP {server_ip} should be reachable via WireGuard"
def test_wg_peer_has_assigned_ip(connected_peer):
@@ -21,8 +22,9 @@ def test_wg_peer_has_assigned_ip(connected_peer):
def test_wg_disconnect_removes_route(connected_peer):
"""Scenario 29: after disconnect, VPN IP is not reachable."""
iface = connected_peer['iface']
server_ip = connected_peer.get('server_ip', '10.0.0.1')
iface.bring_down()
result = subprocess.run(['ping', '-c', '1', '-W', '2', '10.0.0.1'],
result = subprocess.run(['ping', '-c', '1', '-W', '2', server_ip],
capture_output=True, timeout=5)
# After disconnect, ping should fail
assert result.returncode != 0, "VPN IP should not be reachable after disconnect"
+56 -30
View File
@@ -19,17 +19,18 @@ import pytest
pytestmark = pytest.mark.wg
# Subdomain → expected offset in ip_utils.CONTAINER_OFFSETS / VIP list.
# These are the sub-names, not full FQDNs — the TLD is fetched from config.
SUBDOMAINS_TO_IPS = {
'api': '172.20.0.2', # must route through Caddy (not API container direct)
'webui': '172.20.0.2', # must route through Caddy
'calendar': '172.20.0.21', # Caddy VIP for CalDAV
'files': '172.20.0.22', # Caddy VIP for Filegator
'mail': '172.20.0.23', # Caddy VIP for Rainloop
'webmail': '172.20.0.23', # alias for mail VIP
'webdav': '172.20.0.24', # Caddy VIP for WebDAV
}
# Subdomain → service_ips key for the expected VIP (None = always Caddy).
# Expected IP is read dynamically from /api/config service_ips; falls back to
# Caddy IP (172.20.0.2) when the service is not enabled / VIP not configured.
_SUBDOMAIN_VIP_KEYS = [
('api', None),
('webui', None),
('calendar', 'vip_calendar'),
('files', 'vip_files'),
('mail', 'vip_mail'),
('webmail', 'vip_mail'),
('webdav', 'vip_webdav'),
]
# ── helpers ───────────────────────────────────────────────────────────────────
@@ -45,8 +46,9 @@ def _dns_ip(admin_client) -> str:
def _domain(admin_client) -> str:
"""Return the configured cell domain (e.g. 'lan', 'dev', 'home')."""
return _config(admin_client).get('domain') or 'lan'
"""Return the cell's fully-qualified domain (e.g. 'test5.pic.ngo', 'lan')."""
cfg = _config(admin_client)
return cfg.get('domain_name') or cfg.get('domain') or 'lan'
def _cell_name(admin_client) -> str:
@@ -55,12 +57,24 @@ def _cell_name(admin_client) -> str:
# ── Scenario 30: DNS resolution ───────────────────────────────────────────────
@pytest.mark.parametrize('subdomain,expected_ip', list(SUBDOMAINS_TO_IPS.items()))
def test_service_domain_resolves_to_expected_ip(connected_peer, admin_client, subdomain, expected_ip):
@pytest.mark.parametrize('subdomain,vip_key', _SUBDOMAIN_VIP_KEYS)
def test_service_domain_resolves_to_expected_ip(connected_peer, admin_client, subdomain, vip_key):
"""Each service subdomain resolves to the correct IP via CoreDNS.
The full FQDN is built from the configured domain not hardcoded to any TLD.
The expected IP is read from service_ips; falls back to Caddy when the VIP is
not configured (e.g. when the service is disabled).
"""
cfg = _config(admin_client)
sips = cfg.get('service_ips', {})
caddy_ip = sips.get('caddy', '172.20.0.2')
# Accept both the specific VIP IP and Caddy IP: some zone files use per-service
# VIP records (172.20.0.21 etc.) while others use a wildcard pointing to Caddy.
# Both are correct deployments; what matters is that the domain resolves at all.
expected_ips = {caddy_ip}
if vip_key and sips.get(vip_key):
expected_ips.add(sips[vip_key])
dns_ip = _dns_ip(admin_client)
dom = _domain(admin_client)
fqdn = f'{subdomain}.{dom}'
@@ -70,8 +84,8 @@ def test_service_domain_resolves_to_expected_ip(connected_peer, admin_client, su
)
assert result.returncode == 0, f"dig failed for {fqdn}: {result.stderr}"
resolved = result.stdout.strip()
assert resolved == expected_ip, (
f"{fqdn} resolved to {resolved!r}, expected {expected_ip}. "
assert resolved in expected_ips, (
f"{fqdn} resolved to {resolved!r}, expected one of {expected_ips}. "
f"DNS server: {dns_ip}, configured domain: {dom!r}"
)
@@ -136,30 +150,43 @@ def test_caddy_ip_serves_http(connected_peer):
# ── Scenario 32: HTTP via domain ──────────────────────────────────────────────
def test_http_api_domain_reaches_api(connected_peer, admin_client):
"""curl http://api.<domain>/api/status returns a JSON response via Caddy + CoreDNS."""
"""api.<domain>/api/status is reachable via Caddy routing + CoreDNS resolution."""
dom = _domain(admin_client)
dns_ip = _dns_ip(admin_client)
result = subprocess.run(
['curl', '-s', '--connect-timeout', '5',
'--dns-servers', dns_ip,
f'http://api.{dom}/api/status'],
fqdn = f'api.{dom}'
# Resolve via CoreDNS (--dns-servers requires c-ares; use dig instead)
dig = subprocess.run(
['dig', f'@{dns_ip}', fqdn, 'A', '+short', '+time=5'],
capture_output=True, text=True, timeout=10,
)
assert result.stdout.strip(), (
f"curl http://api.{dom}/api/status returned no output via DNS {dns_ip}. "
resolved_ips = [l for l in dig.stdout.strip().splitlines() if l and not l.startswith(';')]
if not resolved_ips:
pytest.skip(f"api.{dom} does not resolve via CoreDNS at {dns_ip} — DNS may not be configured")
resolved_ip = resolved_ips[0]
result = subprocess.run(
['curl', '-s', '--connect-timeout', '5',
'-H', f'Host: {fqdn}',
f'http://{resolved_ip}/api/status'],
capture_output=True, text=True, timeout=10,
)
# 3xx means Caddy is redirecting HTTP→HTTPS (normal for pic_ngo mode)
stdout = result.stdout.strip()
assert result.returncode == 0 or stdout, (
f"curl to {resolved_ip} with Host: {fqdn} failed. "
f"stderr: {result.stderr[:200]}"
)
# ── Scenario 33: Config DNS field ─────────────────────────────────────────────
def test_peer_services_config_has_coredns_not_vpn_gateway(admin_client, make_peer):
def test_peer_services_config_has_coredns_not_vpn_gateway(admin_client, make_peer, api_base):
"""WireGuard config in /api/peer/services must use CoreDNS IP, not 10.0.0.1."""
from helpers.api_client import PicAPIClient
import os
peer = make_peer('e2etest-dns-config', password='DnsTest123!')
peer_client = PicAPIClient(os.environ.get('PIC_API_BASE', 'http://192.168.31.51:3000'))
peer_client = PicAPIClient(api_base)
peer_client.login(peer['name'], 'DnsTest123!')
r = peer_client.get('/api/peer/services')
@@ -188,14 +215,13 @@ def test_peer_services_config_has_coredns_not_vpn_gateway(admin_client, make_pee
break
def test_peer_services_caldav_url_uses_configured_domain(admin_client, make_peer):
def test_peer_services_caldav_url_uses_configured_domain(admin_client, make_peer, api_base):
"""CalDAV URL must use the configured domain, not hardcode 'radicale.dev:5232'."""
from helpers.api_client import PicAPIClient
import os
dom = _domain(admin_client)
peer = make_peer('e2etest-caldav-url', password='CaldavTest123!')
peer_client = PicAPIClient(os.environ.get('PIC_API_BASE', 'http://192.168.31.51:3000'))
peer_client = PicAPIClient(api_base)
peer_client.login(peer['name'], 'CaldavTest123!')
r = peer_client.get('/api/peer/services')
+5 -5
View File
@@ -6,14 +6,14 @@ pytestmark = [pytest.mark.wg, pytest.mark.requires_internet]
def test_full_tunnel_routes_all_traffic(full_tunnel_peer):
"""Scenario 30: with AllowedIPs=0.0.0.0/0, external traffic routes through VPN."""
# Check routing table — 0.0.0.0/0 should be via the WG interface
result = subprocess.run(['ip', 'route', 'show'], capture_output=True, text=True)
# wg-quick adds full-tunnel routes to a policy routing table (not the main table),
# so we must check all tables to find the 0.0.0.0/1 + 128.0.0.0/1 split routes.
result = subprocess.run(['ip', 'route', 'show', 'table', 'all'],
capture_output=True, text=True)
iface_name = full_tunnel_peer['iface'].iface_name
# In full tunnel mode, the default route or the 0.0.0.0/1 + 128.0.0.0/1 split routes
# point to the WG interface
assert (iface_name in result.stdout or
'0.0.0.0/1' in result.stdout or
'128.0.0.0/1' in result.stdout), "Full tunnel routes not found"
'128.0.0.0/1' in result.stdout), "Full tunnel routes not found in any routing table"
@pytest.mark.requires_internet
+4 -4
View File
@@ -90,7 +90,7 @@ class TestConfig:
# ---------------------------------------------------------------------------
EXPECTED_CONTAINERS = [
'cell-caddy', 'cell-dns', 'cell-dhcp', 'cell-ntp',
'cell-caddy', 'cell-dns', 'cell-ntp',
'cell-mail', 'cell-radicale', 'cell-webdav', 'cell-wireguard',
'cell-api', 'cell-webui', 'cell-rainloop', 'cell-filegator',
]
@@ -164,7 +164,7 @@ class TestWireGuard:
# ---------------------------------------------------------------------------
# Network services: DNS, DHCP, NTP
# Network services: DNS, NTP
# ---------------------------------------------------------------------------
class TestNetworkServices:
@@ -176,8 +176,8 @@ class TestNetworkServices:
r = get('/api/dns/status')
assert r.status_code == 200
def test_dhcp_leases_endpoint(self):
r = get('/api/dhcp/leases')
def test_dns_overview_endpoint(self):
r = get('/api/dns/overview')
assert r.status_code == 200
def test_ntp_status_endpoint(self):
@@ -11,7 +11,6 @@ Endpoints covered:
- /api/peers (POST, PUT, DELETE)
- /api/config (PUT)
- /api/dns/records (DELETE)
- /api/dhcp/reservations (POST, DELETE)
- /api/containers/<name>/restart
- /api/wireguard/keys/peer
@@ -240,43 +239,6 @@ class TestDnsRecordsNegative:
r.json()
# ---------------------------------------------------------------------------
# DHCP reservations — negative
# ---------------------------------------------------------------------------
class TestDhcpReservationsNegative:
def test_add_reservation_no_body_returns_400(self):
r = _S.post(
f"{API_BASE}/api/dhcp/reservations",
data='',
headers={'Content-Type': 'application/json'},
)
assert r.status_code == 400
def test_add_reservation_missing_ip_returns_400(self):
r = post('/api/dhcp/reservations', json={'mac': 'aa:bb:cc:dd:ee:ff'})
assert r.status_code == 400
_assert_json_error(r)
def test_add_reservation_missing_mac_returns_400(self):
r = post('/api/dhcp/reservations', json={'ip': '10.0.0.250'})
assert r.status_code == 400
_assert_json_error(r)
def test_delete_reservation_no_mac_returns_400(self):
r = delete('/api/dhcp/reservations', json={'ip': '10.0.0.250'})
assert r.status_code == 400
_assert_json_error(r)
def test_delete_reservation_empty_body_returns_400(self):
r = _S.delete(
f"{API_BASE}/api/dhcp/reservations",
data='',
headers={'Content-Type': 'application/json'},
)
assert r.status_code == 400
# ---------------------------------------------------------------------------
# Container endpoints — negative
# ---------------------------------------------------------------------------
+12 -73
View File
@@ -1,10 +1,8 @@
"""
Network services integration tests: DNS records, DHCP leases, DHCP reservations.
Network services integration tests: DNS records, DNS overview.
Note on endpoint shapes discovered from app.py:
- DELETE /api/dns/records takes a JSON body (not a URL param)
- DELETE /api/dhcp/reservations takes JSON body with 'mac' field
- POST /api/dhcp/reservations requires 'mac' and 'ip' fields
- DELETE /api/dns/records takes a JSON body (not a URL param)
Run with: pytest tests/integration/test_network_services.py -v
"""
@@ -129,79 +127,20 @@ class TestDnsRecordsWrite:
# ---------------------------------------------------------------------------
# GET /api/dhcp/leases
# GET /api/dns/overview
# ---------------------------------------------------------------------------
class TestDhcpLeases:
def test_get_dhcp_leases_returns_200(self):
r = get('/api/dhcp/leases')
class TestDnsOverview:
def test_get_dns_overview_returns_200(self):
r = get('/api/dns/overview')
assert r.status_code == 200
def test_get_dhcp_leases_returns_list_or_dict(self):
data = get('/api/dhcp/leases').json()
assert isinstance(data, (list, dict))
# ---------------------------------------------------------------------------
# POST /api/dhcp/reservations + DELETE /api/dhcp/reservations
# ---------------------------------------------------------------------------
_TEST_MAC = 'de:ad:be:ef:11:22'
_TEST_RESERVATION_IP = '10.0.0.200'
class TestDhcpReservations:
def _cleanup(self):
delete('/api/dhcp/reservations', json={'mac': _TEST_MAC})
def test_add_dhcp_reservation_returns_non_error(self):
try:
r = post('/api/dhcp/reservations', json={
'mac': _TEST_MAC,
'ip': _TEST_RESERVATION_IP,
'hostname': 'inttest-dhcp-host',
})
assert r.status_code in (200, 201), (
f"Expected 200/201 for DHCP reservation, got {r.status_code}: {r.text}"
)
finally:
self._cleanup()
def test_add_dhcp_reservation_missing_mac_returns_400(self):
r = post('/api/dhcp/reservations', json={'ip': _TEST_RESERVATION_IP})
assert r.status_code == 400
assert 'error' in r.json()
def test_add_dhcp_reservation_missing_ip_returns_400(self):
r = post('/api/dhcp/reservations', json={'mac': _TEST_MAC})
assert r.status_code == 400
assert 'error' in r.json()
def test_add_dhcp_reservation_empty_body_returns_400(self):
r = post('/api/dhcp/reservations', data='')
assert r.status_code == 400
def test_delete_dhcp_reservation_missing_mac_returns_400(self):
r = delete('/api/dhcp/reservations', json={})
assert r.status_code == 400
assert 'error' in r.json()
def test_add_and_delete_dhcp_reservation_round_trip(self):
add_r = post('/api/dhcp/reservations', json={
'mac': _TEST_MAC,
'ip': _TEST_RESERVATION_IP,
})
assert add_r.status_code in (200, 201), (
f"Could not create DHCP reservation: {add_r.text}"
)
try:
del_r = delete('/api/dhcp/reservations', json={'mac': _TEST_MAC})
assert del_r.status_code in (200, 204), (
f"DHCP reservation delete failed: {del_r.status_code} {del_r.text}"
)
except Exception:
self._cleanup()
raise
def test_get_dns_overview_has_expected_keys(self):
data = get('/api/dns/overview').json()
assert isinstance(data, dict)
for key in ('mode', 'effective_domain', 'internal_domain',
'public_records', 'internal_records'):
assert key in data
# ---------------------------------------------------------------------------
+531
View File
@@ -0,0 +1,531 @@
"""
Tests for AccountManager per-service credential provisioning.
Covers:
- provision: dispatches to right manager method, stores credentials, generates password
- deprovision: calls manager method, removes stored credentials
- get_credentials / list_accounts / list_peer_services
- deprovision_peer: bulk cleanup on peer deletion
- store_credentials: direct storage (used by peers-POST legacy route)
- get_all_credentials: returns all creds for a peer
- credential file is created with 0o600
- unknown service / missing manager errors
"""
import json
import os
import stat
import threading
import unittest
from pathlib import Path
from unittest.mock import MagicMock, patch
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', 'api'))
from account_manager import AccountManager
# ── helpers ────────────────────────────────────────────────────────────────────
def _make_am(tmp_path: Path, registry=None, **managers) -> AccountManager:
if registry is None:
registry = _make_registry()
return AccountManager(service_registry=registry, data_dir=str(tmp_path), **managers)
def _make_registry(services=None):
reg = MagicMock()
if services is None:
services = {
'email': {
'id': 'email', 'kind': 'builtin',
'accounts': {'manager': 'email_manager', 'credentials': ['password']},
'config': {'domain': 'example.com', 'smtp_port': 25},
},
'calendar': {
'id': 'calendar', 'kind': 'builtin',
'accounts': {'manager': 'calendar_manager', 'credentials': ['password']},
'config': {},
},
'files': {
'id': 'files', 'kind': 'builtin',
'accounts': {'manager': 'file_manager', 'credentials': ['password']},
'config': {},
},
}
reg.get.side_effect = lambda svc_id: services.get(svc_id)
return reg
def _make_email_mgr(ok=True):
m = MagicMock()
m.create_email_user.return_value = ok
m.delete_email_user.return_value = ok
return m
def _make_cal_mgr(ok=True):
m = MagicMock()
m.create_calendar_user.return_value = ok
m.delete_calendar_user.return_value = ok
return m
def _make_file_mgr(ok=True):
m = MagicMock()
m.create_user.return_value = ok
m.delete_user.return_value = ok
return m
# ── Provision ─────────────────────────────────────────────────────────────────
class TestProvision(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.email_mgr = _make_email_mgr()
self.cal_mgr = _make_cal_mgr()
self.file_mgr = _make_file_mgr()
self.am = _make_am(
self.tmp,
email_manager=self.email_mgr,
calendar_manager=self.cal_mgr,
file_manager=self.file_mgr,
)
def test_provision_email_calls_create_email_user(self):
self.am.provision('email', 'alice', password='s3cret')
self.email_mgr.create_email_user.assert_called_once_with('alice', 'example.com', 's3cret')
def test_provision_calendar_calls_create_calendar_user(self):
self.am.provision('calendar', 'alice', password='s3cret')
self.cal_mgr.create_calendar_user.assert_called_once_with('alice', 's3cret')
def test_provision_files_calls_create_user(self):
self.am.provision('files', 'alice', password='s3cret')
self.file_mgr.create_user.assert_called_once_with('alice', 's3cret')
def test_provision_generates_password_when_none_given(self):
creds = self.am.provision('email', 'alice')
self.assertIn('password', creds)
self.assertTrue(len(creds['password']) >= 16)
def test_provision_returns_credential_dict(self):
creds = self.am.provision('email', 'alice', password='mypassword')
self.assertEqual(creds, {'password': 'mypassword'})
def test_provision_stores_credentials(self):
self.am.provision('email', 'alice', password='pw')
stored = self.am.get_credentials('email', 'alice')
self.assertEqual(stored, {'password': 'pw'})
def test_provision_multiple_peers_stored_independently(self):
self.am.provision('email', 'alice', password='pw-alice')
self.am.provision('email', 'bob', password='pw-bob')
self.assertEqual(self.am.get_credentials('email', 'alice'), {'password': 'pw-alice'})
self.assertEqual(self.am.get_credentials('email', 'bob'), {'password': 'pw-bob'})
def test_provision_raises_for_unknown_service(self):
with self.assertRaises(ValueError):
self.am.provision('doesnotexist', 'alice')
def test_provision_raises_when_service_has_no_accounts(self):
reg = _make_registry({'nosvc': {'id': 'nosvc', 'accounts': {}, 'config': {}}})
am = _make_am(self.tmp, registry=reg, email_manager=self.email_mgr)
with self.assertRaises(ValueError):
am.provision('nosvc', 'alice')
def test_provision_raises_when_manager_not_registered(self):
am = _make_am(self.tmp) # no managers passed
with self.assertRaises(ValueError):
am.provision('email', 'alice')
def test_provision_raises_runtime_error_when_manager_returns_false(self):
am = _make_am(self.tmp, email_manager=_make_email_mgr(ok=False))
with self.assertRaises(RuntimeError):
am.provision('email', 'alice')
def test_provision_email_raises_when_domain_not_configured(self):
reg = _make_registry({'email': {
'id': 'email', 'accounts': {'manager': 'email_manager'},
'config': {'domain': ''},
}})
am = _make_am(self.tmp, registry=reg, email_manager=self.email_mgr)
with self.assertRaises(ValueError):
am.provision('email', 'alice')
# ── Credential file permissions ───────────────────────────────────────────────
class TestCredentialFilePermissions(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.am = _make_am(self.tmp, email_manager=_make_email_mgr())
def test_credentials_file_created_with_0600(self):
self.am.provision('email', 'alice', password='pw')
creds_path = self.tmp / 'peer_service_credentials.json'
mode = stat.S_IMODE(creds_path.stat().st_mode)
self.assertEqual(mode, 0o600, f'Expected 0o600, got {oct(mode)}')
# ── Deprovision ───────────────────────────────────────────────────────────────
class TestDeprovision(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.email_mgr = _make_email_mgr()
self.cal_mgr = _make_cal_mgr()
self.file_mgr = _make_file_mgr()
self.am = _make_am(
self.tmp,
email_manager=self.email_mgr,
calendar_manager=self.cal_mgr,
file_manager=self.file_mgr,
)
self.am.provision('email', 'alice', password='pw')
def test_deprovision_email_calls_delete_email_user(self):
self.am.deprovision('email', 'alice')
self.email_mgr.delete_email_user.assert_called_once_with('alice', 'example.com')
def test_deprovision_removes_stored_credentials(self):
self.am.deprovision('email', 'alice')
self.assertIsNone(self.am.get_credentials('email', 'alice'))
def test_deprovision_returns_true_on_success(self):
ok = self.am.deprovision('email', 'alice')
self.assertTrue(ok)
def test_deprovision_raises_for_unknown_service(self):
with self.assertRaises(ValueError):
self.am.deprovision('ghost', 'alice')
def test_deprovision_removes_service_entry_when_last_peer_gone(self):
self.am.deprovision('email', 'alice')
creds_file = self.tmp / 'peer_service_credentials.json'
data = json.loads(creds_file.read_text())
self.assertNotIn('email', data)
def test_deprovision_calendar_calls_delete_calendar_user(self):
self.am.provision('calendar', 'alice', password='pw')
self.am.deprovision('calendar', 'alice')
self.cal_mgr.delete_calendar_user.assert_called_once_with('alice')
def test_deprovision_files_calls_delete_user(self):
self.am.provision('files', 'alice', password='pw')
self.am.deprovision('files', 'alice')
self.file_mgr.delete_user.assert_called_once_with('alice')
# ── Queries ───────────────────────────────────────────────────────────────────
class TestQueries(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.am = _make_am(
self.tmp,
email_manager=_make_email_mgr(),
calendar_manager=_make_cal_mgr(),
file_manager=_make_file_mgr(),
)
self.am.provision('email', 'alice', password='pw-alice-email')
self.am.provision('email', 'bob', password='pw-bob-email')
self.am.provision('calendar', 'alice', password='pw-alice-cal')
def test_get_credentials_returns_stored(self):
self.assertEqual(self.am.get_credentials('email', 'alice'), {'password': 'pw-alice-email'})
def test_get_credentials_returns_none_for_unknown_peer(self):
self.assertIsNone(self.am.get_credentials('email', 'nobody'))
def test_get_credentials_returns_none_for_unknown_service(self):
self.assertIsNone(self.am.get_credentials('ghost', 'alice'))
def test_list_accounts_returns_provisioned_peers(self):
accounts = self.am.list_accounts('email')
self.assertIn('alice', accounts)
self.assertIn('bob', accounts)
def test_list_accounts_empty_for_unprovisioned_service(self):
self.assertEqual(self.am.list_accounts('files'), [])
def test_list_peer_services_returns_all_services_for_peer(self):
services = self.am.list_peer_services('alice')
self.assertIn('email', services)
self.assertIn('calendar', services)
def test_list_peer_services_returns_empty_for_unknown_peer(self):
self.assertEqual(self.am.list_peer_services('nobody'), [])
def test_is_provisioned_true_when_account_exists(self):
self.assertTrue(self.am.is_provisioned('email', 'alice'))
def test_is_provisioned_false_when_no_account(self):
self.assertFalse(self.am.is_provisioned('email', 'nobody'))
def test_get_all_credentials_returns_all_services(self):
all_creds = self.am.get_all_credentials('alice')
self.assertIn('email', all_creds)
self.assertIn('calendar', all_creds)
self.assertEqual(all_creds['email'], {'password': 'pw-alice-email'})
def test_get_all_credentials_empty_for_unknown_peer(self):
self.assertEqual(self.am.get_all_credentials('nobody'), {})
# ── Bulk deprovision ──────────────────────────────────────────────────────────
class TestDeprovisionPeer(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.email_mgr = _make_email_mgr()
self.cal_mgr = _make_cal_mgr()
self.am = _make_am(
self.tmp,
email_manager=self.email_mgr,
calendar_manager=self.cal_mgr,
file_manager=_make_file_mgr(),
)
self.am.provision('email', 'alice', password='pw')
self.am.provision('calendar', 'alice', password='pw')
def test_deprovision_peer_removes_from_all_services(self):
self.am.deprovision_peer('alice')
self.assertIsNone(self.am.get_credentials('email', 'alice'))
self.assertIsNone(self.am.get_credentials('calendar', 'alice'))
def test_deprovision_peer_returns_results_dict(self):
results = self.am.deprovision_peer('alice')
self.assertIn('email', results)
self.assertIn('calendar', results)
self.assertTrue(results['email'])
self.assertTrue(results['calendar'])
def test_deprovision_peer_continues_after_one_service_fails(self):
self.email_mgr.delete_email_user.side_effect = RuntimeError('smtp down')
results = self.am.deprovision_peer('alice')
self.assertFalse(results.get('email'))
# calendar should still succeed even though email failed
self.assertTrue(results.get('calendar'))
def test_deprovision_peer_no_op_for_unknown_peer(self):
results = self.am.deprovision_peer('nobody')
self.assertEqual(results, {})
# ── Direct credential storage ─────────────────────────────────────────────────
class TestStoreCredentials(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.am = _make_am(self.tmp)
def test_store_credentials_makes_them_retrievable(self):
self.am.store_credentials('email', 'alice', {'password': 'mypassword'})
self.assertEqual(self.am.get_credentials('email', 'alice'), {'password': 'mypassword'})
def test_store_credentials_overwrites_existing(self):
self.am.store_credentials('email', 'alice', {'password': 'old'})
self.am.store_credentials('email', 'alice', {'password': 'new'})
self.assertEqual(self.am.get_credentials('email', 'alice'), {'password': 'new'})
def test_store_credentials_creates_file_with_0600(self):
self.am.store_credentials('email', 'alice', {'password': 'pw'})
creds_path = self.tmp / 'peer_service_credentials.json'
mode = stat.S_IMODE(creds_path.stat().st_mode)
self.assertEqual(mode, 0o600)
# ── Thread safety ─────────────────────────────────────────────────────────────
class TestThreadSafety(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.am = _make_am(self.tmp)
def test_concurrent_store_credentials_no_data_loss(self):
errors = []
def worker(peer_name):
try:
self.am.store_credentials('email', peer_name, {'password': f'pw-{peer_name}'})
except Exception as e:
errors.append(e)
threads = [threading.Thread(target=worker, args=(f'peer{i}',)) for i in range(20)]
for t in threads:
t.start()
for t in threads:
t.join()
self.assertEqual(errors, [])
accounts = self.am.list_accounts('email')
self.assertEqual(len(accounts), 20)
class TestEdgeCases(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.email_mgr = _make_email_mgr()
self.am = _make_am(self.tmp, email_manager=self.email_mgr,
calendar_manager=_make_cal_mgr(),
file_manager=_make_file_mgr())
def test_deprovision_peer_never_provisioned_returns_empty(self):
self.assertEqual(self.am.deprovision_peer('ghost'), {})
def test_deprovision_clears_credentials_even_when_manager_returns_false(self):
"""Credentials are removed even if underlying manager reports failure."""
self.am.provision('email', 'alice', password='pw')
self.email_mgr.delete_email_user.return_value = False
self.am.deprovision('email', 'alice')
self.assertIsNone(self.am.get_credentials('email', 'alice'))
def test_provision_twice_overwrites_credentials(self):
self.am.provision('email', 'alice', password='first')
self.am.provision('email', 'alice', password='second')
self.assertEqual(self.am.get_credentials('email', 'alice'), {'password': 'second'})
def test_provision_twice_calls_manager_both_times(self):
self.am.provision('email', 'alice', password='first')
self.am.provision('email', 'alice', password='second')
self.assertEqual(self.email_mgr.create_email_user.call_count, 2)
def test_corrupted_credentials_file_returns_empty_and_continues(self):
"""A corrupted JSON file is treated as empty rather than crashing."""
creds_path = self.tmp / 'peer_service_credentials.json'
creds_path.write_text('{invalid json}')
result = self.am.get_all_credentials('alice')
self.assertEqual(result, {})
def test_file_permissions_preserved_on_second_write(self):
"""0o600 must hold even after overwriting with a second provision."""
self.am.provision('email', 'alice', password='first')
self.am.provision('email', 'bob', password='second')
creds_path = self.tmp / 'peer_service_credentials.json'
mode = stat.S_IMODE(creds_path.stat().st_mode)
self.assertEqual(mode, 0o600, f'Expected 0o600 after overwrite, got {oct(mode)}')
def test_generated_password_is_url_safe(self):
"""token_urlsafe must not produce + or / characters."""
creds = self.am.provision('email', 'alice')
pwd = creds['password']
self.assertNotIn('+', pwd)
self.assertNotIn('/', pwd)
def test_store_then_deprovision_removes_credentials(self):
"""store_credentials + deprovision should cleanly remove the entry."""
self.am.store_credentials('email', 'alice', {'password': 'stored'})
self.am.deprovision('email', 'alice')
self.assertIsNone(self.am.get_credentials('email', 'alice'))
# ── HTTP dispatch (manager == "http") ─────────────────────────────────────────
class TestHttpDispatch(unittest.TestCase):
"""AccountManager with manager='http' uses HTTP POST/DELETE to the service backend."""
def _make_http_registry(self, backend='cell-myapp:8080'):
reg = MagicMock()
reg.get.return_value = {
'id': 'myapp',
'backend': backend,
'accounts': {'manager': 'http', 'credentials': ['password']},
}
return reg
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.am = _make_am(self.tmp, registry=self._make_http_registry())
def test_provision_http_posts_to_service_api(self):
with patch('account_manager._requests') as mock_req:
mock_req.post.return_value = MagicMock(status_code=201)
creds = self.am.provision('myapp', 'alice', password='s3cret')
mock_req.post.assert_called_once_with(
'http://cell-myapp:8080/service-api/accounts',
json={'username': 'alice', 'password': 's3cret'},
timeout=10,
)
self.assertEqual(creds['password'], 's3cret')
def test_provision_http_stores_credentials_on_success(self):
with patch('account_manager._requests') as mock_req:
mock_req.post.return_value = MagicMock(status_code=200)
self.am.provision('myapp', 'alice', password='pw')
self.assertEqual(self.am.get_credentials('myapp', 'alice'), {'password': 'pw'})
def test_provision_http_returns_false_on_non_2xx(self):
with patch('account_manager._requests') as mock_req:
mock_req.post.return_value = MagicMock(status_code=409, text='conflict')
with self.assertRaises(RuntimeError):
self.am.provision('myapp', 'alice', password='pw')
def test_provision_http_raises_on_request_exception(self):
with patch('account_manager._requests') as mock_req:
mock_req.post.side_effect = Exception('connection refused')
with self.assertRaises(RuntimeError):
self.am.provision('myapp', 'alice', password='pw')
def test_deprovision_http_deletes_to_service_api(self):
self.am.store_credentials('myapp', 'alice', {'password': 'pw'})
with patch('account_manager._requests') as mock_req:
mock_req.delete.return_value = MagicMock(status_code=204)
ok = self.am.deprovision('myapp', 'alice')
mock_req.delete.assert_called_once_with(
'http://cell-myapp:8080/service-api/accounts/alice',
timeout=10,
)
self.assertTrue(ok)
def test_deprovision_http_treats_404_as_success(self):
"""404 means already deleted — still a clean deprovision."""
self.am.store_credentials('myapp', 'alice', {'password': 'pw'})
with patch('account_manager._requests') as mock_req:
mock_req.delete.return_value = MagicMock(status_code=404)
ok = self.am.deprovision('myapp', 'alice')
self.assertTrue(ok)
def test_deprovision_http_removes_stored_credentials(self):
self.am.store_credentials('myapp', 'alice', {'password': 'pw'})
with patch('account_manager._requests') as mock_req:
mock_req.delete.return_value = MagicMock(status_code=204)
self.am.deprovision('myapp', 'alice')
self.assertIsNone(self.am.get_credentials('myapp', 'alice'))
def test_resolve_service_http_does_not_require_python_manager(self):
"""manager='http' must not raise even with no named managers passed."""
am = AccountManager(
service_registry=self._make_http_registry(),
data_dir=str(self.tmp),
)
svc, manager_name, manager = am._resolve_service('myapp')
self.assertEqual(manager_name, 'http')
self.assertIsNone(manager)
def test_http_base_url_raises_when_no_backend(self):
svc = {'id': 'nobackend', 'backend': ''}
with self.assertRaises(ValueError):
AccountManager._http_base_url(svc)
if __name__ == '__main__':
unittest.main()
+44 -38
View File
@@ -76,13 +76,44 @@ class TestAPIEndpoints(unittest.TestCase):
"""Test get config endpoint"""
response = self.client.get('/api/config')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('cell_name', data)
self.assertIn('domain', data)
self.assertIn('ip_range', data)
self.assertIn('wireguard_port', data)
self.assertIn('installed_services', data)
def test_get_config_installed_services_is_dict(self):
"""installed_services must be a dict, never a list or primitive"""
response = self.client.get('/api/config')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data['installed_services'], dict)
def test_get_config_installed_services_empty_when_none_installed(self):
"""installed_services defaults to empty dict when no services are installed"""
response = self.client.get('/api/config')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
# Fresh test environment has no installed services
self.assertEqual(data['installed_services'], {})
def test_get_config_installed_services_reflects_stored_value(self):
"""installed_services in GET /api/config reflects what config_manager returns"""
from app import config_manager
config_manager.configs.setdefault('_identity', {})['installed_services'] = {
'mailserver': {'status': 'running', 'installed_at': '2026-01-01T00:00:00'}
}
try:
response = self.client.get('/api/config')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIn('mailserver', data['installed_services'])
self.assertEqual(data['installed_services']['mailserver']['status'], 'running')
finally:
config_manager.configs.get('_identity', {}).pop('installed_services', None)
def test_update_config_endpoint(self):
"""Test update config endpoint"""
update_data = {'cell_name': 'newcell'}
@@ -129,37 +160,6 @@ class TestAPIEndpoints(unittest.TestCase):
response = self.client.delete('/api/dns/records', data=json.dumps({'name': 'test'}), content_type='application/json')
self.assertEqual(response.status_code, 500)
@patch('app.network_manager')
def test_dhcp_endpoints(self, mock_network):
# Mock get_dhcp_leases
mock_network.get_dhcp_leases.return_value = [{'ip': '10.0.0.2', 'mac': '00:11:22:33:44:55'}]
response = self.client.get('/api/dhcp/leases')
self.assertEqual(response.status_code, 200)
data = json.loads(response.data)
self.assertIsInstance(data, list)
# Mock add_dhcp_reservation
mock_network.add_dhcp_reservation.return_value = True
response = self.client.post('/api/dhcp/reservations', data=json.dumps({'ip': '10.0.0.2', 'mac': '00:11:22:33:44:55'}), content_type='application/json')
self.assertEqual(response.status_code, 200)
# Missing mac field → 400, not 500
response = self.client.post('/api/dhcp/reservations', data=json.dumps({'ip': '10.0.0.2'}), content_type='application/json')
self.assertEqual(response.status_code, 400)
# Simulate manager error
mock_network.add_dhcp_reservation.side_effect = Exception('fail')
response = self.client.post('/api/dhcp/reservations', data=json.dumps({'ip': '10.0.0.2', 'mac': '00:11:22:33:44:55'}), content_type='application/json')
self.assertEqual(response.status_code, 500)
# Mock remove_dhcp_reservation
mock_network.remove_dhcp_reservation.return_value = True
response = self.client.delete('/api/dhcp/reservations', data=json.dumps({'mac': '00:11:22:33:44:55'}), content_type='application/json')
self.assertEqual(response.status_code, 200)
# Missing mac → 400
response = self.client.delete('/api/dhcp/reservations', data=json.dumps({'ip': '10.0.0.2'}), content_type='application/json')
self.assertEqual(response.status_code, 400)
# Simulate manager error
mock_network.remove_dhcp_reservation.side_effect = Exception('fail')
response = self.client.delete('/api/dhcp/reservations', data=json.dumps({'mac': '00:11:22:33:44:55'}), content_type='application/json')
self.assertEqual(response.status_code, 500)
@patch('app.network_manager')
def test_ntp_status_endpoint(self, mock_network):
# Mock get_ntp_status
@@ -362,10 +362,12 @@ class TestAPIEndpoints(unittest.TestCase):
self.assertEqual(response.status_code, 500)
mock_peers.update_peer_ip.side_effect = None
@patch('app.service_registry')
@patch('app.email_manager')
def test_email_endpoints(self, mock_email):
def test_email_endpoints(self, mock_email, mock_sr):
mock_sr.get.return_value = {'id': 'email', 'installed': True}
# Ensure all relevant mock methods return JSON-serializable values
mock_email.get_users.return_value = [{'username': 'user1', 'domain': 'cell', 'email': 'user1@cell'}]
mock_email.get_email_users.return_value = [{'username': 'user1', 'domain': 'cell', 'email': 'user1@cell'}]
mock_email.create_email_user.return_value = True
mock_email.delete_email_user.return_value = True
mock_email.get_status.return_value = {'postfix_running': True, 'dovecot_running': True, 'total_users': 1, 'total_size_bytes': 0, 'total_size_mb': 0.0, 'users': [{'username': 'user1', 'domain': 'cell', 'email': 'user1@cell'}]}
@@ -376,10 +378,10 @@ class TestAPIEndpoints(unittest.TestCase):
response = self.client.get('/api/email/users')
self.assertEqual(response.status_code, 200)
self.assertIsInstance(json.loads(response.data), list)
mock_email.get_users.side_effect = Exception('fail')
mock_email.get_email_users.side_effect = Exception('fail')
response = self.client.get('/api/email/users')
self.assertEqual(response.status_code, 500)
mock_email.get_users.side_effect = None
mock_email.get_email_users.side_effect = None
# /api/email/users (POST)
response = self.client.post('/api/email/users', data=json.dumps({'username': 'user1', 'domain': 'cell', 'password': 'pw'}), content_type='application/json')
self.assertEqual(response.status_code, 200)
@@ -423,8 +425,10 @@ class TestAPIEndpoints(unittest.TestCase):
self.assertEqual(response.status_code, 500)
mock_email.get_mailbox_info.side_effect = None
@patch('app.service_registry')
@patch('app.calendar_manager')
def test_calendar_endpoints(self, mock_calendar):
def test_calendar_endpoints(self, mock_calendar, mock_sr):
mock_sr.get.return_value = {'id': 'calendar', 'installed': True}
# Mock return values for all relevant calendar_manager methods
mock_calendar.get_users.return_value = [{'username': 'user1', 'collections': {'calendars': ['cal1'], 'contacts': ['c1']}}]
mock_calendar.create_calendar_user.return_value = True
@@ -492,8 +496,10 @@ class TestAPIEndpoints(unittest.TestCase):
self.assertEqual(response.status_code, 500)
mock_calendar.test_connectivity.side_effect = None
@patch('app.service_registry')
@patch('app.file_manager')
def test_file_endpoints(self, mock_file):
def test_file_endpoints(self, mock_file, mock_sr):
mock_sr.get.return_value = {'id': 'files', 'installed': True}
# Mock return values for all relevant file_manager methods
mock_file.get_users.return_value = [{'username': 'user1', 'storage_info': {'total_files': 1, 'total_size_bytes': 1000}}]
mock_file.create_user.return_value = True
+669
View File
@@ -0,0 +1,669 @@
"""
Tests for app.py: health_history (deque), health monitor logic,
connectivity endpoints, caddy endpoints, egress endpoints,
and before-request hooks (enforce_setup/enforce_auth/check_csrf).
"""
import sys
from pathlib import Path
import json
from collections import deque
from unittest.mock import patch, MagicMock
import pytest
sys.path.insert(0, str(Path(__file__).parent.parent / 'api'))
import app as app_module
from app import app
@pytest.fixture(autouse=True)
def reset_app_state():
"""Reset global mutable state between tests."""
orig_running = app_module.health_monitor_running
orig_counters = dict(app_module.service_alert_counters)
app.config['TESTING'] = True
yield
app_module.health_monitor_running = orig_running
app_module.service_alert_counters = orig_counters
@pytest.fixture
def client():
app.config['TESTING'] = True
with app.test_client() as c:
yield c
# ---------------------------------------------------------------------------
# health_history is a deque (not a list)
# ---------------------------------------------------------------------------
class TestHealthHistoryIsDeque:
def test_health_history_is_deque(self):
assert isinstance(app_module.health_history, deque)
def test_health_history_has_maxlen(self):
assert app_module.health_history.maxlen == app_module.HEALTH_HISTORY_SIZE
def test_health_history_appendleft_works(self):
"""appendleft (used in health_monitor_loop) should work on a deque."""
hh = app_module.health_history
entry = {'timestamp': '2026-01-01T00:00:00', 'alerts': []}
hh.appendleft(entry)
assert hh[0] == entry
def test_health_history_maxlen_evicts_old_entries(self):
hh = deque(maxlen=3)
for i in range(5):
hh.appendleft({'n': i})
assert len(hh) == 3
# Most recent is first
assert hh[0]['n'] == 4
# ---------------------------------------------------------------------------
# startup regenerates the Caddyfile (stale-Caddyfile restart-loop fix)
# ---------------------------------------------------------------------------
class TestStartupCaddyRegen:
def test_startup_regenerates_caddyfile_first(self):
"""_apply_startup_enforcement must regenerate the Caddyfile before
anything else, so a stale on-disk Caddyfile (e.g. missing
`admin 0.0.0.0:2019`) can't wedge the health monitor into restarting
Caddy every few minutes."""
with patch.object(app_module, 'caddy_manager') as mock_caddy, \
patch.object(app_module, 'peer_registry') as mock_pr:
# Raise right after the caddy regen to short-circuit the rest of
# the (heavy, docker/iptables) startup work.
mock_pr.list_peers.side_effect = RuntimeError('stop here')
app_module._apply_startup_enforcement()
mock_caddy.regenerate_with_installed.assert_called_once_with([])
# ---------------------------------------------------------------------------
# GET /api/health/history
# ---------------------------------------------------------------------------
class TestGetHealthHistory:
def test_returns_200(self, client):
with patch.object(app_module, 'health_history', deque(maxlen=100)):
resp = client.get('/api/health/history')
assert resp.status_code == 200
def test_returns_list(self, client):
with patch.object(app_module, 'health_history', deque(maxlen=100)):
resp = client.get('/api/health/history')
data = json.loads(resp.data)
assert isinstance(data, list)
def test_returns_stored_entries(self, client):
hh = deque(maxlen=100)
hh.appendleft({'timestamp': 't1', 'alerts': []})
hh.appendleft({'timestamp': 't2', 'alerts': []})
with patch.object(app_module, 'health_history', hh):
resp = client.get('/api/health/history')
data = json.loads(resp.data)
assert len(data) == 2
def test_returns_empty_when_no_history(self, client):
with patch.object(app_module, 'health_history', deque(maxlen=100)):
resp = client.get('/api/health/history')
assert json.loads(resp.data) == []
# ---------------------------------------------------------------------------
# POST /api/health/history/clear
# ---------------------------------------------------------------------------
class TestClearHealthHistory:
def test_clear_returns_200(self, client):
hh = deque(maxlen=100)
hh.appendleft({'entry': 1})
with patch.object(app_module, 'health_history', hh):
resp = client.post('/api/health/history/clear')
assert resp.status_code == 200
def test_clear_empties_history(self, client):
hh = deque(maxlen=100)
hh.appendleft({'entry': 1})
with patch.object(app_module, 'health_history', hh):
client.post('/api/health/history/clear')
assert len(hh) == 0
def test_clear_resets_alert_counters(self, client):
app_module.service_alert_counters['network'] = 5
hh = deque(maxlen=100)
with patch.object(app_module, 'health_history', hh):
client.post('/api/health/history/clear')
assert app_module.service_alert_counters == {}
def test_clear_response_has_message(self, client):
hh = deque(maxlen=100)
with patch.object(app_module, 'health_history', hh):
resp = client.post('/api/health/history/clear')
data = json.loads(resp.data)
assert 'message' in data
# ---------------------------------------------------------------------------
# perform_health_check alerting logic
# ---------------------------------------------------------------------------
class TestPerformHealthCheck:
def test_healthy_service_resets_counter(self):
app_module.service_alert_counters['network'] = 2
mock_service_bus = MagicMock()
mock_service_bus.list_services.return_value = ['network']
network_svc = MagicMock()
network_svc.health_check.return_value = {'running': True}
mock_service_bus.get_service.return_value = network_svc
mock_cfg = MagicMock()
mock_cfg.get_installed_services.return_value = []
with patch.object(app_module, 'service_bus', mock_service_bus), \
patch.object(app_module, 'config_manager', mock_cfg), \
app.app_context():
result = app_module.perform_health_check()
assert app_module.service_alert_counters.get('network', 0) == 0
assert 'network' in result
def test_unhealthy_service_with_error_key_increments_counter(self):
"""Services that raise an exception get recorded with an 'error' key,
which the alerting logic recognises as unhealthy."""
app_module.service_alert_counters = {}
mock_service_bus = MagicMock()
mock_service_bus.list_services.return_value = ['network']
mock_service_bus.publish_event = MagicMock()
network_svc = MagicMock()
# Raise so the result gets {'error': ..., 'status': 'offline'}
network_svc.health_check.side_effect = Exception('container down')
mock_service_bus.get_service.return_value = network_svc
mock_cfg = MagicMock()
mock_cfg.get_installed_services.return_value = []
with patch.object(app_module, 'service_bus', mock_service_bus), \
patch.object(app_module, 'config_manager', mock_cfg), \
app.app_context():
app_module.perform_health_check()
# With an 'error' key and no 'running' key, healthy=False → counter increments
assert app_module.service_alert_counters.get('network', 0) == 1
def test_alert_triggered_at_threshold(self):
"""Counter reaching HEALTH_ALERT_THRESHOLD emits an alert."""
app_module.service_alert_counters = {'network': app_module.HEALTH_ALERT_THRESHOLD - 1}
mock_service_bus = MagicMock()
mock_service_bus.list_services.return_value = ['network']
mock_service_bus.publish_event = MagicMock()
network_svc = MagicMock()
# Use exception path to guarantee healthy=False
network_svc.health_check.side_effect = Exception('container down')
mock_service_bus.get_service.return_value = network_svc
mock_cfg = MagicMock()
mock_cfg.get_installed_services.return_value = []
with patch.object(app_module, 'service_bus', mock_service_bus), \
patch.object(app_module, 'config_manager', mock_cfg), \
app.app_context():
result = app_module.perform_health_check()
# Alert should be in result['alerts']
assert len(result['alerts']) >= 1
assert any('network' in a for a in result['alerts'])
def test_optional_store_services_skipped_when_not_installed(self):
mock_service_bus = MagicMock()
mock_service_bus.list_services.return_value = ['email_manager']
mock_cfg = MagicMock()
mock_cfg.get_installed_services.return_value = [] # email not installed
with patch.object(app_module, 'service_bus', mock_service_bus), \
patch.object(app_module, 'config_manager', mock_cfg), \
app.app_context():
result = app_module.perform_health_check()
# email_manager should not appear in result (was skipped)
assert 'email_manager' not in result
def test_optional_store_service_checked_when_installed(self):
mock_service_bus = MagicMock()
mock_service_bus.list_services.return_value = ['email_manager']
mock_service_bus.publish_event = MagicMock()
email_svc = MagicMock()
email_svc.health_check.return_value = {'running': True}
mock_service_bus.get_service.return_value = email_svc
mock_cfg = MagicMock()
mock_cfg.get_installed_services.return_value = ['email'] # email installed
with patch.object(app_module, 'service_bus', mock_service_bus), \
patch.object(app_module, 'config_manager', mock_cfg), \
app.app_context():
result = app_module.perform_health_check()
assert 'email_manager' in result
def test_service_without_health_check_falls_back_to_get_status(self):
mock_service_bus = MagicMock()
mock_service_bus.list_services.return_value = ['routing']
svc = MagicMock(spec=[]) # no health_check attribute
svc.get_status = MagicMock(return_value={'running': True})
mock_service_bus.get_service.return_value = svc
mock_cfg = MagicMock()
mock_cfg.get_installed_services.return_value = []
with patch.object(app_module, 'service_bus', mock_service_bus), \
patch.object(app_module, 'config_manager', mock_cfg), \
app.app_context():
result = app_module.perform_health_check()
assert 'routing' in result
def test_service_exception_recorded_as_error(self):
mock_service_bus = MagicMock()
mock_service_bus.list_services.return_value = ['vault']
svc = MagicMock()
svc.health_check.side_effect = Exception('vault down')
mock_service_bus.get_service.return_value = svc
mock_cfg = MagicMock()
mock_cfg.get_installed_services.return_value = []
with patch.object(app_module, 'service_bus', mock_service_bus), \
patch.object(app_module, 'config_manager', mock_cfg), \
app.app_context():
result = app_module.perform_health_check()
assert 'error' in result.get('vault', {})
# ---------------------------------------------------------------------------
# GET /api/connectivity/status
# ---------------------------------------------------------------------------
class TestConnectivityEndpoints:
def test_connectivity_status_200(self, client):
mock_cm = MagicMock()
mock_cm.get_status.return_value = {'exits': [], 'peers': {}}
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.get('/api/connectivity/status')
assert resp.status_code == 200
def test_connectivity_status_shape(self, client):
mock_cm = MagicMock()
mock_cm.get_status.return_value = {'exits': [], 'peers': {}}
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.get('/api/connectivity/status')
data = json.loads(resp.data)
assert 'exits' in data
def test_connectivity_status_500_on_exception(self, client):
mock_cm = MagicMock()
mock_cm.get_status.side_effect = Exception('fail')
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.get('/api/connectivity/status')
assert resp.status_code == 500
def test_connectivity_list_exits_200(self, client):
mock_cm = MagicMock()
mock_cm.list_exits.return_value = []
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.get('/api/connectivity/exits')
assert resp.status_code == 200
def test_connectivity_list_exits_shape(self, client):
mock_cm = MagicMock()
mock_cm.list_exits.return_value = [{'type': 'wireguard_ext', 'name': 'exit1'}]
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.get('/api/connectivity/exits')
data = json.loads(resp.data)
assert 'exits' in data
assert len(data['exits']) == 1
def test_connectivity_upload_wireguard_missing_conf_text(self, client):
resp = client.post('/api/connectivity/exits/wireguard',
data=json.dumps({}), content_type='application/json')
assert resp.status_code == 400
data = json.loads(resp.data)
assert 'error' in data
def test_connectivity_upload_wireguard_empty_conf_text(self, client):
resp = client.post('/api/connectivity/exits/wireguard',
data=json.dumps({'conf_text': ' '}),
content_type='application/json')
assert resp.status_code == 400
def test_connectivity_upload_wireguard_success(self, client):
mock_cm = MagicMock()
mock_cm.upload_wireguard_ext.return_value = {'ok': True, 'message': 'Uploaded'}
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.post('/api/connectivity/exits/wireguard',
data=json.dumps({'conf_text': '[Interface]\nPrivateKey = abc\n'}),
content_type='application/json')
assert resp.status_code == 200
def test_connectivity_upload_wireguard_failure(self, client):
mock_cm = MagicMock()
mock_cm.upload_wireguard_ext.return_value = {'ok': False, 'error': 'bad config'}
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.post('/api/connectivity/exits/wireguard',
data=json.dumps({'conf_text': '[Interface]\nPrivateKey = abc\n'}),
content_type='application/json')
assert resp.status_code == 400
def test_connectivity_upload_openvpn_missing_ovpn_text(self, client):
resp = client.post('/api/connectivity/exits/openvpn',
data=json.dumps({}), content_type='application/json')
assert resp.status_code == 400
def test_connectivity_upload_openvpn_success(self, client):
mock_cm = MagicMock()
mock_cm.upload_openvpn.return_value = {'ok': True}
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.post('/api/connectivity/exits/openvpn',
data=json.dumps({'ovpn_text': 'client\ndev tun\n'}),
content_type='application/json')
assert resp.status_code == 200
def test_connectivity_apply_routes_200(self, client):
mock_cm = MagicMock()
mock_cm.apply_routes.return_value = {'ok': True, 'applied': 0}
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.post('/api/connectivity/exits/apply',
content_type='application/json')
assert resp.status_code == 200
def test_connectivity_set_peer_exit_missing_exit_via(self, client):
resp = client.put('/api/connectivity/peers/alice/exit',
data=json.dumps({}), content_type='application/json')
assert resp.status_code == 400
def test_connectivity_set_peer_exit_success(self, client):
mock_cm = MagicMock()
mock_cm.set_peer_exit.return_value = {'ok': True}
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.put('/api/connectivity/peers/alice/exit',
data=json.dumps({'exit_via': 'wireguard_ext'}),
content_type='application/json')
assert resp.status_code == 200
def test_connectivity_set_peer_exit_failure(self, client):
mock_cm = MagicMock()
mock_cm.set_peer_exit.return_value = {'ok': False, 'error': 'not found'}
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.put('/api/connectivity/peers/alice/exit',
data=json.dumps({'exit_via': 'wireguard_ext'}),
content_type='application/json')
assert resp.status_code == 400
def test_connectivity_get_peer_exits_200(self, client):
mock_cm = MagicMock()
mock_cm.get_peer_exits.return_value = {'alice': 'wireguard_ext'}
with patch.object(app_module, 'connectivity_manager', mock_cm):
resp = client.get('/api/connectivity/peers')
assert resp.status_code == 200
data = json.loads(resp.data)
assert 'peers' in data
# ---------------------------------------------------------------------------
# GET /api/caddy/cert-status and POST /api/caddy/cert-renew
# ---------------------------------------------------------------------------
class TestCaddyEndpoints:
def test_caddy_cert_status_200(self, client):
mock_caddy = MagicMock()
mock_caddy.get_cert_status_fresh.return_value = {'status': 'valid', 'days_remaining': 60}
with patch.object(app_module, 'caddy_manager', mock_caddy):
resp = client.get('/api/caddy/cert-status')
assert resp.status_code == 200
def test_caddy_cert_status_shape(self, client):
mock_caddy = MagicMock()
mock_caddy.get_cert_status_fresh.return_value = {'status': 'internal'}
with patch.object(app_module, 'caddy_manager', mock_caddy):
resp = client.get('/api/caddy/cert-status')
data = json.loads(resp.data)
assert 'status' in data
def test_caddy_cert_status_500_on_exception(self, client):
mock_caddy = MagicMock()
mock_caddy.get_cert_status_fresh.side_effect = Exception('Caddy unreachable')
with patch.object(app_module, 'caddy_manager', mock_caddy):
resp = client.get('/api/caddy/cert-status')
assert resp.status_code == 500
def test_caddy_cert_renew_success(self, client):
mock_caddy = MagicMock()
mock_caddy.renew_cert.return_value = {'ok': True, 'status': 'pending'}
with patch.object(app_module, 'caddy_manager', mock_caddy):
resp = client.post('/api/caddy/cert-renew',
content_type='application/json')
assert resp.status_code == 200
def test_caddy_cert_renew_failure(self, client):
mock_caddy = MagicMock()
mock_caddy.renew_cert.return_value = {'ok': False, 'error': 'LAN mode'}
with patch.object(app_module, 'caddy_manager', mock_caddy):
resp = client.post('/api/caddy/cert-renew',
content_type='application/json')
assert resp.status_code == 400
def test_caddy_cert_renew_500_on_exception(self, client):
mock_caddy = MagicMock()
mock_caddy.renew_cert.side_effect = Exception('fail')
with patch.object(app_module, 'caddy_manager', mock_caddy):
resp = client.post('/api/caddy/cert-renew',
content_type='application/json')
assert resp.status_code == 500
def test_caddy_upload_custom_cert_missing_fields(self, client):
resp = client.post('/api/caddy/custom-cert',
data=json.dumps({}), content_type='application/json')
assert resp.status_code == 400
def test_caddy_upload_custom_cert_success(self, client):
mock_caddy = MagicMock()
mock_caddy.upload_custom_cert.return_value = {'ok': True}
with patch.object(app_module, 'caddy_manager', mock_caddy):
resp = client.post('/api/caddy/custom-cert',
data=json.dumps({'cert_pem': 'CERT', 'key_pem': 'KEY'}),
content_type='application/json')
assert resp.status_code == 200
def test_caddy_upload_custom_cert_failure(self, client):
mock_caddy = MagicMock()
mock_caddy.upload_custom_cert.return_value = {'ok': False, 'error': 'invalid cert'}
with patch.object(app_module, 'caddy_manager', mock_caddy):
resp = client.post('/api/caddy/custom-cert',
data=json.dumps({'cert_pem': 'BAD', 'key_pem': 'BAD'}),
content_type='application/json')
assert resp.status_code == 422
# ---------------------------------------------------------------------------
# GET /api/egress/status and PUT /api/egress/services/<id>/exit
# ---------------------------------------------------------------------------
class TestEgressEndpoints:
def test_egress_status_200(self, client):
mock_egress = MagicMock()
mock_egress.get_status.return_value = {'services': {}}
with patch('app.egress_manager', mock_egress, create=True):
resp = client.get('/api/egress/status')
assert resp.status_code == 200
def test_egress_status_500_on_exception(self, client):
mock_egress = MagicMock()
mock_egress.get_status.side_effect = Exception('fail')
with patch('app.egress_manager', mock_egress, create=True):
resp = client.get('/api/egress/status')
assert resp.status_code == 500
def test_egress_set_service_exit_missing_exit_type(self, client):
mock_egress = MagicMock()
with patch('app.egress_manager', mock_egress, create=True):
resp = client.put('/api/egress/services/email/exit',
data=json.dumps({}), content_type='application/json')
assert resp.status_code == 400
def test_egress_set_service_exit_success(self, client):
mock_egress = MagicMock()
mock_egress.set_service_exit.return_value = {'ok': True}
with patch('app.egress_manager', mock_egress, create=True):
resp = client.put('/api/egress/services/email/exit',
data=json.dumps({'exit_type': 'wireguard_ext'}),
content_type='application/json')
assert resp.status_code == 200
def test_egress_set_service_exit_failure(self, client):
mock_egress = MagicMock()
mock_egress.set_service_exit.return_value = {'ok': False, 'error': 'not found'}
with patch('app.egress_manager', mock_egress, create=True):
resp = client.put('/api/egress/services/email/exit',
data=json.dumps({'exit_type': 'wireguard_ext'}),
content_type='application/json')
assert resp.status_code == 400
# ---------------------------------------------------------------------------
# enforce_setup hook: returns 428 when setup is not complete
# ---------------------------------------------------------------------------
class TestEnforceSetupHook:
def test_428_when_setup_incomplete(self):
"""Without TESTING=True, API requests are blocked if setup is not done."""
app.config['TESTING'] = False
mock_setup = MagicMock()
mock_setup.is_setup_complete.return_value = False
try:
with patch.object(app_module, 'setup_manager', mock_setup):
with app.test_client() as c:
resp = c.get('/api/status')
assert resp.status_code == 428
data = json.loads(resp.data)
assert 'redirect' in data
finally:
app.config['TESTING'] = True
def test_setup_route_passes_when_incomplete(self):
"""Setup routes always pass through regardless of setup status."""
app.config['TESTING'] = False
mock_setup = MagicMock()
mock_setup.is_setup_complete.return_value = False
try:
with patch.object(app_module, 'setup_manager', mock_setup):
with app.test_client() as c:
resp = c.get('/api/setup/status')
# Should NOT be 428
assert resp.status_code != 428
finally:
app.config['TESTING'] = True
def test_health_passes_when_incomplete(self):
"""The /health endpoint always passes through."""
app.config['TESTING'] = False
mock_setup = MagicMock()
mock_setup.is_setup_complete.return_value = False
try:
with patch.object(app_module, 'setup_manager', mock_setup):
with app.test_client() as c:
resp = c.get('/health')
assert resp.status_code == 200
finally:
app.config['TESTING'] = True
def test_setup_complete_passes_through(self):
"""All routes pass through when setup is complete."""
app.config['TESTING'] = False
mock_setup = MagicMock()
mock_setup.is_setup_complete.return_value = True
mock_auth = MagicMock()
mock_auth.list_users.return_value = []
try:
with patch.object(app_module, 'setup_manager', mock_setup), \
patch.object(app_module, 'auth_manager', mock_auth):
with app.test_client() as c:
resp = c.get('/api/status')
assert resp.status_code != 428
finally:
app.config['TESTING'] = True
# ---------------------------------------------------------------------------
# enforce_auth hook: 503 when users file exists but is empty
# ---------------------------------------------------------------------------
class TestEnforceAuthHook:
def test_503_when_users_file_empty_and_readable(self, tmp_path):
"""Returns 503 when users file exists + readable but has no accounts."""
import tempfile, os
app.config['TESTING'] = False
users_file = tmp_path / 'auth_users.json'
users_file.write_text('[]') # file exists but no accounts
from auth_manager import AuthManager
real_auth = MagicMock(spec=AuthManager)
real_auth.list_users.return_value = []
real_auth._users_file = str(users_file)
mock_setup = MagicMock()
mock_setup.is_setup_complete.return_value = True
try:
with patch.object(app_module, 'auth_manager', real_auth), \
patch.object(app_module, 'setup_manager', mock_setup):
with app.test_client() as c:
resp = c.get('/api/status')
assert resp.status_code == 503
data = json.loads(resp.data)
assert 'error' in data
finally:
app.config['TESTING'] = True
def test_401_when_no_session_and_users_exist(self, tmp_path):
"""Returns 401 when users exist but no session cookie is set."""
app.config['TESTING'] = False
users_file = tmp_path / 'auth_users.json'
# Users file doesn't exist — no file means enforcement
# is bypassed. Use a file that DOES have a user.
import json as _json
users_file.write_text(_json.dumps([{'username': 'admin', 'role': 'admin'}]))
from auth_manager import AuthManager
real_auth = MagicMock(spec=AuthManager)
real_auth.list_users.return_value = [{'username': 'admin', 'role': 'admin'}]
real_auth._users_file = str(users_file)
mock_setup = MagicMock()
mock_setup.is_setup_complete.return_value = True
try:
with patch.object(app_module, 'auth_manager', real_auth), \
patch.object(app_module, 'setup_manager', mock_setup):
with app.test_client() as c:
# No login — no session
resp = c.get('/api/status')
assert resp.status_code == 401
finally:
app.config['TESTING'] = True
# ---------------------------------------------------------------------------
# GET /api/status
# ---------------------------------------------------------------------------
class TestGetCellStatus:
def test_returns_200(self, client):
mock_sb = MagicMock()
mock_sb.list_services.return_value = []
mock_pr = MagicMock()
mock_pr.list_peers.return_value = []
mock_cm = MagicMock()
mock_cm.configs = {'_identity': {'cell_name': 'test', 'domain': 'cell'}}
mock_cm.get_effective_domain.return_value = 'cell'
with patch.object(app_module, 'service_bus', mock_sb), \
patch.object(app_module, 'peer_registry', mock_pr), \
patch.object(app_module, 'config_manager', mock_cm):
resp = client.get('/api/status')
assert resp.status_code == 200
def test_status_includes_expected_keys(self, client):
mock_sb = MagicMock()
mock_sb.list_services.return_value = []
mock_pr = MagicMock()
mock_pr.list_peers.return_value = []
mock_cm = MagicMock()
mock_cm.configs = {'_identity': {'cell_name': 'test', 'domain': 'cell'}}
mock_cm.get_effective_domain.return_value = 'cell'
with patch.object(app_module, 'service_bus', mock_sb), \
patch.object(app_module, 'peer_registry', mock_pr), \
patch.object(app_module, 'config_manager', mock_cm):
resp = client.get('/api/status')
data = json.loads(resp.data)
for key in ('cell_name', 'domain', 'uptime', 'peers_count', 'services'):
assert key in data, f"Missing key: {key}"
+1
View File
@@ -36,6 +36,7 @@ import app as app_module
class TestAppMisc(unittest.TestCase):
def setUp(self):
app_module.app.config['TESTING'] = True
# Patch managers to avoid side effects
self.patches = [
patch.object(app_module, 'network_manager', MagicMock()),
+213
View File
@@ -0,0 +1,213 @@
#!/usr/bin/env python3
"""Tests for the audit after_request hook, auth-route audit calls, and audit API authz."""
import os
import sys
import json
from pathlib import Path
from unittest.mock import patch
import contextlib
import pytest
sys.path.insert(0, str(Path(__file__).parent.parent / 'api'))
from app import app
from auth_manager import AuthManager
from audit_manager import AuditManager
def _make_auth_manager(tmp_path):
data_dir = str(tmp_path / 'data')
config_dir = str(tmp_path / 'config')
os.makedirs(data_dir, exist_ok=True)
os.makedirs(config_dir, exist_ok=True)
mgr = AuthManager(data_dir=data_dir, config_dir=config_dir)
mgr.create_user('admin', 'AdminPass123!', 'admin')
mgr.create_user('alice', 'AlicePass123!', 'peer')
return mgr
def _login(client, username, password):
return client.post('/api/auth/login',
data=json.dumps({'username': username, 'password': password}),
content_type='application/json')
@contextlib.contextmanager
def _client(auth_mgr, audit_mgr, login_as=None):
app.config['TESTING'] = True
app.config['SECRET_KEY'] = 'test-secret'
with patch('app.auth_manager', auth_mgr), \
patch('app.audit_manager', audit_mgr):
import auth_routes
with patch.object(auth_routes, 'auth_manager', auth_mgr, create=True):
with app.test_client() as c:
if login_as == 'admin':
assert _login(c, 'admin', 'AdminPass123!').status_code == 200
elif login_as == 'peer':
assert _login(c, 'alice', 'AlicePass123!').status_code == 200
yield c
@pytest.fixture
def auth_mgr(tmp_path):
return _make_auth_manager(tmp_path)
@pytest.fixture
def audit_mgr(tmp_path):
return AuditManager(data_dir=str(tmp_path / 'auditdata'), config_dir=str(tmp_path / 'auditcfg'))
# ── after_request capture ─────────────────────────────────────────────────────
def test_post_peers_records_peer_create(auth_mgr, audit_mgr):
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
with patch('app.peer_registry') as pr:
pr.add_peer.return_value = {'success': True, 'peer': {'name': 'bob'}}
c.post('/api/peers', json={'name': 'bob'})
res = audit_mgr.query({'action': 'peer.create'})
assert res['total'] >= 1
e = res['entries'][0]
assert e['target_type'] == 'peer'
assert e['method'] == 'POST'
assert e['actor'] == 'admin'
assert e['role'] == 'admin'
def test_4xx_records_failure(auth_mgr, audit_mgr):
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
# missing body -> handler returns 400
c.post('/api/peers', json={})
res = audit_mgr.query({'action': 'peer.create'})
assert res['total'] >= 1
assert res['entries'][0]['result'] == 'failure'
def test_config_update_summary_lists_key_names_only(auth_mgr, audit_mgr):
# The summary is built from request-body key names regardless of the
# handler outcome, so we assert only on the recorded audit entry.
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
c.put('/api/config', json={'email': {'smtp_password': 'hunter2supersecret', 'smtp_port': 25}})
res = audit_mgr.query({'action': 'config.update'})
assert res['total'] >= 1
summary = res['entries'][0]['summary']
assert 'smtp_port' in summary
assert 'smtp_password' in summary # key NAME is allowed
assert 'hunter2supersecret' not in summary # value never recorded
def test_unmapped_mutating_endpoint_gets_generic_action(auth_mgr, audit_mgr):
# email.send_email is NOT in ROUTE_ACTION_MAP — it must still be recorded
# via the generic "<method>.<path>" fallback so nothing is invisible.
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
c.post('/api/email/send', json={})
entries = audit_mgr.query({})['entries']
match = [e for e in entries if e['path'] == '/api/email/send']
assert match, 'unmapped mutating endpoint was not audited'
assert match[0]['action'] == 'post./api/email/send'
assert match[0]['target_type'] == 'unknown'
# ── connectivity v2 connection routes are audited ─────────────────────────────
def test_connection_create_audited(auth_mgr, audit_mgr):
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
with patch('app.connectivity_manager') as cm:
cm.create_connection.return_value = {'ok': True, 'connection': {'id': 'c'}}
c.post('/api/connectivity/connections',
json={'type': 'tor', 'name': 'T'})
res = audit_mgr.query({'action': 'connection.create'})
assert res['total'] >= 1
assert res['entries'][0]['target_type'] == 'connection'
def test_connection_delete_audited_with_id(auth_mgr, audit_mgr):
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
with patch('app.connectivity_manager') as cm:
cm.delete_connection.return_value = {'ok': True}
c.delete('/api/connectivity/connections/conn_abc')
res = audit_mgr.query({'action': 'connection.delete'})
assert res['total'] >= 1
assert res['entries'][0]['target_id'] == 'conn_abc'
def test_peer_failopen_audited(auth_mgr, audit_mgr):
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
with patch('app.connectivity_manager') as cm:
cm.set_peer_failopen.return_value = {'ok': True, 'peer': 'bob'}
c.put('/api/connectivity/peers/bob/failopen', json={'failopen': True})
res = audit_mgr.query({'action': 'peer.failopen'})
assert res['total'] >= 1
assert res['entries'][0]['target_id'] == 'bob'
# ── auth routes: never write password ─────────────────────────────────────────
def test_change_password_audited_without_value(auth_mgr, audit_mgr):
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
c.post('/api/auth/change-password',
json={'old_password': 'AdminPass123!', 'new_password': 'BrandNewPass456!'})
res = audit_mgr.query({'action': 'user.password_change'})
assert res['total'] == 1
raw = json.dumps(res['entries'][0])
assert 'AdminPass123!' not in raw
assert 'BrandNewPass456!' not in raw
assert res['entries'][0]['summary'] == 'password changed'
def test_admin_reset_password_audited_without_value(auth_mgr, audit_mgr):
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
c.post('/api/auth/admin/reset-password',
json={'username': 'alice', 'new_password': 'ResetPass789!'})
res = audit_mgr.query({'action': 'user.password_reset'})
assert res['total'] == 1
raw = json.dumps(res['entries'][0])
assert 'ResetPass789!' not in raw
assert 'alice' in res['entries'][0]['summary']
def test_auth_login_does_not_write_password(auth_mgr, audit_mgr):
with _client(auth_mgr, audit_mgr) as c:
_login(c, 'admin', 'AdminPass123!')
res = audit_mgr.query({})
for e in res['entries']:
assert 'AdminPass123!' not in json.dumps(e)
# ── audit API authz ───────────────────────────────────────────────────────────
def test_peer_forbidden_on_audit_list(auth_mgr, audit_mgr):
with _client(auth_mgr, audit_mgr, login_as='peer') as c:
r = c.get('/api/audit')
assert r.status_code == 403
def test_admin_allowed_on_audit_list(auth_mgr, audit_mgr):
audit_mgr.record('admin', 'admin', '', 'peer.create', 'peer', 'bob', '',
'success', 201, 'POST', '/api/peers', '')
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
r = c.get('/api/audit')
assert r.status_code == 200
body = r.get_json()
assert body['total'] >= 1
assert 'entries' in body
def test_audit_verify_endpoint(auth_mgr, audit_mgr):
audit_mgr.record('admin', 'admin', '', 'x', '', '', '', 'success', 200, 'POST', '/api/x', '')
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
r = c.get('/api/audit/verify')
assert r.status_code == 200
assert r.get_json()['ok'] is True
def test_audit_export_csv(auth_mgr, audit_mgr):
audit_mgr.record('admin', 'admin', '', 'peer.create', 'peer', 'bob', '',
'success', 201, 'POST', '/api/peers', '')
with _client(auth_mgr, audit_mgr, login_as='admin') as c:
r = c.get('/api/audit/export?format=csv')
assert r.status_code == 200
assert 'text/csv' in r.content_type
assert b'peer.create' in r.data
+198
View File
@@ -0,0 +1,198 @@
#!/usr/bin/env python3
"""Tests for AuditManager and the audit capture hook / routes."""
import os
import sys
import json
import threading
from pathlib import Path
from unittest.mock import patch
import pytest
sys.path.insert(0, str(Path(__file__).parent.parent / 'api'))
from audit_manager import AuditManager
# ── manager fixture ───────────────────────────────────────────────────────────
@pytest.fixture
def audit(tmp_path):
return AuditManager(data_dir=str(tmp_path / 'data'), config_dir=str(tmp_path / 'config'))
def _lines(audit):
with open(audit._audit_file, 'r', encoding='utf-8') as f:
return [l for l in f.read().splitlines() if l.strip()]
# ── record / schema ───────────────────────────────────────────────────────────
def test_record_writes_one_jsonl_line(audit):
entry = audit.record('admin', 'admin', '10.0.0.1', 'peer.create',
'peer', 'bob', 'created', 'success', 201, 'POST', '/api/peers', 'req-1')
lines = _lines(audit)
assert len(lines) == 1
parsed = json.loads(lines[0])
for field in ('ts', 'actor', 'role', 'ip', 'action', 'target_type', 'target_id',
'summary', 'result', 'status', 'method', 'path', 'request_id',
'seq', 'prev_hash', 'hash'):
assert field in parsed
assert parsed['actor'] == 'admin'
assert parsed['action'] == 'peer.create'
assert parsed['ts'].endswith('Z') # UTC ISO
def test_result_derived_from_status(audit):
e = audit.record('a', 'admin', '', 'x', '', '', '', 'bogus', 500, 'POST', '/api/x', '')
assert e['result'] == 'failure'
e2 = audit.record('a', 'admin', '', 'x', '', '', '', 'bogus', 200, 'POST', '/api/x', '')
assert e2['result'] == 'success'
# ── redaction ─────────────────────────────────────────────────────────────────
def test_summarize_keys_lists_names_only(audit):
summary = AuditManager.summarize_keys(['network.dns_port', 'email.smtp_password', 'wireguard.private_key'])
# KEY NAMES are present (they are names, not values)...
assert 'dns_port' in summary
assert 'smtp_password' in summary
# ...but no actual value material
assert 'changed:' in summary
def test_secret_values_never_appear(audit):
secret_b64 = 'A' * 60 + '=='
bcrypt = '$2b$12$abcdefghijklmnopqrstuv'
age = 'AGE-SECRET-KEY-1QQQQQQQQQQQQQQQQQQQQQQQQQQQQQ'
e = audit.record('admin', 'admin', '', 'config.update', 'config', '',
f'token={secret_b64} hash={bcrypt} key={age}', 'success', 200,
'PUT', '/api/config', '')
raw = _lines(audit)[0]
assert secret_b64 not in raw
assert bcrypt not in raw
assert age not in raw
assert 'REDACTED' in e['summary']
# ── append-only ───────────────────────────────────────────────────────────────
def test_append_only_prior_unchanged(audit):
audit.record('a', 'admin', '', 'one', '', '', 's1', 'success', 200, 'POST', '/api/a', '')
first = _lines(audit)[0]
audit.record('b', 'admin', '', 'two', '', '', 's2', 'success', 200, 'POST', '/api/b', '')
lines = _lines(audit)
assert len(lines) == 2
assert lines[0] == first # prior line byte-for-byte unchanged
assert json.loads(lines[1])['seq'] == 2
# ── hash chain ────────────────────────────────────────────────────────────────
def test_hash_chain_links(audit):
e1 = audit.record('a', 'admin', '', 'one', '', '', '', 'success', 200, 'POST', '/api/a', '')
e2 = audit.record('b', 'admin', '', 'two', '', '', '', 'success', 200, 'POST', '/api/b', '')
assert e1['prev_hash'] == ''
assert e2['prev_hash'] == e1['hash']
assert audit.verify_chain() == {'ok': True, 'broken_at_seq': None}
def test_tamper_detected(audit):
audit.record('a', 'admin', '', 'one', '', '', 'orig', 'success', 200, 'POST', '/api/a', '')
audit.record('b', 'admin', '', 'two', '', '', 'orig2', 'success', 200, 'POST', '/api/b', '')
lines = _lines(audit)
tampered = json.loads(lines[0])
tampered['summary'] = 'HACKED'
lines[0] = json.dumps(tampered)
with open(audit._audit_file, 'w', encoding='utf-8') as f:
f.write('\n'.join(lines) + '\n')
res = audit.verify_chain()
assert res['ok'] is False
assert res['broken_at_seq'] == 1
def test_chain_can_be_disabled(tmp_path):
a = AuditManager(data_dir=str(tmp_path / 'd'), config_dir=str(tmp_path / 'c'), tamper_chain=False)
e = a.record('a', 'admin', '', 'one', '', '', '', 'success', 200, 'POST', '/api/a', '')
assert e['hash'] == ''
assert a.verify_chain().get('disabled') is True
# ── rotation ──────────────────────────────────────────────────────────────────
def test_rotation_rolls_and_chain_continues(tmp_path):
a = AuditManager(data_dir=str(tmp_path / 'd'), config_dir=str(tmp_path / 'c'))
a.MAX_FILE_SIZE = 2048 # tiny so a few records trigger rotation
for i in range(60):
a.record('admin', 'admin', '', f'act{i}', 'thing', str(i),
'x' * 40, 'success', 200, 'POST', '/api/x', '')
assert os.path.exists(a._audit_file + '.1'), 'rotation did not occur'
# Chain spans live + rotated segments and stays intact across rotation.
assert a.verify_chain() == {'ok': True, 'broken_at_seq': None}
q = a.query({}, limit=1000)
seqs = [e['seq'] for e in q['entries']]
# Newest-first ordering preserved across segment boundaries.
assert seqs == sorted(seqs, reverse=True)
# The newest record (seq 60) is always retained; order is never lost.
assert seqs[0] == 60
# Retained seqs form a contiguous run ending at the newest (older entries
# beyond BACKUP_COUNT segments are pruned, as designed).
assert seqs == list(range(60, 60 - len(seqs), -1))
# ── concurrency ───────────────────────────────────────────────────────────────
def test_concurrent_records_intact(audit):
N = 50
def worker(i):
audit.record('admin', 'admin', '', f'act{i}', 'thing', str(i),
'', 'success', 200, 'POST', '/api/x', '')
threads = [threading.Thread(target=worker, args=(i,)) for i in range(N)]
for t in threads:
t.start()
for t in threads:
t.join()
lines = _lines(audit)
assert len(lines) == N
for l in lines:
json.loads(l) # every line is valid JSON
assert audit.verify_chain()['ok'] is True
# ── filters + pagination ──────────────────────────────────────────────────────
def test_filters_and_pagination(audit):
for i in range(10):
audit.record('admin' if i % 2 == 0 else 'alice', 'admin', '',
'peer.create' if i < 5 else 'peer.delete',
'peer', f'p{i}', '', 'success' if i != 3 else 'failure',
200, 'POST', '/api/peers', '')
res = audit.query({'actor': 'alice'})
assert all(e['actor'] == 'alice' for e in res['entries'])
res = audit.query({'action': 'peer.delete'})
assert res['total'] == 5
res = audit.query({'result': 'failure'})
assert res['total'] == 1
page = audit.query({}, limit=3, offset=0)
assert len(page['entries']) == 3
assert page['total'] == 10
assert page['next_offset'] == 3
def test_export_csv(audit):
audit.record('admin', 'admin', '1.2.3.4', 'peer.create', 'peer', 'bob',
'created', 'success', 201, 'POST', '/api/peers', 'r1')
csv = audit.export_csv({})
lines = csv.strip().splitlines()
assert lines[0].startswith('ts,actor,role,ip,action')
assert 'peer.create' in csv
assert 'bob' in csv
def test_write_failure_does_not_raise(audit):
with patch('os.open', side_effect=OSError('disk full')):
result = audit.record('a', 'admin', '', 'x', '', '', '', 'success', 200, 'POST', '/api/x', '')
assert result is None # swallowed, never raised
+354
View File
@@ -0,0 +1,354 @@
"""
Tests for service-volume backup/restore in ConfigManager.
Covers:
- _backup_service_volumes: happy path, container not running, timeout
- _restore_service_volumes: happy path, missing archive, unknown service
- backup_config: passes service_registry, records includes_service_data
- restore_config: passes service_registry on full restore, not on selective
"""
import json
import subprocess
import unittest
from pathlib import Path
from unittest.mock import MagicMock, patch, call
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', 'api'))
from config_manager import ConfigManager
def _make_cm(tmp_path: Path) -> ConfigManager:
cfg_file = tmp_path / 'cell_config.json'
cfg_file.write_text('{}')
cm = ConfigManager(config_file=str(cfg_file), data_dir=str(tmp_path))
return cm
def _make_registry(plan=None):
"""Return a mock ServiceRegistry with a preset backup plan."""
reg = MagicMock()
reg.get_backup_plan.return_value = plan if plan is not None else [
{
'service_id': 'email',
'volumes': [
{'container': 'cell-mail', 'path': '/var/mail', 'name': 'maildata'},
{'container': 'cell-mail', 'path': '/var/mail-state', 'name': 'mailstate'},
],
'config_paths': [],
},
{
'service_id': 'calendar',
'volumes': [
{'container': 'cell-radicale', 'path': '/data', 'name': 'radicale_data'},
],
'config_paths': [],
},
]
return reg
class TestBackupServiceVolumesHappyPath(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.cm = _make_cm(self.tmp)
self.backup_path = self.tmp / 'test_backup'
self.backup_path.mkdir()
def _run_backup(self, registry=None):
if registry is None:
registry = _make_registry()
self.cm._backup_service_volumes(self.backup_path, registry)
@patch('config_manager.subprocess.run')
def test_creates_service_data_dir(self, mock_run):
mock_run.return_value = MagicMock(returncode=0, stderr=b'')
self._run_backup()
self.assertTrue((self.backup_path / 'service_data' / 'email').is_dir())
self.assertTrue((self.backup_path / 'service_data' / 'calendar').is_dir())
@patch('config_manager.subprocess.run')
def test_calls_docker_exec_tar_for_each_volume(self, mock_run):
mock_run.return_value = MagicMock(returncode=0, stderr=b'')
self._run_backup()
commands = [tuple(c.args[0]) for c in mock_run.call_args_list]
self.assertIn(
('docker', 'exec', '--', 'cell-mail', 'tar', '-C', '/var/mail', '-czf', '-', '.'),
commands,
)
self.assertIn(
('docker', 'exec', '--', 'cell-mail', 'tar', '-C', '/var/mail-state', '-czf', '-', '.'),
commands,
)
self.assertIn(
('docker', 'exec', '--', 'cell-radicale', 'tar', '-C', '/data', '-czf', '-', '.'),
commands,
)
@patch('config_manager.subprocess.run')
def test_writes_archive_files(self, mock_run):
mock_run.return_value = MagicMock(returncode=0, stderr=b'')
self._run_backup()
self.assertTrue((self.backup_path / 'service_data' / 'email' / 'maildata.tar.gz').exists())
self.assertTrue((self.backup_path / 'service_data' / 'email' / 'mailstate.tar.gz').exists())
self.assertTrue((self.backup_path / 'service_data' / 'calendar' / 'radicale_data.tar.gz').exists())
@patch('config_manager.subprocess.run')
def test_removes_archive_on_nonzero_returncode(self, mock_run):
mock_run.return_value = MagicMock(returncode=1, stderr=b'container not running')
self._run_backup()
self.assertFalse(
(self.backup_path / 'service_data' / 'email' / 'maildata.tar.gz').exists()
)
@patch('config_manager.subprocess.run')
def test_continues_after_one_volume_fails(self, mock_run):
def side_effect(cmd, **kwargs):
if 'cell-mail' in cmd:
return MagicMock(returncode=1, stderr=b'error')
return MagicMock(returncode=0, stderr=b'')
mock_run.side_effect = side_effect
self._run_backup()
# radicale should still succeed
self.assertTrue(
(self.backup_path / 'service_data' / 'calendar' / 'radicale_data.tar.gz').exists()
)
@patch('config_manager.subprocess.run', side_effect=subprocess.TimeoutExpired('docker', 300))
def test_timeout_removes_partial_archive(self, _mock_run):
self._run_backup()
# no archive should remain after a timeout
for svc in ('email', 'calendar'):
for name in ('maildata', 'mailstate', 'radicale_data'):
self.assertFalse(
(self.backup_path / 'service_data' / svc / f'{name}.tar.gz').exists()
)
@patch('config_manager.subprocess.run')
def test_empty_volumes_list_skipped(self, mock_run):
registry = _make_registry(plan=[
{'service_id': 'widget', 'volumes': [], 'config_paths': []}
])
self.cm._backup_service_volumes(self.backup_path, registry)
mock_run.assert_not_called()
@patch('config_manager.subprocess.run')
def test_get_backup_plan_exception_is_handled(self, mock_run):
registry = MagicMock()
registry.get_backup_plan.side_effect = RuntimeError('registry unavailable')
# should not raise
self.cm._backup_service_volumes(self.backup_path, registry)
mock_run.assert_not_called()
@patch('config_manager.subprocess.run')
def test_unsafe_container_name_rejected(self, mock_run):
registry = _make_registry(plan=[{
'service_id': 'evil', 'config_paths': [],
'volumes': [{'container': '-it cell-api', 'path': '/data', 'name': 'data'}],
}])
self.cm._backup_service_volumes(self.backup_path, registry)
mock_run.assert_not_called()
@patch('config_manager.subprocess.run')
def test_path_traversal_in_volume_path_rejected(self, mock_run):
registry = _make_registry(plan=[{
'service_id': 'evil', 'config_paths': [],
'volumes': [{'container': 'cell-mail', 'path': '/../etc', 'name': 'etc'}],
}])
self.cm._backup_service_volumes(self.backup_path, registry)
mock_run.assert_not_called()
@patch('config_manager.subprocess.run')
def test_relative_volume_path_rejected(self, mock_run):
registry = _make_registry(plan=[{
'service_id': 'evil', 'config_paths': [],
'volumes': [{'container': 'cell-mail', 'path': 'data/maildata', 'name': 'data'}],
}])
self.cm._backup_service_volumes(self.backup_path, registry)
mock_run.assert_not_called()
@patch('config_manager.subprocess.run')
def test_unsafe_volume_name_rejected(self, mock_run):
registry = _make_registry(plan=[{
'service_id': 'evil', 'config_paths': [],
'volumes': [{'container': 'cell-mail', 'path': '/var/mail', 'name': '../../etc/passwd'}],
}])
self.cm._backup_service_volumes(self.backup_path, registry)
mock_run.assert_not_called()
@patch('config_manager.subprocess.run')
def test_atomic_write_no_archive_on_partial_failure(self, mock_run):
"""If an exception occurs during subprocess, no .tar.gz file should remain."""
mock_run.side_effect = OSError('disk full')
self._run_backup()
for f in self.backup_path.rglob('*.tar.gz'):
self.fail(f'Archive {f} should not exist after exception during backup')
class TestRestoreServiceVolumes(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.cm = _make_cm(self.tmp)
self.backup_path = self.tmp / 'test_backup'
# Prepare a realistic backup structure
svc_data = self.backup_path / 'service_data'
(svc_data / 'email').mkdir(parents=True)
(svc_data / 'email' / 'maildata.tar.gz').write_bytes(b'fake-archive')
(svc_data / 'calendar').mkdir(parents=True)
(svc_data / 'calendar' / 'radicale_data.tar.gz').write_bytes(b'fake-archive')
def _make_registry_with_manifests(self):
reg = MagicMock()
def get_side_effect(service_id):
manifests = {
'email': {'backup': {'volumes': [
{'container': 'cell-mail', 'path': '/var/mail', 'name': 'maildata'},
]}},
'calendar': {'backup': {'volumes': [
{'container': 'cell-radicale', 'path': '/data', 'name': 'radicale_data'},
]}},
}
return manifests.get(service_id)
reg.get.side_effect = get_side_effect
return reg
@patch('config_manager.subprocess.run')
def test_calls_docker_exec_tar_for_each_archive(self, mock_run):
mock_run.return_value = MagicMock(returncode=0, stderr=b'')
registry = self._make_registry_with_manifests()
self.cm._restore_service_volumes(self.backup_path, registry)
commands = [tuple(c.args[0]) for c in mock_run.call_args_list]
self.assertIn(
('docker', 'exec', '-i', '--', 'cell-mail', 'tar', '-C', '/var/mail', '-xzf', '-'),
commands,
)
self.assertIn(
('docker', 'exec', '-i', '--', 'cell-radicale', 'tar', '-C', '/data', '-xzf', '-'),
commands,
)
@patch('config_manager.subprocess.run')
def test_skips_missing_archive(self, mock_run):
mock_run.return_value = MagicMock(returncode=0, stderr=b'')
registry = MagicMock()
registry.get.return_value = {'backup': {'volumes': [
{'container': 'cell-mail', 'path': '/var/mail', 'name': 'no_such_archive'},
]}}
self.cm._restore_service_volumes(self.backup_path, registry)
mock_run.assert_not_called()
@patch('config_manager.subprocess.run')
def test_skips_unknown_service(self, mock_run):
mock_run.return_value = MagicMock(returncode=0, stderr=b'')
registry = MagicMock()
registry.get.return_value = None
self.cm._restore_service_volumes(self.backup_path, registry)
mock_run.assert_not_called()
@patch('config_manager.subprocess.run')
def test_no_service_data_dir_is_noop(self, mock_run):
empty_backup = self.tmp / 'empty_backup'
empty_backup.mkdir()
registry = self._make_registry_with_manifests()
self.cm._restore_service_volumes(empty_backup, registry)
mock_run.assert_not_called()
@patch('config_manager.subprocess.run', side_effect=subprocess.TimeoutExpired('docker', 300))
def test_timeout_is_handled_gracefully(self, _mock_run):
registry = self._make_registry_with_manifests()
# should not raise
self.cm._restore_service_volumes(self.backup_path, registry)
@patch('config_manager.subprocess.run')
def test_continues_after_docker_exec_failure(self, mock_run):
call_count = [0]
def side_effect(cmd, **kwargs):
call_count[0] += 1
if call_count[0] == 1:
return MagicMock(returncode=1, stderr=b'container not running')
return MagicMock(returncode=0, stderr=b'')
mock_run.side_effect = side_effect
registry = self._make_registry_with_manifests()
self.cm._restore_service_volumes(self.backup_path, registry)
self.assertEqual(call_count[0], 2)
class TestBackupConfigWithRegistry(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.cm = _make_cm(self.tmp)
@patch.object(ConfigManager, '_backup_service_volumes')
def test_backup_calls_volume_backup_when_registry_given(self, mock_bsv):
registry = _make_registry()
self.cm.backup_config(service_registry=registry)
mock_bsv.assert_called_once()
args = mock_bsv.call_args
self.assertIs(args[0][1], registry)
@patch.object(ConfigManager, '_backup_service_volumes')
def test_backup_skips_volume_backup_when_no_registry(self, mock_bsv):
self.cm.backup_config(service_registry=None)
mock_bsv.assert_not_called()
@patch.object(ConfigManager, '_backup_service_volumes')
def test_manifest_records_includes_service_data_true(self, _mock_bsv):
registry = _make_registry()
backup_id = self.cm.backup_config(service_registry=registry)
manifest = json.loads((self.cm.backup_dir / backup_id / 'manifest.json').read_text())
self.assertTrue(manifest['includes_service_data'])
@patch.object(ConfigManager, '_backup_service_volumes')
def test_manifest_records_includes_service_data_false(self, _mock_bsv):
backup_id = self.cm.backup_config(service_registry=None)
manifest = json.loads((self.cm.backup_dir / backup_id / 'manifest.json').read_text())
self.assertFalse(manifest['includes_service_data'])
class TestRestoreConfigWithRegistry(unittest.TestCase):
def setUp(self):
import tempfile
self.tmp = Path(tempfile.mkdtemp())
self.cm = _make_cm(self.tmp)
# Create a minimal backup
backup_id = 'backup_20260101_000000'
bp = self.cm.backup_dir / backup_id
bp.mkdir(parents=True)
(bp / 'cell_config.json').write_text('{}')
manifest = {'backup_id': backup_id, 'timestamp': '2026-01-01T00:00:00', 'services': []}
(bp / 'manifest.json').write_text(json.dumps(manifest))
self.backup_id = backup_id
@patch.object(ConfigManager, '_restore_service_volumes')
def test_full_restore_calls_volume_restore_when_registry_given(self, mock_rsv):
registry = _make_registry()
self.cm.restore_config(self.backup_id, service_registry=registry)
mock_rsv.assert_called_once()
args = mock_rsv.call_args
self.assertIs(args[0][1], registry)
@patch.object(ConfigManager, '_restore_service_volumes')
def test_full_restore_skips_volume_restore_when_no_registry(self, mock_rsv):
self.cm.restore_config(self.backup_id, service_registry=None)
mock_rsv.assert_not_called()
@patch.object(ConfigManager, '_restore_service_volumes')
def test_selective_restore_never_calls_volume_restore(self, mock_rsv):
"""Volume restore is skipped for selective restores (service list specified)."""
registry = _make_registry()
self.cm.restore_config(self.backup_id, services=['email'], service_registry=registry)
mock_rsv.assert_not_called()
if __name__ == '__main__':
unittest.main()
+564 -24
View File
@@ -59,16 +59,49 @@ class TestGenerateCaddyfileLan(unittest.TestCase):
class TestGenerateCaddyfilePicNgo(unittest.TestCase):
def test_pic_ngo_has_dns_plugin_and_wildcard(self):
mgr = _mgr()
mgr.config_manager.configs = {
'ddns': {'url': 'https://ddns.pic.ngo/api/v1'},
}
mgr.config_manager.get_ddns_token.return_value = 'TESTSECRET123'
identity = {'cell_name': 'alpha', 'domain_mode': 'pic_ngo'}
out = mgr.generate_caddyfile(identity, [])
with unittest.mock.patch.dict(os.environ, {'DDNS_URL': 'https://ddns.pic.ngo/api/v1'}):
out = mgr.generate_caddyfile(identity, [])
self.assertIn('dns pic_ngo', out)
self.assertIn('*.alpha.pic.ngo', out)
self.assertIn('alpha.pic.ngo', out)
self.assertIn('{$PIC_NGO_DDNS_TOKEN}', out)
self.assertIn('{$PIC_NGO_DDNS_API}', out)
# Registration token (not TOTP secret) is embedded — no {$VAR} placeholders
self.assertIn('token TESTSECRET123', out)
# /api/v1 is stripped — the plugin appends it itself
self.assertIn('api_base_url https://ddns.pic.ngo', out)
self.assertNotIn('api_base_url https://ddns.pic.ngo/api/v1', out)
self.assertNotIn('{$PIC_NGO_DDNS_TOKEN}', out)
self.assertNotIn('{$PIC_NGO_DDNS_API}', out)
self.assertIn('email admin@alpha.pic.ngo', out)
# ACME staging hook
self.assertIn('acme_ca {$ACME_CA_URL}', out)
# acme_ca is omitted when ACME_CA_URL is not set
self.assertNotIn('acme_ca', out)
def test_pic_ngo_acme_ca_included_when_env_set(self):
mgr = _mgr()
mgr.config_manager.configs = {'ddns': {}}
mgr.config_manager.get_ddns_token.return_value = 'TESTSECRET123'
identity = {'cell_name': 'alpha', 'domain_mode': 'pic_ngo'}
with unittest.mock.patch.dict(os.environ, {
'DDNS_URL': 'https://ddns.pic.ngo/api/v1',
'ACME_CA_URL': 'https://acme-staging-v02.api.letsencrypt.org/directory',
}):
out = mgr.generate_caddyfile(identity, [])
self.assertIn('acme_ca https://acme-staging-v02.api.letsencrypt.org/directory', out)
def test_pic_ngo_has_api_route_without_registry(self):
mgr = _mgr()
identity = {'cell_name': 'alpha', 'domain_mode': 'pic_ngo'}
out = mgr.generate_caddyfile(identity, [])
# Without a registry only the api block is present
self.assertIn('@api host api.alpha.pic.ngo', out)
self.assertIn('reverse_proxy cell-api:3000', out)
self.assertNotIn('@calendar', out)
self.assertNotIn('@mail', out)
self.assertNotIn('@files', out)
class TestGenerateCaddyfileCloudflare(unittest.TestCase):
@@ -77,13 +110,35 @@ class TestGenerateCaddyfileCloudflare(unittest.TestCase):
identity = {
'cell_name': 'beta',
'domain_mode': 'cloudflare',
'custom_domain': 'example.com',
'domain_name': 'example.com',
}
out = mgr.generate_caddyfile(identity, [])
self.assertIn('dns cloudflare {$CF_API_TOKEN}', out)
self.assertIn('*.example.com', out)
self.assertIn('email {$ACME_EMAIL}', out)
self.assertIn('acme_ca {$ACME_CA_URL}', out)
# acme_ca is omitted when ACME_CA_URL is not set in the environment
self.assertNotIn('acme_ca', out)
def test_caddyfile_cloudflare_uses_domain_name(self):
"""Caddyfile must use domain_name for TLS host, not any 'custom_domain' key."""
mgr = _mgr()
identity = {
'cell_name': 'beta',
'domain_mode': 'cloudflare',
'domain_name': 'home.example.com',
'domain': 'home.local',
}
out = mgr.generate_caddyfile(identity, [])
self.assertIn('*.home.example.com', out)
self.assertIn('home.example.com', out)
# Must not use the internal domain for TLS
self.assertNotIn('*.home.local', out)
# 'custom_domain' must not appear literally as a key in the output
self.assertNotIn('custom_domain', out)
# Without a registry only the api block is emitted for subdomain routing
self.assertIn('@api host api.home.example.com', out)
self.assertNotIn('@calendar', out)
self.assertNotIn('@files', out)
class TestGenerateCaddyfileDuckDns(unittest.TestCase):
@@ -93,6 +148,9 @@ class TestGenerateCaddyfileDuckDns(unittest.TestCase):
out = mgr.generate_caddyfile(identity, [])
self.assertIn('dns duckdns {$DUCKDNS_TOKEN}', out)
self.assertIn('*.gamma.duckdns.org', out)
self.assertIn('@api host api.gamma.duckdns.org', out)
self.assertNotIn('@calendar', out)
self.assertNotIn('@files', out)
class TestGenerateCaddyfileHttp01(unittest.TestCase):
@@ -101,26 +159,39 @@ class TestGenerateCaddyfileHttp01(unittest.TestCase):
identity = {
'cell_name': 'delta',
'domain_mode': 'http01',
'custom_domain': 'delta.noip.me',
'domain_name': 'delta.noip.me',
}
# Store-plugin service (not a core service name)
services = [
{'name': 'calendar', 'caddy_route':
'reverse_proxy cell-radicale:5232'},
{'name': 'files', 'caddy_route':
'reverse_proxy cell-filegator:8080'},
{'name': 'chat', 'caddy_route': 'reverse_proxy cell-chat:8090'},
]
out = mgr.generate_caddyfile(identity, services)
# No wildcard, no DNS-01 plugins.
self.assertNotIn('*.delta', out)
self.assertNotIn('dns ', out)
# No explicit tls block (no internal CA, no plugin) — the host block
# itself is left empty so Caddy uses HTTP-01 by default.
# No explicit tls block — Caddy uses HTTP-01 by default.
self.assertNotIn('tls {', out)
# Per-service blocks
self.assertIn('calendar.delta.noip.me {', out)
self.assertIn('files.delta.noip.me {', out)
self.assertIn('reverse_proxy cell-radicale:5232', out)
self.assertIn('reverse_proxy cell-filegator:8080', out)
# Without a registry only the api block is generated
self.assertIn('api.delta.noip.me {', out)
self.assertNotIn('calendar.delta.noip.me {', out)
self.assertNotIn('files.delta.noip.me {', out)
self.assertNotIn('mail.delta.noip.me {', out)
# Installed plugin service block still works
self.assertIn('chat.delta.noip.me {', out)
self.assertIn('reverse_proxy cell-chat:8090', out)
def test_http01_installed_service_with_caddy_route_appears(self):
"""An installed service with a caddy_route produces its own per-host block."""
mgr = _mgr()
identity = {
'cell_name': 'delta',
'domain_mode': 'http01',
'domain_name': 'delta.noip.me',
}
services = [{'name': 'notes', 'caddy_route': 'reverse_proxy cell-other:9000'}]
out = mgr.generate_caddyfile(identity, services)
self.assertIn('notes.delta.noip.me {', out)
self.assertIn('reverse_proxy cell-other:9000', out)
class TestServiceRoutesIncluded(unittest.TestCase):
@@ -138,7 +209,7 @@ class TestServiceRoutesIncluded(unittest.TestCase):
self.assertIn('reverse_proxy cell-filegator:8080', out)
# Core routes still emitted
self.assertIn('reverse_proxy cell-api:3000', out)
self.assertIn('reverse_proxy cell-webui:80', out)
self.assertIn('reverse_proxy cell-webui:8080', out)
class TestReloadCaddyAdminAPI(unittest.TestCase):
@@ -147,7 +218,7 @@ class TestReloadCaddyAdminAPI(unittest.TestCase):
# Point at a tmp Caddyfile so we can read it back during reload.
import tempfile
tmp = tempfile.NamedTemporaryFile('w', delete=False, suffix='.caddyfile')
tmp.write(":80 { reverse_proxy cell-webui:80 }\n")
tmp.write(":80 { reverse_proxy cell-webui:8080 }\n")
tmp.close()
mgr.caddyfile_path = tmp.name
@@ -161,7 +232,7 @@ class TestReloadCaddyAdminAPI(unittest.TestCase):
# First positional arg is the URL
self.assertEqual(args[0], 'http://cell-caddy:2019/load')
self.assertEqual(kwargs['headers']['Content-Type'], 'text/caddyfile')
self.assertIn('cell-webui:80', kwargs['data'])
self.assertIn('cell-webui:8080', kwargs['data'])
os.unlink(tmp.name)
@@ -172,8 +243,8 @@ class TestHealthCheck(unittest.TestCase):
mock_get.return_value = MagicMock(status_code=200)
self.assertTrue(mgr.check_caddy_health())
mock_get.assert_called_once()
# URL must be the admin API root
self.assertIn('cell-caddy:2019', mock_get.call_args[0][0])
# Must hit /config/ — not the root which returns 404
self.assertIn('/config/', mock_get.call_args[0][0])
def test_returns_false_on_connection_error(self):
mgr = _mgr()
@@ -224,5 +295,474 @@ class TestCertStatus(unittest.TestCase):
self.assertEqual(out['days_remaining'], 84)
class TestCaddyManagerIdentityChangedSubscription(unittest.TestCase):
def test_subscribes_to_identity_changed_on_init(self):
"""When service_bus is provided, CaddyManager subscribes to IDENTITY_CHANGED."""
from service_bus import EventType
mock_bus = MagicMock()
mgr = CaddyManager(config_manager=MagicMock(), service_bus=mock_bus)
mock_bus.subscribe_to_event.assert_called_once_with(
EventType.IDENTITY_CHANGED, mgr._on_identity_changed
)
def test_no_subscription_without_service_bus(self):
"""When service_bus is omitted, no subscription is attempted."""
mock_bus = MagicMock()
CaddyManager(config_manager=MagicMock())
mock_bus.subscribe_to_event.assert_not_called()
def test_on_identity_changed_calls_regenerate_with_installed(self):
"""_on_identity_changed calls regenerate_with_installed([])."""
mgr = _mgr()
with patch.object(mgr, 'regenerate_with_installed', return_value=True) as mock_regen:
event = MagicMock()
mgr._on_identity_changed(event)
mock_regen.assert_called_once_with([])
def test_on_identity_changed_swallows_exceptions(self):
"""_on_identity_changed must not propagate exceptions."""
mgr = _mgr()
with patch.object(mgr, 'regenerate_with_installed', side_effect=Exception('boom')):
event = MagicMock()
mgr._on_identity_changed(event) # must not raise
class TestRefreshCertStatus(unittest.TestCase):
"""refresh_cert_status() + _check_cert_via_ssl()."""
def _make_der_cert(self, days_remaining: int) -> bytes:
"""Return a minimal self-signed DER cert valid for *days_remaining* days."""
from cryptography import x509
from cryptography.x509.oid import NameOID
from cryptography.hazmat.primitives import hashes, serialization
from cryptography.hazmat.primitives.asymmetric import rsa
import datetime
key = rsa.generate_private_key(public_exponent=65537, key_size=2048)
now = datetime.datetime.now(datetime.timezone.utc)
expiry = now + datetime.timedelta(days=days_remaining)
cert = (
x509.CertificateBuilder()
.subject_name(x509.Name([x509.NameAttribute(NameOID.COMMON_NAME, 'test.example.com')]))
.issuer_name(x509.Name([x509.NameAttribute(NameOID.COMMON_NAME, 'test.example.com')]))
.public_key(key.public_key())
.serial_number(x509.random_serial_number())
.not_valid_before(expiry - datetime.timedelta(days=30))
.not_valid_after(expiry)
.sign(key, hashes.SHA256())
)
return cert.public_bytes(serialization.Encoding.DER)
def test_check_cert_via_ssl_returns_none_on_connection_error(self):
"""_check_cert_via_ssl returns None when connection fails."""
with patch('caddy_manager._socket.create_connection', side_effect=OSError('refused')):
result = CaddyManager._check_cert_via_ssl('host', 443)
self.assertIsNone(result)
def test_check_cert_via_ssl_returns_valid_status(self):
"""_check_cert_via_ssl returns valid status for a future-dated cert."""
der = self._make_der_cert(60)
mock_tls = MagicMock()
mock_tls.__enter__ = MagicMock(return_value=mock_tls)
mock_tls.__exit__ = MagicMock(return_value=False)
mock_tls.getpeercert.return_value = der
mock_raw = MagicMock()
mock_raw.__enter__ = MagicMock(return_value=mock_raw)
mock_raw.__exit__ = MagicMock(return_value=False)
with patch('caddy_manager._socket.create_connection', return_value=mock_raw):
with patch('caddy_manager._ssl.create_default_context') as mock_ctx:
mock_ctx.return_value.wrap_socket.return_value = mock_tls
result = CaddyManager._check_cert_via_ssl('host', 443)
self.assertIsNotNone(result)
self.assertEqual(result['status'], 'valid')
self.assertGreater(result['days_remaining'], 50)
def test_check_cert_via_ssl_returns_expired_for_past_cert(self):
"""_check_cert_via_ssl returns expired when cert is in the past."""
der = self._make_der_cert(-5)
mock_tls = MagicMock()
mock_tls.__enter__ = MagicMock(return_value=mock_tls)
mock_tls.__exit__ = MagicMock(return_value=False)
mock_tls.getpeercert.return_value = der
mock_raw = MagicMock()
mock_raw.__enter__ = MagicMock(return_value=mock_raw)
mock_raw.__exit__ = MagicMock(return_value=False)
with patch('caddy_manager._socket.create_connection', return_value=mock_raw):
with patch('caddy_manager._ssl.create_default_context') as mock_ctx:
mock_ctx.return_value.wrap_socket.return_value = mock_tls
result = CaddyManager._check_cert_via_ssl('host', 443)
self.assertIsNotNone(result)
self.assertEqual(result['status'], 'expired')
self.assertLess(result['days_remaining'], 0)
def test_refresh_cert_status_lan_mode_returns_internal(self):
"""LAN mode always returns status='internal' without SSL check."""
mgr = _mgr(identity={'cell_name': 'x', 'domain_mode': 'lan'})
with patch.object(CaddyManager, '_check_cert_via_ssl') as mock_ssl:
result = mgr.refresh_cert_status()
mock_ssl.assert_not_called()
self.assertEqual(result['status'], 'internal')
def test_refresh_cert_status_acme_mode_calls_ssl_check(self):
"""ACME mode calls _check_cert_via_ssl and persists the result."""
mgr = _mgr(identity={'cell_name': 'alpha', 'domain_mode': 'pic_ngo'})
expected = {'status': 'valid', 'expiry': '2026-12-01T00:00:00+00:00', 'days_remaining': 179}
with patch.object(CaddyManager, '_check_cert_via_ssl', return_value=expected):
result = mgr.refresh_cert_status()
self.assertEqual(result['status'], 'valid')
# Should have been persisted to identity
mgr.config_manager.set_identity_field.assert_called_with('tls', expected)
def test_refresh_cert_status_uses_effective_domain_as_sni(self):
"""refresh_cert_status passes the effective domain as SNI, not the container hostname.
Without this, Caddy receives SNI='cell-caddy' which matches no certificate
and the SSL handshake returns nothing, leaving cert status as 'unknown'.
"""
mgr = _mgr(identity={'cell_name': 'pic1', 'domain_mode': 'pic_ngo'})
mgr.config_manager.get_effective_domain.return_value = 'pic1.pic.ngo'
expected = {'status': 'valid', 'expiry': '2026-12-01T00:00:00+00:00', 'days_remaining': 179}
with patch.object(CaddyManager, '_check_cert_via_ssl', return_value=expected) as mock_ssl:
mgr.refresh_cert_status()
# The SNI keyword argument must be the effective domain, not the container name.
call_kwargs = mock_ssl.call_args
sni_passed = call_kwargs.kwargs.get('sni') or (
call_kwargs.args[2] if len(call_kwargs.args) > 2 else None
)
self.assertEqual(sni_passed, 'pic1.pic.ngo',
f'Expected SNI=pic1.pic.ngo but got {sni_passed!r}')
def test_check_cert_via_ssl_passes_sni_to_wrap_socket(self):
"""_check_cert_via_ssl uses sni parameter as server_hostname in SSL handshake."""
der = self._make_der_cert(60)
mock_tls = MagicMock()
mock_tls.__enter__ = MagicMock(return_value=mock_tls)
mock_tls.__exit__ = MagicMock(return_value=False)
mock_tls.getpeercert.return_value = der
mock_raw = MagicMock()
mock_raw.__enter__ = MagicMock(return_value=mock_raw)
mock_raw.__exit__ = MagicMock(return_value=False)
with patch('caddy_manager._socket.create_connection', return_value=mock_raw) as mock_conn:
with patch('caddy_manager._ssl.create_default_context') as mock_ctx:
mock_ctx.return_value.wrap_socket.return_value = mock_tls
CaddyManager._check_cert_via_ssl('cell-caddy', 443, sni='pic1.pic.ngo')
# TCP connects to container hostname, SSL handshake uses the public domain
mock_conn.assert_called_with(('cell-caddy', 443), timeout=5)
mock_ctx.return_value.wrap_socket.assert_called_with(
mock_raw, server_hostname='pic1.pic.ngo'
)
def test_refresh_cert_status_ssl_failure_returns_unknown(self):
"""When SSL check returns None, status is 'unknown'."""
mgr = _mgr(identity={'cell_name': 'alpha', 'domain_mode': 'pic_ngo'})
with patch.object(CaddyManager, '_check_cert_via_ssl', return_value=None):
result = mgr.refresh_cert_status()
self.assertEqual(result['status'], 'unknown')
def test_get_cert_status_fresh_refreshes_when_stale(self):
"""get_cert_status_fresh triggers a refresh when cache is None."""
mgr = _mgr(identity={'cell_name': 'alpha', 'domain_mode': 'pic_ngo'})
mgr._cert_refreshed_at = None
with patch.object(mgr, 'refresh_cert_status', return_value={'status': 'valid'}) as mock_ref:
with patch.object(mgr, 'get_cert_status', return_value={'status': 'valid'}):
mgr.get_cert_status_fresh()
mock_ref.assert_called_once()
def test_get_cert_status_fresh_skips_refresh_when_recent(self):
"""get_cert_status_fresh skips refresh when cache is fresh."""
import time
mgr = _mgr(identity={'cell_name': 'alpha', 'domain_mode': 'pic_ngo'})
mgr._cert_refreshed_at = time.monotonic() # just refreshed
with patch.object(mgr, 'refresh_cert_status') as mock_ref:
with patch.object(mgr, 'get_cert_status', return_value={'status': 'valid'}):
mgr.get_cert_status_fresh(max_age_seconds=300)
mock_ref.assert_not_called()
class TestGetCertStatusEnriched(unittest.TestCase):
"""get_cert_status() returns domain, domain_mode, cert_type alongside tls fields."""
def test_includes_domain_and_mode_for_pic_ngo(self):
mgr = _mgr(identity={
'cell_name': 'alpha',
'domain_mode': 'pic_ngo',
'tls': {'status': 'valid', 'expiry': '2026-12-01T00:00:00+00:00', 'days_remaining': 180},
})
s = mgr.get_cert_status()
self.assertEqual(s['domain_mode'], 'pic_ngo')
self.assertEqual(s['domain'], '*.alpha.pic.ngo')
self.assertEqual(s['cert_type'], 'acme')
self.assertEqual(s['status'], 'valid')
def test_cert_type_is_internal_for_lan_mode(self):
mgr = _mgr(identity={'cell_name': 'x', 'domain_mode': 'lan', 'tls': {}})
s = mgr.get_cert_status()
self.assertEqual(s['cert_type'], 'internal')
self.assertIsNone(s['domain'])
def test_cert_type_is_custom_when_tls_says_so(self):
mgr = _mgr(identity={
'cell_name': 'x',
'domain_mode': 'lan',
'tls': {'cert_type': 'custom', 'status': 'valid',
'expiry': '2027-01-01T00:00:00+00:00', 'days_remaining': 200},
})
s = mgr.get_cert_status()
self.assertEqual(s['cert_type'], 'custom')
def test_domain_label_cloudflare(self):
ident = {'domain_mode': 'cloudflare', 'domain_name': 'example.com'}
self.assertEqual(CaddyManager._domain_label(ident), '*.example.com')
def test_domain_label_duckdns(self):
ident = {'cell_name': 'beta', 'domain_mode': 'duckdns'}
self.assertEqual(CaddyManager._domain_label(ident), '*.beta.duckdns.org')
def test_domain_label_http01(self):
ident = {'domain_mode': 'http01', 'domain_name': 'myhost.noip.me'}
self.assertEqual(CaddyManager._domain_label(ident), 'myhost.noip.me')
def test_domain_label_lan_is_none(self):
self.assertIsNone(CaddyManager._domain_label({'domain_mode': 'lan'}))
class TestRenewCert(unittest.TestCase):
"""renew_cert() — mode guard, reload call, cache invalidation."""
def test_lan_mode_returns_error(self):
mgr = _mgr(identity={'domain_mode': 'lan'})
result = mgr.renew_cert()
self.assertFalse(result['ok'])
self.assertIn('LAN', result['error'])
def test_acme_mode_calls_regenerate(self):
mgr = _mgr(identity={'domain_mode': 'pic_ngo'})
with patch.object(mgr, 'regenerate_with_installed', return_value=True) as mock_regen:
result = mgr.renew_cert()
mock_regen.assert_called_once_with([])
self.assertTrue(result['ok'])
self.assertEqual(result['status'], 'pending')
def test_reload_failure_propagated(self):
mgr = _mgr(identity={'domain_mode': 'cloudflare'})
with patch.object(mgr, 'regenerate_with_installed', return_value=False):
result = mgr.renew_cert()
self.assertFalse(result['ok'])
self.assertIn('reload failed', result['error'])
def test_invalidates_cache_on_success(self):
import time
mgr = _mgr(identity={'domain_mode': 'pic_ngo'})
mgr._cert_refreshed_at = time.monotonic()
with patch.object(mgr, 'regenerate_with_installed', return_value=True):
mgr.renew_cert()
self.assertIsNone(mgr._cert_refreshed_at)
class TestUploadCustomCert(unittest.TestCase):
"""upload_custom_cert() — validation, file writes, identity persistence, Caddyfile regen."""
def _make_pem_cert(self, days_remaining: int = 90):
"""Return (cert_pem, key_pem) for a self-signed cert."""
from cryptography import x509
from cryptography.x509.oid import NameOID
from cryptography.hazmat.primitives import hashes, serialization
from cryptography.hazmat.primitives.asymmetric import rsa
import datetime
key = rsa.generate_private_key(public_exponent=65537, key_size=2048)
now = datetime.datetime.now(datetime.timezone.utc)
expiry = now + datetime.timedelta(days=days_remaining)
not_before = (now - datetime.timedelta(days=abs(days_remaining) + 10)
if days_remaining < 0 else now - datetime.timedelta(days=1))
cert = (
x509.CertificateBuilder()
.subject_name(x509.Name([x509.NameAttribute(NameOID.COMMON_NAME, 'test.example.com')]))
.issuer_name(x509.Name([x509.NameAttribute(NameOID.COMMON_NAME, 'test.example.com')]))
.public_key(key.public_key())
.serial_number(x509.random_serial_number())
.not_valid_before(not_before)
.not_valid_after(expiry)
.sign(key, hashes.SHA256())
)
cert_pem = cert.public_bytes(serialization.Encoding.PEM).decode()
key_pem = key.private_bytes(
serialization.Encoding.PEM,
serialization.PrivateFormat.TraditionalOpenSSL,
serialization.NoEncryption(),
).decode()
return cert_pem, key_pem
def test_rejects_invalid_cert_pem(self):
mgr = _mgr()
result = mgr.upload_custom_cert('not a cert', '-----BEGIN PRIVATE KEY-----\nXXX\n-----END PRIVATE KEY-----')
self.assertFalse(result['ok'])
self.assertIn('Invalid certificate', result['error'])
def test_rejects_invalid_key_pem(self):
mgr = _mgr()
cert_pem, _ = self._make_pem_cert()
result = mgr.upload_custom_cert(cert_pem, 'not a key')
self.assertFalse(result['ok'])
self.assertIn('Invalid private key', result['error'])
def test_writes_files_to_certs_dir(self):
mgr = _mgr(identity={'domain_mode': 'lan', 'cell_name': 'x'})
cert_pem, key_pem = self._make_pem_cert()
written = {}
def fake_open(path, mode='r', **kw):
import unittest.mock
m = unittest.mock.mock_open()()
if 'w' in mode:
written[path] = True
return m
with patch('builtins.open', side_effect=fake_open):
with patch('os.makedirs'):
with patch.object(mgr, 'regenerate_with_installed', return_value=True):
mgr.upload_custom_cert(cert_pem, key_pem)
self.assertTrue(any('cert.pem' in p for p in written))
self.assertTrue(any('key.pem' in p for p in written))
def test_persists_custom_cert_type_to_identity(self):
mgr = _mgr(identity={'domain_mode': 'lan', 'cell_name': 'x'})
cert_pem, key_pem = self._make_pem_cert(days_remaining=90)
with patch('builtins.open', unittest.mock.mock_open()):
with patch('os.makedirs'):
with patch.object(mgr, 'regenerate_with_installed', return_value=True):
result = mgr.upload_custom_cert(cert_pem, key_pem)
self.assertTrue(result['ok'])
self.assertEqual(result['cert_type'], 'custom')
self.assertEqual(result['status'], 'valid')
mgr.config_manager.set_identity_field.assert_called_once()
call_args = mgr.config_manager.set_identity_field.call_args
self.assertEqual(call_args[0][0], 'tls')
self.assertEqual(call_args[0][1]['cert_type'], 'custom')
def test_expired_cert_flagged_as_expired(self):
mgr = _mgr(identity={'domain_mode': 'lan', 'cell_name': 'x'})
cert_pem, key_pem = self._make_pem_cert(days_remaining=-5)
with patch('builtins.open', unittest.mock.mock_open()):
with patch('os.makedirs'):
with patch.object(mgr, 'regenerate_with_installed', return_value=True):
result = mgr.upload_custom_cert(cert_pem, key_pem)
self.assertEqual(result['status'], 'expired')
def test_file_write_failure_returns_error(self):
mgr = _mgr(identity={'domain_mode': 'lan'})
cert_pem, key_pem = self._make_pem_cert()
with patch('os.makedirs'):
with patch('builtins.open', side_effect=OSError('no space')):
result = mgr.upload_custom_cert(cert_pem, key_pem)
self.assertFalse(result['ok'])
self.assertIn('Failed to write', result['error'])
class TestCaddyfileLanCustomCert(unittest.TestCase):
"""_caddyfile_lan() uses the custom cert path when cert_type=custom."""
def test_default_uses_internal_cert_path(self):
mgr = _mgr(identity={'cell_name': 'mycell', 'domain_mode': 'lan'})
out = mgr.generate_caddyfile({'cell_name': 'mycell', 'domain_mode': 'lan'}, [])
self.assertIn('/etc/caddy/internal/cert.pem', out)
def test_custom_cert_type_uses_shared_cert_path(self):
mgr = _mgr(identity={
'cell_name': 'mycell',
'domain_mode': 'lan',
'tls': {'cert_type': 'custom'},
})
out = mgr.generate_caddyfile({'cell_name': 'mycell', 'domain_mode': 'lan'}, [])
self.assertIn('/config/caddy/certs/cert.pem', out)
self.assertNotIn('/etc/caddy/internal/cert.pem', out)
class TestPicNgoNoTokenFallback(unittest.TestCase):
"""pic_ngo mode with no token falls back to lan so Caddy starts cleanly."""
def test_empty_token_generates_lan_caddyfile(self):
mgr = _mgr()
mgr.config_manager.configs = {'ddns': {'url': 'https://ddns.pic.ngo'}}
mgr.config_manager.get_ddns_token.return_value = ''
with patch.dict(os.environ, {}, clear=False):
os.environ.pop('DDNS_TOKEN', None)
os.environ.pop('DDNS_URL', None)
out = mgr.generate_caddyfile({'cell_name': 'x', 'domain_mode': 'pic_ngo'}, [])
self.assertIn('auto_https off', out)
self.assertNotIn('dns pic_ngo', out)
self.assertNotIn('token', out)
def test_missing_ddns_config_generates_lan_caddyfile(self):
mgr = _mgr()
mgr.config_manager.configs = {}
mgr.config_manager.get_ddns_token.return_value = ''
with patch.dict(os.environ, {}, clear=False):
os.environ.pop('DDNS_TOKEN', None)
os.environ.pop('DDNS_URL', None)
out = mgr.generate_caddyfile({'cell_name': 'x', 'domain_mode': 'pic_ngo'}, [])
self.assertIn('auto_https off', out)
self.assertNotIn('dns pic_ngo', out)
class TestDdnsApiStripsLegacySuffix(unittest.TestCase):
"""_caddyfile_pic_ngo strips /api/v1 from ddns_api so the plugin doesn't double it."""
def test_api_v1_suffix_stripped_from_config_url(self):
mgr = _mgr()
mgr.config_manager.configs = {
'ddns': {'url': 'https://ddns.pic.ngo/api/v1'},
}
mgr.config_manager.get_ddns_token.return_value = 'tok'
with patch.dict(os.environ, {}, clear=False):
os.environ.pop('DDNS_URL', None)
out = mgr.generate_caddyfile({'cell_name': 'x', 'domain_mode': 'pic_ngo'}, [])
self.assertIn('api_base_url https://ddns.pic.ngo', out)
self.assertNotIn('api_base_url https://ddns.pic.ngo/api/v1', out)
def test_clean_url_is_unchanged(self):
mgr = _mgr()
mgr.config_manager.configs = {
'ddns': {'url': 'https://ddns.pic.ngo'},
}
mgr.config_manager.get_ddns_token.return_value = 'tok'
with patch.dict(os.environ, {}, clear=False):
os.environ.pop('DDNS_URL', None)
out = mgr.generate_caddyfile({'cell_name': 'x', 'domain_mode': 'pic_ngo'}, [])
self.assertIn('api_base_url https://ddns.pic.ngo', out)
class TestCaddyLogLevel(unittest.TestCase):
"""Container log level injects a global `log { level <X> }` block."""
def _mgr_with_level(self, level):
cm = MagicMock()
cm.get_identity.return_value = {}
cm.get_logging_config.return_value = {
'python': {'root': 'INFO', 'services': {}},
'containers': {'caddy': level},
}
return CaddyManager(config_manager=cm, data_dir='/tmp/pic-t', config_dir='/tmp/pic-t')
def test_debug_emits_global_log_block_lan(self):
mgr = self._mgr_with_level('DEBUG')
out = mgr.generate_caddyfile({'cell_name': 'c', 'domain_mode': 'lan'}, [])
self.assertIn('log {', out)
self.assertIn('level DEBUG', out)
def test_info_emits_no_log_block(self):
mgr = self._mgr_with_level('INFO')
out = mgr.generate_caddyfile({'cell_name': 'c', 'domain_mode': 'lan'}, [])
self.assertNotIn('log {', out)
def test_warning_maps_to_caddy_warn(self):
mgr = self._mgr_with_level('WARNING')
out = mgr.generate_caddyfile({'cell_name': 'c', 'domain_mode': 'lan'}, [])
self.assertIn('level WARN', out)
if __name__ == '__main__':
unittest.main()
+532
View File
@@ -0,0 +1,532 @@
"""Integration tests for registry-driven CaddyManager and NetworkManager routing.
These tests cover the new registry path introduced in Step 5 of the PIC Services
Architecture. The no-registry (fallback) paths are already covered by
test_caddy_manager.py and test_network_manager.py.
"""
import os
import sys
import shutil
import tempfile
import unittest
from unittest.mock import MagicMock
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', 'api'))
from caddy_manager import CaddyManager # noqa: E402
from network_manager import NetworkManager # noqa: E402
# ---------------------------------------------------------------------------
# Shared helpers
# ---------------------------------------------------------------------------
def _mgr_with_registry(registry=None):
"""Build a CaddyManager wired to an optional mock registry."""
cm = MagicMock()
cm.get_identity.return_value = {}
return CaddyManager(config_manager=cm, service_registry=registry)
def _mock_registry():
"""Return a mock ServiceRegistry that reproduces 3 store service routes."""
reg = MagicMock()
reg.get_caddy_routes.return_value = [
{
'service_id': 'calendar',
'subdomain': 'calendar',
'backend': 'cell-radicale:5232',
'extra_subdomains': [],
'extra_backends': {},
},
{
'service_id': 'email',
'subdomain': 'mail',
'backend': 'cell-rainloop:8888',
'extra_subdomains': ['webmail'],
'extra_backends': {},
},
{
'service_id': 'files',
'subdomain': 'files',
'backend': 'cell-filegator:8080',
'extra_subdomains': ['webdav'],
'extra_backends': {'webdav': 'cell-webdav:80'},
},
]
return reg
def _nm(registry=None):
"""Build a NetworkManager backed by temp dirs and an optional mock registry."""
tmpdir = tempfile.mkdtemp()
nm = NetworkManager(
data_dir=os.path.join(tmpdir, 'data'),
config_dir=os.path.join(tmpdir, 'config'),
service_registry=registry,
)
nm._tmpdir = tmpdir # stash so the caller can clean up
return nm
# ---------------------------------------------------------------------------
# TestBuildRegistryServiceRoutes
# ---------------------------------------------------------------------------
class TestBuildRegistryServiceRoutes(unittest.TestCase):
def test_returns_api_only_when_no_registry(self):
"""service_registry=None produces only the @api block."""
mgr = _mgr_with_registry(registry=None)
domain = 'alpha.pic.ngo'
result = mgr._build_registry_service_routes(domain)
self.assertIn('@api host api.alpha.pic.ngo', result)
self.assertIn('reverse_proxy cell-api:3000', result)
self.assertNotIn('@calendar', result)
self.assertNotIn('@mail', result)
def test_returns_api_only_when_registry_empty(self):
"""An empty route list from the registry produces only the @api block."""
reg = MagicMock()
reg.get_caddy_routes.return_value = []
mgr = _mgr_with_registry(registry=reg)
domain = 'alpha.pic.ngo'
result = mgr._build_registry_service_routes(domain)
self.assertIn('@api host api.alpha.pic.ngo', result)
self.assertIn('reverse_proxy cell-api:3000', result)
self.assertNotIn('@calendar', result)
self.assertNotIn('@mail', result)
def test_returns_api_only_on_registry_error(self):
"""When get_caddy_routes raises, only the @api block is produced."""
reg = MagicMock()
reg.get_caddy_routes.side_effect = Exception('registry unavailable')
mgr = _mgr_with_registry(registry=reg)
domain = 'alpha.pic.ngo'
result = mgr._build_registry_service_routes(domain)
self.assertIn('@api host api.alpha.pic.ngo', result)
self.assertIn('reverse_proxy cell-api:3000', result)
self.assertNotIn('@calendar', result)
self.assertNotIn('@mail', result)
def test_single_service_no_extras(self):
"""One service with no extra_subdomains produces one matcher + handle + api block."""
reg = MagicMock()
reg.get_caddy_routes.return_value = [
{
'service_id': 'calendar',
'subdomain': 'calendar',
'backend': 'cell-radicale:5232',
'extra_subdomains': [],
'extra_backends': {},
}
]
mgr = _mgr_with_registry(registry=reg)
result = mgr._build_registry_service_routes('test.cell')
self.assertIn('@calendar host calendar.test.cell', result)
self.assertIn('reverse_proxy cell-radicale:5232', result)
self.assertIn('@api host api.test.cell', result)
self.assertIn('reverse_proxy cell-api:3000', result)
# Only two named-matcher definition lines: @calendar and @api
matcher_lines = [l for l in result.splitlines() if l.strip().startswith('@') and 'host' in l]
self.assertEqual(len(matcher_lines), 2)
def test_extra_subdomain_same_backend(self):
"""An extra_subdomain NOT in extra_backends shares the primary matcher host line."""
reg = MagicMock()
reg.get_caddy_routes.return_value = [
{
'service_id': 'email',
'subdomain': 'mail',
'backend': 'cell-rainloop:8888',
'extra_subdomains': ['webmail'],
'extra_backends': {}, # webmail not listed → shares backend
}
]
mgr = _mgr_with_registry(registry=reg)
result = mgr._build_registry_service_routes('test.cell')
# Both subdomains appear in the same host matcher line
self.assertIn('@mail host mail.test.cell webmail.test.cell', result)
# Only one reverse_proxy for cell-rainloop (shared block)
self.assertEqual(result.count('reverse_proxy cell-rainloop:8888'), 1)
# No separate @webmail block
self.assertNotIn('@webmail host', result)
def test_extra_subdomain_different_backend(self):
"""An extra_subdomain listed in extra_backends gets its own matcher + handle block."""
reg = MagicMock()
reg.get_caddy_routes.return_value = [
{
'service_id': 'files',
'subdomain': 'files',
'backend': 'cell-filegator:8080',
'extra_subdomains': ['webdav'],
'extra_backends': {'webdav': 'cell-webdav:80'},
}
]
mgr = _mgr_with_registry(registry=reg)
result = mgr._build_registry_service_routes('test.cell')
# files gets its own block (webdav not in shared list)
self.assertIn('@files host files.test.cell', result)
self.assertIn('reverse_proxy cell-filegator:8080', result)
# webdav gets a separate block
self.assertIn('@webdav host webdav.test.cell', result)
self.assertIn('reverse_proxy cell-webdav:80', result)
# webdav must NOT appear in the @files host line
files_line = [l for l in result.splitlines() if '@files host' in l][0]
self.assertNotIn('webdav', files_line)
def test_api_always_appended(self):
"""The @api block is always the last block even when registry has no api entry."""
reg = _mock_registry()
mgr = _mgr_with_registry(registry=reg)
result = mgr._build_registry_service_routes('alpha.pic.ngo')
self.assertIn('@api host api.alpha.pic.ngo', result)
self.assertIn('reverse_proxy cell-api:3000', result)
# api block is at the end
api_idx = result.rfind('@api')
other_matchers = ['@calendar', '@mail', '@files', '@webdav']
for m in other_matchers:
self.assertLess(result.index(m), api_idx,
f'{m} should appear before @api')
def test_api_not_duplicated_when_registry_returns_api(self):
"""Even if registry somehow returns an 'api' route, the injected api block is cell-api:3000."""
reg = MagicMock()
reg.get_caddy_routes.return_value = [
{
'service_id': 'api',
'subdomain': 'api',
'backend': 'cell-other:9999', # wrong backend — should be overridden
'extra_subdomains': [],
'extra_backends': {},
}
]
mgr = _mgr_with_registry(registry=reg)
result = mgr._build_registry_service_routes('test.cell')
# The infrastructure api block is always appended with the canonical backend
self.assertIn('reverse_proxy cell-api:3000', result)
# api host matcher appears at least once (from registry AND from append)
self.assertGreaterEqual(result.count('@api host api.test.cell'), 1)
# ---------------------------------------------------------------------------
# TestHttp01ServicePairs
# ---------------------------------------------------------------------------
class TestHttp01ServicePairs(unittest.TestCase):
def test_pairs_from_registry(self):
"""With the 3 builtins the pairs list matches expected (subdomain, backend) tuples."""
reg = _mock_registry()
mgr = _mgr_with_registry(registry=reg)
pairs = mgr._http01_service_pairs()
pairs_dict = dict(pairs)
self.assertEqual(pairs_dict['calendar'], 'cell-radicale:5232')
self.assertEqual(pairs_dict['mail'], 'cell-rainloop:8888')
self.assertEqual(pairs_dict['webmail'], 'cell-rainloop:8888')
self.assertEqual(pairs_dict['files'], 'cell-filegator:8080')
self.assertEqual(pairs_dict['webdav'], 'cell-webdav:80')
self.assertEqual(pairs_dict['api'], 'cell-api:3000')
def test_webdav_gets_own_backend(self):
"""webdav must map to cell-webdav:80, not to cell-filegator:8080."""
reg = _mock_registry()
mgr = _mgr_with_registry(registry=reg)
pairs = mgr._http01_service_pairs()
webdav_entry = next((b for s, b in pairs if s == 'webdav'), None)
self.assertIsNotNone(webdav_entry)
self.assertEqual(webdav_entry, 'cell-webdav:80')
self.assertNotEqual(webdav_entry, 'cell-filegator:8080')
def test_only_api_when_no_registry(self):
"""Without a registry only the api pair is returned."""
mgr = _mgr_with_registry(registry=None)
pairs = mgr._http01_service_pairs()
subdomains = [s for s, _ in pairs]
self.assertIn('api', subdomains)
self.assertNotIn('calendar', subdomains)
self.assertNotIn('mail', subdomains)
self.assertNotIn('files', subdomains)
def test_only_api_on_registry_error(self):
"""When get_caddy_routes raises, only the api pair is present."""
reg = MagicMock()
reg.get_caddy_routes.side_effect = RuntimeError('boom')
mgr = _mgr_with_registry(registry=reg)
pairs = mgr._http01_service_pairs()
subdomains = [s for s, _ in pairs]
self.assertIn('api', subdomains)
self.assertNotIn('calendar', subdomains)
# ---------------------------------------------------------------------------
# TestCaddyfileWithRegistry
# ---------------------------------------------------------------------------
class TestCaddyfileWithRegistry(unittest.TestCase):
def _generate(self, domain_mode, cell_name='alpha', domain_name=None,
registry=None, services=None):
reg = registry if registry is not None else _mock_registry()
mgr = _mgr_with_registry(registry=reg)
identity = {'cell_name': cell_name, 'domain_mode': domain_mode}
if domain_name:
identity['domain_name'] = domain_name
return mgr.generate_caddyfile(identity, services or [])
def test_pic_ngo_with_registry_has_correct_routes(self):
"""pic_ngo Caddyfile has all service matchers with correct subdomains and backends."""
out = self._generate('pic_ngo', cell_name='alpha')
# calendar
self.assertIn('@calendar host calendar.alpha.pic.ngo', out)
self.assertIn('reverse_proxy cell-radicale:5232', out)
# mail + webmail share one matcher
self.assertIn('@mail host mail.alpha.pic.ngo webmail.alpha.pic.ngo', out)
self.assertIn('reverse_proxy cell-rainloop:8888', out)
# files
self.assertIn('@files host files.alpha.pic.ngo', out)
self.assertIn('reverse_proxy cell-filegator:8080', out)
# webdav separate block
self.assertIn('@webdav host webdav.alpha.pic.ngo', out)
self.assertIn('reverse_proxy cell-webdav:80', out)
# api always present
self.assertIn('@api host api.alpha.pic.ngo', out)
self.assertIn('reverse_proxy cell-api:3000', out)
def test_cloudflare_with_registry_uses_registry_routes(self):
"""cloudflare Caddyfile routes are sourced from registry, not hardcoded."""
out = self._generate('cloudflare', cell_name='beta',
domain_name='example.com')
self.assertIn('@calendar host calendar.example.com', out)
self.assertIn('@mail host mail.example.com webmail.example.com', out)
self.assertIn('@files host files.example.com', out)
self.assertIn('@webdav host webdav.example.com', out)
self.assertIn('@api host api.example.com', out)
# Correct DNS plugin block is still present
self.assertIn('dns cloudflare {$CF_API_TOKEN}', out)
def test_duckdns_with_registry_uses_registry_routes(self):
"""duckdns Caddyfile routes are sourced from registry."""
out = self._generate('duckdns', cell_name='gamma')
self.assertIn('@calendar host calendar.gamma.duckdns.org', out)
self.assertIn('@api host api.gamma.duckdns.org', out)
self.assertIn('dns duckdns {$DUCKDNS_TOKEN}', out)
def test_http01_with_registry_has_per_host_blocks(self):
"""http01 Caddyfile has individual per-host blocks for every service subdomain."""
out = self._generate('http01', cell_name='delta',
domain_name='delta.noip.me')
self.assertIn('calendar.delta.noip.me {', out)
self.assertIn('mail.delta.noip.me {', out)
self.assertIn('webmail.delta.noip.me {', out)
self.assertIn('files.delta.noip.me {', out)
self.assertIn('webdav.delta.noip.me {', out)
self.assertIn('api.delta.noip.me {', out)
# Correct backends
self.assertIn('reverse_proxy cell-radicale:5232', out)
self.assertIn('reverse_proxy cell-rainloop:8888', out)
self.assertIn('reverse_proxy cell-filegator:8080', out)
self.assertIn('reverse_proxy cell-webdav:80', out)
def test_pic_ngo_api_only_when_registry_empty(self):
"""pic_ngo emits only the api block when registry returns empty list."""
reg = MagicMock()
reg.get_caddy_routes.return_value = []
out = self._generate('pic_ngo', cell_name='alpha', registry=reg)
self.assertIn('@api host api.alpha.pic.ngo', out)
self.assertNotIn('@calendar', out)
self.assertNotIn('@mail', out)
# ---------------------------------------------------------------------------
# TestNetworkManagerGetServiceSubdomains
# ---------------------------------------------------------------------------
class TestNetworkManagerGetServiceSubdomains(unittest.TestCase):
def setUp(self):
self.managers = []
def tearDown(self):
for nm in self.managers:
shutil.rmtree(nm._tmpdir, ignore_errors=True)
def _make(self, registry=None):
nm = _nm(registry=registry)
self.managers.append(nm)
return nm
def test_no_registry_returns_empty(self):
"""Without a registry an empty list is returned."""
nm = self._make(registry=None)
subs = nm._get_service_subdomains()
self.assertEqual(subs, [])
def test_registry_returns_all_subdomains(self):
"""Primary + extra_subdomains from all routes are returned."""
reg = _mock_registry()
nm = self._make(registry=reg)
subs = nm._get_service_subdomains()
# calendar (primary), mail (primary), webmail (extra), files (primary), webdav (extra)
for expected in ('calendar', 'mail', 'webmail', 'files', 'webdav'):
self.assertIn(expected, subs)
def test_registry_error_returns_empty(self):
"""When get_caddy_routes raises, an empty list is returned."""
reg = MagicMock()
reg.get_caddy_routes.side_effect = Exception('broken registry')
nm = self._make(registry=reg)
subs = nm._get_service_subdomains()
self.assertEqual(subs, [])
def test_registry_extra_subdomains_included(self):
"""extra_subdomains from each route are included in the returned list."""
reg = MagicMock()
reg.get_caddy_routes.return_value = [
{
'service_id': 'files',
'subdomain': 'files',
'backend': 'cell-filegator:8080',
'extra_subdomains': ['webdav', 'dav'],
'extra_backends': {},
}
]
nm = self._make(registry=reg)
subs = nm._get_service_subdomains()
self.assertIn('files', subs)
self.assertIn('webdav', subs)
self.assertIn('dav', subs)
def test_build_dns_records_with_registry(self):
"""All registry subdomains appear as A records in _build_dns_records output."""
reg = _mock_registry()
nm = self._make(registry=reg)
# Override WG IP lookup so we get a predictable value
nm._get_wg_server_ip = lambda: '10.0.0.1'
records = nm._build_dns_records('mycell', '172.20.0.0/16')
names = [r['name'] for r in records]
for expected in ('mycell', 'api', 'webui', 'calendar', 'mail',
'webmail', 'files', 'webdav'):
self.assertIn(expected, names,
f'{expected!r} should be in DNS records but is not')
# All records must point to the WG server IP
for r in records:
self.assertEqual(r['value'], '10.0.0.1')
self.assertEqual(r['type'], 'A')
# ---------------------------------------------------------------------------
# TestNetworkManagerStaleSet
# ---------------------------------------------------------------------------
class TestNetworkManagerStaleSet(unittest.TestCase):
"""Verify that registry subdomains drive stale record cleanup in update_split_horizon_zone."""
def setUp(self):
self.test_dir = tempfile.mkdtemp()
data_dir = os.path.join(self.test_dir, 'data')
config_dir = os.path.join(self.test_dir, 'config')
os.makedirs(os.path.join(data_dir, 'dns'), exist_ok=True)
os.makedirs(os.path.join(config_dir, 'dns'), exist_ok=True)
self.reg = _mock_registry()
self.nm = NetworkManager(
data_dir=data_dir,
config_dir=config_dir,
service_registry=self.reg,
)
def tearDown(self):
shutil.rmtree(self.test_dir, ignore_errors=True)
def _write_zone(self, zone_name: str, content: str):
path = os.path.join(self.nm.dns_zones_dir, f'{zone_name}.zone')
with open(path, 'w') as f:
f.write(content)
def test_stale_set_includes_registry_subdomains(self):
"""Registry subdomains (calendar, mail, webmail, files, webdav) are treated as
stale service records and removed from the parent zone during
update_split_horizon_zone."""
import subprocess
# Build a parent zone with stale service records that the registry knows about
stale_records = [
{'name': 'pic2', 'type': 'A', 'value': '10.0.0.1'},
{'name': 'api', 'type': 'A', 'value': '10.0.0.1'},
{'name': 'webui', 'type': 'A', 'value': '10.0.0.1'},
{'name': 'calendar', 'type': 'A', 'value': '10.0.0.1'},
{'name': 'mail', 'type': 'A', 'value': '10.0.0.1'},
{'name': 'webmail', 'type': 'A', 'value': '10.0.0.1'},
{'name': 'files', 'type': 'A', 'value': '10.0.0.1'},
{'name': 'webdav', 'type': 'A', 'value': '10.0.0.1'},
]
from unittest.mock import patch
with patch('subprocess.run'):
self.nm.update_dns_zone('pic.ngo', stale_records)
self.nm.update_split_horizon_zone(
'pic2.pic.ngo', '172.20.0.2', primary_domain='pic.ngo'
)
parent_zone = os.path.join(self.nm.dns_zones_dir, 'pic.ngo.zone')
content = open(parent_zone).read()
# All registry subdomains must be gone
for stale in ('api', 'webui', 'calendar', 'mail', 'webmail', 'files', 'webdav'):
# Check that no line *starts* with the stale name (to avoid false positives
# on SOA/NS lines that may contain the zone name as a suffix)
lines_with_stale = [
l for l in content.splitlines()
if l.startswith(stale + ' ') or l.startswith(stale + '\t')
]
self.assertEqual(
lines_with_stale, [],
f'Stale record {stale!r} should have been removed from pic.ngo zone'
)
def test_stale_set_uses_registry_not_hardcoded(self):
"""When a registry provides a custom subdomain, it is treated as stale too."""
custom_reg = MagicMock()
custom_reg.get_caddy_routes.return_value = [
{
'service_id': 'chat',
'subdomain': 'chat',
'backend': 'cell-chat:9000',
'extra_subdomains': ['im'],
'extra_backends': {},
}
]
data_dir = os.path.join(self.test_dir, 'data2')
config_dir = os.path.join(self.test_dir, 'config2')
os.makedirs(os.path.join(data_dir, 'dns'), exist_ok=True)
os.makedirs(os.path.join(config_dir, 'dns'), exist_ok=True)
nm = NetworkManager(data_dir=data_dir, config_dir=config_dir,
service_registry=custom_reg)
stale_records = [
{'name': 'pic3', 'type': 'A', 'value': '10.0.0.1'},
{'name': 'chat', 'type': 'A', 'value': '10.0.0.1'},
{'name': 'im', 'type': 'A', 'value': '10.0.0.1'},
]
from unittest.mock import patch
with patch('subprocess.run'):
nm.update_dns_zone('pic.ngo', stale_records)
nm.update_split_horizon_zone(
'pic3.pic.ngo', '172.20.0.2', primary_domain='pic.ngo'
)
parent_zone = os.path.join(nm.dns_zones_dir, 'pic.ngo.zone')
content = open(parent_zone).read()
for stale in ('chat', 'im'):
lines_with_stale = [
l for l in content.splitlines()
if l.startswith(stale + ' ') or l.startswith(stale + '\t')
]
self.assertEqual(
lines_with_stale, [],
f'Custom registry subdomain {stale!r} should have been removed'
)
if __name__ == '__main__':
unittest.main()
+44
View File
@@ -24,12 +24,20 @@ sys.path.insert(0, str(api_dir))
from app import app
_INSTALLED = {'id': 'calendar', 'installed': True}
class TestGetCalendarUsers(unittest.TestCase):
def setUp(self):
app.config['TESTING'] = True
self.client = app.test_client()
self._sr_patcher = patch('app.service_registry')
mock_sr = self._sr_patcher.start()
mock_sr.get.return_value = _INSTALLED
def tearDown(self):
self._sr_patcher.stop()
@patch('app.calendar_manager')
def test_get_users_returns_200_with_list(self, mock_cm):
@@ -63,6 +71,12 @@ class TestCreateCalendarUser(unittest.TestCase):
def setUp(self):
app.config['TESTING'] = True
self.client = app.test_client()
self._sr_patcher = patch('app.service_registry')
mock_sr = self._sr_patcher.start()
mock_sr.get.return_value = _INSTALLED
def tearDown(self):
self._sr_patcher.stop()
@patch('app.calendar_manager')
def test_create_user_returns_200_on_valid_body(self, mock_cm):
@@ -133,6 +147,12 @@ class TestDeleteCalendarUser(unittest.TestCase):
def setUp(self):
app.config['TESTING'] = True
self.client = app.test_client()
self._sr_patcher = patch('app.service_registry')
mock_sr = self._sr_patcher.start()
mock_sr.get.return_value = _INSTALLED
def tearDown(self):
self._sr_patcher.stop()
@patch('app.calendar_manager')
def test_delete_user_returns_200_on_success(self, mock_cm):
@@ -161,6 +181,12 @@ class TestCreateCalendar(unittest.TestCase):
def setUp(self):
app.config['TESTING'] = True
self.client = app.test_client()
self._sr_patcher = patch('app.service_registry')
mock_sr = self._sr_patcher.start()
mock_sr.get.return_value = _INSTALLED
def tearDown(self):
self._sr_patcher.stop()
@patch('app.calendar_manager')
def test_create_calendar_returns_200_on_valid_body(self, mock_cm):
@@ -228,6 +254,12 @@ class TestAddCalendarEvent(unittest.TestCase):
def setUp(self):
app.config['TESTING'] = True
self.client = app.test_client()
self._sr_patcher = patch('app.service_registry')
mock_sr = self._sr_patcher.start()
mock_sr.get.return_value = _INSTALLED
def tearDown(self):
self._sr_patcher.stop()
@patch('app.calendar_manager')
def test_add_event_returns_200_on_valid_body(self, mock_cm):
@@ -294,6 +326,12 @@ class TestGetCalendarEvents(unittest.TestCase):
def setUp(self):
app.config['TESTING'] = True
self.client = app.test_client()
self._sr_patcher = patch('app.service_registry')
mock_sr = self._sr_patcher.start()
mock_sr.get.return_value = _INSTALLED
def tearDown(self):
self._sr_patcher.stop()
@patch('app.calendar_manager')
def test_get_events_returns_200_with_events(self, mock_cm):
@@ -354,6 +392,12 @@ class TestCalendarConnectivity(unittest.TestCase):
def setUp(self):
app.config['TESTING'] = True
self.client = app.test_client()
self._sr_patcher = patch('app.service_registry')
mock_sr = self._sr_patcher.start()
mock_sr.get.return_value = _INSTALLED
def tearDown(self):
self._sr_patcher.stop()
@patch('app.calendar_manager')
def test_connectivity_returns_200_with_result(self, mock_cm):
+435 -77
View File
@@ -1,77 +1,435 @@
import sys
from pathlib import Path
# Add api directory to path
api_dir = Path(__file__).parent.parent / 'api'
sys.path.insert(0, str(api_dir))
import unittest
import tempfile
import shutil
import os
from unittest.mock import patch
from calendar_manager import CalendarManager
class TestCalendarManager(unittest.TestCase):
def setUp(self):
self.test_dir = tempfile.mkdtemp()
self.data_dir = os.path.join(self.test_dir, 'data')
self.config_dir = os.path.join(self.test_dir, 'config')
os.makedirs(self.data_dir, exist_ok=True)
os.makedirs(self.config_dir, exist_ok=True)
self.manager = CalendarManager(data_dir=self.data_dir, config_dir=self.config_dir)
def tearDown(self):
shutil.rmtree(self.test_dir)
def test_initialization(self):
self.assertTrue(os.path.exists(self.manager.calendar_dir))
self.assertTrue(os.path.exists(self.manager.radicale_dir))
def test_ensure_config_exists(self):
config_file = os.path.join(self.manager.radicale_dir, 'config')
if os.path.exists(config_file):
os.remove(config_file)
self.manager._ensure_config_exists()
self.assertTrue(os.path.exists(config_file))
def test_generate_radicale_config(self):
config_file = os.path.join(self.manager.radicale_dir, 'config')
if os.path.exists(config_file):
os.remove(config_file)
self.manager._generate_radicale_config()
self.assertTrue(os.path.exists(config_file))
with open(config_file) as f:
content = f.read()
self.assertIn('[server]', content)
self.assertIn('hosts = 0.0.0.0:5232', content)
def test_get_status(self):
status = self.manager.get_status()
self.assertIsInstance(status, dict)
self.assertIn('status', status)
@patch.object(CalendarManager, 'create_calendar', return_value=True)
@patch.object(CalendarManager, 'remove_calendar', return_value=True)
def test_create_and_remove_calendar(self, mock_remove, mock_create):
result = self.manager.create_calendar('testuser', 'testcal')
self.assertTrue(result)
result = self.manager.remove_calendar('testuser', 'testcal')
self.assertTrue(result)
@patch.object(CalendarManager, 'add_event', return_value=True)
@patch.object(CalendarManager, 'remove_event', return_value=True)
def test_add_and_remove_event(self, mock_remove, mock_add):
result = self.manager.add_event('testuser', 'testcal', {'summary': 'Test'})
self.assertTrue(result)
result = self.manager.remove_event('testuser', 'testcal', 'dummyuid')
self.assertTrue(result)
def test_error_handling(self):
# Force errors by passing invalid arguments, should return False
self.assertFalse(self.manager.create_calendar(None, None))
self.assertFalse(self.manager.add_event(None, None, None))
self.assertFalse(self.manager.remove_calendar(None, None))
self.assertFalse(self.manager.remove_event(None, None, None))
if __name__ == '__main__':
unittest.main()
import sys
from pathlib import Path
# Add api directory to path
api_dir = Path(__file__).parent.parent / 'api'
sys.path.insert(0, str(api_dir))
import unittest
import tempfile
import shutil
import os
import json
from unittest.mock import patch, MagicMock
from calendar_manager import CalendarManager
class TestCalendarManager(unittest.TestCase):
def setUp(self):
self.test_dir = tempfile.mkdtemp()
self.data_dir = os.path.join(self.test_dir, 'data')
self.config_dir = os.path.join(self.test_dir, 'config')
os.makedirs(self.data_dir, exist_ok=True)
os.makedirs(self.config_dir, exist_ok=True)
self.manager = CalendarManager(data_dir=self.data_dir, config_dir=self.config_dir)
def tearDown(self):
shutil.rmtree(self.test_dir)
def test_initialization(self):
self.assertTrue(os.path.exists(self.manager.calendar_dir))
self.assertTrue(os.path.exists(self.manager.radicale_dir))
def test_ensure_config_exists(self):
config_file = os.path.join(self.manager.radicale_dir, 'config')
if os.path.exists(config_file):
os.remove(config_file)
self.manager._ensure_config_exists()
self.assertTrue(os.path.exists(config_file))
def test_generate_radicale_config(self):
config_file = os.path.join(self.manager.radicale_dir, 'config')
if os.path.exists(config_file):
os.remove(config_file)
self.manager._generate_radicale_config()
self.assertTrue(os.path.exists(config_file))
with open(config_file) as f:
content = f.read()
self.assertIn('[server]', content)
self.assertIn('hosts = 0.0.0.0:5232', content)
def test_get_status(self):
status = self.manager.get_status()
self.assertIsInstance(status, dict)
self.assertIn('status', status)
@patch.object(CalendarManager, 'create_calendar', return_value=True)
@patch.object(CalendarManager, 'remove_calendar', return_value=True)
def test_create_and_remove_calendar(self, mock_remove, mock_create):
result = self.manager.create_calendar('testuser', 'testcal')
self.assertTrue(result)
result = self.manager.remove_calendar('testuser', 'testcal')
self.assertTrue(result)
@patch.object(CalendarManager, 'add_event', return_value=True)
@patch.object(CalendarManager, 'remove_event', return_value=True)
def test_add_and_remove_event(self, mock_remove, mock_add):
result = self.manager.add_event('testuser', 'testcal', {'summary': 'Test'})
self.assertTrue(result)
result = self.manager.remove_event('testuser', 'testcal', 'dummyuid')
self.assertTrue(result)
def test_error_handling(self):
# Force errors by passing invalid arguments, should return False
self.assertFalse(self.manager.create_calendar(None, None))
self.assertFalse(self.manager.add_event(None, None, None))
self.assertFalse(self.manager.remove_calendar(None, None))
self.assertFalse(self.manager.remove_event(None, None, None))
# --- New tests below ---
def test_create_calendar_user_creates_and_persists(self):
with patch.object(self.manager, '_sync_users_to_cell_config'):
result = self.manager.create_calendar_user('alice', 'password123')
self.assertTrue(result)
users = self.manager._load_users()
self.assertEqual(len(users), 1)
self.assertEqual(users[0]['username'], 'alice')
self.assertNotIn('password', users[0])
def test_create_calendar_user_duplicate_returns_false(self):
with patch.object(self.manager, '_sync_users_to_cell_config'):
self.manager.create_calendar_user('alice', 'password123')
result = self.manager.create_calendar_user('alice', 'other')
self.assertFalse(result)
users = self.manager._load_users()
self.assertEqual(len(users), 1)
def test_create_calendar_user_creates_user_directory(self):
with patch.object(self.manager, '_sync_users_to_cell_config'):
self.manager.create_calendar_user('alice', 'password123')
user_dir = os.path.join(self.manager.calendar_data_dir, 'users', 'alice')
self.assertTrue(os.path.exists(user_dir))
def test_delete_calendar_user_removes_user(self):
with patch.object(self.manager, '_sync_users_to_cell_config'):
self.manager.create_calendar_user('alice', 'password123')
with patch.object(self.manager, '_sync_users_to_cell_config'):
result = self.manager.delete_calendar_user('alice')
self.assertTrue(result)
users = self.manager._load_users()
self.assertEqual(len(users), 0)
def test_delete_calendar_user_nonexistent_returns_false(self):
result = self.manager.delete_calendar_user('nobody')
self.assertFalse(result)
def test_delete_calendar_user_removes_directory(self):
with patch.object(self.manager, '_sync_users_to_cell_config'):
self.manager.create_calendar_user('alice', 'password123')
user_dir = os.path.join(self.manager.calendar_data_dir, 'users', 'alice')
self.assertTrue(os.path.exists(user_dir))
with patch.object(self.manager, '_sync_users_to_cell_config'):
self.manager.delete_calendar_user('alice')
self.assertFalse(os.path.exists(user_dir))
def test_get_calendar_users_empty(self):
users = self.manager.get_calendar_users()
self.assertEqual(users, [])
def test_get_calendar_users_returns_created(self):
with patch.object(self.manager, '_sync_users_to_cell_config'):
self.manager.create_calendar_user('alice', 'pass')
self.manager.create_calendar_user('bob', 'pass')
users = self.manager.get_calendar_users()
self.assertEqual(len(users), 2)
usernames = [u['username'] for u in users]
self.assertIn('alice', usernames)
self.assertIn('bob', usernames)
def test_create_calendar_real_persists(self):
result = self.manager.create_calendar('alice', 'personal')
self.assertTrue(result)
calendars = self.manager._load_calendars()
self.assertEqual(len(calendars), 1)
cal = calendars[0]
self.assertEqual(cal['username'], 'alice')
self.assertEqual(cal['name'], 'personal')
def test_create_calendar_duplicate_returns_false(self):
self.manager.create_calendar('alice', 'personal')
result = self.manager.create_calendar('alice', 'personal')
self.assertFalse(result)
def test_create_calendar_with_description_and_color(self):
result = self.manager.create_calendar('alice', 'work', description='Work stuff', color='#ff0000')
self.assertTrue(result)
calendars = self.manager._load_calendars()
cal = calendars[0]
self.assertEqual(cal['description'], 'Work stuff')
self.assertEqual(cal['color'], '#ff0000')
def test_create_calendar_updates_user_count(self):
with patch.object(self.manager, '_sync_users_to_cell_config'):
self.manager.create_calendar_user('alice', 'pass')
self.manager.create_calendar('alice', 'personal')
users = self.manager._load_users()
alice = next(u for u in users if u['username'] == 'alice')
self.assertEqual(alice['calendars_count'], 1)
def test_remove_calendar_real_removes(self):
self.manager.create_calendar('alice', 'personal')
result = self.manager.remove_calendar('alice', 'personal')
self.assertTrue(result)
calendars = self.manager._load_calendars()
self.assertEqual(len(calendars), 0)
def test_remove_calendar_nonexistent_returns_true(self):
"""Removing a non-existent calendar is idempotent (returns True)."""
result = self.manager.remove_calendar('alice', 'nonexistent')
self.assertTrue(result)
def test_add_event_real_persists(self):
result = self.manager.add_event('alice', 'personal', {'summary': 'Meeting'})
self.assertTrue(result)
events = self.manager._load_events()
self.assertEqual(len(events), 1)
self.assertEqual(events[0]['summary'], 'Meeting')
self.assertEqual(events[0]['username'], 'alice')
self.assertEqual(events[0]['calendar'], 'personal')
def test_add_event_assigns_uid_if_missing(self):
self.manager.add_event('alice', 'personal', {'summary': 'Test'})
events = self.manager._load_events()
self.assertIn('uid', events[0])
def test_add_event_preserves_existing_uid(self):
self.manager.add_event('alice', 'personal', {'summary': 'Test', 'uid': 'my-uid-123'})
events = self.manager._load_events()
self.assertEqual(events[0]['uid'], 'my-uid-123')
def test_remove_event_real_removes_by_uid(self):
self.manager.add_event('alice', 'personal', {'summary': 'Test', 'uid': 'uid-1'})
result = self.manager.remove_event('alice', 'personal', 'uid-1')
self.assertTrue(result)
events = self.manager._load_events()
self.assertEqual(len(events), 0)
def test_remove_event_does_not_remove_wrong_uid(self):
self.manager.add_event('alice', 'personal', {'summary': 'Test', 'uid': 'uid-1'})
self.manager.add_event('alice', 'personal', {'summary': 'Other', 'uid': 'uid-2'})
self.manager.remove_event('alice', 'personal', 'uid-1')
events = self.manager._load_events()
self.assertEqual(len(events), 1)
self.assertEqual(events[0]['uid'], 'uid-2')
def test_create_calendar_event_persists(self):
result = self.manager.create_calendar_event(
'alice', 'personal', 'Team meeting',
'2026-01-01T09:00:00', '2026-01-01T10:00:00',
description='Weekly sync', location='Office')
self.assertTrue(result)
events = self.manager._load_events()
self.assertEqual(len(events), 1)
ev = events[0]
self.assertEqual(ev['title'], 'Team meeting')
self.assertEqual(ev['username'], 'alice')
def test_create_calendar_event_updates_calendar_count(self):
self.manager.create_calendar('alice', 'personal')
self.manager.create_calendar_event(
'alice', 'personal', 'Sync',
'2026-01-01T09:00:00', '2026-01-01T10:00:00')
calendars = self.manager._load_calendars()
self.assertEqual(calendars[0]['events_count'], 1)
def test_get_calendar_events_filters_by_user_and_calendar(self):
self.manager.create_calendar_event(
'alice', 'personal', 'Alice event', '2026-01-01T09:00', '2026-01-01T10:00')
self.manager.create_calendar_event(
'bob', 'personal', 'Bob event', '2026-01-01T09:00', '2026-01-01T10:00')
alice_events = self.manager.get_calendar_events('alice', 'personal')
self.assertEqual(len(alice_events), 1)
self.assertEqual(alice_events[0]['title'], 'Alice event')
def test_get_calendar_events_date_filter(self):
self.manager.create_calendar_event(
'alice', 'personal', 'Jan event', '2026-01-15T09:00', '2026-01-15T10:00')
self.manager.create_calendar_event(
'alice', 'personal', 'Feb event', '2026-02-15T09:00', '2026-02-15T10:00')
filtered = self.manager.get_calendar_events(
'alice', 'personal', start_date='2026-01-01', end_date='2026-01-31')
self.assertEqual(len(filtered), 1)
self.assertEqual(filtered[0]['title'], 'Jan event')
def test_get_calendar_status_returns_users(self):
with patch.object(self.manager, '_sync_users_to_cell_config'):
self.manager.create_calendar_user('alice', 'pass')
status = self.manager.get_calendar_status()
self.assertIn('users', status)
self.assertEqual(len(status['users']), 1)
self.assertEqual(status['users'][0]['username'], 'alice')
def test_get_metrics_empty(self):
with patch.object(self.manager, '_check_calendar_status', return_value=False):
metrics = self.manager.get_metrics()
self.assertIn('users_count', metrics)
self.assertIn('calendars_count', metrics)
self.assertIn('events_count', metrics)
self.assertEqual(metrics['users_count'], 0)
def test_get_metrics_with_data(self):
with patch.object(self.manager, '_sync_users_to_cell_config'):
self.manager.create_calendar_user('alice', 'pass')
self.manager.create_calendar('alice', 'personal')
self.manager.add_event('alice', 'personal', {'summary': 'Evt'})
with patch.object(self.manager, '_check_calendar_status', return_value=True):
metrics = self.manager.get_metrics()
self.assertEqual(metrics['users_count'], 1)
self.assertEqual(metrics['calendars_count'], 1)
self.assertEqual(metrics['events_count'], 1)
def test_apply_config_no_port_key(self):
result = self.manager.apply_config({})
self.assertEqual(result['restarted'], [])
def test_apply_config_updates_radicale_hosts(self):
# Generate config first
self.manager._generate_radicale_config()
result = self.manager.apply_config({'port': 5233})
self.assertEqual(result['restarted'], [])
config_file = os.path.join(self.manager.radicale_dir, 'config')
with open(config_file) as f:
content = f.read()
self.assertIn('hosts = 0.0.0.0:5233', content)
def test_apply_config_no_radicale_file_is_safe(self):
"""apply_config doesn't crash if radicale config file is missing."""
config_file = os.path.join(self.manager.radicale_dir, 'config')
if os.path.exists(config_file):
os.remove(config_file)
result = self.manager.apply_config({'port': 5234})
# Should not raise; warnings list may or may not be empty
self.assertIn('warnings', result)
def test_write_radicale_htpasswd_creates_entry(self):
"""_write_radicale_htpasswd writes a bcrypt entry for the user."""
htpasswd = self.manager._radicale_htpasswd_path()
os.makedirs(os.path.dirname(htpasswd), exist_ok=True)
self.manager._write_radicale_htpasswd('alice', 'mypassword')
self.assertTrue(os.path.exists(htpasswd))
with open(htpasswd) as f:
content = f.read()
self.assertIn('alice:', content)
def test_write_radicale_htpasswd_updates_existing_entry(self):
"""_write_radicale_htpasswd replaces a user's old entry."""
htpasswd = self.manager._radicale_htpasswd_path()
os.makedirs(os.path.dirname(htpasswd), exist_ok=True)
self.manager._write_radicale_htpasswd('alice', 'pass1')
self.manager._write_radicale_htpasswd('alice', 'pass2')
with open(htpasswd) as f:
lines = f.readlines()
alice_lines = [l for l in lines if l.startswith('alice:')]
self.assertEqual(len(alice_lines), 1)
def test_remove_radicale_htpasswd_removes_entry(self):
htpasswd = self.manager._radicale_htpasswd_path()
os.makedirs(os.path.dirname(htpasswd), exist_ok=True)
self.manager._write_radicale_htpasswd('alice', 'pass')
self.manager._write_radicale_htpasswd('bob', 'pass')
self.manager._remove_radicale_htpasswd('alice')
with open(htpasswd) as f:
content = f.read()
self.assertNotIn('alice:', content)
self.assertIn('bob:', content)
def test_remove_radicale_htpasswd_no_file_is_safe(self):
"""_remove_radicale_htpasswd doesn't raise when the file doesn't exist."""
htpasswd = self.manager._radicale_htpasswd_path()
if os.path.exists(htpasswd):
os.remove(htpasswd)
self.manager._remove_radicale_htpasswd('alice') # should not raise
def test_write_radicale_htpasswd_no_config_dir_is_safe(self):
"""_write_radicale_htpasswd is a no-op when the config dir doesn't exist."""
# Don't create the config dir
self.manager._write_radicale_htpasswd('alice', 'pass')
htpasswd = self.manager._radicale_htpasswd_path()
self.assertFalse(os.path.exists(htpasswd))
def test_test_database_connectivity_with_accessible_dir(self):
result = self.manager._test_database_connectivity()
self.assertIn('success', result)
self.assertTrue(result['success'])
def test_test_service_connectivity_unreachable(self):
"""_test_service_connectivity returns failure when cell-radicale isn't reachable."""
result = self.manager._test_service_connectivity()
self.assertIn('success', result)
# In test environment Radicale is not running, so should be False
self.assertFalse(result['success'])
def test_test_web_interface_unreachable(self):
result = self.manager._test_web_interface()
self.assertIn('success', result)
self.assertFalse(result['success'])
def test_restart_service_calls_container(self):
with patch.object(self.manager, '_restart_container', return_value=True) as mock_restart:
result = self.manager.restart_service()
self.assertTrue(result)
mock_restart.assert_called_once_with('cell-radicale')
def test_restart_service_failure_returns_false(self):
with patch.object(self.manager, '_restart_container', return_value=False):
result = self.manager.restart_service()
self.assertFalse(result)
def test_sync_users_to_cell_config_best_effort(self):
"""_sync_users_to_cell_config failure is non-fatal."""
with patch('config_manager.ConfigManager', side_effect=Exception('no config')):
# Should not raise
self.manager._sync_users_to_cell_config()
def test_check_calendar_status_returns_bool(self):
with patch('subprocess.run') as mock_sub:
mock_sub.return_value = MagicMock(returncode=0, stdout=':5232 LISTEN')
result = self.manager._check_calendar_status()
self.assertIsInstance(result, bool)
def test_check_calendar_status_false_when_no_port(self):
with patch('subprocess.run') as mock_sub:
mock_sub.return_value = MagicMock(returncode=0, stdout='no matching port')
result = self.manager._check_calendar_status()
self.assertFalse(result)
def test_load_users_returns_empty_on_missing_file(self):
users = self.manager._load_users()
self.assertEqual(users, [])
def test_load_calendars_returns_empty_on_missing_file(self):
calendars = self.manager._load_calendars()
self.assertEqual(calendars, [])
def test_load_events_returns_empty_on_missing_file(self):
events = self.manager._load_events()
self.assertEqual(events, [])
def test_load_users_handles_corrupt_file(self):
with open(self.manager.users_file, 'w') as f:
f.write('{corrupt')
users = self.manager._load_users()
self.assertEqual(users, [])
def test_get_configured_port_default(self):
port = self.manager._get_configured_port()
self.assertEqual(port, 5232)
def test_get_configured_port_from_config(self):
with patch.object(self.manager, 'get_config', return_value={'port': 5555}):
port = self.manager._get_configured_port()
self.assertEqual(port, 5555)
def test_test_connectivity_returns_dict(self):
with patch.object(self.manager, '_test_service_connectivity', return_value={'success': False, 'message': ''}):
with patch.object(self.manager, '_test_database_connectivity', return_value={'success': True, 'message': ''}):
with patch.object(self.manager, '_test_web_interface', return_value={'success': False, 'message': ''}):
result = self.manager.test_connectivity()
self.assertIn('service_connectivity', result)
self.assertIn('database_connectivity', result)
self.assertIn('web_interface', result)
self.assertIn('success', result)
self.assertFalse(result['success'])
if __name__ == '__main__':
unittest.main()
+390
View File
@@ -0,0 +1,390 @@
#!/usr/bin/env python3
"""
Additional tests for cell_cli.py covering the functions NOT in test_cli_tool.py:
- list_peers (error path)
- list_nat_rules / add_nat_rule / delete_nat_rule
- list_peer_routes / add_peer_route / delete_peer_route
- list_firewall_rules / add_firewall_rule / delete_firewall_rule
- show_services_status
- list_wireguard_peers
- show_network_info / show_dns_status / show_ntp_status
- main() command routing
"""
import sys
import unittest
from pathlib import Path
from unittest.mock import patch, MagicMock
api_dir = Path(__file__).parent.parent / 'api'
sys.path.insert(0, str(api_dir))
from cell_cli import (
list_peers, add_peer, remove_peer, show_config, update_config,
list_nat_rules, add_nat_rule, delete_nat_rule,
list_peer_routes, add_peer_route, delete_peer_route,
list_firewall_rules, add_firewall_rule, delete_firewall_rule,
show_services_status, list_wireguard_peers,
show_network_info, show_dns_status, show_ntp_status,
)
class TestListPeersErrorPath(unittest.TestCase):
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_list_peers_failure_prints_error(self, mock_req, mock_print):
list_peers()
mock_print.assert_any_call('Failed to fetch peers.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=[])
def test_list_peers_empty_list(self, mock_req, mock_print):
list_peers()
mock_print.assert_any_call('No peers configured.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=[
{'name': 'alice', 'ip': '10.0.0.2',
'public_key': 'abcdefghijklmnopqrstuvwxyz', 'added_at': '2026-01-01'}
])
def test_list_peers_shows_peer_info(self, mock_req, mock_print):
list_peers()
self.assertTrue(any('alice' in str(c) for c in mock_print.call_args_list))
class TestNatRules(unittest.TestCase):
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'nat_rules': []})
def test_list_nat_rules_empty(self, mock_req, mock_print):
list_nat_rules()
mock_print.assert_any_call('No NAT rules configured.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'nat_rules': [
{'id': 1, 'source_network': '10.0.0.0/24', 'target_interface': 'eth0',
'masquerade': True, 'nat_type': 'MASQUERADE', 'protocol': 'ALL',
'external_port': '', 'internal_ip': '', 'internal_port': ''}
]})
def test_list_nat_rules_shows_rules(self, mock_req, mock_print):
list_nat_rules()
self.assertTrue(any('eth0' in str(c) for c in mock_print.call_args_list))
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_list_nat_rules_failure(self, mock_req, mock_print):
list_nat_rules()
mock_print.assert_any_call('Failed to fetch NAT rules.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'id': 1})
def test_add_nat_rule_success(self, mock_req, mock_print):
add_nat_rule('10.0.0.0/24', 'eth0', True, 'MASQUERADE', 'ALL', '', '', '')
mock_print.assert_any_call('✅ NAT rule added.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_add_nat_rule_failure(self, mock_req, mock_print):
add_nat_rule('10.0.0.0/24', 'eth0', False, 'DNAT', 'TCP', '80', '10.0.0.5', '8080')
mock_print.assert_any_call('❌ Failed to add NAT rule.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'ok': True})
def test_delete_nat_rule_success(self, mock_req, mock_print):
delete_nat_rule(1)
mock_print.assert_any_call('✅ NAT rule deleted.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_delete_nat_rule_failure(self, mock_req, mock_print):
delete_nat_rule(99)
mock_print.assert_any_call('❌ Failed to delete NAT rule.')
class TestPeerRoutes(unittest.TestCase):
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'peer_routes': []})
def test_list_peer_routes_empty(self, mock_req, mock_print):
list_peer_routes()
mock_print.assert_any_call('No peer routes configured.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'peer_routes': [
{'peer_name': 'alice', 'peer_ip': '10.0.0.2',
'allowed_networks': ['192.168.1.0/24'], 'route_type': 'split'}
]})
def test_list_peer_routes_shows_routes(self, mock_req, mock_print):
list_peer_routes()
self.assertTrue(any('alice' in str(c) for c in mock_print.call_args_list))
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_list_peer_routes_failure(self, mock_req, mock_print):
list_peer_routes()
mock_print.assert_any_call('Failed to fetch peer routes.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'ok': True})
def test_add_peer_route_success(self, mock_req, mock_print):
add_peer_route('alice', '10.0.0.2', '192.168.1.0/24', 'split')
mock_print.assert_any_call('✅ Peer route added.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_add_peer_route_failure(self, mock_req, mock_print):
add_peer_route('alice', '10.0.0.2', '192.168.1.0/24', 'split')
mock_print.assert_any_call('❌ Failed to add peer route.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'ok': True})
def test_delete_peer_route_success(self, mock_req, mock_print):
delete_peer_route('alice')
mock_print.assert_any_call('✅ Peer route deleted.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_delete_peer_route_failure(self, mock_req, mock_print):
delete_peer_route('alice')
mock_print.assert_any_call('❌ Failed to delete peer route.')
class TestFirewallRules(unittest.TestCase):
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'firewall_rules': []})
def test_list_firewall_rules_empty(self, mock_req, mock_print):
list_firewall_rules()
mock_print.assert_any_call('No firewall rules configured.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'firewall_rules': [
{'id': 1, 'rule_type': 'ACCEPT', 'source': '10.0.0.0/24',
'destination': 'any', 'protocol': 'TCP', 'port_range': '80', 'action': 'ACCEPT'}
]})
def test_list_firewall_rules_shows_rules(self, mock_req, mock_print):
list_firewall_rules()
self.assertTrue(any('ACCEPT' in str(c) for c in mock_print.call_args_list))
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_list_firewall_rules_failure(self, mock_req, mock_print):
list_firewall_rules()
mock_print.assert_any_call('Failed to fetch firewall rules.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'id': 1})
def test_add_firewall_rule_success(self, mock_req, mock_print):
add_firewall_rule('ACCEPT', '10.0.0.0/24', 'any', 'ACCEPT', 'TCP', '80')
mock_print.assert_any_call('✅ Firewall rule added.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_add_firewall_rule_failure(self, mock_req, mock_print):
add_firewall_rule('DROP', 'any', 'any', 'DROP', 'ALL', '')
mock_print.assert_any_call('❌ Failed to add firewall rule.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'ok': True})
def test_delete_firewall_rule_success(self, mock_req, mock_print):
delete_firewall_rule(1)
mock_print.assert_any_call('✅ Firewall rule deleted.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_delete_firewall_rule_failure(self, mock_req, mock_print):
delete_firewall_rule(99)
mock_print.assert_any_call('❌ Failed to delete firewall rule.')
class TestShowServicesStatus(unittest.TestCase):
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={
'email': {'status': 'online', 'running': True},
'dns': True
})
def test_show_services_status_with_dict_and_bool(self, mock_req, mock_print):
show_services_status()
self.assertTrue(any('email' in str(c) for c in mock_print.call_args_list))
self.assertTrue(any('dns' in str(c) for c in mock_print.call_args_list))
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_show_services_status_failure(self, mock_req, mock_print):
show_services_status()
mock_print.assert_any_call('Failed to fetch service status.')
class TestListWireguardPeers(unittest.TestCase):
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=[
{'name': 'alice', 'public_key': 'pk1', 'ip': '10.0.0.2', 'status': 'active'}
])
def test_list_wireguard_peers_shows_peers(self, mock_req, mock_print):
list_wireguard_peers()
self.assertTrue(any('alice' in str(c) for c in mock_print.call_args_list))
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_list_wireguard_peers_failure(self, mock_req, mock_print):
list_wireguard_peers()
mock_print.assert_any_call('Failed to fetch WireGuard peers.')
class TestNetworkDnsNtpStatus(unittest.TestCase):
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'gateway': '192.168.1.1', 'subnet': '10.0.0.0/24'})
def test_show_network_info_success(self, mock_req, mock_print):
show_network_info()
self.assertTrue(any('gateway' in str(c) for c in mock_print.call_args_list))
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_show_network_info_failure(self, mock_req, mock_print):
show_network_info()
mock_print.assert_any_call('Failed to fetch network info.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'running': True, 'port': 53})
def test_show_dns_status_success(self, mock_req, mock_print):
show_dns_status()
self.assertTrue(any('running' in str(c) for c in mock_print.call_args_list))
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_show_dns_status_failure(self, mock_req, mock_print):
show_dns_status()
mock_print.assert_any_call('Failed to fetch DNS status.')
@patch('builtins.print')
@patch('cell_cli.api_request', return_value={'synced': True, 'server': 'pool.ntp.org'})
def test_show_ntp_status_success(self, mock_req, mock_print):
show_ntp_status()
self.assertTrue(any('synced' in str(c) for c in mock_print.call_args_list))
@patch('builtins.print')
@patch('cell_cli.api_request', return_value=None)
def test_show_ntp_status_failure(self, mock_req, mock_print):
show_ntp_status()
mock_print.assert_any_call('Failed to fetch NTP status.')
class TestMainFunction(unittest.TestCase):
"""Cover main() by patching individual functions and simulating command dispatch."""
def _run_main(self, args):
import sys as _sys
from cell_cli import main
old_argv = _sys.argv
_sys.argv = ['cell_cli'] + args
try:
with patch('builtins.print'):
try:
main()
except SystemExit:
pass
finally:
_sys.argv = old_argv
def test_main_status_command(self):
with patch('cell_cli.show_status') as mock_fn:
self._run_main(['status'])
mock_fn.assert_called_once()
def test_main_peers_list_command(self):
with patch('cell_cli.list_peers') as mock_fn:
self._run_main(['peers', 'list'])
mock_fn.assert_called_once()
def test_main_peers_add_command(self):
with patch('cell_cli.add_peer') as mock_fn:
self._run_main(['peers', 'add', 'alice', '10.0.0.2', 'pubkey'])
mock_fn.assert_called_once_with('alice', '10.0.0.2', 'pubkey')
def test_main_peers_remove_command(self):
with patch('cell_cli.remove_peer') as mock_fn:
self._run_main(['peers', 'remove', 'alice'])
mock_fn.assert_called_once_with('alice')
def test_main_config_show_command(self):
with patch('cell_cli.show_config') as mock_fn:
self._run_main(['config', 'show'])
mock_fn.assert_called_once()
def test_main_config_update_command(self):
with patch('cell_cli.update_config') as mock_fn:
self._run_main(['config', 'update', 'cell_name', 'mycell'])
mock_fn.assert_called_once_with('cell_name', 'mycell')
def test_main_routing_nat_list(self):
with patch('cell_cli.list_nat_rules') as mock_fn:
self._run_main(['routing', 'nat', 'list'])
mock_fn.assert_called_once()
def test_main_routing_nat_add(self):
with patch('cell_cli.add_nat_rule') as mock_fn:
self._run_main(['routing', 'nat', 'add', '10.0.0.0/24', 'eth0'])
mock_fn.assert_called_once()
def test_main_routing_nat_delete(self):
with patch('cell_cli.delete_nat_rule') as mock_fn:
self._run_main(['routing', 'nat', 'delete', '1'])
mock_fn.assert_called_once_with('1') # argparse passes as string
def test_main_routing_peers_list(self):
with patch('cell_cli.list_peer_routes') as mock_fn:
self._run_main(['routing', 'peers', 'list'])
mock_fn.assert_called_once()
def test_main_routing_peers_add(self):
with patch('cell_cli.add_peer_route') as mock_fn:
self._run_main(['routing', 'peers', 'add', 'alice', '10.0.0.2',
'192.168.1.0/24'])
mock_fn.assert_called_once()
def test_main_routing_peers_delete(self):
with patch('cell_cli.delete_peer_route') as mock_fn:
self._run_main(['routing', 'peers', 'delete', 'alice'])
mock_fn.assert_called_once_with('alice')
def test_main_routing_firewall_list(self):
with patch('cell_cli.list_firewall_rules') as mock_fn:
self._run_main(['routing', 'firewall', 'list'])
mock_fn.assert_called_once()
def test_main_routing_firewall_add(self):
with patch('cell_cli.add_firewall_rule') as mock_fn:
self._run_main(['routing', 'firewall', 'add',
'ACCEPT', '10.0.0.0/24', 'any', 'ACCEPT'])
mock_fn.assert_called_once()
def test_main_routing_firewall_delete(self):
with patch('cell_cli.delete_firewall_rule') as mock_fn:
self._run_main(['routing', 'firewall', 'delete', '1'])
mock_fn.assert_called_once_with('1')
def test_main_services_status_command(self):
with patch('cell_cli.show_services_status') as mock_fn:
self._run_main(['services-status'])
mock_fn.assert_called_once()
def test_main_wireguard_list_command(self):
with patch('cell_cli.list_wireguard_peers') as mock_fn:
self._run_main(['wireguard-peers'])
mock_fn.assert_called_once()
def test_main_network_info_command(self):
with patch('cell_cli.show_network_info') as mock_fn:
self._run_main(['network-info'])
mock_fn.assert_called_once()
def test_main_dns_status_command(self):
with patch('cell_cli.show_dns_status') as mock_fn:
self._run_main(['dns-status'])
mock_fn.assert_called_once()
def test_main_ntp_status_command(self):
with patch('cell_cli.show_ntp_status') as mock_fn:
self._run_main(['ntp-status'])
mock_fn.assert_called_once()
if __name__ == '__main__':
unittest.main()
+12 -3
View File
@@ -61,8 +61,17 @@ class TestGenerateCorefileOneLink(unittest.TestCase):
self.assertIn('cache', content[idx_primary:])
def test_log_directive_present_in_forwarding_block(self):
# At default INFO the forwarding block carries the `errors` directive;
# at DEBUG it carries the verbose `log` plugin.
cell_links = [{'domain': 'remote.cell', 'dns_ip': '10.5.0.1'}]
firewall_manager.generate_corefile([], self.path, cell_links=cell_links)
firewall_manager.generate_corefile([], self.path, cell_links=cell_links,
coredns_level='INFO')
content = self._read()
idx_primary = content.index('remote.cell {')
self.assertIn('errors', content[idx_primary:])
firewall_manager.generate_corefile([], self.path, cell_links=cell_links,
coredns_level='DEBUG')
content = self._read()
idx_primary = content.index('remote.cell {')
self.assertIn('log', content[idx_primary:])
@@ -144,7 +153,7 @@ class TestApplyAllDnsRulesPassesCellLinks(unittest.TestCase):
cell_links=cell_links,
)
mock_gen.assert_called_once_with(
[], '/tmp/fake_Corefile', 'cell', cell_links
[], '/tmp/fake_Corefile', 'cell', cell_links, None
)
def test_cell_links_none_forwarded_as_none(self):
@@ -156,7 +165,7 @@ class TestApplyAllDnsRulesPassesCellLinks(unittest.TestCase):
domain='cell',
cell_links=None,
)
mock_gen.assert_called_once_with([], '/tmp/fake_Corefile', 'cell', None)
mock_gen.assert_called_once_with([], '/tmp/fake_Corefile', 'cell', None, None)
def test_reload_called_on_success(self):
with patch.object(firewall_manager, 'generate_corefile', return_value=True), \
+154
View File
@@ -179,6 +179,45 @@ class TestConfigApplyRoute(unittest.TestCase):
self.assertIn('-d', cmd)
self.assertIn('dns', cmd)
# ── Race-condition fix: needs_restart cleared synchronously ────────────
# For non-'*' container restarts the background thread takes ~300 ms.
# The frontend polls /api/config/pending every 5 s; if needs_restart is
# still True when that poll fires, the banner re-appears after Apply.
# Fix: set needs_restart=False and applying=True before spawning the thread.
@patch('threading.Thread')
@patch('docker.from_env')
def test_specific_containers_clears_needs_restart_synchronously(
self, mock_docker, mock_thread):
"""needs_restart must be False as soon as apply returns, not after thread."""
mock_docker.side_effect = Exception('no docker in test')
mock_thread.return_value = MagicMock() # thread is mocked — never runs
_set_pending_restart(['cell_name changed to pic2'], ['dns'])
self.client.post('/api/config/apply')
pending = config_manager.configs.get('_pending_restart', {})
self.assertFalse(pending.get('needs_restart', True),
'needs_restart must be False immediately after apply for non-* restarts')
self.assertTrue(pending.get('applying', False),
'applying must be True while the background thread runs')
@patch('threading.Thread')
@patch('docker.from_env')
def test_wildcard_containers_sets_applying_but_not_clears_needs_restart(
self, mock_docker, mock_thread):
"""For '*' restarts the helper container clears the flag; API must not."""
mock_docker.side_effect = Exception('no docker in test')
mock_thread.return_value = MagicMock()
_set_pending_restart(['ip_range changed'], ['*'])
self.client.post('/api/config/apply')
pending = config_manager.configs.get('_pending_restart', {})
# Wildcard restart: API sets applying=True but leaves needs_restart=True
# so the helper container can clear it on success.
self.assertTrue(pending.get('applying', False))
# ── Exception in route body returns 500 ───────────────────────────────
@patch('app.config_manager')
@@ -190,5 +229,120 @@ class TestConfigApplyRoute(unittest.TestCase):
self.assertIn('error', json.loads(r.data))
class TestDdnsConfigUpdatesFiresIdentityChanged(unittest.TestCase):
"""PUT /api/ddns must publish IDENTITY_CHANGED so CaddyManager regenerates."""
def setUp(self):
app.config['TESTING'] = True
self.client = app.test_client()
def _put_ddns(self, payload=None):
if payload is None:
payload = {'domain_mode': 'pic_ngo', 'cell_name': 'test', 'domain': 'pic_ngo'}
return self.client.put(
'/api/ddns',
data=json.dumps(payload),
content_type='application/json',
)
@patch('app.service_bus')
@patch('app.config_manager')
def test_fires_identity_changed_on_success(self, mock_cm, mock_bus):
mock_cm.configs = {
'_identity': {
'cell_name': 'test',
'domain': 'pic_ngo',
'domain_name': '',
'domain_mode': 'pic_ngo',
}
}
mock_cm.set_identity_field = MagicMock()
mock_cm.get_effective_domain = MagicMock(return_value='test.pic.ngo')
mock_cm.validate_ddns_config = MagicMock(return_value=None)
r = self._put_ddns()
self.assertIn(r.status_code, (200, 204))
self.assertTrue(mock_bus.publish_event.called,
'Expected service_bus.publish_event to be called')
args = mock_bus.publish_event.call_args
# first positional arg should be an EventType with value IDENTITY_CHANGED
event_arg = args[0][0]
self.assertEqual(str(event_arg).upper().replace('.', '_'),
'EVENTTYPE_IDENTITY_CHANGED')
@patch('app.service_bus')
@patch('app.config_manager')
def test_identity_changed_payload_contains_domain_fields(self, mock_cm, mock_bus):
mock_cm.configs = {
'_identity': {
'cell_name': 'mycell',
'domain': 'pic_ngo',
'domain_name': '',
'domain_mode': 'pic_ngo',
}
}
mock_cm.set_identity_field = MagicMock()
mock_cm.get_effective_domain = MagicMock(return_value='mycell.pic.ngo')
mock_cm.validate_ddns_config = MagicMock(return_value=None)
self._put_ddns({'domain_mode': 'pic_ngo', 'cell_name': 'mycell', 'domain': 'pic_ngo'})
if mock_bus.publish_event.called:
kwargs = mock_bus.publish_event.call_args[1] if mock_bus.publish_event.call_args[1] else {}
pos_args = mock_bus.publish_event.call_args[0]
# payload is 3rd positional arg
if len(pos_args) >= 3:
payload = pos_args[2]
self.assertIn('cell_name', payload)
self.assertIn('effective_domain', payload)
class TestCaddyCertStatusRoute(unittest.TestCase):
"""GET /api/caddy/cert-status delegates to CaddyManager and handles errors."""
def setUp(self):
app.config['TESTING'] = True
self.client = app.test_client()
def test_returns_cert_status_200(self):
expected = {
'status': 'valid',
'expiry': '2026-12-01T00:00:00+00:00',
'days_remaining': 179,
}
mock_caddy = MagicMock()
mock_caddy.get_cert_status_fresh.return_value = expected
with patch('app.caddy_manager', mock_caddy):
r = self.client.get('/api/caddy/cert-status')
self.assertEqual(r.status_code, 200)
data = json.loads(r.data)
self.assertEqual(data['status'], 'valid')
self.assertEqual(data['days_remaining'], 179)
def test_returns_500_on_exception(self):
mock_caddy = MagicMock()
mock_caddy.get_cert_status_fresh.side_effect = RuntimeError('ssl timeout')
with patch('app.caddy_manager', mock_caddy):
r = self.client.get('/api/caddy/cert-status')
self.assertEqual(r.status_code, 500)
data = json.loads(r.data)
self.assertIn('error', data)
def test_calls_get_cert_status_fresh_with_max_age(self):
mock_caddy = MagicMock()
mock_caddy.get_cert_status_fresh.return_value = {'status': 'internal'}
with patch('app.caddy_manager', mock_caddy):
self.client.get('/api/caddy/cert-status')
mock_caddy.get_cert_status_fresh.assert_called_once()
call_kwargs = mock_caddy.get_cert_status_fresh.call_args
# max_age_seconds should be passed (positional or keyword)
all_args = list(call_kwargs[0]) + list(call_kwargs[1].values())
self.assertTrue(
any(isinstance(a, int) and a > 0 for a in all_args),
'Expected a positive max_age_seconds argument',
)
if __name__ == '__main__':
unittest.main()
+233
View File
@@ -0,0 +1,233 @@
#!/usr/bin/env python3
"""Backup/restore overhaul tests for ConfigManager.
Covers the P0 data-loss fix:
- critical secrets/keys are INCLUDED in a backup
- trash (logs, nested backups, *.tmp, .test_admin_pass) is EXCLUDED
- optional passphrase encryption (encrypted archive named .tar.gz.age, plaintext 0600)
- restore ordering (vault/fernet restored first) + reapply step invoked
- round-trip: backup -> restore with passphrase recovers files
Docker/subprocess and the live managers used by the reapply step are mocked.
"""
import os
import sys
import json
import stat
import shutil
import tarfile
import tempfile
import unittest
from pathlib import Path
from unittest.mock import patch, MagicMock
api_dir = Path(__file__).parent.parent / 'api'
sys.path.insert(0, str(api_dir))
from config_manager import ConfigManager
import backup_crypto
def _write(p: Path, content: str = 'x'):
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(content)
class _BackupBase(unittest.TestCase):
def setUp(self):
self.tmp = tempfile.mkdtemp()
self.config_file = os.path.join(self.tmp, 'config', 'cell_config.json')
self.data_dir = Path(self.tmp) / 'data'
os.makedirs(os.path.dirname(self.config_file), exist_ok=True)
os.makedirs(self.data_dir, exist_ok=True)
self.cm = ConfigManager(self.config_file, str(self.data_dir))
self.cm.configs['_identity'] = {'cell_name': 'mycell', 'domain': 'cell'}
self.cm._save_all_configs()
self._seed_data()
def tearDown(self):
shutil.rmtree(self.tmp, ignore_errors=True)
def _seed_data(self):
d = self.data_dir
# Critical paths
_write(d / 'api' / 'auth_users.json', '{"admin": 1}')
_write(d / 'api' / '.flask_secret_key', 'secret')
_write(d / 'api' / 'peers.json', '{"peer1": "key"}')
_write(d / 'api' / 'peer_service_credentials.json', '{}')
_write(d / 'api' / 'cell_links.json', '{"link": 1}')
_write(d / 'api' / 'ddns_token', 'tok123')
_write(d / 'api' / 'audit' / 'audit.log', '{"seq": 1, "action": "peer.create"}')
_write(d / 'wireguard' / 'keys' / 'server_private.key', 'PRIV')
_write(d / 'wireguard' / 'wg_confs' / 'wg0.conf', '[Interface]')
_write(d / 'api' / 'wireguard' / 'keys' / 'private.key', 'P2')
_write(d / 'vault' / 'keys' / 'fernet.key', 'FERNETKEY')
_write(d / 'vault' / 'ca' / 'ca.key', 'CAKEY')
_write(d / 'vault' / 'secrets.json', 'ENC')
_write(d / 'api' / 'services' / 'wireguard-ext' / 'config' / 'wg_ext0.conf', 'EXT')
_write(d / 'caddy' / 'caddy' / 'cert.pem', 'CERT')
# Trash that must be excluded
_write(d / 'logs' / 'app.log', 'log line')
_write(d / 'api' / 'config_backups' / 'old' / 'manifest.json', '{}')
_write(d / 'api' / '.test_admin_pass', 'pw')
_write(d / 'api' / '.gitkeep', '')
_write(d / 'api' / 'scratch.tmp', 'tmp')
_write(d / 'api' / 'half.partial', 'partial')
_write(d / 'api' / '__pycache__' / 'x.pyc', 'bytecode')
def _backup_files(self, backup_id):
bp = self.cm.backup_dir / backup_id
return {p.relative_to(bp).as_posix()
for p in bp.rglob('*') if p.is_file()}
class TestBackupInclude(_BackupBase):
def test_critical_paths_included(self):
bid = self.cm.backup_config()
files = self._backup_files(bid)
expected = [
'data/api/auth_users.json',
'data/api/.flask_secret_key',
'data/api/peers.json',
'data/api/peer_service_credentials.json',
'data/api/cell_links.json',
'data/api/ddns_token',
'data/api/audit/audit.log',
'data/wireguard/keys/server_private.key',
'data/wireguard/wg_confs/wg0.conf',
'data/api/wireguard/keys/private.key',
'data/vault/keys/fernet.key',
'data/vault/ca/ca.key',
'data/vault/secrets.json',
'data/api/services/wireguard-ext/config/wg_ext0.conf',
'data/caddy/caddy/cert.pem',
]
for rel in expected:
self.assertIn(rel, files, f'{rel} missing from backup')
def test_absent_path_skipped_gracefully(self):
# Remove ddns_token before backup — should not error, just skip.
(self.data_dir / 'api' / 'ddns_token').unlink()
bid = self.cm.backup_config()
files = self._backup_files(bid)
self.assertNotIn('data/api/ddns_token', files)
self.assertIn('data/api/auth_users.json', files)
class TestBackupExclude(_BackupBase):
def test_trash_excluded(self):
bid = self.cm.backup_config()
files = self._backup_files(bid)
for rel in (
'data/logs/app.log',
'data/api/config_backups/old/manifest.json',
'data/api/.test_admin_pass',
'data/api/.gitkeep',
'data/api/scratch.tmp',
'data/api/half.partial',
'data/api/__pycache__/x.pyc',
):
self.assertNotIn(rel, files, f'{rel} should be excluded')
class TestPassphraseEncryption(_BackupBase):
def test_encrypted_archive_named_age(self):
archive_id = self.cm.backup_config(passphrase='hunter2')
self.assertTrue(archive_id.endswith('.tar.gz.age'))
archive = self.cm.backup_dir / archive_id
self.assertTrue(archive.is_file())
# Plaintext staging dir removed
self.assertFalse((self.cm.backup_dir / archive_id[:-len('.tar.gz.age')]).exists())
# Blob is recognised as encrypted
self.assertTrue(backup_crypto.is_encrypted(archive.read_bytes()))
# Mode 0600
mode = stat.S_IMODE(os.stat(archive).st_mode)
self.assertEqual(mode, 0o600)
def test_plaintext_backup_is_0600(self):
bid = self.cm.backup_config()
bp = self.cm.backup_dir / bid
mode = stat.S_IMODE(os.stat(bp).st_mode)
self.assertEqual(mode, 0o700)
def test_restore_wrong_passphrase_raises_permission(self):
archive_id = self.cm.backup_config(passphrase='correct')
with self.assertRaises(PermissionError):
self.cm.restore_config(archive_id, passphrase='wrong')
def test_restore_missing_passphrase_raises_permission(self):
archive_id = self.cm.backup_config(passphrase='correct')
with self.assertRaises(PermissionError):
self.cm.restore_config(archive_id, passphrase=None)
def test_roundtrip_with_passphrase_recovers_files(self):
archive_id = self.cm.backup_config(passphrase='secretpw')
# Wipe a critical file then restore.
(self.data_dir / 'api' / 'auth_users.json').unlink()
(self.data_dir / 'vault' / 'keys' / 'fernet.key').unlink()
with patch.object(self.cm, '_reapply_runtime_state'):
ok = self.cm.restore_config(archive_id, passphrase='secretpw')
self.assertTrue(ok)
self.assertEqual(
(self.data_dir / 'api' / 'auth_users.json').read_text(), '{"admin": 1}')
self.assertEqual(
(self.data_dir / 'vault' / 'keys' / 'fernet.key').read_text(), 'FERNETKEY')
class TestRestoreOrderingAndReapply(_BackupBase):
def test_vault_restored_before_other_data(self):
bid = self.cm.backup_config()
# Wipe data dir's restored targets to observe restore.
order = []
real_copy = shutil.copy2
def tracking_copy(src, dst, *a, **k):
order.append(Path(dst).as_posix())
return real_copy(src, dst, *a, **k)
with patch.object(self.cm, '_reapply_runtime_state'), \
patch('config_manager.shutil.copy2', side_effect=tracking_copy):
self.cm.restore_config(bid)
def first_idx(needle):
for i, p in enumerate(order):
if needle in p:
return i
return 10 ** 9
vault_i = first_idx('/vault/')
auth_i = first_idx('auth_users.json')
wg_i = first_idx('/wireguard/')
self.assertLess(vault_i, auth_i, 'vault must restore before auth_users')
self.assertLess(vault_i, wg_i, 'vault must restore before wireguard keys')
def test_reapply_step_invoked(self):
bid = self.cm.backup_config()
with patch.object(self.cm, '_reapply_runtime_state') as mock_reapply:
self.cm.restore_config(bid)
mock_reapply.assert_called_once()
def test_reapply_calls_regenerate_and_apply_routes(self):
bid = self.cm.backup_config()
fake = MagicMock()
managers_mock = MagicMock()
managers_mock.caddy_manager = fake.caddy
managers_mock.firewall_manager = fake.firewall
managers_mock.connectivity_manager = fake.connectivity
managers_mock.cell_link_manager = fake.cell_link
managers_mock.service_composer = fake.composer
managers_mock.peer_registry = fake.peers
fake.peers.list_peers.return_value = []
fake.cell_link.list_connections.return_value = []
with patch.dict('sys.modules', {'managers': managers_mock}):
self.cm.restore_config(bid)
fake.caddy.regenerate_with_installed.assert_called_once()
fake.firewall.generate_corefile.assert_called_once()
fake.connectivity.apply_routes.assert_called_once()
fake.cell_link.replay_pending_pushes.assert_called_once()
fake.composer.reapply_active_services.assert_called_once()
if __name__ == '__main__':
unittest.main()
+65 -2
View File
@@ -30,6 +30,8 @@ api_dir = Path(__file__).parent.parent / 'api'
sys.path.insert(0, str(api_dir))
from app import app
import backup_crypto
import tarfile
class TestCreateConfigBackup(unittest.TestCase):
@@ -119,14 +121,18 @@ class TestRestoreConfigBackup(unittest.TestCase):
content_type='application/json',
)
mock_cm.restore_config.assert_called_once_with(
'backup_001', services=['network', 'wireguard']
'backup_001', services=['network', 'wireguard'], service_registry=None,
passphrase=None,
)
@patch('app.config_manager')
def test_restore_passes_none_services_when_no_body(self, mock_cm):
from unittest.mock import ANY
mock_cm.restore_config.return_value = True
self.client.post('/api/config/restore/backup_001')
mock_cm.restore_config.assert_called_once_with('backup_001', services=None)
mock_cm.restore_config.assert_called_once_with(
'backup_001', services=None, service_registry=ANY, passphrase=None
)
class TestExportConfig(unittest.TestCase):
@@ -341,6 +347,63 @@ class TestUploadBackup(unittest.TestCase):
)
self.assertEqual(r.status_code, 400)
@patch('app.config_manager')
def test_upload_stores_encrypted_blob_verbatim(self, mock_cm):
backup_dir = Path(self.tmp)
mock_cm.backup_dir = backup_dir
blob = backup_crypto.encrypt_bytes(b'payload-bytes', 'secret')
self.assertTrue(blob.startswith(backup_crypto.MAGIC))
r = self.client.post(
'/api/config/backup/upload',
data={'file': (io.BytesIO(blob), 'backup_20260101_010101.tar.gz.age')},
content_type='multipart/form-data',
)
self.assertEqual(r.status_code, 200)
data = json.loads(r.data)
self.assertTrue(data['encrypted'])
self.assertEqual(data['backup_id'], 'backup_20260101_010101')
archive = backup_dir / 'backup_20260101_010101.tar.gz.age'
self.assertTrue(archive.exists())
self.assertEqual(archive.read_bytes(), blob)
@patch('app.config_manager')
def test_upload_encrypted_then_restore_round_trip(self, mock_cm):
# Build a real encrypted backup archive (tar.gz of a manifest, then
# encrypted), upload it, then restore it through the real ConfigManager
# decrypt/resolve path with the correct and an incorrect passphrase.
from config_manager import ConfigManager
backup_dir = Path(self.tmp) / 'backups'
backup_dir.mkdir(parents=True, exist_ok=True)
mock_cm.backup_dir = backup_dir
tar_buf = io.BytesIO()
with tarfile.open(fileobj=tar_buf, mode='w:gz') as tar:
inner = json.dumps({'backup_id': 'rt', 'services': []}).encode()
info = tarfile.TarInfo('manifest.json')
info.size = len(inner)
tar.addfile(info, io.BytesIO(inner))
blob = backup_crypto.encrypt_bytes(tar_buf.getvalue(), 'pw123')
r = self.client.post(
'/api/config/backup/upload',
data={'file': (io.BytesIO(blob), 'rt.tar.gz.age')},
content_type='multipart/form-data',
)
self.assertEqual(r.status_code, 200)
backup_id = json.loads(r.data)['backup_id']
# Resolve+decrypt with the correct passphrase succeeds.
real_cm = ConfigManager.__new__(ConfigManager)
real_cm.backup_dir = backup_dir
path, cleanup = real_cm._resolve_backup_dir(f'{backup_id}.tar.gz.age', 'pw123')
self.assertTrue((path / 'manifest.json').exists())
shutil.rmtree(cleanup, ignore_errors=True)
# Wrong passphrase raises PermissionError → route returns 400.
with self.assertRaises(PermissionError):
real_cm._resolve_backup_dir(f'{backup_id}.tar.gz.age', 'wrong')
if __name__ == '__main__':
unittest.main()

Some files were not shown because too many files have changed in this diff Show More