1
Dev – Architecture
Dmitrii Iurco edited this page 2026-06-11 15:39:28 -04:00

Status: Active | Owner: @roof | Applies to: main (2026-06) | Updated: 2026-06-11

Dev – Architecture


Container stack

Six core containers run on a Docker bridge network called cell-network (default subnet 172.20.0.0/16). Static IPs per container are set in docker-compose.yml and can be overridden via .env.

Browser / WireGuard peer
  └── Caddy (:80/:443)              TLS termination, reverse proxy
        └── React SPA (:8081→8080)  Vite + Tailwind (Nginx in container)
        └── Flask API (:3000)       REST API, bound to 127.0.0.1 only
              ├── NetworkManager        CoreDNS, chrony
              ├── WireGuardManager      WireGuard peer lifecycle
              ├── PeerRegistry          peer registration and trust
              ├── EmailManager          Postfix + Dovecot
              ├── CalendarManager       Radicale CalDAV/CardDAV
              ├── FileManager           WebDAV + Filegator
              ├── RoutingManager        iptables NAT and routing
              ├── FirewallManager       iptables INPUT/FORWARD rules
              ├── VaultManager          internal CA, cert lifecycle
              ├── ContainerManager      Docker SDK
              ├── CellLinkManager       cell-to-cell WireGuard links
              ├── ConnectivityManager   exit routing
              ├── DDNSManager           DDNS heartbeat
              ├── ServiceStoreManager   optional service install/remove
              ├── CaddyManager          Caddyfile generation and reload
              ├── AuthManager           session auth, RBAC
              ├── AuditManager          append-only activity log
              ├── AccountManager        per-service account provisioning
              └── SetupManager          first-run wizard state

Key container properties:

  • cell-wireguard runs unprivileged — NET_ADMIN capability only. It requires the WireGuard kernel module on the host. No --privileged flag.
  • cell-api and cell-webui use slim images.
  • The Docker socket is mounted only into cell-api. Other containers have no Docker access.
  • The Flask API binds to 127.0.0.1:3000 only. All external access goes through Caddy.
  • DHCP was removed. cell-dns runs CoreDNS only.

Installed optional service containers join cell-network with their own compose projects, managed by ServiceComposer. Each service is a separate compose project at data/services/<id>/docker-compose.yml.


Manager pattern

All managers inherit BaseServiceManager (api/base_service_manager.py), which requires implementing:

  • get_status() — current running state
  • get_config() / update_config() — config read/write
  • validate_config() — validation before write
  • test_connectivity() — reachability check
  • get_logs() — recent log lines
  • restart_service() — container restart via Docker SDK

Managers are instantiated as singletons in api/managers.py and injected into app.py as module-level names. Route handlers import them from app inside the route function (not at module load time) to avoid circular imports.

All managers use self.logger (from BaseServiceManager) and self.config_manager for config access. Direct file I/O on cell_config.json is a bug.

Shared state in managers uses threading.RLock. Flask is multi-threaded and managers run concurrently.


Service bus

ServiceBus (api/service_bus.py) is a pub/sub system between managers. Events include CONFIG_CHANGED, SERVICE_STARTED, SERVICE_STOPPED. Managers subscribe to events from their dependencies (for example, WireGuardManager subscribes to network changes).


Config and secrets

  • Runtime config: config/api/cell_config.json — managed by ConfigManager, never edited directly
  • Secrets: data/ — git-ignored; contains auth_users.json, WireGuard keys, DDNS token, CA key, vault secrets
  • _identity.domain in cell config is a plain string (the domain mode, for example "pic_ngo"), not a dict

ConfigManager validates on write and keeps automatic rolling backups in data/api/config_backups/.


Before-request hooks

Three Flask before-request hooks run on every request, in order:

  1. enforce_setup — returns 428 for all /api/* except /api/setup/* and /health until setup is complete. Skipped when app.config['TESTING'] is True.
  2. enforce_auth — returns 401 if no session; 503 if the users file is empty (misconfiguration). Skipped when testing.
  3. check_csrf — requires X-CSRF-Token header on POST, PUT, DELETE, PATCH on /api/* except /api/auth/* and /api/setup/*.

These are the security boundary. Modifying them requires careful review.


Connectivity v2 data model

The Connectivity feature (v2) uses named connection instances instead of one-global-exit-per-type.

Each connection is a record in cell_config.json under connectivity.connections. A record contains:

  • id — UUID assigned at creation
  • type — one of wireguard_ext, openvpn, tor, sshuttle, proxy, cell_relay
  • name — human label
  • mark — fwmark hex value, allocated from the pool 0x10000x1FFF (stride 0x10)
  • table — routing table number, starting from 1000
  • For iface types (wireguard_ext, openvpn): iface — interface name (wgext_<suffix> or ovpn_<suffix>)
  • For redirect types (tor, sshuttle, proxy): redirect_port — allocated from 91009199
  • status — last health probe result (health, timestamp, detail); never contains secrets
  • Secrets are stored in the vault under conn_<id>_<field> and only the key references are kept in the record

cell_relay connections are auto-derived from cell links that offer an exit. They have mark and table allocated but no iface or redirect_port. They are reconciled automatically on list_connections().

Migration v1 to v2: on first get_connectivity() call after upgrade, ConfigManager calls ConnectivityManager._migrate_connectivity_v1_to_v2() if the stored version is less than 2. This creates one named connection per previously-configured exit type (which had fixed fwmarks 0x100x50), repoints vault secret references to the new conn_<id>_<field> naming, and deletes the old references.

Peer assignments store the connection id as exit_connection_id on the peer record. The legacy route_via field is kept in sync for backward compatibility.


Cell-to-cell networking

CellLinkManager manages WireGuard site-to-site tunnels. Each link is a WireGuard peer on wg0 with:

  • A /32 VPN address for the remote cell's API endpoint
  • AllowedIPs covering the remote cell's full VPN subnet

The peer-sync protocol (/api/cells/peer-sync/) allows two cells to exchange public keys and allowed networks without a session. Authentication is by source IP and WireGuard public key — not session cookies.

When a remote cell advertises that it offers an internet exit, reconcile_cell_relays() creates or updates a cell_relay connection in the local connectivity config. This is called automatically when list_connections() is invoked.

Access control for cell-to-cell service access (calendar, files, mail, WebDAV) is enforced at the iptables level by FirewallManager.


Frontend

The React SPA (webui/) is built with Vite and styled with Tailwind CSS utilities. There are no custom CSS files. All API calls go through webui/src/services/api.js (Axios). Page components live in webui/src/pages/; reusable components in webui/src/components/.

In development, the Vite dev server (npm run dev) proxies /api requests to :3000. In production, Caddy routes them.

The nav is dynamic: installed services are fetched via GET /api/services/active on load. After install or uninstall, the pic-services-changed custom DOM event is dispatched to trigger a re-fetch without a full page reload.