- Fix#2: Move DDNS bearer token from cell_config.json to data/api/ddns_token.
Token is now in the secrets store (data/) rather than the config store (config/).
Auto-migrates existing installs on first access. ConfigManager.get/set_ddns_token()
added. set_ddns_config() now strips 'token' key to prevent it leaking back.
- Fix#3: Set Caddyfile permissions to 0o600 after write so the token embedded
in the Caddyfile is not world-readable on the host filesystem.
- Fix#5: Heartbeat now fires IDENTITY_CHANGED after re-registration so Caddy
regenerates its config with the new token automatically — users no longer need
to click Re-register in Settings after a wizard registration failure.
Also: heartbeat skips the 401-cycle when no token exists and goes straight to
registration instead. DDNSManager now accepts service_bus= and is wired up.
- Fix#6: Settings page starts polling GET /api/caddy/cert-status every 15s
after a successful DDNS re-registration and shows "Acquiring certificate…"
feedback until Let's Encrypt issues the cert (up to 5 minutes).
- Fix#7: regenerate_with_installed() is debounced (5 s window) so two rapid
IDENTITY_CHANGED events (e.g. wizard + heartbeat) can't start simultaneous
ACME orders that interfere with each other.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Setup wizard (Issue 1 — UI):
- pic.ngo subdomain input now uses the same split-field style as DuckDNS:
input + static '.pic.ngo' suffix in a flex row, availability status below
Setup wizard (Issue 2 — Caddy not regenerating after completion):
- complete_setup route now fires IDENTITY_CHANGED after a successful wizard
submission so CaddyManager regenerates the Caddyfile immediately; users
no longer need to press 'Renew Certificate' to start ACME
Settings — DDNS status (Issue 2 — domain status missing):
- New GET /api/ddns/status endpoint: returns registered flag, domain_name,
public_ip (ipify with 30s cache), last_ip from heartbeat
- Settings DDNS section for pic_ngo now shows a live status row with
color-coded dot (green=registered+current, yellow=registered+stale,
gray=not registered), current public IP, and a Check button
- Status auto-refreshes on mount and after each successful re-registration
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The method is named get_email_users in EmailManager; the route was
calling the non-existent get_users, causing an AttributeError on every
GET /api/email/users request.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- PicNgoDDNS.update(): send token in request body instead of Authorization
header; DDNS server validates it from body (was returning HTTP 422 on
every heartbeat, leaving IP record stale after fresh install)
- peers.py / Peers.jsx: webdav service_access only valid when 'files' store
service is installed; was always shown even with no services, confusing
users into thinking WebDAV was pre-installed
- 10 new regression tests: DDNS update body contract, Caddy always
regenerates on startup with no services, peer role allowed on
/api/services/active, webdav gating by installed services
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- DNS (critical): add _configured_dns_params() that returns (primary_domain,
split_horizon_zones) from config_manager so all apply_all_dns_rules() callers
pass the correct primary zone (e.g. 'pic.ngo') and split-horizon list
(e.g. ['pic1.pic.ngo']) instead of the FQDN as the primary — fixes
DNS_PROBE_FINISHED_BAD_CONFIG for all external domains when on VPN
- firewall_manager: add split_horizon_zones param to apply_all_dns_rules()
and forward it to generate_corefile()
- Peers: filter service_access list to installed services only; peers.py
derives valid services from config_manager.get_installed_services() with
the email→mail ID mapping; Peers.jsx fetches from /api/store/installed
and filters the checkboxes and defaults accordingly
- Health check: fix file_manager→'files' ID mapping so files service health
is checked when installed (was silently skipped due to 'file' vs 'files')
- Verbosity persistence: move log_levels.json from non-mounted
/app/api/config/ to CONFIG_DIR (/app/config/) which maps to config/api/
on the host; both load (managers.py) and save (routes/services.py) updated
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The GET /api/cells/invite endpoint was returning domain='pic.ngo' instead
of the full FQDN 'test5.pic.ngo' because it read _identity.domain rather
than _identity.domain_name.
Apply the same domain_name preference (domain_name || domain) to:
- routes/cells.py get_cell_invite() — the invite shown to connecting cells
- routes/cells.py update_cell_permissions() — Corefile DNS regeneration
- cell_link_manager.py _check_invite_conflicts() — incoming domain collision check
- cell_link_manager.py exchange_invites() — own invite construction
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The e2e tests were reading a stale Corefile at a hardcoded fallback path
(/home/roof/pic/config/dns/Corefile) instead of the live one written by
the API (/opt/pic/config/dns/Corefile on pic1). Adding a proper API
endpoint eliminates the path ambiguity.
The iptables test was checking whether peer_ip, DROP, and dpt:80 appeared
anywhere in the full multi-line output rather than on the same rule line,
producing false positives. Now checks per line.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Allows fetching a single peer by name. E2E tests need this to verify
persisted peer state after PUT operations.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Endpoint override:
- Add PUT /api/wireguard/endpoint to set endpoint_override in identity
config; GET returns detected, override, and effective endpoints
- _effective_endpoint() helper applies override in peer config generation
(wireguard.py and peer_dashboard.py); detected IP still shown in UI
- Add Endpoint Override input in WireGuard page — solves the common case
where auto-detected IP is a gateway/VPS but peers connect via LAN IP
Docker cell-network fix:
- Declare cell-network external in docker-compose.yml; Docker Compose v5
enforces label ownership and rejects networks created by older versions
- Makefile start/update pre-create cell-network idempotently
- reinstall/uninstall(full) explicitly delete and recreate the network
- Fix uninstall loop path: data/api/services/ (not data/services/)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- CaddyManager: add refresh_cert_status() and get_cert_status_fresh() that
open a live TLS connection to cell-caddy:443 to read cert expiry; avoids
needing a volume mount into the API container
- CaddyManager: periodic cert refresh in health_monitor_loop (every 60 cycles)
- config.py PUT /api/ddns: publish IDENTITY_CHANGED so CaddyManager regenerates
the Caddyfile immediately after any domain/cell_name change — previously the
event was never fired from this route
- config.py: remove all ip_utils.write_caddyfile() calls; CaddyManager is now
the sole authority for Caddyfile generation
- app.py: add GET /api/caddy/cert-status route
- app.py: add GET /api/egress/status and PUT /api/egress/services/<id>/exit routes
- Settings.jsx: display cert status badge (valid/expired/internal/unknown) with
expiry date and days-remaining in the domain section
- Tests: TestRefreshCertStatus (8 tests), TestDdnsConfigUpdatesFiresIdentityChanged,
TestCaddyCertStatusRoute added; fix expired-cert helper to set not_valid_before
relative to expiry so it's always earlier
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- add_peer() now calls account_manager.provision() for any installed store
service whose manifest declares accounts.manager == 'http', enabling
per-peer credential provisioning to third-party HTTP services
- reapply_on_startup() calls egress_manager.apply_all() so fwmark rules
survive container restarts without manual intervention
- add GET /api/egress/status and PUT /api/egress/services/<id>/exit routes
so the UI can read and override per-service egress policy
- tests: HTTP provision wiring (happy path + non-fatal failure), egress
apply_all at startup (wired/unwired/failure cases)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- docker-compose.services.yml: change external network name from
pic_cell-network to cell-network so store-service compose files can find
it. The project-prefixed name was overriding the explicit name: cell-network
fix in docker-compose.yml when both files were merged by make start.
- service_store.py: normalize docker compose stderr into the error key in
the 400 response so the Store page shows the actual failure reason instead
of the generic fallback message.
- app.py: skip health checks for email/calendar/files managers when those
optional store services are not installed — prevents false Down alerts and
unnecessary noise in health history.
- Logs.jsx: remove Email/Calendar/Files columns from the health history table;
they are optional store services, not core builtins that should always appear.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Email/calendar/files routes now return 404 when the service is not
installed, using a require_active_service decorator that checks
ServiceRegistry. Status endpoints are exempt so health checks always work.
SetupManager.complete_setup() now accepts a service_store_manager and
installs any wizard-selected services in a background daemon thread after
setup completes. Failures are logged but do not fail the wizard.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Email, calendar, and files no longer appear in the nav or as usable pages
unless they are installed. The nav refreshes whenever a service is installed
or removed via the new pic-services-changed CustomEvent.
Changes:
- routes/services.py: add GET /api/services/active endpoint
- api.js: add servicesAPI.listActive()
- App.jsx: replace hardcoded coreServiceChildren with dynamic state fetched
from /api/services/active; SERVICE_META maps ids to nav entry shapes
- ServiceNotInstalledBanner.jsx: new component — admin gets catalog link,
peer gets "contact admin" message
- EmailPage/CalendarPage/FilesPage: show banner when service not installed
- ServicesIndex.jsx: remove CoreServiceCard + CORE_SERVICES "Built-in"
section; rename Remove → Uninstall; dispatch pic-services-changed on
install/uninstall success
- MyServices.jsx: conditionally render service cards based on active list;
placeholder card when absent; page-level notice when nothing is installed
- tests/test_services_active_endpoint.py: 4 new endpoint tests
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously, CaddyManager and NetworkManager contained hardcoded lists of
service names (calendar, files, mail, webdav, etc.), meaning every new
service required a code change to appear in Caddy routes and DNS records.
Now both managers accept a service_registry parameter and derive their
service lists dynamically from the registry at runtime.
- CaddyManager: new _build_registry_service_routes() and
_http01_service_pairs() methods pull routes from the registry
- NetworkManager: new _get_service_subdomains() method returns registry
subdomains with a hardcoded fallback when no registry is wired in;
_build_dns_records, stale-record detection, and service name sets all
use the registry
- managers.py: service_registry constructed before network_manager so it
can be injected into both CaddyManager and NetworkManager
- service_registry.py: validation chokepoint in get_caddy_routes() rejects
invalid subdomain/backend values and reserved service names
- service_store_manager.py: _validate_manifest now validates top-level
subdomain, backend, extra_subdomains, and extra_backends fields
- tests: 24 new tests covering registry-driven routing and DNS subdomain
generation (test_caddy_registry_integration.py)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rename Store → Services: ServicesIndex.jsx shows built-in core services
(Email, Calendar, Files) with Manage links, plus the existing add-on
store below.
New service sub-pages at /services/email|calendar|files serve both
admin and peer roles. Admins see connection info, service status, users
list, and an inline config form (port/data-dir). Peers see connection
info and their personal credentials fetched from peerAPI.
Navigation restructured: a Services parent item expands to show the
three sub-pages via a collapsible sidebar group (ChevronDown toggle).
Both admin and peer navigation include the Services group. Sidebar
extracted NavItem/NavList components to eliminate the duplicate mobile/
desktop rendering.
Settings.jsx drops EmailForm, CalendarForm, FilesForm and their
SERVICE_DEFS entries. Port conflict detection and per-service validation
logic extracted to utils/serviceConfig.js, shared by Settings and the
new service pages. Service form flushers are registered without cleanup
so the Apply banner saves dirty config even when the user navigates away
from a service page before clicking Apply.
Legacy routes /email, /calendar, /files, /store redirect to their new
canonical paths.
GET /api/config now includes installed_services so the nav can derive
which add-ons are installed without a separate store fetch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In DDNS modes (pic_ngo, cloudflare, duckdns, http01), all built-in
services are now reachable as subdomains of the cell domain, e.g.
calendar.pic1.pic.ngo instead of pic1.pic.ngo/calendar.
Key changes:
- CaddyManager._build_core_service_routes(): new helper generates
Caddy named-matcher host blocks for calendar, mail/webmail, files,
webdav, and api subdomains within the wildcard TLS server block.
- All ACME modes (pic_ngo, cloudflare, duckdns) use the new
subdomain matchers; http01 emits a dedicated server block per service.
- http01: installed store-plugin services whose name clashes with a
core service are skipped to prevent duplicate server blocks.
- routes/config.py: ip_utils.write_caddyfile() is skipped in non-LAN
modes so LAN Caddy config never overwrites the ACME config.
- firewall_manager.generate_corefile(): new split_horizon_zones param
adds local authoritative file zones so LAN clients resolve
*.pic1.pic.ngo to the internal Caddy IP without hairpin NAT.
- NetworkManager.update_split_horizon_zone(): writes the wildcard zone
file and regenerates the Corefile with the split-horizon block;
called automatically after every identity change in non-LAN mode.
- Added @ to allowed record-name chars in update_dns_zone validation.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ConfigManager.get_effective_domain(): returns domain_name when DDNS
active (pic_ngo/cloudflare/duckdns), domain otherwise. Used by all
public-facing services so they use the real registered FQDN.
- ConfigManager.get_internal_domain(): always returns _identity.domain
(CoreDNS zone name, dnsmasq, cell-link invites — stays internal).
- Silent migration: if domain_mode != lan and domain is generic "cell",
auto-set to {cell_name}.local for unique CoreDNS zone naming.
- caddy_manager: fix custom_domain bug — cloudflare/http01 modes were
reading identity.get('custom_domain') which never exists; now reads
domain_name correctly.
- routes/config, app: expose effective_domain in GET /api/config and
/api/status responses.
- email_manager, routes/email: use get_effective_domain() for
OVERRIDE_HOSTNAME, POSTMASTER_ADDRESS, and new-user email defaults.
- ServiceBus.IDENTITY_CHANGED event: emitted from PUT /api/config and
POST /api/ddns/register after identity writes; caddy_manager and
email_manager subscribe to regenerate config automatically.
- Settings.jsx: hide Local Domain input in non-LAN modes; show
read-only effective_domain with "managed by DDNS" badge and an
Advanced toggle for the internal CoreDNS zone name.
- 11 new test classes covering all new helpers, event subscriptions,
caddy/email handlers, and the custom_domain fix.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- DDNSTokenExpired exception triggers auto re-register in update_ip()
so cells recover silently after a DDNS DB reset
- POST /api/ddns/register lets the user force re-registration from Settings
- Re-register button in Settings → External Domain & DDNS (pic_ngo only)
- 3 new tests covering register endpoint: wrong provider, missing name, success
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- GET /api/config now returns domain_mode, domain_name, ddns.{provider,subdomain,has_token}
- GET /api/ddns/check/<name> proxies availability check to DDNS service
- PUT /api/ddns validates and saves cloudflare/duckdns credentials post-setup
- When cell_name changes for pic_ngo provider, auto-registers the new subdomain
- Settings: Cell Name shows availability badge for pic_ngo; auto-save blocks on taken
- Settings: new External Domain & DDNS section — pic_ngo info, cloudflare/duckdns edit
- 11 new tests for the two new endpoints (all pass)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_check_pic_ngo_available was hardcoding https://ddns.pic.ngo, ignoring
DDNS_URL. Now imports DDNS_API_BASE from setup_manager so both the
availability check and DDNS registration use the same configured URL.
API container now receives DDNS_URL and DDNS_TOTP_SECRET from env.
Default DDNS_URL points to http://ddns.pic.ngo:8080/api/v1 (the
FastAPI service runs on port 8080 without TLS termination in front).
Also returns 503 (not 500) when the DDNS service is unreachable.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
install.sh no longer prompts for anything. It installs packages (with sudo),
creates the system user, clones the repo, and runs 'make install' — all as
the invoking user. Only package installs and system-level ops use sudo.
All folder creation happens under the user's own account, no chown needed.
/setup wizard gains the missing validation that was previously in install.sh:
- Step 1: checks pic.ngo name availability via backend (non-blocking)
- Step 4: 'Verify token' button for Cloudflare and DuckDNS tokens,
validated server-side through new /api/setup/validate steps
API changes (routes/setup.py):
- validate step 'pic_ngo_available': proxy check to ddns.pic.ngo
- validate step 'cloudflare_token': verify via Cloudflare tokens API
- validate step 'duckdns_token': verify via DuckDNS update endpoint
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reverts 8d1ef39. The installer must collect cell name, domain mode, and
provider tokens before 'make install' so that DDNS registration,
availability checks, and Caddy TLS can be configured at first boot.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three related fixes for split-tunnel peers that need to reach connected cells:
1. apply_peer_rules/apply_all_peer_rules now accept wg_subnet (actual local VPN
subnet) and cell_subnets (connected cells' vpn_subnets) parameters instead of
hardcoding 10.0.0.0/24. All callers (startup, add_peer, update_peer,
apply-enforcement endpoint) pass the real values.
2. Explicit ACCEPT rules are inserted in FORWARD for each connected cell's
subnet so split-tunnel peers (internet_access=False) can still reach
connected cells via the wg0→wg0 path.
3. apply_ip_range in network_manager now loads cell_links.json and passes it
to generate_corefile(), fixing a race where the bootstrap DNS thread could
overwrite the Corefile and wipe cross-cell DNS forwarding zones on startup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a cell is connected to others, changing the local WireGuard address
or Docker ip_range to a subnet that overlaps a connected cell's vpn_subnet
would break routing. Both now return 409 with the conflicting cell name.
- wireguard.address: derive network from new address, check all connected
cells' vpn_subnet for overlap (after existing format validation)
- ip_range: check all connected cells' vpn_subnet for overlap (after
existing RFC-1918 validation)
Tests: 4 cases each (overlap → 409, no overlap → ok, no cells → ok,
format error still fires first → 400).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two gaps allowed a cell to take a domain already in use by a connected cell:
1. PUT /api/config domain change: added check against cell_link_manager's
connected cells list before saving — returns 409 if the new domain
collides with any connected cell's domain.
2. accept_invite healing path: a remote cell changing its domain via a
re-invite was not validated against other connected cells' domains.
Now calls _check_invite_conflicts(invite, exclude_cell=name) before
applying any change.
Also: the healing path now detects domain changes (alongside dns_ip/
vpn_subnet/endpoint), updates the stored domain, and refreshes the DNS
forward rule when the domain changes.
Tests: 3 new domain-conflict tests in test_config_validation.py;
3 new accept_invite healing tests in test_cell_link_manager.py.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three issues fixed together:
1. WireGuard address changes now go through the pending-restart queue
(shown in the UI banner) instead of restarting cell-wireguard immediately.
Only private_key changes still restart immediately; address and port
changes both defer to the user-initiated Apply flow. Previously the
address change was silently applied and never appeared in Settings →
Pending Configuration.
2. When the WG address changes, the API spawns a background thread that
pushes the updated invite to all connected cells (over LAN, before the
WG tunnel is back up). This lets remote cells automatically update
their dns_ip, AllowedIPs, and CoreDNS forwarding rules without manual
re-pairing.
3. accept_invite now handles the "already connected but changed" case:
if the remote cell re-sends an invite with a different dns_ip, vpn_subnet
or endpoint, we update the stored link, the WG AllowedIPs, and the
CoreDNS forward rule in place — no delete/re-add required. Previously
the endpoint was ignored and returned the stale record unchanged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three fixes:
1. Extend the docker-exec safety guard in wireguard_manager to also check
for 'wg_confs' in the config path. When running unit tests on the host
the API uses /app/config/wireguard/wg0.conf (no wg_confs subdir), so the
old '/tmp/' | 'pytest' check didn't fire — _syncconf and friends were
executing live 'docker exec cell-wireguard wg set' calls against the
running container, removing real VPN peers that didn't appear in the
test config. The wg_confs subdir only exists inside the container mount,
so its presence reliably gates live calls.
2. Fix get_split_tunnel_ips() wrong path: self.data_dir + 'api/cell_links.json'
→ self.data_dir + 'cell_links.json'. The extra 'api/' segment produced
/app/data/api/cell_links.json inside the container instead of the real
/app/data/cell_links.json, so connected cells were silently excluded from
split-tunnel CIDRs.
3. update_peer_ip_registry and ip_update now also call
wireguard_manager.update_peer_ip so wg0.conf AllowedIPs stay in sync when
a peer's VPN IP changes at runtime (previously only peers.json was updated).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
**Auto mutual pairing**
When Cell A imports Cell B's invite (POST /api/cells on A), A now
immediately pushes its own invite to Cell B over the LAN (using the
endpoint IP, before the WG tunnel exists) via the new endpoint:
POST /api/cells/peer-sync/accept-invite
Cell B auto-adds Cell A as a WireGuard peer and DNS forward, completing
the bidirectional tunnel without any manual action on Cell B's UI.
The endpoint is idempotent and unauthenticated (runs before WG tunnel).
Previously, the pairing was one-sided: Cell A had Cell B as a WG peer
but Cell B never had Cell A — the tunnel never established and all
cross-cell operations silently failed.
**Conflict detection (add_connection + accept-invite)**
_check_invite_conflicts() now validates before connecting:
- VPN subnet must not overlap own subnet or any already-connected cell's subnet
- Domain must not match own domain or any already-connected cell's domain
Returns clear error messages so the admin knows which cell to reconfigure.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
DNS A records now return the WireGuard server IP (10.0.0.1) instead of
Docker bridge VIPs so cross-cell peers resolve service names correctly
regardless of their bridge subnet. DNAT rules (wg0:53→cell-dns:53 and
wg0:80→cell-caddy:80) are applied at startup. Caddy routes by Host header,
eliminating the Docker bridge subnet conflict. Firewall cell rules allow
DNS and service (Caddy) traffic from linked cell subnets. Split-tunnel
AllowedIPs now dynamically includes connected-cell VPN subnets and drops
the 172.20.0.0/16 range. Peers with route_via set now receive full-tunnel
config (0.0.0.0/0) so all their traffic exits via the remote cell.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the ability to route a specific peer's internet traffic through a
connected cell acting as an exit relay.
Cell A side:
- PUT /api/peers/<peer>/route-via {"via_cell": "cellB"} sets route_via
- Updates WG AllowedIPs to include 0.0.0.0/0 for the exit cell peer
- Adds ip rule + ip route in policy table inside cell-wireguard so the
specific peer's traffic egresses via cellB's WG IP
- Sets exit_relay_active on the cell link and pushes use_as_exit_relay=True
to cellB via peer-sync
Cell B side:
- Receives use_as_exit_relay in the peer-sync payload
- Calls apply_cell_rules(..., exit_relay=True) to add FORWARD -o eth0 ACCEPT
- Stores remote_exit_relay_active flag for startup recovery
Startup recovery:
- apply_all_cell_rules passes exit_relay=remote_exit_relay_active (cellB)
- _apply_startup_enforcement reapplies ip rule for each peer with route_via (cellA)
since policy routing rules don't survive container restart
peer_registry gets route_via field with lazy migration.
22 new tests across test_cell_link_manager, test_peer_registry, test_peer_route_via.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the ability for a cell to signal to a peer that it's willing to
route internet traffic on their behalf. This is the signaling layer
for Phase 3 (per-peer routing via exit cell).
Changes:
- cell_links.json: exit_offered (bool) + remote_exit_offered (bool)
fields with lazy migration (default false for existing records)
- _push_permissions_to_remote: includes exit_offered in the push body
- apply_remote_permissions: accepts exit_offered kwarg; stores it as
remote_exit_offered on the matching cell link
- peer-sync receiver: passes exit_offered from body to apply_remote_permissions
- CellLinkManager.set_exit_offered(cell_name, offered): persists +
triggers push so the remote learns of our offer immediately
- PUT /api/cells/<name>/exit-offer: REST endpoint to toggle the flag
- 12 new tests covering all new paths
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When PIC A updates service sharing permissions, it immediately pushes
the mirrored state to PIC B over the WireGuard tunnel so B's UI shows
what A is sharing with it in real time.
Architecture:
- Push model: update_permissions() → _push_permissions_to_remote() →
POST /api/cells/peer-sync/permissions on remote cell
- Auth: source IP must be inside a known cell's vpn_subnet (WireGuard
tunnel proves identity) + body's from_public_key must match stored key
- Mirror semantics: our inbound (what we share) → their outbound view
- Non-fatal: push failures set pending_push=True; replay_pending_pushes()
retries at startup so offline cells catch up on reconnect
- add_connection() also pushes initial state so remote sees permissions
immediately on the first connect
New fields on cell_links.json records (lazy-migrated):
remote_api_url, last_push_status, last_push_at, last_push_error,
pending_push, last_remote_update_at
New endpoint: POST /api/cells/peer-sync/permissions
30 new tests (1101 total).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 1 — connection fixes:
- routing_manager.stop(): remove iptables -F / -t nat -F nuclear flush that
would wipe WireGuard MASQUERADE and all peer rules on any UI stop action
- wireguard_manager.add_cell_peer(): reject vpn_subnet that overlaps the local
WG network (routing blackhole — was the root cause of no handshake)
- wireguard_manager._syncconf(): pass Endpoint to 'wg set' so cell peers with
static endpoints are synced to the kernel (not just AllowedIPs)
Phase 2 — service-sharing permissions backend:
- firewall_manager: add _cell_tag(), clear_cell_rules(), apply_cell_rules(),
apply_all_cell_rules() — iptables FORWARD rules for cell-to-cell traffic
using 'pic-cell-<name>' comment tags, distinct from 'pic-peer-*'
- app.py startup enforcement: call apply_all_cell_rules(cell_links) so rules
survive API restarts
- cell_link_manager: permissions schema {inbound, outbound} per service;
lazy migration for existing entries; update_permissions(), get_permissions();
apply_cell_rules wired into add_connection/remove_connection
- routes/cells.py: GET /api/cells/services, GET+PUT /api/cells/<n>/permissions;
RuntimeError now returns 400 (not 500) from add_connection
Removed broken 'test' cell (subnet 10.0.0.0/24 collided with local WG network).
Second PIC must use a distinct subnet (e.g. 10.0.1.0/24) before reconnecting.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>