Site-to-site WireGuard tunnels between PIC cells with automatic DNS forwarding.
Each cell generates an invite JSON (public key, endpoint, VPN subnet, DNS IP,
domain); the remote cell imports it to establish a bidirectional tunnel and
CoreDNS forwarding block so each cell's domain resolves across the mesh.
Backend:
- CellLinkManager: invite generation, add/remove connections, live WireGuard
handshake status; stores links in data/cell_links.json
- WireGuardManager: add_cell_peer() accepts subnet CIDRs (not /32) and an
optional endpoint for site-to-site peers; _read_iface_field() reads port,
address, and network directly from wg0.conf at runtime instead of constants
- NetworkManager: add/remove CoreDNS forwarding blocks per remote cell domain
- app.py: /api/cells/* routes; _next_peer_ip() derives VPN range from
configured address so peer allocation follows any address change
Frontend:
- CellNetwork page: invite panel (JSON + QR), connect form (paste JSON),
connected cells list (green/red status, disconnect button)
- App.jsx: Cell Network nav entry and route
Tests: 25 new tests across test_wireguard_manager, test_network_manager,
test_cell_link_manager (263 total)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend:
- wireguard_manager: _get_configured_port/address/network() read from wg0.conf
instead of module-level constants; get_split_tunnel_ips() derives VPN network
from configured Address; get_server_config() returns configured port, dns_ip,
split_tunnel_ips, vpn_network
- add_peer() and get_peer_config() use configured port (not hardcoded 51820)
- _next_peer_ip() derives subnet from wireguard_manager._get_configured_address()
so new peers are allocated IPs from the correct VPN range after address change
- refresh-ip and check-port API endpoints return configured port, not 51820
- PUT /api/config: when wireguard port/address changes, all peers are marked
config_needs_reinstall so users know to re-download tunnel configs
- get_peer_config endpoint: uses configured split tunnel IPs (not hardcoded)
Frontend:
- Peers.jsx: SERVICES domains use live domain from ConfigContext; generateConfig()
uses serverConf.dns_ip and serverConf.split_tunnel_ips; vpn_network shown in
peer-access description; DNS hint uses live domain; server config loaded at
mount time so it is available without re-fetching on every peer action;
handleUpdatePeer uses /32 for server-side AllowedIPs (was incorrectly using
full/split tunnel CIDRs which the backend rejects)
- WireGuard.jsx: generateWireGuardConfig() uses serverConfig.dns_ip,
split_tunnel_ips from server-config API; split-tunnel description shows
live IPs
Tests: 9 new tests in TestWireGuardConfigReads verify all config reads
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Changes:
- ConfigContext.jsx: React context that loads /api/config once; exposes domain,
cell_name, refresh() — wraps entire app in App.jsx
- Email/Calendar/Files pages: replace hardcoded 'mail.cell', 'calendar.cell',
'files.cell', 'webdav.cell' with domain from ConfigContext; hostname updates
immediately after Settings save (refreshConfig() called on save)
- /api/status: cell_name and domain now read from stored _identity in config_manager,
not hardcoded 'personal-internet-cell' / 'cell.local'
- network_manager.apply_cell_name(old, new): updates hostname A-record in primary
zone file and reloads CoreDNS; called from PUT /api/config when cell_name changes
- Old identity captured before save so apply_cell_name gets the correct old value
- Settings EmailForm: smtp/imap ports are read-only with note (docker-compose.yml level)
- Settings FilesForm: port is read-only with note (Caddy proxies on 80 externally)
- Settings CalendarForm: port labeled "Internal port; clients use 80 via Caddy"
Tests added:
- test_apply_cell_name_renames_host_record
- test_apply_cell_name_noop_when_same
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
config_manager restore_config and import_config previously injected zero-filled
entries (port=0, domain='') for every service schema regardless of whether that
service was in the backup/import data. Removed this logic — only restore what's
actually in the backup.
network_manager.apply_domain now:
- updates dnsmasq.conf domain= line (reload cell-dhcp)
- rewrites Corefile zone blocks to the new domain name
- renames and rewrites the primary zone file $ORIGIN + SOA records
- reloads CoreDNS
Tests added first (TDD):
- test_restore_does_not_zero_unconfigured_services
- test_restore_does_not_zero_import
- test_apply_domain_updates_corefile (zone file + Corefile)
- test_apply_domain_updates_dnsmasq
- test_apply_config_writes_dhcp_range / ntp_servers
- test_apply_config_updates_mailserver_env / no_domain_no_restart
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each service manager now has apply_config() that writes to the actual config:
- network: dhcp_range → dnsmasq.conf (reload cell-dhcp), ntp_servers → chrony.conf
(restart cell-ntp), domain → dnsmasq.conf domain= line
- email: domain → mailserver.env OVERRIDE_HOSTNAME + POSTMASTER_ADDRESS,
restart cell-mail
- wireguard: port/address/private_key → wg0.conf ListenPort/Address/PrivateKey,
restart cell-wireguard
- calendar: port → radicale config hosts=, restart cell-radicale
PUT /api/config now calls apply_config() after persisting JSON, and returns
{restarted: [...], warnings: [...]} so Settings UI can show which containers
were restarted. _restart_container() helper added to BaseServiceManager.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- PUT /api/config now calls service_manager.update_config() for each service
so changes write to the service's own config file, not just cell_config.json
- email_manager.get_status() now reads smtp_port/imap_port/domain from its
config file (defaults: 587/993/cell.local) and includes them in the response
- calendar_manager.get_status() includes configured port (default 5232)
- file_manager.get_status() uses configured port from service config
- Email.jsx reads imap_port/smtp_port from API status instead of hardcoding
- Settings service sections show "port changes require container restart" note
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
app.py:
- Alert logic now checks status.running (container up/down) instead of healthy
(which requires connectivity tests) — services are only alerted when actually down
- Add POST /api/health/history/clear endpoint to reset history + alert counters
log_manager.py:
- get_all_log_file_infos: include rotated backup files (*.log.1, *.log.2 ...) in listing,
marked with backup=true so UI can dim them and hide rotate button
api.js: add monitoringAPI.clearHealthHistory
Logs page:
- Health History: add Clear button with confirmation
- File panel: show full filename (including .log.1 backups), explain host path and naming,
hide rotate button for backup files
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Service manager fixes (connectivity tests):
- email_manager: replace telnet with socket.create_connection for SMTP/IMAP;
replace nslookup with socket.getaddrinfo for DNS; exclude unconfigured domain
from success (email healthy=False now correctly means ports refused, not missing domain)
- calendar_manager: replace localhost:5232 with cell-radicale:5232;
fix database check to test dir writability instead of file existence (files created on demand)
- file_manager: replace localhost:8080 with cell-webdav:80; add top-level success key
- network_manager: replace nslookup with socket.getaddrinfo;
add success key to dhcp_test and ntp_test return values
- routing_manager: exclude iptables_access from success
(iptables runs in cell-wireguard, not API container)
- wireguard_manager: add success key to no-arg test_connectivity result
Health history UI:
- SvcCol reads data?.status?.running || data?.status?.status — handles nested health check shape
Result: network/wireguard/calendar/files/routing/vault all healthy=True.
Email healthy=False is correct — mail server needs ≥1 account before Dovecot starts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
docker-compose.yml:
- Add json-file logging driver (max-size: 10m, max-file: 5) to all 13 containers
- Docker now owns container stdout/stderr rotation automatically
- Add ./data/logs:/app/api/data/logs volume to API — service logs now persist across restarts
log_manager.py:
- Remove container log collection hack (Docker handles it natively)
- Add set_service_level(service, level) — change log level at runtime without restart
- Add get_service_levels() — return current per-service levels
- Simplify get_all_log_file_infos to return only service log files
app.py:
- Add GET /api/logs/verbosity — return current per-service log levels
- Add PUT /api/logs/verbosity — update levels at runtime, persist to config/log_levels.json
- Load persisted log level overrides at startup from log_levels.json
- Simplify rotate endpoint (service logs only, container logs owned by Docker)
wireguard_manager.py:
- get_keys(): return empty strings if key files don't exist (prevents get_status crash
when wg0.conf is missing at startup and falls through to generate_config)
Logs page (4 tabs):
- API Service Logs: structured JSON logs from Python managers, with search/filter/rotate panel
- Container Logs: live docker logs (read via existing /api/containers/<name>/logs endpoint)
- Verbosity Config: per-service level dropdowns, apply immediately + persist
- Health History: existing health poll table
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- log_manager: add collect_container_logs (appends docker logs to container_<name>.log),
get_container_log_lines, rotate_container_log, get_all_log_file_infos
- app.py: new endpoints /api/logs/files (all log file sizes), /api/logs/containers/<name>
(collect+return stored container logs); rotate endpoint now handles both service and container logs
- Logs page: split into API Service Logs tab (python manager logs) and Container Logs tab
(persistent docker stdout/stderr); Statistics tab shows both kinds with per-row rotate;
each tab has a description explaining what it shows and where files live
- wireguard_manager: test_connectivity peer_ip=None guard (already in previous commit, now rebuilt)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- WireGuardManager.test_connectivity: make peer_ip optional so health_check
can call it without args (was logging ERROR on every health poll)
- Logs page: add ALL option to service selector (uses search across all services)
- Logs page: show service tag on each log line when in ALL/search mode
- Logs page: require window.confirm before rotating logs to prevent accidental data loss
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The connectivity endpoint was calling routing_manager.test_connectivity()
(no args, internal health check) instead of test_routing_connectivity(target_ip).
Also ping/traceroute aren't installed in the API container; run them via
docker exec cell-wireguard instead.
Updated test_api_endpoints to mock test_routing_connectivity and cover
the new DELETE /firewall/<id> and GET /live-iptables endpoints.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Added a path guard: if the config file resolves to /tmp/ or a pytest
temp dir, _syncconf bails out immediately. Without this, tests calling
add_peer/remove_peer with a temp-dir WireGuardManager would connect to
the live cell-wireguard container and remove production peers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Server-side access control:
- firewall_manager.py: per-peer iptables FORWARD rules in WireGuard container;
virtual IPs on Caddy (172.20.0.21-24) for per-service DROP/ACCEPT targeting
- CoreDNS Corefile regenerated with ACL blocks for blocked services per peer
- POST /api/wireguard/apply-enforcement re-applies rules after WireGuard restart;
wg0.conf PostUp calls it via curl so rules restore automatically on container start
WireGuard fixes:
- _syncconf uses `wg set peer` instead of `wg syncconf` to avoid resetting ListenPort
- add_peer validates AllowedIPs must be /32 — rejects full/split tunnel CIDRs that
would route internet or LAN traffic to that peer
- _config_file() checks for linuxserver wg_confs/ subdirectory first
UI:
- Peers page fetches /api/wireguard/peers/statuses for live handshake data;
status badge now shows real Online/Offline + seconds since last handshake
- IP field removed from Add Peer form (auto-assigned from 10.0.0.0/24)
Tests (246 pass):
- test_firewall_manager.py: 22 tests for ACL generation, iptables rule correctness,
comment tagging, clear_peer_rules filter logic
- test_peer_wg_integration.py: 10 tests for /32 enforcement, IP auto-assignment,
syncconf called on add/remove
- test_wireguard_manager.py: updated to reflect correct IPs and /32 requirement
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Peer creation/edit form now configures:
- Tunnel mode: full (0.0.0.0/0) or split (PIC only)
- Per-service access toggles (calendar, files, mail, webdav)
- Peer-to-peer communication toggle
- Optional calendar account creation
- Access capability badges in peer list
Bug fixes:
- DNS in client configs was 8.8.8.8 / 172.20.0.2 — now 172.20.0.3 (CoreDNS)
This was why .cell domains didn't resolve on connected VPN peers
- get_peer_config API uses stored internet_access to set AllowedIPs
- New PUT /api/peers/<name> endpoint with config_changed detection
- POST /api/peers/<name>/clear-reinstall clears reinstall flag after download
- Routing page reads real host routes via /proc/1/net/route (pid: host)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- WireGuard default changed to full tunnel (0.0.0.0/0) — all peer traffic
routes through PIC server so internet latency matches server's clean 41ms
- UI tunnel toggle now defaults to Full tunnel
- API /peers/config accepts allowed_ips param so UI toggle wires through
- Routing page reads real host routes via /proc/1/net/route (pid: host)
instead of mock data; shows ens18/192.168.31.1 correctly
- Add iproute2 + util-linux to API Dockerfile
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
wg show outputs "listening port" not "listen port" — substring mismatch
caused port status to always show Blocked. Add webdav.cell, webmail.cell,
api.cell to Caddyfile and cell.zone so VPN peers can reach all services.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Assign static IPs to all 13 containers (172.20.0.2–13) so DNS zone
records match actual container IPs regardless of start order.
- Update cell.zone: all .cell domains now point to cell-caddy (172.20.0.2)
which is the correct single entry point via Caddy reverse proxy.
- Create config/radicale/config so the calendar container actually starts.
- Fix webdav: replace empty users.passwd with USERNAME/PASSWORD env vars.
- Fix DNS fallback IP in wireguard_manager: 172.20.0.2→172.20.0.3 (cell-dns).
- Remove duplicate http://ui.cell from Caddyfile.
- Add persistent data volumes for rainloop and filegator.
- Fix mail domainname placeholder (yourdomain.com→cell.local).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- check_port_open now checks if wg0 interface is actually listening (via
'wg show wg0') instead of requiring a live peer handshake. This means
the port shows 'Open' whenever WireGuard is running, not only when a
peer has connected recently.
- get_peer_config defaults to split-tunnel AllowedIPs (10.0.0.0/24,
172.20.0.0/16) so VPN clients only route cell service traffic through
the tunnel. Local LAN traffic (192.168.x.x etc.) stays direct, fixing
the 60-120ms penalty when pinging local hosts while on VPN.
- Peer config modal now uses cell DNS (172.20.0.2) so .cell domains
resolve correctly with both split and full tunnel.
- Added split/full tunnel toggle in the peer config modal so users can
download either config variant.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a full-tunnel VPN client pings the server's own public IP, traffic
loops out through Docker's external interface and back, causing 60-120ms
jitter. The DNAT PostUp rule intercepts packets from wg0 destined for the
public IP and redirects them to 10.0.0.1 (the VPN interface), keeping
traffic entirely inside the tunnel.
Also updates SERVER_ADDRESS from 172.20.0.1/16 to 10.0.0.1/24 to avoid
routing conflict with the Docker bridge network on eth0.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix CoreDNS not loading .cell zones (wrong Corefile path, now uses -conf flag)
- Fix WireGuard server address conflict (172.20.0.1/16 overlapped with Docker
network; changed to 10.0.0.1/24 to eliminate duplicate routes)
- Add SERVERMODE=true and sysctls to WireGuard docker-compose for server mode
- Fix DNS zone file parser to handle 4-field records (name IN type value)
- Add get_dns_records() to NetworkManager; mount data/dns into API container
- Fix peer config endpoint: look up IP/key from registry, use real endpoint
- Add bulk peer statuses endpoint keyed by public_key
- Normalize snake_case API fields to camelCase in WireGuard UI
- Add port check endpoint (checks via live handshake, not unreliable TCP probe)
- Add Caddy virtual hosts for ui/calendar/files/mail .cell domains (HTTP only)
- Fix cell config domain default from cell.local to cell
- Fix Routing Network Config tab (was calling hardcoded localhost:3000)
- Fix DNS records display (record.value not record.ip)
- Move service access guide to top of Dashboard with login hints
- Add /api/routing/setup endpoint
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- WireGuardManager: get_external_ip() (cached 1h), check_port_open(),
get_server_config() returning public_key + detected endpoint
- API: /api/wireguard/server-config returns real external IP;
/api/wireguard/refresh-ip forces re-detection;
/api/wireguard/peers/config now looks up peer IP + private key from
registry and uses real server endpoint automatically
- Fix doubled port in Endpoint (178.x:51820:51820 → 178.x:51820)
- Fix Address=/32 when peer_ip already has mask
- WebUI nginx: proxy /api/ and /health to cell-api (fixes localhost:3000
hardcode — UI now works from any machine)
- api.js: baseURL='' so all calls go through nginx proxy
- WireGuard page: show Server Endpoint card with external IP, endpoint,
public key, and Refresh IP button
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>