diff --git a/CLAUDE.md b/CLAUDE.md index a78fc37..2904ed5 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,87 +1,282 @@ # CLAUDE.md -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. +This file is the primary context source for Claude Code in this repository. Read it fully before touching any code. -## What This Project Is +--- -**Personal Internet Cell (PIC)** — a self-hosted digital infrastructure platform. It manages DNS, DHCP, NTP, WireGuard VPN, email, calendar/contacts (CalDAV), file storage (WebDAV), reverse proxy (Caddy), a certificate authority, and container orchestration, all from a single API + React UI. +## Project Overview -## Common Commands +**Personal Internet Cell (PIC)** is a self-hosted digital infrastructure platform for individuals who want full ownership of their core internet services without relying on cloud providers. -```bash -# Full stack -make start # docker-compose up -d -make stop # docker-compose down -make restart # docker-compose restart -make status # docker status + API health -make logs # docker-compose logs -f -make build # rebuild api image +A PIC instance runs DNS, DHCP, NTP, WireGuard VPN, email (SMTP/IMAP), calendar/contacts (CalDAV/CardDAV), file storage (WebDAV), HTTPS reverse proxy (Caddy), an internal certificate authority, and optional third-party services — all managed from a single REST API and a React web UI. No manual config-file editing is required for normal operations. -# Tests -make test # pytest tests/ api/tests/ -make test-coverage # pytest with coverage HTML report -make test-api # pytest tests/test_api_endpoints.py -pytest tests/test_.py # single test file +**Primary users:** technically capable individuals, homelab operators, small families or teams. -# Local dev (no Docker) -pip install -r api/requirements.txt -python api/app.py # Flask API on :3000 +**What the product optimizes for:** +- One-command install, browser-based first-run wizard, no manual `.env` editing for identity +- Everything managed through the API and UI — the user should never need to `ssh` for day-to-day operations +- Security by default: session auth, CSRF protection, WireGuard isolation, internal CA, no open API port +- Reliability and observability: structured logs, health monitoring, automated config backups -cd webui && npm install && npm run dev # React UI on :5173 (proxies API to :3000) +**Key constraints:** +- Runs on a single Linux host with Docker; no Kubernetes, no swarm +- Must work on Debian, Ubuntu, Fedora, RHEL, and Alpine +- The Flask API must never be exposed directly; Caddy always proxies it +- All secrets live in `data/` (git-ignored), never in the repo -# WireGuard -make show-routes -make add-peer PEER_NAME=foo PEER_IP=10.0.0.5 PEER_KEY= -make list-peers -``` +--- + +## Tech Stack + +### Backend +- **Python 3.11** — Flask REST API (`api/app.py`) +- **Flask** — routing, sessions, before-request hooks (enforce_setup, enforce_auth, check_csrf) +- **bcrypt** — password hashing in `AuthManager` +- **Docker SDK for Python** — container lifecycle in `ContainerManager` +- **PyNaCl / Age** — encryption in `VaultManager` +- **pyotp** — TOTP for DDNS registration + +### Frontend +- **React 18** — SPA +- **Vite** — dev server and build (proxies `/api` → `:3000`) +- **Tailwind CSS** — all styling; no custom CSS files +- **Axios** — all API calls go through `src/services/api.js` + +### Infrastructure +- **Docker Compose** — all 12+ service containers +- **Caddy** — reverse proxy, TLS termination (Let's Encrypt DNS-01 or HTTP-01 or internal CA) +- **CoreDNS** — `.cell` TLD authoritative DNS +- **dnsmasq** — DHCP +- **chrony** — NTP +- **WireGuard** — VPN (kernel module, not userspace) +- **Postfix + Dovecot** — email via `docker-mailserver` +- **Radicale** — CalDAV/CardDAV +- **PowerDNS** — authoritative DNS on the DDNS VPS (separate repo: `pic-ddns`) + +### CI/CD +- **Gitea Actions** — unit tests on every push, image builds on tag +- **act_runner** — self-hosted runner on pic0 (192.168.31.51) +- **Gitea Container Registry** — images pushed to `git.pic.ngo` + +Do not introduce: Redux, styled-components, SQLAlchemy, Celery, or any async framework (asyncio/FastAPI) into the main API unless explicitly requested. + +--- ## Architecture -### Backend (`api/`) +``` +Browser / WireGuard peer + └── Caddy (:80/:443) TLS termination, reverse proxy + └── React SPA (:8081) Vite + Tailwind (Nginx in container) + └── Flask API (:3000) REST API, bound to 127.0.0.1 only + ├── NetworkManager CoreDNS, dnsmasq, chrony + ├── WireGuardManager WireGuard peer lifecycle + ├── PeerRegistry peer registration and trust + ├── EmailManager Postfix + Dovecot + ├── CalendarManager Radicale CalDAV/CardDAV + ├── FileManager WebDAV + Filegator + ├── RoutingManager iptables NAT and routing + ├── FirewallManager iptables INPUT/FORWARD rules + ├── VaultManager internal CA, TLS certs, Age encryption + ├── ContainerManager Docker SDK + ├── CellLinkManager site-to-site WireGuard links + ├── ConnectivityManager per-peer exit routing (WG ext, OpenVPN, Tor) + ├── DDNSManager dynamic DNS heartbeat + ├── ServiceStoreManager optional service install/remove + ├── CaddyManager Caddyfile generation and reload + ├── AuthManager bcrypt passwords, session auth, RBAC + └── SetupManager first-run wizard state +``` -All service managers inherit `BaseServiceManager` (`api/base_service_manager.py`). This enforces a consistent interface: `get_status()`, `get_config()`, `update_config()`, `validate_config()`, `test_connectivity()`, `get_logs()`, `restart_service()`. When adding or modifying a service manager, follow this pattern. +### Key files -The `ServiceBus` (`api/service_bus.py`) is a pub/sub event system used for inter-service communication. Services publish events (e.g., `SERVICE_STARTED`, `CONFIG_CHANGED`, `PEER_CONNECTED`) and subscribe to events from dependencies. Dependency graph is declared in the bus — e.g., `wireguard` depends on `network`; `email` depends on `network` and `vault`. +| File | Role | +|---|---| +| `api/app.py` | Flask app, all REST endpoints, before-request hooks, health monitor thread | +| `api/managers.py` | Singleton instantiation of all service managers | +| `api/base_service_manager.py` | Abstract base class: `get_status`, `get_config`, `update_config`, `validate_config`, `test_connectivity`, `get_logs`, `restart_service` | +| `api/config_manager.py` | Single source of truth for `cell_config.json` — all read/write goes through here | +| `api/service_bus.py` | Pub/sub event system between managers | +| `webui/src/services/api.js` | Axios API client — all UI→API calls | +| `docker-compose.yml` | Container definitions and network topology | +| `Makefile` | All operational commands | +| `install.sh` | Bash installer served via `https://install.pic.ngo` | -`ConfigManager` (`api/config_manager.py`) is the single source of truth. Config lives in `/app/config/cell_config.json` (mapped from `config/api/`). All managers read/write through ConfigManager, which validates against per-service schemas and maintains automatic backups. +### Directory layout -`LogManager` (`api/log_manager.py`) provides structured JSON logging with rotation (5 MB / 5 backups per service). Use it instead of `print()` or raw `logging`. +``` +api/ Flask API and all service managers +webui/ React SPA (Vite + Tailwind) +tests/ pytest unit tests (no running services required) +tests/integration/ require a running PIC stack +tests/e2e/ Playwright UI and WireGuard e2e tests +config/ Runtime config per service (mostly git-ignored) +data/ Runtime secrets and state (fully git-ignored) +scripts/ Setup and maintenance scripts +install.sh One-line installer entry point +Makefile All make targets +docker-compose.yml +``` -`app.py` (2000+ lines) contains all Flask REST endpoints, organized by service. It runs a background health-monitoring thread. +### Config and secrets -Service managers: -- `network_manager.py` — DNS (CoreDNS), DHCP (dnsmasq), NTP (chrony) -- `wireguard_manager.py` — VPN peer lifecycle, QR codes -- `peer_registry.py` — peer registration/lookup -- `routing_manager.py` — NAT, firewall rules, VPN gateway -- `vault_manager.py` — internal certificate authority -- `email_manager.py` — Postfix + Dovecot -- `calendar_manager.py` — Radicale CalDAV/CardDAV -- `file_manager.py` — WebDAV storage -- `container_manager.py` — Docker SDK wrappers -- `cell_manager.py` — top-level orchestration +- Runtime config: `config/api/cell_config.json` — managed by `ConfigManager`, never edit directly +- Secrets and user data: `data/` — git-ignored, contains `auth_users.json`, WireGuard keys, DDNS token, CA key +- DDNS config lives under the top-level `ddns` key in `cell_config.json`, accessed via `config_manager.configs.get('ddns', {})` +- Do not read `_identity.domain` expecting a dict — it is a plain string (the domain mode, e.g. `"pic_ngo"`) -### Frontend (`webui/`) +### Before-request hooks (app.py) -React 18 + Vite + Tailwind CSS. All API calls go through `src/services/api.js` (Axios). Vite dev server proxies `/api` to `localhost:3000`. Pages in `src/pages/`, shared components in `src/components/`. +Three hooks run on every request in this order: +1. `enforce_setup` — returns 428 for all `/api/*` except `/api/setup/*` and `/health` until setup is complete. Skipped when `app.config['TESTING']` is True. +2. `enforce_auth` — returns 401 if no session; returns 503 if users file exists but is empty (misconfiguration). Skipped when `app.config['TESTING']` is True. +3. `check_csrf` — requires `X-CSRF-Token` header on all mutating requests except `/api/auth/*` and `/api/setup/*`. -### Infrastructure +--- -`docker-compose.yml` defines 13 services on a custom bridge network `cell-network` (172.20.0.0/16). Cell IPs default to 10.0.0.0/24. Key ports: 53 (DNS), 80/443 (Caddy), 3000 (API), 5173/8081 (WebUI), 51820/udp (WireGuard), 25/587/993 (mail), 5232 (CalDAV), 8080 (WebDAV). +## Coding Conventions -Config files for each service live under `config//`. Persistent data is under `data/` (git-ignored). WireGuard configs are also git-ignored. +### Python (API) -## Testing +- All managers inherit `BaseServiceManager` — always implement all abstract methods +- Use `self.logger` (from `BaseServiceManager`) — never `print()` or raw `logging` +- Config reads go through `self.config_manager` — never open `cell_config.json` directly +- Use `threading.RLock` for shared state; managers run in a multi-threaded Flask app +- Do not use `any` typing; be explicit +- Keep Flask route handlers thin — business logic belongs in the manager, not in `app.py` +- Error responses must be JSON: `jsonify({'error': '...'}), ` +- Do not catch bare `Exception` and silently swallow it — log at minimum -Tests live in `tests/` (28 files). Use mocking (`pytest-mock`) for external system calls. Integration tests in `test_integration.py` require Docker services running. +### JavaScript (webui) -## AI Collaboration Rules (Claude Code) +- All API calls go through `src/services/api.js` — never use `fetch` or a new Axios instance directly +- Use functional components; no class components +- Tailwind utilities only — no inline styles, no custom CSS files +- Keep page components in `src/pages/`, reusable UI in `src/components/` +- State: local `useState`/`useEffect` is fine; no Redux or global state library + +### General + +- No comments that describe *what* the code does — only *why* if non-obvious +- No dead code, no commented-out blocks +- No backwards-compat shims for things being removed +- Prefer editing existing files over creating new ones +- Tests that write to disk: mock `builtins.open` with `OSError` rather than relying on `/nonexistent/path` (CI runs as root and can create any path) + +--- + +## Testing and Quality + +Before considering any task complete: +1. Run `make test` — all 1500+ unit tests must pass +2. Fix failures before committing — the pre-commit hook will block the commit anyway + +### Rules + +- Use `unittest.mock` / `pytest-mock` for all Docker, filesystem, and subprocess calls +- Tests must pass in CI (rootless environment where filesystem assumptions don't hold) +- When testing write-failure paths, mock `builtins.open` with `side_effect=OSError` — do not rely on unwritable paths +- Integration tests (`tests/integration/`) require a running stack — exclude from CI with `--ignore=tests/integration` +- E2e tests (`tests/e2e/`) require Playwright — exclude from CI with `--ignore=tests/e2e` +- Add tests for any new API endpoint, manager method, or utility function +- Do not add tests for Flask routing boilerplate or trivial getters — test behaviour, not structure + +--- + +## File Placement Rules + +| New thing | Where it goes | +|---|---| +| New service manager | `api/_manager.py`, registered in `api/managers.py` and wired into `app.py` | +| New API endpoints | `app.py` — grouped with the relevant manager's existing endpoints | +| New React page | `webui/src/pages/` | +| Reusable UI component | `webui/src/components/` | +| New pytest test file | `tests/test_.py` | +| Operational script | `scripts/` | +| Documentation | Update `README.md`, `QUICKSTART.md`, or `Personal Internet Cell – Project Wiki.md` as appropriate | + +Do not create a new abstraction for a single use case. Do not create near-duplicate files — edit the existing one. + +--- + +## Safety Rules + +- **Never expose the Flask API port (3000) directly** — it must always be behind Caddy +- **Never commit secrets** — `data/`, `.env`, `*.key`, `*.pem` are all git-ignored; keep it that way +- **Do not modify `enforce_setup` or `enforce_auth` hooks** without understanding the full auth flow — these are the security boundary +- **Do not change the `cell_config.json` schema** without updating `ConfigManager` validation and all manager reads +- **Do not rename API route paths** without checking the webui `api.js` client and any external callers +- **Do not modify WireGuard key generation** — losing the server private key means all peers must be re-provisioned +- Flag any change to auth flow, CSRF logic, or session management as security-sensitive before implementing + +--- + +## Commands + +```bash +# Stack lifecycle (always use make — never call docker/docker-compose directly) +make start # build and start all containers +make stop # stop all containers +make restart # restart containers +make status # container status + API health check +make logs # follow all container logs +make logs-api # follow API logs only +make logs-caddy # follow Caddy logs +make shell-api # shell inside the API container +make build-api # rebuild API image after code change +make build-webui # rebuild webui image after code change + +# Tests +make test # pytest tests/ --ignore=tests/e2e --ignore=tests/integration +make test-coverage # coverage report in htmlcov/ +pytest tests/test_.py -v # single test file + +# Local dev (no Docker) +pip install -r api/requirements.txt +python3 api/app.py # Flask API on :3000 + +cd webui && npm install && npm run dev # React UI on :5173 (proxies /api → :3000) + +# Peer / WireGuard +make list-peers +make show-routes + +# Admin password +make show-admin-password +make reset-admin-password + +# Backup / restore +make backup +make restore + +# Maintenance +make update # git pull + rebuild + restart +make uninstall # stop containers; prompt to delete config/ and data/ +``` + +--- + +## Infrastructure Topology + +| Machine | IP | Role | +|---|---|---| +| pic0 | 192.168.31.51 | Dev machine — you are here. Run all commands directly. | +| pic1 | 192.168.31.52 | Test/staging PIC instance | +| Gitea | 192.168.31.50 | Self-hosted git server (`gitea@192.168.31.50:roof/pic.git`) | +| DDNS VPS | 192.168.31.101 (LAN) / 178.168.15.65 (public) | PowerDNS + FastAPI for `*.pic.ngo` DDNS | + +The `roof` user on pic0 has passwordless sudo and is in the `docker` group — use both freely. + +--- + +## AI Collaboration Rules These rules apply to every Claude Code session in this repo: -- **Read memory first** — load `/home/roof/.claude/projects/-home-roof/memory/MEMORY.md` and referenced files at session start. -- **Dev machine context** — you are already on pic0 (192.168.31.51), the dev machine. Execute commands here directly; do not ask the user to run them. -- **Use all available agents** — spawn specialized sub-agents (pic-remote, pic-qa, pic-architect, etc.) for tasks that match their description. -- **make is the only interface** — never call docker/docker-compose directly. All container lifecycle operations go through `make start`, `make stop`, `make build`, `make logs`, etc. -- **Test every new feature** — after implementing any change, run `make test` before considering the task done. -- **Test before commit** — the pre-commit hook enforces this, but run `make test` manually first and fix all failures before staging files. +- **Read memory first** — load `/home/roof/.claude/projects/-home-roof/memory/MEMORY.md` at session start; follow referenced memory files for relevant context. +- **You are on pic0** — execute commands directly here; do not ask the user to run them. +- **`make` is the only container interface** — never call `docker` or `docker-compose` directly. All container lifecycle goes through `make start`, `make stop`, `make build`, `make logs`, etc. +- **Use specialized agents** — spawn `pic-remote` for VPS/pic1 SSH tasks, `pic-qa` for test writing, `pic-architect` for design decisions, `pic-designer` for UI review, `pic-devops` for docker-compose/Makefile changes, `pic-writer` for documentation. +- **Test before commit** — run `make test` and fix all failures before staging. The pre-commit hook enforces this, but run it manually first. +- **No skipping hooks** — never use `--no-verify` unless the only change is documentation or a workflow file with no Python/JS. +- **Commits need context** — write commit messages that explain *why*, not just *what*. Always add the Co-Authored-By trailer.