Files
roof dde4d9a53f
Unit Tests / test (push) Successful in 8m54s
Rewrite CLAUDE.md following article best practices
Adds: tech stack, coding conventions, file placement rules, safety rules,
infrastructure topology table, and expands architecture with key-file table
and before-request hook documentation. Removes vague guidance, replaces
with actionable rules Claude can follow automatically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 07:25:53 -04:00

283 lines
13 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CLAUDE.md
This file is the primary context source for Claude Code in this repository. Read it fully before touching any code.
---
## Project Overview
**Personal Internet Cell (PIC)** is a self-hosted digital infrastructure platform for individuals who want full ownership of their core internet services without relying on cloud providers.
A PIC instance runs DNS, DHCP, NTP, WireGuard VPN, email (SMTP/IMAP), calendar/contacts (CalDAV/CardDAV), file storage (WebDAV), HTTPS reverse proxy (Caddy), an internal certificate authority, and optional third-party services — all managed from a single REST API and a React web UI. No manual config-file editing is required for normal operations.
**Primary users:** technically capable individuals, homelab operators, small families or teams.
**What the product optimizes for:**
- One-command install, browser-based first-run wizard, no manual `.env` editing for identity
- Everything managed through the API and UI — the user should never need to `ssh` for day-to-day operations
- Security by default: session auth, CSRF protection, WireGuard isolation, internal CA, no open API port
- Reliability and observability: structured logs, health monitoring, automated config backups
**Key constraints:**
- Runs on a single Linux host with Docker; no Kubernetes, no swarm
- Must work on Debian, Ubuntu, Fedora, RHEL, and Alpine
- The Flask API must never be exposed directly; Caddy always proxies it
- All secrets live in `data/` (git-ignored), never in the repo
---
## Tech Stack
### Backend
- **Python 3.11** — Flask REST API (`api/app.py`)
- **Flask** — routing, sessions, before-request hooks (enforce_setup, enforce_auth, check_csrf)
- **bcrypt** — password hashing in `AuthManager`
- **Docker SDK for Python** — container lifecycle in `ContainerManager`
- **PyNaCl / Age** — encryption in `VaultManager`
- **pyotp** — TOTP for DDNS registration
### Frontend
- **React 18** — SPA
- **Vite** — dev server and build (proxies `/api``:3000`)
- **Tailwind CSS** — all styling; no custom CSS files
- **Axios** — all API calls go through `src/services/api.js`
### Infrastructure
- **Docker Compose** — all 12+ service containers
- **Caddy** — reverse proxy, TLS termination (Let's Encrypt DNS-01 or HTTP-01 or internal CA)
- **CoreDNS** — `.cell` TLD authoritative DNS
- **dnsmasq** — DHCP
- **chrony** — NTP
- **WireGuard** — VPN (kernel module, not userspace)
- **Postfix + Dovecot** — email via `docker-mailserver`
- **Radicale** — CalDAV/CardDAV
- **PowerDNS** — authoritative DNS on the DDNS VPS (separate repo: `pic-ddns`)
### CI/CD
- **Gitea Actions** — unit tests on every push, image builds on tag
- **act_runner** — self-hosted runner on pic0 (192.168.31.51)
- **Gitea Container Registry** — images pushed to `git.pic.ngo`
Do not introduce: Redux, styled-components, SQLAlchemy, Celery, or any async framework (asyncio/FastAPI) into the main API unless explicitly requested.
---
## Architecture
```
Browser / WireGuard peer
└── Caddy (:80/:443) TLS termination, reverse proxy
└── React SPA (:8081) Vite + Tailwind (Nginx in container)
└── Flask API (:3000) REST API, bound to 127.0.0.1 only
├── NetworkManager CoreDNS, dnsmasq, chrony
├── WireGuardManager WireGuard peer lifecycle
├── PeerRegistry peer registration and trust
├── EmailManager Postfix + Dovecot
├── CalendarManager Radicale CalDAV/CardDAV
├── FileManager WebDAV + Filegator
├── RoutingManager iptables NAT and routing
├── FirewallManager iptables INPUT/FORWARD rules
├── VaultManager internal CA, TLS certs, Age encryption
├── ContainerManager Docker SDK
├── CellLinkManager site-to-site WireGuard links
├── ConnectivityManager per-peer exit routing (WG ext, OpenVPN, Tor)
├── DDNSManager dynamic DNS heartbeat
├── ServiceStoreManager optional service install/remove
├── CaddyManager Caddyfile generation and reload
├── AuthManager bcrypt passwords, session auth, RBAC
└── SetupManager first-run wizard state
```
### Key files
| File | Role |
|---|---|
| `api/app.py` | Flask app, all REST endpoints, before-request hooks, health monitor thread |
| `api/managers.py` | Singleton instantiation of all service managers |
| `api/base_service_manager.py` | Abstract base class: `get_status`, `get_config`, `update_config`, `validate_config`, `test_connectivity`, `get_logs`, `restart_service` |
| `api/config_manager.py` | Single source of truth for `cell_config.json` — all read/write goes through here |
| `api/service_bus.py` | Pub/sub event system between managers |
| `webui/src/services/api.js` | Axios API client — all UI→API calls |
| `docker-compose.yml` | Container definitions and network topology |
| `Makefile` | All operational commands |
| `install.sh` | Bash installer served via `https://install.pic.ngo` |
### Directory layout
```
api/ Flask API and all service managers
webui/ React SPA (Vite + Tailwind)
tests/ pytest unit tests (no running services required)
tests/integration/ require a running PIC stack
tests/e2e/ Playwright UI and WireGuard e2e tests
config/ Runtime config per service (mostly git-ignored)
data/ Runtime secrets and state (fully git-ignored)
scripts/ Setup and maintenance scripts
install.sh One-line installer entry point
Makefile All make targets
docker-compose.yml
```
### Config and secrets
- Runtime config: `config/api/cell_config.json` — managed by `ConfigManager`, never edit directly
- Secrets and user data: `data/` — git-ignored, contains `auth_users.json`, WireGuard keys, DDNS token, CA key
- DDNS config lives under the top-level `ddns` key in `cell_config.json`, accessed via `config_manager.configs.get('ddns', {})`
- Do not read `_identity.domain` expecting a dict — it is a plain string (the domain mode, e.g. `"pic_ngo"`)
### Before-request hooks (app.py)
Three hooks run on every request in this order:
1. `enforce_setup` — returns 428 for all `/api/*` except `/api/setup/*` and `/health` until setup is complete. Skipped when `app.config['TESTING']` is True.
2. `enforce_auth` — returns 401 if no session; returns 503 if users file exists but is empty (misconfiguration). Skipped when `app.config['TESTING']` is True.
3. `check_csrf` — requires `X-CSRF-Token` header on all mutating requests except `/api/auth/*` and `/api/setup/*`.
---
## Coding Conventions
### Python (API)
- All managers inherit `BaseServiceManager` — always implement all abstract methods
- Use `self.logger` (from `BaseServiceManager`) — never `print()` or raw `logging`
- Config reads go through `self.config_manager` — never open `cell_config.json` directly
- Use `threading.RLock` for shared state; managers run in a multi-threaded Flask app
- Do not use `any` typing; be explicit
- Keep Flask route handlers thin — business logic belongs in the manager, not in `app.py`
- Error responses must be JSON: `jsonify({'error': '...'}), <status_code>`
- Do not catch bare `Exception` and silently swallow it — log at minimum
### JavaScript (webui)
- All API calls go through `src/services/api.js` — never use `fetch` or a new Axios instance directly
- Use functional components; no class components
- Tailwind utilities only — no inline styles, no custom CSS files
- Keep page components in `src/pages/`, reusable UI in `src/components/`
- State: local `useState`/`useEffect` is fine; no Redux or global state library
### General
- No comments that describe *what* the code does — only *why* if non-obvious
- No dead code, no commented-out blocks
- No backwards-compat shims for things being removed
- Prefer editing existing files over creating new ones
- Tests that write to disk: mock `builtins.open` with `OSError` rather than relying on `/nonexistent/path` (CI runs as root and can create any path)
---
## Testing and Quality
Before considering any task complete:
1. Run `make test` — all 1500+ unit tests must pass
2. Fix failures before committing — the pre-commit hook will block the commit anyway
### Rules
- Use `unittest.mock` / `pytest-mock` for all Docker, filesystem, and subprocess calls
- Tests must pass in CI (rootless environment where filesystem assumptions don't hold)
- When testing write-failure paths, mock `builtins.open` with `side_effect=OSError` — do not rely on unwritable paths
- Integration tests (`tests/integration/`) require a running stack — exclude from CI with `--ignore=tests/integration`
- E2e tests (`tests/e2e/`) require Playwright — exclude from CI with `--ignore=tests/e2e`
- Add tests for any new API endpoint, manager method, or utility function
- Do not add tests for Flask routing boilerplate or trivial getters — test behaviour, not structure
---
## File Placement Rules
| New thing | Where it goes |
|---|---|
| New service manager | `api/<name>_manager.py`, registered in `api/managers.py` and wired into `app.py` |
| New API endpoints | `app.py` — grouped with the relevant manager's existing endpoints |
| New React page | `webui/src/pages/` |
| Reusable UI component | `webui/src/components/` |
| New pytest test file | `tests/test_<module>.py` |
| Operational script | `scripts/` |
| Documentation | Update `README.md`, `QUICKSTART.md`, or `Personal Internet Cell Project Wiki.md` as appropriate |
Do not create a new abstraction for a single use case. Do not create near-duplicate files — edit the existing one.
---
## Safety Rules
- **Never expose the Flask API port (3000) directly** — it must always be behind Caddy
- **Never commit secrets** — `data/`, `.env`, `*.key`, `*.pem` are all git-ignored; keep it that way
- **Do not modify `enforce_setup` or `enforce_auth` hooks** without understanding the full auth flow — these are the security boundary
- **Do not change the `cell_config.json` schema** without updating `ConfigManager` validation and all manager reads
- **Do not rename API route paths** without checking the webui `api.js` client and any external callers
- **Do not modify WireGuard key generation** — losing the server private key means all peers must be re-provisioned
- Flag any change to auth flow, CSRF logic, or session management as security-sensitive before implementing
---
## Commands
```bash
# Stack lifecycle (always use make — never call docker/docker-compose directly)
make start # build and start all containers
make stop # stop all containers
make restart # restart containers
make status # container status + API health check
make logs # follow all container logs
make logs-api # follow API logs only
make logs-caddy # follow Caddy logs
make shell-api # shell inside the API container
make build-api # rebuild API image after code change
make build-webui # rebuild webui image after code change
# Tests
make test # pytest tests/ --ignore=tests/e2e --ignore=tests/integration
make test-coverage # coverage report in htmlcov/
pytest tests/test_<module>.py -v # single test file
# Local dev (no Docker)
pip install -r api/requirements.txt
python3 api/app.py # Flask API on :3000
cd webui && npm install && npm run dev # React UI on :5173 (proxies /api → :3000)
# Peer / WireGuard
make list-peers
make show-routes
# Admin password
make show-admin-password
make reset-admin-password
# Backup / restore
make backup
make restore
# Maintenance
make update # git pull + rebuild + restart
make uninstall # stop containers; prompt to delete config/ and data/
```
---
## Infrastructure Topology
| Machine | IP | Role |
|---|---|---|
| pic0 | 192.168.31.51 | Dev machine — you are here. Run all commands directly. |
| pic1 | 192.168.31.52 | Test/staging PIC instance |
| Gitea | 192.168.31.50 | Self-hosted git server (`gitea@192.168.31.50:roof/pic.git`) |
| DDNS VPS | 192.168.31.101 (LAN) / 178.168.15.65 (public) | PowerDNS + FastAPI for `*.pic.ngo` DDNS |
The `roof` user on pic0 has passwordless sudo and is in the `docker` group — use both freely.
---
## AI Collaboration Rules
These rules apply to every Claude Code session in this repo:
- **Read memory first** — load `/home/roof/.claude/projects/-home-roof/memory/MEMORY.md` at session start; follow referenced memory files for relevant context.
- **You are on pic0** — execute commands directly here; do not ask the user to run them.
- **`make` is the only container interface** — never call `docker` or `docker-compose` directly. All container lifecycle goes through `make start`, `make stop`, `make build`, `make logs`, etc.
- **Use specialized agents** — spawn `pic-remote` for VPS/pic1 SSH tasks, `pic-qa` for test writing, `pic-architect` for design decisions, `pic-designer` for UI review, `pic-devops` for docker-compose/Makefile changes, `pic-writer` for documentation.
- **Test before commit** — run `make test` and fix all failures before staging. The pre-commit hook enforces this, but run it manually first.
- **No skipping hooks** — never use `--no-verify` unless the only change is documentation or a workflow file with no Python/JS.
- **Commits need context** — write commit messages that explain *why*, not just *what*. Always add the Co-Authored-By trailer.