fix: resolve all Cell Identity banner and cert issues
Unit Tests / test (push) Successful in 7m17s

Four bugs fixed:

1. Banner delay (up to 5 s): DraftConfigContext now exposes isDirty as
   reactive useState so App.jsx re-renders immediately when any section
   marks itself dirty, instead of waiting for the next checkPending() poll.

2. Banner re-triggers after Apply (race): For non-'*' container restarts
   (e.g., cell_name → DNS restart) the background thread took ~300 ms to
   clear _pending_restart. A concurrent checkPending() poll could see
   needs_restart=True and overwrite the frontend's optimistic clear.
   Fix: set needs_restart=False and applying=True synchronously before
   spawning the thread.

3. Apply showed banner during applyPending() when hasDirty()==false:
   setApplyStatus('saving') was skipped for the auto-save-then-apply
   path, leaving applyStatus=null while applyPending() ran and the
   banner stayed visible. Always set 'saving' before applyPending().

4. Cert status always 'unknown' in pic_ngo mode: _check_cert_via_ssl
   connected to cell-caddy:443 but sent SNI='cell-caddy'. Caddy finds no
   matching cert and returns nothing. Fix: pass the effective public
   domain (e.g. pic1.pic.ngo) as SNI so Caddy returns the right cert.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-06-10 04:17:56 -04:00
parent ec8995d41e
commit 649378b59b
7 changed files with 171 additions and 19 deletions
+17 -4
View File
@@ -634,7 +634,16 @@ class CaddyManager(BaseServiceManager):
else:
caddy_host = os.environ.get('CADDY_CERT_HOST', 'cell-caddy')
caddy_port = int(os.environ.get('CADDY_HTTPS_PORT', '443'))
result = self._check_cert_via_ssl(caddy_host, caddy_port)
# Use the effective domain as TLS SNI so Caddy serves the right
# certificate. Without this, Caddy receives SNI='cell-caddy' which
# matches no cert and the handshake returns nothing.
sni = None
if self.config_manager:
try:
sni = self.config_manager.get_effective_domain()
except Exception:
pass
result = self._check_cert_via_ssl(caddy_host, caddy_port, sni=sni)
status = result if result is not None else {
'status': 'unknown', 'expiry': None, 'days_remaining': None
}
@@ -649,14 +658,18 @@ class CaddyManager(BaseServiceManager):
return status
@staticmethod
def _check_cert_via_ssl(hostname: str, port: int = 443) -> Optional[Dict[str, Any]]:
"""Open an SSL connection and return cert expiry info, or None on failure."""
def _check_cert_via_ssl(hostname: str, port: int = 443, sni: str = None) -> Optional[Dict[str, Any]]:
"""Open an SSL connection and return cert expiry info, or None on failure.
Connect to hostname:port but present sni (if given) as the TLS server
name so Caddy returns the right certificate for the public domain.
"""
ctx = _ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = _ssl.CERT_NONE
try:
with _socket.create_connection((hostname, port), timeout=5) as raw:
with ctx.wrap_socket(raw, server_hostname=hostname) as tls:
with ctx.wrap_socket(raw, server_hostname=sni or hostname) as tls:
der = tls.getpeercert(binary_form=True)
if not der:
return None
+7 -1
View File
@@ -792,6 +792,12 @@ def apply_pending_config():
+ (' (network_recreate)' if needs_network_recreate else '')
)
else:
# Clear needs_restart immediately so frontend polls don't see stale
# state while the container restart runs in the background thread.
config_manager.configs['_pending_restart']['needs_restart'] = False
config_manager.configs['_pending_restart']['applying'] = True
config_manager._save_all_configs()
def _do_apply():
import time as _time
import subprocess as _subprocess
@@ -808,7 +814,7 @@ def apply_pending_config():
logger.error(f"docker compose up failed: {result.stderr.strip()}")
else:
logger.info(f'docker compose up completed for: {containers}')
_clear_pending_restart()
_clear_pending_restart()
threading.Thread(target=_do_apply, daemon=False).start()
return jsonify({