Skip to content

Rotate Atlas secrets

Atlas reads every credential at process start. Rotation therefore reduces to: update the secret in its source of truth, point the env var or mounted file at the new value, and restart the affected service. The audit found live credentials in /opt/atlas/.env on disk; this runbook is the canonical way to change them without leaking the old value into a log or a git commit.

  • .env is gitignored (/.env* in .gitignore). Never commit a real .env.
  • The repository checks out only .env.example. If you ever need to copy a credential into the repo, copy into .env, never the example.
  • make doctor should be green before and after every rotation.
  1. Rotate in PgBouncer / Postgres: ALTER USER atlas WITH PASSWORD '<new>';
  2. Update ATLAS_DB_DSN in .env (or in the orchestrator’s secret store).
  3. make ship-services — every Go service is restarted; pgxpool reconnects.
  4. Verify /readyz returns postgres: ok.

DJED AI gateway virtual key (DJED_AI_API_KEY) — preferred path

Section titled “DJED AI gateway virtual key (DJED_AI_API_KEY) — preferred path”

Atlas defaults to gateway mode: the planner talks to djed-litellm through one OpenAI-compatible client and the gateway owns provider routing, fallbacks, retries, cache, and per-cell virtual keys. Atlas’s key is minted by the DJED provisioner; rotating it never touches an upstream provider key.

  1. Mint a fresh cell virtual key:
    Terminal window
    djed-provisioner ai mint --cell atlas
    # or, on the DJED host:
    /opt/djed/scripts/cell-provision.sh ai --cell atlas
  2. Replace DJED_AI_API_KEY in .env with the new value.
  3. make ship-planner (and make ship-services so the gateway picks up the new key for the /readyz AI dependency).
  4. Verify /readyz returns openrouter: gateway mode (no recent failure) and the next /answer/stream request emits a real LLM answer instead of the baseline fallback.
  5. Revoke the old virtual key:
    Terminal window
    djed-provisioner ai revoke --cell atlas --key <old>

Mode check. curl -fsS http://127.0.0.1:58104/api/atlas/planner/llm-status includes "mode": "gateway" when DJED_AI_BASE_URL is set; legacy direct-OpenRouter mode reads "mode": "legacy".

OpenRouter API key (OPENROUTER_API_KEY) — legacy fallback

Section titled “OpenRouter API key (OPENROUTER_API_KEY) — legacy fallback”

Used only when DJED_AI_BASE_URL is unset (the planner emits a deprecation warning the first time it falls back). Prefer migrating to the gateway mode above.

  1. Mint a new key at openrouter.ai.
  2. Replace the value in .env (and Zitadel-managed secret if you push to it).
  3. make ship-planner and make ship-services (the gateway reads the key for clean-core readiness reporting).
  4. Verify /readyz returns openrouter: legacy mode (no recent failure) and the next /answer/stream request emits a real LLM answer rather than the baseline fallback.
  5. Revoke the old key.

BFF session secret (ATLAS_BFF_SESSION_SECRET)

Section titled “BFF session secret (ATLAS_BFF_SESSION_SECRET)”

Rotating this signs everyone out — schedule accordingly.

  1. Generate a new value: openssl rand -hex 32.
  2. Replace ATLAS_BFF_SESSION_SECRET in .env.
  3. make ship-services. Existing atlas_session cookies fail signature verification and clients fall back to the OIDC code flow.

BFF client secret (ATLAS_BFF_CLIENT_SECRET)

Section titled “BFF client secret (ATLAS_BFF_CLIENT_SECRET)”
  1. Rotate in Zitadel for the Atlas web app application.
  2. Replace ATLAS_BFF_CLIENT_SECRET in .env.
  3. make ship-services. Verify /api/auth/session for an authenticated browser still returns 200.

Zitadel machine key (KEYSTONE_MACHINE_KEY_FILE)

Section titled “Zitadel machine key (KEYSTONE_MACHINE_KEY_FILE)”
  1. Mint a new machine key in Zitadel for the atlas-graphsync service user.
  2. Replace the file at /opt/djed/config/zitadel-keys/atlas-graphsync.key.json atomically (mv the new file over the old).
  3. make ship-services for atlas-gateway, atlas-graphsync, atlas-resolver.
  4. Verify make ship-ontology succeeds (still requires the ACL grant — see grant-ontology-acl).
  1. Rotate in Redis (CONFIG SET requirepass <new> then CONFIG REWRITE).
  2. Update ATLAS_REDIS_URL in .env (the URL form already includes auth).
  3. make ship-services to restart the gateway. BFF sessions are stored in Redis db=6 and survive a Redis password rotation as long as the new URL reaches the same db.

Add the following hook (or wire into your existing pre-commit framework) so nothing under .env* (other than the example) can ever land in a commit:

.git/hooks/pre-commit
#!/usr/bin/env bash
if git diff --cached --name-only | grep -E '^\.env($|\.[^e])' > /dev/null; then
echo "Refusing to commit .env files. Use .env.example for template values."
exit 1
fi

chmod +x .git/hooks/pre-commit to enable. CI should run the same regex against the diff for completeness.