Release checklist
Atlas ships as one cell. make ship-all drives the seven phases of the
deploy contract and only reports success when every phase passes its
verification step. Use this page as the pre- and post-deploy playbook.
Pre-deploy
Section titled “Pre-deploy”- Clean working tree.
git statusis empty and you are onmain. - Secrets present.
.envhas all of:ATLAS_DB_DSN,DJED_AI_API_KEY(orOPENROUTER_API_KEY),KEYSTONE_MACHINE_KEY_FILE,OIDC_*,ATLAS_BFF_*if BFF is enabled. - Lint + tests green.
make lint && make test-unitboth exit 0. - Docs build clean.
bun run --cwd docs buildexits 0; link-check CI gates dead internal links.
Deploy — what each phase verifies
Section titled “Deploy — what each phase verifies”| Phase | Target | Verification |
|---|---|---|
| 1 | ship-migrate | docker compose run --rm atlas-migrate exits 0. Captured directly — there is no separate verify step to inspect a torn-down container. |
| 2 | ship-services | go vet ./..., compose build, compose up, every Go service reaches Docker-level healthy within 15 s, curl /health → 200 on the gateway, curl /health → 200 on graphsync. |
| 3 | ship-planner | compose up, planner reaches healthy within 10 s. |
| 4 | ship-ontology | Gateway container registers ontology + shapes via machine-token. A 403 is a release-blocking ACL regression now that Atlas ontology ACLs are provisioned; ATLAS_ONTOLOGY_ALLOW_ACL_GAP=1 is reserved for an explicit emergency override. |
| 5 | ship-landing | bun install + build + rsync to /srv/atlas-landing/. |
| 6 | ship-ui | npm ci + vite build + rsync to /srv/atlas-ui/app/. |
| 7 | ship-docs | bun run build + rsync to /srv/atlas-docs/. |
Post-deploy smoke
Section titled “Post-deploy smoke”Run these in order; every one must pass before calling the release green.
# 1. Gateway readinesscurl -sf https://atlas.naburis.cloud/health
# 2. API returns intents without auth (public route)curl -sf https://atlas.naburis.cloud/api/atlas/resolve/intents | jq 'length'
# 3. Static surfaces answercurl -sf https://atlas.naburis.cloud/curl -sf https://atlas.naburis.cloud/app/curl -sf https://atlas.naburis.cloud/docs/
# 4. Graph sync is drainingcurl -sf https://atlas.naburis.cloud/api/atlas/graph/status \ -H "X-Tenant-ID: $TENANT" -H "X-Workspace-ID: atlas" -H "X-Context-ID: atlas"
# 5. Corpus has at least one healthy collectorcurl -sf https://atlas.naburis.cloud/api/atlas/corpus/status \ -H "X-Tenant-ID: $TENANT" -H "X-Workspace-ID: atlas" -H "X-Context-ID: atlas" \ | jq '.collectors | length'If any step fails, roll forward with a new deploy rather than rolling back — the migration contract is forward-only.
Observability checks
Section titled “Observability checks”- OTel traces appear in the platform collector under
service.name=atlas-gatewayand for every downstream service (atlas-resolver,atlas-planner,atlas-graphsync). /api/atlas/control/scorecardreturns a non-empty module list (or a warm “no landscape data” state, which is expected before first crawl).- No service has status
unhealthyindocker compose ps.
After a rollback
Section titled “After a rollback”Atlas does not support SQL rollback. If you must unship:
- Re-deploy the previous Git SHA of the UI and landing (they’re statically built; rsync-back is safe).
- For the Go services,
git revertthe offending commit and runmake ship-servicesagain — the services are stateless. - Any migration already applied stays applied. Data shape is additive-only by contract.