Fedora Backend Update Runbook
This guide captures the repeatable process for upgrading the Fedora Repository container that powers the Michael J Wright archive. Follow it end-to-end whenever a new fcrepo/fcrepo:<version>-tomcat9 image is released.
1. Discovery & Change Control
- Subscribe to Fedora release announcements (GitHub releases or mailing list).
- When a new release is announced, record the change request in the roadmap backlog with:
- Fedora version
- Release notes link
- Planned deployment window
- Potential downstream impacts (metadata, RDF models, storage requirements).
- Assign an owner and confirm the staging and production dates.
2. Preflight Checks
- Review release notes for schema changes, migrations, or deprecations.
- Confirm compatibility with the current Postgres version and Tomcat base image.
- Update observability plans:
- New metrics to scrape
- Log format changes
- Health endpoint adjustments.
- Notify stakeholders (curators, partner institutions) of the planned maintenance window.
3. Backup Strategy
- Pause ingest jobs and integrations.
- Snapshot the running volumes:
$timestamp = Get-Date -Format 'yyyyMMdd-HHmmss' docker run --rm -v postgres_data:/var/lib/postgresql/data ` -v (Resolve-Path './backups'):/backup alpine ` sh -c "cd /var/lib && tar czf /backup/postgres_$timestamp.tgz postgresql" docker run --rm -v fcrepo_data:/data ` -v (Resolve-Path './backups'):/backup alpine ` sh -c "cd / && tar czf /backup/fcrepo_$timestamp.tgz data" - Verify archives and move them to offsite storage.
- Document backup locations in the change request.
4. Staging Environment Validation
- Update the Fedora image tag in staging (
docker-compose.override.ymlor environment variable) to the new release. - Pull the image and restart staging services:
docker compose pull fcrepo docker compose up -d fcrepo - Run smoke tests:
- Fedora REST API CRUD checks (create → read → delete test resources)
- SPARQL queries, custom integrations, ingestion tool dry runs
- Authentication via Cloudflare Worker.
- Monitor logs (
docker compose logs -f fcrepo) for warnings or migration output. - Validate dashboards in Grafana and confirm Prometheus scrape success.
- Capture sign-off in the change request once staging passes.
5. Production Deployment
- Schedule the maintenance window and notify stakeholders.
- Freeze ingest pipelines and ensure backups are current (repeat section 3 if needed).
- Update the production
FEDORA_IMAGEvalue ordocker-compose.ymltag to the new release. - Redeploy using the automation script:
pwsh -ExecutionPolicy Bypass -File .\quickstartdockerandcloudflared.ps1 -ForceRestartTunnel - Watch logs in real time and verify Fedora starts cleanly.
- Run the production smoke test suite (API checks, ingestion dry run, Worker proxy test).
- Once healthy, lift the ingest freeze and inform stakeholders the system is live.
6. Post-Deployment Tasks
- Update
docs/development-roadmap.mdand changelog with the new version. - Rotate any credentials exposed during testing (for example, temporary service accounts).
- Archive logs and metrics snapshots from the deployment window for audit.
- Close the change request with:
- Start/end time
- Validation steps performed
- Any follow-up actions.
- Schedule a quick retrospective if incidents occurred.
7. Automation Opportunities
- Set up a scheduled job to watch the Fedora image digest and open an issue when it changes.
- Integrate container vulnerability scanning (
docker scout cvesor Trivy) into the pipeline. - Parameterize the Fedora image tag via
.env(FEDORA_IMAGE=fcrepo/fcrepo:6.5.0-tomcat9) for safer rollbacks. - Create GitHub Actions that run staging smoke tests automatically when the tag changes.
Keeping this runbook updated ensures every Fedora upgrade is predictable, auditable, and recoverable.