Michael J Wright Archive Documentation

Fedora Backend Update Runbook

This guide captures the repeatable process for upgrading the Fedora Repository container that powers the Michael J Wright archive. Follow it end-to-end whenever a new fcrepo/fcrepo:<version>-tomcat9 image is released.


1. Discovery & Change Control

  1. Subscribe to Fedora release announcements (GitHub releases or mailing list).
  2. When a new release is announced, record the change request in the roadmap backlog with:
    • Fedora version
    • Release notes link
    • Planned deployment window
    • Potential downstream impacts (metadata, RDF models, storage requirements).
  3. Assign an owner and confirm the staging and production dates.

2. Preflight Checks

  1. Review release notes for schema changes, migrations, or deprecations.
  2. Confirm compatibility with the current Postgres version and Tomcat base image.
  3. Update observability plans:
    • New metrics to scrape
    • Log format changes
    • Health endpoint adjustments.
  4. Notify stakeholders (curators, partner institutions) of the planned maintenance window.

3. Backup Strategy

  1. Pause ingest jobs and integrations.
  2. Snapshot the running volumes:
    $timestamp = Get-Date -Format 'yyyyMMdd-HHmmss'
    docker run --rm -v postgres_data:/var/lib/postgresql/data `
      -v (Resolve-Path './backups'):/backup alpine `
      sh -c "cd /var/lib && tar czf /backup/postgres_$timestamp.tgz postgresql"
    
    docker run --rm -v fcrepo_data:/data `
      -v (Resolve-Path './backups'):/backup alpine `
      sh -c "cd / && tar czf /backup/fcrepo_$timestamp.tgz data"
    
  3. Verify archives and move them to offsite storage.
  4. Document backup locations in the change request.

4. Staging Environment Validation

  1. Update the Fedora image tag in staging (docker-compose.override.yml or environment variable) to the new release.
  2. Pull the image and restart staging services:
    docker compose pull fcrepo
    docker compose up -d fcrepo
    
  3. Run smoke tests:
    • Fedora REST API CRUD checks (create → read → delete test resources)
    • SPARQL queries, custom integrations, ingestion tool dry runs
    • Authentication via Cloudflare Worker.
  4. Monitor logs (docker compose logs -f fcrepo) for warnings or migration output.
  5. Validate dashboards in Grafana and confirm Prometheus scrape success.
  6. Capture sign-off in the change request once staging passes.

5. Production Deployment

  1. Schedule the maintenance window and notify stakeholders.
  2. Freeze ingest pipelines and ensure backups are current (repeat section 3 if needed).
  3. Update the production FEDORA_IMAGE value or docker-compose.yml tag to the new release.
  4. Redeploy using the automation script:
    pwsh -ExecutionPolicy Bypass -File .\quickstartdockerandcloudflared.ps1 -ForceRestartTunnel
    
  5. Watch logs in real time and verify Fedora starts cleanly.
  6. Run the production smoke test suite (API checks, ingestion dry run, Worker proxy test).
  7. Once healthy, lift the ingest freeze and inform stakeholders the system is live.

6. Post-Deployment Tasks

  1. Update docs/development-roadmap.md and changelog with the new version.
  2. Rotate any credentials exposed during testing (for example, temporary service accounts).
  3. Archive logs and metrics snapshots from the deployment window for audit.
  4. Close the change request with:
    • Start/end time
    • Validation steps performed
    • Any follow-up actions.
  5. Schedule a quick retrospective if incidents occurred.

7. Automation Opportunities

Keeping this runbook updated ensures every Fedora upgrade is predictable, auditable, and recoverable.