Michael J Wright Archive Documentation

Fedora Backend Update Runbook

This guide captures a repeatable process for upgrading the Fedora Repository container that powers the Michael J Wright archive.

Key principle: upgrades must be deliberate and reversible. Do not run Fedora from a floating tag.


1. Discovery & Change Control

  1. Subscribe to Fedora release announcements (GitHub releases or mailing list).
  2. When a new release is announced, record the change request in the roadmap backlog with:
    • Fedora version
    • Release notes link
    • Planned deployment window
    • Potential downstream impacts (metadata, RDF models, storage requirements).
  3. Assign an owner and confirm the staging and production dates.

2. Preflight Checks

  1. Review release notes for schema changes, migrations, or deprecations.
  2. Confirm compatibility with the current Postgres version and Tomcat base image.
  3. Update observability plans:
    • New metrics to scrape
    • Log format changes
    • Health endpoint adjustments.
  4. Notify stakeholders (curators, partner institutions) of the planned maintenance window.

Pinning check (required)


3. Backup Strategy

  1. Pause ingest jobs and integrations.

  2. Capture the currently running image digests (used for fast rollback):

    # IMPORTANT: do NOT `source .env` in bash here.
    # This project's .env is for Docker Compose and may contain characters bash treats as syntax.
    
    # Capture the immutable digests for the images actually running:
    docker image inspect "$(docker inspect -f '{{.Config.Image}}' mjw-fedora)" --format '{{index .RepoDigests 0}}'
    docker image inspect "$(docker inspect -f '{{.Config.Image}}' mjw-db)" --format '{{index .RepoDigests 0}}'
    
  3. (Recommended baseline) Pin current images by digest in .env so rollback is one command:

    cd ~/fedoraMJWArtist
    TS="$(date +%Y%m%d-%H%M%S)" && sudo cp .env ".env.bak.${TS}"
    
    FEDORA_DIGEST="$(docker image inspect "$(docker inspect -f '{{.Config.Image}}' mjw-fedora)" --format '{{index .RepoDigests 0}}')"
    POSTGRES_DIGEST="$(docker image inspect "$(docker inspect -f '{{.Config.Image}}' mjw-db)" --format '{{index .RepoDigests 0}}')"
    
    sudo sed -i "s|^FEDORA_IMAGE=.*|FEDORA_IMAGE=${FEDORA_DIGEST}|" .env
    sudo sed -i "s|^POSTGRES_IMAGE=.*|POSTGRES_IMAGE=${POSTGRES_DIGEST}|" .env
    
    docker compose pull mjw-fedora db
    docker compose up -d --force-recreate mjw-fedora db
    

    Notes:

    • After pinning by digest, docker ps may show images as fcrepo/fcrepo / postgres (without a tag). That is expected.
    • Your rollback handle is the .env.bak.* file plus the digests you recorded.
  4. Backup the running Docker volumes (Linux / Ubuntu):

    cd ~/fedoraMJWArtist
    mkdir -p ./backups
    ts="$(date +%Y%m%d-%H%M%S)"
    
    # Compose may prefix volumes with the project name. Discover actual volume names.
    POSTGRES_VOL="$(docker volume ls --format '{{.Name}}' | grep -E 'postgres_data$' | head -n 1)"
    FCREPO_VOL="$(docker volume ls --format '{{.Name}}' | grep -E 'fcrepo_data$' | head -n 1)"
    
    echo "POSTGRES_VOL=$POSTGRES_VOL"
    echo "FCREPO_VOL=$FCREPO_VOL"
    
    docker run --rm -v "${POSTGRES_VOL}:/var/lib/postgresql/data" -v "$PWD/backups":/backup alpine:3.20 \
      sh -lc "cd /var/lib/postgresql && tar czf /backup/postgres_data_${ts}.tgz data"
    
    docker run --rm -v "${FCREPO_VOL}:/data" -v "$PWD/backups":/backup alpine:3.20 \
      sh -lc "cd / && tar czf /backup/fcrepo_data_${ts}.tgz data"
    
  5. (Azure recommended) Take a Managed Disk snapshot as an additional rollback layer:

    • Stop ingest + stop containers (docker compose down).
    • Stop the VM.
    • Snapshot the disk(s) that hold Docker’s data (commonly the OS disk unless you’ve moved /var/lib/docker).
    • Start the VM, then proceed.
  6. Verify archives and move them to offsite storage.

  7. Document backup locations + image digests in the change request.


4. Staging Environment Validation

  1. Update the Fedora image tag in staging (.env FEDORA_IMAGE=...) to the new release.
  2. Pull the image and restart staging services:
    docker compose pull mjw-fedora
    docker compose up -d mjw-fedora
    
  3. Run smoke tests:
    • Fedora REST API CRUD checks (create → read → delete test resources)
    • SPARQL queries, custom integrations, ingestion tool dry runs
    • Authentication via Cloudflare Worker.
  4. Monitor logs (docker compose logs -f mjw-fedora) for warnings or migration output.
  5. Validate dashboards in Grafana and confirm Prometheus scrape success.
  6. Capture sign-off in the change request once staging passes.

5. Production Deployment

  1. Schedule the maintenance window and notify stakeholders.
  2. Freeze ingest pipelines and ensure backups are current (repeat section 3 if needed).
  3. Update .env FEDORA_IMAGE to the new release (pin a specific version).
  4. Redeploy using the automation script:
    pwsh -ExecutionPolicy Bypass -File .\quickstartdockerandcloudflared.ps1 -ForceRestartTunnel
    
  5. Watch logs in real time and verify Fedora starts cleanly.
  6. Run the production smoke test suite (API checks, ingestion dry run, Worker proxy test).
  7. Once healthy, lift the ingest freeze and inform stakeholders the system is live.

5A. Concrete Upgrade / Rollback Commands (Ubuntu VM)

Use this section as the “copy/paste runbook” during the maintenance window.

Baseline: confirm pins and capture digests

cd ~/fedoraMJWArtist
grep -E '^(FEDORA_IMAGE|POSTGRES_IMAGE|PROMETHEUS_IMAGE|GRAFANA_IMAGE)=' .env

docker image inspect "$(docker inspect -f '{{.Config.Image}}' mjw-fedora)" --format '{{index .RepoDigests 0}}'
docker image inspect "$(docker inspect -f '{{.Config.Image}}' mjw-db)" --format '{{index .RepoDigests 0}}'

Verify the VM is actually running Fedora 6.5.1

This proves what the running container is using (tag and/or digest). It does not rely on what you intended via .env.

cd ~/fedoraMJWArtist

# What Compose is configured to use (from .env)
grep -E '^FEDORA_IMAGE=' .env

# What the running container was created from (image reference)
docker inspect -f '{{.Config.Image}}' mjw-fedora

# Immutable digest of the running container image (recommended “truth source”)
docker image inspect "$(docker inspect -f '{{.Config.Image}}' mjw-fedora)" --format '{{index .RepoDigests 0}}'

Upgrade Fedora to a specific tag

Use a specific Fedora tag (example: fcrepo/fcrepo:6.5.1-tomcat9). Avoid 6-tomcat9 (floating) and avoid 7.0.0-* alpha tags unless you are explicitly testing.

cd ~/fedoraMJWArtist
TS="$(date +%Y%m%d-%H%M%S)" && sudo cp .env ".env.bak.${TS}"
sudo sed -i 's|^FEDORA_IMAGE=.*|FEDORA_IMAGE=fcrepo/fcrepo:6.5.1-tomcat9|' .env

docker compose pull mjw-fedora
docker compose up -d --force-recreate mjw-fedora
docker compose logs -n 200 mjw-fedora

Roll back (fast)

Pick one of the .env.bak.* files you created, restore it, and restart Fedora.

cd ~/fedoraMJWArtist
ls -1t .env.bak.* | head -n 5

sudo cp .env.bak.YYYYMMDD-HHMMSS .env
docker compose pull mjw-fedora
docker compose up -d --force-recreate mjw-fedora
docker compose logs -n 200 mjw-fedora

Roll back (data restore)

Use this if the upgrade wrote incompatible state or you need a clean “back to known-good data”.

cd ~/fedoraMJWArtist
bash ./scripts/restore-on-azure.sh

6. Post-Deployment Tasks

  1. Update docs/development-roadmap.md and changelog with the new version.
  2. Rotate any credentials exposed during testing (for example, temporary service accounts).
  3. Archive logs and metrics snapshots from the deployment window for audit.
  4. Close the change request with:
    • Start/end time
    • Validation steps performed
    • Any follow-up actions.
  5. Schedule a quick retrospective if incidents occurred.

7. Automation Opportunities


8. Rollback Paths (choose based on what changed)

A) Fast rollback (no migrations occurred)

  1. Set .env back to the previous FEDORA_IMAGE (or better: the previous digest you recorded).
  2. Redeploy:
    docker compose pull mjw-fedora
    docker compose up -d mjw-fedora
    

If Fedora performed migrations or wrote incompatible state, this may not work safely.

B) Data rollback (restore volumes)

C) Infrastructure rollback (Azure snapshot)


9. Notes on Postgres upgrades

Keep Fedora upgrades and Postgres major upgrades as separate change requests. Postgres 12.x is old; plan a dedicated Postgres upgrade window (dump/restore or logical replication) once Fedora is stable.

Keeping this runbook updated ensures every Fedora upgrade is predictable, auditable, and recoverable.