Security Architecture Overview
Michael J Wright Digital Archive - Production Security Model
Architecture Summary
The Michael J Wright Digital Archive is deployed with a defense-in-depth security architecture that eliminates public attack surface while providing authenticated access through Azure AD SSO and JWT-based service authentication.
Internet Users
↓
Cloudflare (CDN + Access + Zero Trust)
↓
Azure AD SSO Authentication
↓
┌─────────────────────────────────────────┐
│ Cloudflare Workers (Edge Security) │
│ • Frontend (Pages) │
│ • Manage Worker (JWT Protected) │
│ • Submit-Ingest Worker (JWT Protected) │
│ • API Worker (JWT Protected) │
└─────────────────────────────────────────┘
↓
Cloudflare Tunnel (Encrypted, Outbound-Only)
↓
┌─────────────────────────────────────────┐
│ Azure VM (No Public IP) │
│ • Docker Host │
│ • Fedora Repository (Containerized) │
│ • PostgreSQL (Containerized) │
│ • Private Network Only │
└─────────────────────────────────────────┘
Layer 1: Network Isolation
No Public IP Address
The Azure VM hosting the Fedora repository has no public IP address:
- Cannot be reached directly from the internet
- No SSH exposure
- No HTTP/HTTPS listener on public interfaces
- All inbound connections rejected at Azure NSG level
Cloudflare Tunnel (Cloudflared)
- Outbound-only connection from VM to Cloudflare edge
- VM initiates persistent WebSocket connection to Cloudflare
- No inbound ports required on firewall
- Traffic encrypted end-to-end with TLS 1.3
- Tunnel authenticates using certificate stored on VM
- If tunnel disconnects, repository becomes unreachable (fail-secure)
Configuration:
# /etc/cloudflared/config.yml
tunnel: <tunnel-id>
credentials-file: /etc/cloudflared/<tunnel-id>.json
ingress:
- hostname: fcrepo.michaeljwright.com.au
service: http://localhost:8080
- service: http_status:404
Layer 2: Azure AD Single Sign-On
Cloudflare Access (Zero Trust)
All user-facing endpoints are protected by Cloudflare Access with Azure AD as the identity provider:
Protected Domains:
submit.michaeljwright.com.au- Content submission portalmanage.michaeljwright.com.au- Archive management interfaceapi.data.michaeljwright.com.au- REST API proxy
Authentication Flow:
- User requests protected URL
- Cloudflare Access intercepts request
- User redirected to Azure AD login
- After successful authentication, Cloudflare issues JWT
- JWT included in
Cf-Access-Jwt-Assertionheader - Worker validates JWT before serving content
Access Policy:
- Identity Provider: Azure AD (Microsoft Entra ID)
- Allowed Users: Configured email addresses or Azure AD groups
- Session Duration: Configurable (default: 24 hours)
- Multi-factor authentication: Enforced via Azure AD
Layer 3: Worker-Level JWT Authentication
Dual Authentication Model
Each Cloudflare Worker implements dual authentication:
- Browser Users: Cloudflare Access JWT (
Cf-Access-Jwt-Assertionheader) - Service-to-Service: HS256 JWT (
Authorization: Bearerheader)
Cloudflare Access JWT Validation
Implementation: workers/shared/auth.js
export async function validateCloudflareAccess(request, env) {
const token = request.headers.get('Cf-Access-Jwt-Assertion');
// Decode and validate JWT
// - Check signature (optional JWKS verification)
// - Verify expiration (exp claim)
// - Verify not-before (nbf claim)
// - Enforce audience (ACCESS_AUD)
// - Extract user identity (email, groups)
return { user, error: null, status: 200 };
}
Security Features:
- Expiration time validation
- Not-before time validation
- Optional signature verification via Cloudflare Access JWKS
- Optional audience tag enforcement
- User identity extraction (email, name, groups)
Service-to-Service JWT (HS256)
For automation and inter-worker communication:
Token Generation:
const token = await generateServiceJwt(env, {
sub: 'automation',
name: 'Archive Automation',
}, {
ttlSeconds: 300 // 5 minutes
});
Token Validation:
const result = await verifyServiceJwt(token, env);
// Validates:
// - Signature (HMAC-SHA256)
// - Issuer (iss claim)
// - Audience (aud claim)
// - Expiration (exp claim)
// - Not-before (nbf claim)
Configuration:
- Secret:
SERVICE_JWT_SECRET(Wrangler secret, shared across workers) - Algorithm: HS256 (HMAC-SHA256)
- Default TTL: 5 minutes
- Issuer/Audience: Configured per worker in
wrangler.toml
Unified Authentication Middleware
Implementation:
export async function authenticateRequest(request, env) {
// 1. Check for Authorization: Bearer (service token)
const authHeader = request.headers.get('Authorization');
if (authHeader && authHeader.startsWith('Bearer ')) {
return await verifyServiceJwt(token, env);
}
// 2. Fallback to Cloudflare Access (browser)
return await validateCloudflareAccess(request, env);
}
All API endpoints in manage, submit-ingest, and api workers require authentication via this middleware.
Layer 4: Container Isolation
Docker Security
Fedora Repository and PostgreSQL run in isolated Docker containers:
Network Isolation:
# docker-compose.yml
networks:
fedora-net:
driver: bridge
internal: false # Only fedora container needs external access
Container Security:
- Fedora runs as non-root user inside container
- PostgreSQL accessible only via Docker network
- Volume mounts restricted to necessary data directories
- Resource limits enforced (CPU, memory)
Container Images:
fcrepo/fcrepo:6.5.0- Official Fedora imagepostgres:15-alpine- Official PostgreSQL (minimal Alpine base)
Fedora Basic Authentication
Fedora itself is protected by HTTP Basic Authentication:
Configuration:
<!-- config/tomcat-users.xml -->
<user username="curator1"
password="<strong-password>"
roles="fedoraUser,fedoraAdmin"/>
Worker Access:
- Workers authenticate to Fedora using
FEDORA_BASIC_AUTHsecret - Secret stored in Wrangler (encrypted at rest)
- Never exposed to browser clients
- Format:
Basic base64(username:password)
Layer 5: Data Protection
Encryption in Transit
- Browser ↔ Cloudflare: TLS 1.3 (HTTPS)
- Cloudflare ↔ Workers: Internal Cloudflare network (encrypted)
- Workers ↔ VM: Cloudflare Tunnel with TLS 1.3
- VM ↔ Fedora: HTTP on localhost only (Docker network)
Encryption at Rest
- Azure VM Disk: Azure managed disk encryption enabled
- PostgreSQL Data: Database files on encrypted volume
- Fedora Binaries: Object store on encrypted volume
Secrets Management
Cloudflare Workers Secrets:
# Set via Wrangler CLI (encrypted in Cloudflare)
npx wrangler secret put FEDORA_BASIC_AUTH
npx wrangler secret put SERVICE_JWT_SECRET
Secrets are:
- Encrypted at rest in Cloudflare
- Available only to worker runtime
- Never logged or exposed in responses
- Rotatable without code changes
Security Testing
Automated Security Validation
Test Script: scripts/test-worker-bypass.ps1
Validates that API endpoints are properly protected:
# Test 1: Attempt unauthenticated API access
Invoke-WebRequest https://manage.michaeljwright.com.au/api/items
# Expected: 401 Unauthorized
# Test 2: Attempt unauthenticated submission
Invoke-WebRequest https://submit.michaeljwright.com.au/api/submit -Method POST
# Expected: 401 Unauthorized
Expected Results:
- ✅ All API endpoints return 401/403 without valid authentication
- ✅ Browser users redirected to Azure AD login
- ✅ Valid service JWTs accepted
- ✅ Expired tokens rejected
JWT Token Testing
Generate Service Token:
node scripts/generate-service-jwt.mjs `
--secret "<SERVICE_JWT_SECRET>" `
--iss mjw-manage `
--aud mjw-submit-ingest `
--sub automation
Test API with Token:
node scripts/test-service-auth.mjs `
--url https://submit.michaeljwright.com.au/api/validate-catalog-id?id=TEST `
--token <TOKEN>
Access Control Matrix
| Resource | Public Access | Authenticated Users | Service Tokens | Notes |
|---|---|---|---|---|
| Frontend (Pages) | ❌ No | ✅ Yes (Azure AD) | N/A | Read-only documentation |
| Manage Worker | ❌ No | ✅ Yes (Azure AD) | ✅ Yes (HS256) | Edit/delete operations |
| Submit Worker | ❌ No | ✅ Yes (Azure AD) | ✅ Yes (HS256) | Content ingestion |
| API Worker | ❌ No | ✅ Yes (Azure AD) | ✅ Yes (HS256) | Fedora proxy |
| Fedora (Direct) | ❌ No | ❌ No | ❌ No | No public access |
| Azure VM | ❌ No | ❌ No | ❌ No | No public IP |
| PostgreSQL | ❌ No | ❌ No | ❌ No | Docker network only |
Incident Response
Compromise Scenarios
If SERVICE_JWT_SECRET is compromised:
- Rotate secret immediately via Wrangler
- Deploy updated workers
- All active service tokens invalidated instantly
- Browser sessions unaffected (use different auth path)
If Cloudflare Tunnel certificate is compromised:
- Revoke tunnel in Cloudflare dashboard
- Generate new tunnel certificate
- Update VM configuration
- Restart cloudflared service
If FEDORA_BASIC_AUTH is compromised:
- Update Tomcat users.xml on VM
- Restart Fedora container
- Update Wrangler secrets in all workers
- Deploy workers
If Azure AD is compromised:
- Revoke user sessions in Cloudflare Access
- Reset passwords in Azure AD
- Review Access audit logs
- Consider MFA enforcement
Monitoring & Logging
Cloudflare Access Logs:
- All authentication attempts logged
- Failed login attempts tracked
- Unusual access patterns alerted
Worker Logs:
# Real-time worker logs
npx wrangler tail
# Filter for authentication failures
npx wrangler tail --status error
Fedora Access Logs:
- All API requests logged in container
- View with:
docker logs fedora
Security Hardening Checklist
Required (Already Implemented)
- ✅ No public IP on Azure VM
- ✅ Cloudflare Tunnel for inbound connectivity
- ✅ Azure AD SSO via Cloudflare Access
- ✅ JWT authentication on all worker endpoints
- ✅ Fedora Basic Auth for repository access
- ✅ HTTPS/TLS 1.3 for all external connections
- ✅ Docker container isolation
- ✅ Secrets encrypted in Wrangler
Recommended Enhancements
- ⚠️ Enable Access JWKS verification: Set
ACCESS_JWKS_URLin worker config - ⚠️ Set Access audience tags: Configure
ACCESS_AUDfor stricter validation - ⚠️ Regular secret rotation: Rotate SERVICE_JWT_SECRET quarterly
- ⚠️ MFA enforcement: Require MFA in Azure AD policies
- ⚠️ Rate limiting: Configure Cloudflare rate limiting rules
- ⚠️ WAF rules: Enable Cloudflare Web Application Firewall
- ⚠️ Audit logging: Enable Azure VM diagnostics and Cloudflare audit logs
- ⚠️ Backup encryption: Ensure Fedora backups are encrypted
Optional (Advanced)
- 🔹 Implement role-based access control (RBAC) using Azure AD groups
- 🔹 Add Content Security Policy (CSP) headers to frontend
- 🔹 Enable Cloudflare Bot Management
- 🔹 Implement API request signing for sensitive operations
- 🔹 Add honeypot endpoints to detect scanning attempts
Configuration Reference
Worker Environment Variables
Manage Worker (workers/manage/wrangler.toml):
[vars]
FEDORA_URL = "https://fcrepo.michaeljwright.com.au"
ACCESS_AUD = "" # Optional: Cloudflare Access audience
ACCESS_JWKS_URL = "" # Optional: https://<team>.cloudflareaccess.com/cdn-cgi/access/certs
SERVICE_JWT_ISSUER = "mjw-manage"
SERVICE_JWT_AUDIENCE = "mjw-manage"
Submit-Ingest Worker (workers/submit-ingest/wrangler.toml):
[vars]
FEDORA_URL = "https://fcrepo.michaeljwright.com.au"
MAX_FILE_SIZE = "52428800"
MAX_FILES = "100"
ACCESS_AUD = ""
ACCESS_JWKS_URL = ""
SERVICE_JWT_ISSUER = "mjw-manage"
SERVICE_JWT_AUDIENCE = "mjw-submit-ingest"
Wrangler Secrets (Set Per Worker)
# Navigate to worker directory
cd workers/manage # or workers/submit-ingest
# Set secrets
npx wrangler secret put FEDORA_BASIC_AUTH
npx wrangler secret put SERVICE_JWT_SECRET
FEDORA_BASIC_AUTH Format:
Basic <base64(username:password)>
SERVICE_JWT_SECRET:
- Minimum 32 characters
- Cryptographically random
- Same value across all workers
- Rotate quarterly
Compliance & Best Practices
OWASP Top 10 Coverage
| Risk | Mitigation |
|---|---|
| A01: Broken Access Control | Azure AD SSO + JWT authentication on all endpoints |
| A02: Cryptographic Failures | TLS 1.3, encrypted secrets, encrypted Azure disks |
| A03: Injection | Parameterized Fedora API calls, input validation |
| A04: Insecure Design | Defense-in-depth, no public IP, fail-secure defaults |
| A05: Security Misconfiguration | Minimal attack surface, container isolation, principle of least privilege |
| A06: Vulnerable Components | Official Docker images, regular updates |
| A07: Authentication Failures | Azure AD MFA, short-lived JWTs, session management |
| A08: Software/Data Integrity | Signed worker deployments, encrypted backups |
| A09: Logging/Monitoring Failures | Cloudflare Access logs, worker tail logs, Fedora access logs |
| A10: SSRF | Workers only connect to known Fedora endpoint via tunnel |
Zero Trust Principles
- Never Trust, Always Verify: Every request authenticated (browser or service)
- Least Privilege Access: Users only access needed resources via Access policies
- Assume Breach: Multiple security layers (network, authentication, authorization)
- Verify Explicitly: JWT signature and claims validation on every request
- Minimize Blast Radius: Container isolation, network segmentation, short-lived tokens
Summary
The Michael J Wright Digital Archive employs a multi-layered security architecture:
- Network Layer: No public IP, Cloudflare Tunnel with outbound-only connection
- Identity Layer: Azure AD SSO via Cloudflare Access for all user access
- Application Layer: JWT authentication (Access + service tokens) on all worker endpoints
- Container Layer: Docker isolation for Fedora and PostgreSQL
- Data Layer: TLS 1.3 in transit, encrypted disks at rest
Attack Surface: Effectively zero public exposure. All access mediated through:
- Cloudflare's global edge network
- Azure AD authentication
- Cryptographically signed JWTs
- Fedora Basic Auth (worker-only)
Authentication Flow: Internet → Cloudflare Access (Azure AD) → Worker JWT validation → Cloudflare Tunnel (encrypted) → Fedora (localhost)
This architecture eliminates common attack vectors (direct SSH, exposed databases, unauthenticated APIs) while providing seamless authenticated access for authorized users and services.
Last Updated: November 7, 2025
Security Contact: Archive Administrator
Review Schedule: Quarterly