Files
gps-denied-desktop/_docs/00_templates/rollback_strategy.md
T
Oleksandr Bezdieniezhnykh fd75243a84 more detailed SDLC plan
2025-12-10 19:05:17 +02:00

3.6 KiB

Rollback Strategy Template

Overview

Field Value
Service/Component [Name]
Last Updated [YYYY-MM-DD]
Owner [Team/Person]
Max Rollback Time [Target: X minutes]

Rollback Triggers

Automatic Rollback Triggers

  • Health check failures > 3 consecutive
  • Error rate > 10% for 5 minutes
  • P99 latency > 2x baseline for 5 minutes
  • Critical alert triggered

Manual Rollback Triggers

  • User-reported critical bug
  • Data corruption detected
  • Security vulnerability discovered
  • Stakeholder decision

Pre-Rollback Checklist

  • Incident acknowledged and documented
  • Stakeholders notified of rollback decision
  • Current state captured (logs, metrics snapshot)
  • Rollback target version identified
  • Database state assessed (migrations reversible?)

Rollback Procedures

Application Rollback

Option 1: Revert Deployment (Preferred)

# Using CI/CD
# Trigger previous successful deployment

# Manual (if needed)
git revert <commit-hash>
git push origin main

Option 2: Blue-Green Switch

# Switch traffic to previous version
# [Platform-specific commands]

Option 3: Feature Flag Disable

# Disable feature flag
# [Feature flag system commands]

Database Rollback

If Migration is Reversible

# Run down migration
# [Migration tool command]

If Migration is NOT Reversible

  1. Restore from backup
  2. Point-in-time recovery to pre-deployment
  3. WARNING: May cause data loss - requires approval

Configuration Rollback

# Restore previous configuration
# [Config management commands]

Post-Rollback Verification

Immediate (0-5 minutes)

  • Service responding to health checks
  • No error spikes in logs
  • Basic functionality verified

Short-term (5-30 minutes)

  • All critical paths functional
  • Error rate returned to baseline
  • Performance metrics normal

Extended (30-60 minutes)

  • No delayed issues appearing
  • User reports resolved
  • All alerts cleared

Communication Plan

During Rollback

Audience Message Channel
Engineering "Initiating rollback due to [reason]" Slack
Stakeholders "Service issue detected, rollback in progress" Email
Users "We're aware of issues and working on a fix" Status page

After Rollback

Audience Message Channel
Engineering "Rollback complete, monitoring" Slack
Stakeholders "Service restored, post-mortem scheduled" Email
Users "Issue resolved, service fully operational" Status page

Known Limitations

Cannot Rollback If:

  • Database migration deleted columns with data
  • External API contracts changed
  • Third-party integrations updated

Partial Rollback Scenarios

  • When only specific components affected
  • When data migration is complex

Recovery After Rollback

Investigation

  1. Collect all relevant logs
  2. Identify root cause
  3. Document findings

Re-deployment Planning

  1. Fix identified in development
  2. Additional tests added
  3. Staged rollout planned
  4. Monitoring enhanced

Rollback Testing

Test Schedule

  • Monthly rollback drill
  • After major infrastructure changes
  • Before critical releases

Test Scenarios

  1. Application rollback
  2. Database rollback (in staging)
  3. Configuration rollback

Contacts

Role Name Contact
On-call
Database Admin
Platform Team