High Availability Design.
Design for five-nines availability.
Five-nines (99.999%) availability means less than 5 minutes downtime per year. This requires: no single points of failure, automated failover, and graceful degradation.
Every component needs redundancy: multiple instances behind load balancers, database replication with automatic failover, and redundant network paths.
Test failures regularly. Chaos engineering—intentionally causing failures—validates that redundancy works. Untested redundancy provides false confidence.
LOAD BALANCING
Distribute traffic across multiple instances. Health checks remove failed instances automatically.
DATABASE REPLICATION
Synchronous replication for zero data loss. Automatic failover to replica on primary failure.
CIRCUIT BREAKERS
Prevent cascade failures. Fail fast when dependencies are unavailable.
GRACEFUL DEGRADATION
Continue with reduced functionality when components fail. Never full outage.