ADVANCED // ENTERPRISE
MODULE 04 // RESILIENCE

High Availability Design.

Design for five-nines availability.

AVAILABILITY ARCHITECTURE

Five-nines (99.999%) availability means less than 5 minutes downtime per year. This requires: no single points of failure, automated failover, and graceful degradation.

Every component needs redundancy: multiple instances behind load balancers, database replication with automatic failover, and redundant network paths.

Test failures regularly. Chaos engineering—intentionally causing failures—validates that redundancy works. Untested redundancy provides false confidence.

LOAD BALANCING

Distribute traffic across multiple instances. Health checks remove failed instances automatically.

DATABASE REPLICATION

Synchronous replication for zero data loss. Automatic failover to replica on primary failure.

CIRCUIT BREAKERS

Prevent cascade failures. Fail fast when dependencies are unavailable.

GRACEFUL DEGRADATION

Continue with reduced functionality when components fail. Never full outage.

KNOWLEDGE CHECK // Q04
Why is untested redundancy worse than no redundancy?