Learn from Amazon Web Services

When Capacity Broke The Cloud:
Mastering Cloud Resilience After The Kinesis Outage

In 2020, Amazon's Kinesis service suffered a major outage due to exceeding OS thread limits. Incident Drill helps your team prepare for and mitigate similar cloud infrastructure challenges through realistic incident simulations.

Amazon Web Services | 2020 | Outage (Cloud)

Practice This Scenario →

The Hidden Threat to Cloud Scalability

Cloud infrastructure relies on careful resource management. Exceeding limits, like OS thread counts, can lead to cascading failures, bringing down entire services. This incident highlights the critical need for proactive testing and incident response training to prevent costly downtime and maintain customer trust.

PREPARE YOUR TEAM

How Incident Drill helps

Incident Drill provides a platform for teams to simulate the 2020 Kinesis outage and other critical incidents. Teams collaborate in a safe environment to practice incident response, identify weaknesses in their systems, and build the resilience needed to handle real-world cloud infrastructure challenges. No production systems at risk. Real-world learning. Improved incident response.

🧑‍💻

Realistic Simulations

Experience incidents with realistic system behavior and data.

🤝

Collaborative Environment

Work together as a team to diagnose and resolve incidents.

⏱️

Time-Based Progression

Incidents unfold over time, requiring quick thinking and decisive action.

📊

Performance Metrics

Track your team's performance and identify areas for improvement.

📚

Post-Incident Analysis

Review incident timelines and learn from mistakes.

☁️

Cloud-Native Scenarios

Focus on incidents relevant to modern cloud architectures.