Learn from Google Cloud

When a Global Quota Brought Down Filestore Worldwide

In 2022, a rogue internal service triggered a massive Filestore outage, rendering instances read-only for hours. Incident Drill helps your team prepare for and prevent similar catastrophic cloud events.

Google Cloud | 2022 | Outage (Cloud)

The Hidden Dangers of Internal Dependencies

This incident highlights the critical importance of understanding and managing internal dependencies. Over-reliance and insufficient monitoring can lead to unexpected cascading failures, impacting global infrastructure and resulting in significant downtime. Quota management is also paramount.

PREPARE YOUR TEAM

Incident Drill: Practice Responding to Cloud Emergencies

Incident Drill provides realistic incident simulations that allow your team to practice responding to complex scenarios like the Filestore outage. We recreate the pressure and ambiguity of a real incident, helping your team develop the skills and processes needed to minimize downtime, improve communication, and prevent future incidents.

🚨

Realistic Simulations

Experience the chaos of a real incident without the real-world consequences.

💬

Collaborative Environment

Work together as a team to diagnose and resolve the incident.

📈

Data-Driven Insights

Receive detailed feedback on your team's performance and identify areas for improvement.

🔍

Root Cause Analysis

Dig deep to understand the underlying causes of the incident.

🛡️

Preventative Measures

Develop strategies to prevent similar incidents from happening in the future.

📚

Customizable Scenarios

Tailor simulations to your specific infrastructure and risks.

WHY TEAMS PRACTICE THIS

Improve Your Cloud Incident Response

  • Reduce downtime and minimize impact on customers.
  • Improve team communication and collaboration.
  • Identify and address vulnerabilities in your infrastructure.
  • Develop a proactive approach to incident management.
  • Increase confidence in your team's ability to handle emergencies.
  • Learn from past mistakes and prevent future incidents.
0:00
Internal Service Overloads Filestore Admin API
0:15
Global Quota Triggered
0:30
Filestore Instances Become Read-Only
3:30
Issue Resolved, Filestore Restored

How It Works

1

Step 1: Understand the Incident

Review the details of the Google Cloud Filestore Throttle incident.

2

Step 2: Simulate the Scenario

Participate in a realistic incident simulation with Incident Drill.

3

Step 3: Analyze the Response

Evaluate your team's performance and identify areas for improvement.

4

Step 4: Implement Preventative Measures

Develop strategies to prevent similar incidents from occurring in your own environment.

Ready to Level Up Your Incident Response?

Join the Incident Drill waitlist and be among the first to experience the future of incident training.

Get Early Access
Founding client discounts Shape the roadmap Direct founder support

Join the Incident Drill waitlist

Drop your email and we'll reach out with private beta invites and roadmap updates.