Join us
@squadcast ・ Jun 16,2024 ・ 2 min read ・ 289 views ・ Originally posted on www.squadcast.com
Klever, a cryptocurrency and financial services company, faced challenges managing on-call rotations for their globally distributed workforce. This resulted in delayed responses to critical incidents.
Squadcast, an on-call scheduling and alerting platform, helped Klever automate on-call scheduling, streamline alert routing, and improve visibility into incident management. This led to faster incident resolution, reduced alert fatigue, and improved customer communication.
Empowering a Globally Distributed Team for Faster Incident Resolution
In today’s digital landscape, ensuring seamless and reliable service delivery is paramount. This is especially true for companies like Klever, a leading cryptocurrency and financial services provider with a global user base. With a geographically dispersed team, Klever faced challenges in managing on-call rotations and ensuring timely responses to critical incidents.
This blog explores how Klever leveraged Squadcast, an on-call scheduling and alerting platform, to optimize their operations and enhance customer experience.
Klever’s initial reliance on manual on-call scheduling, often managed through spreadsheets, proved cumbersome for their global workforce. Coordinating schedules across various time zones and ensuring the right personnel were notified during non-business hours presented significant roadblocks.
Squadcast’s automated on-call scheduling features revolutionized Klever’s approach. The platform simplified creating and managing on-call rotations, enabling effortless adjustments, overrides, and streamlined alert ownership — even during off-peak hours.
Prior to Squadcast, alerts from various sources like AWS CloudWatch, Google Stackdriver, and Prometheus Alertmanager flooded Slack channels. This, coupled with time zone disparities, led to delayed acknowledgements and sluggish response times.
Squadcast’s on-call notifications and escalation policies addressed this concern effectively. Alerts were routed to the designated on-call personnel on Slack, ensuring prompt action and minimizing both Mean Time To Acknowledge (MTTA) and Mean Time To Respond (MTTR).
Klever lacked a system to track the impact of incidents and their corresponding response metrics before adopting Squadcast. Squadcast’s Incident Dashboard and Analytics provided invaluable insights. The platform facilitated the monitoring of crucial metrics like MTTA, MTTR, and incident severity, allowing for a comprehensive analysis of incident impact on downstream and upstream services.
The high volume of alerts, particularly during node outages, overwhelmed Klever’s team. Squadcast’s Suppression Rules empowered them to focus on critical issues by minimizing non-critical alerts, significantly reducing alert fatigue.
Previously, Klever engineers relied on their support team to communicate service statuses (downtime or degradation) to customers. Squadcast’s Status Page feature transformed this process. Engineers could directly update the Status Page, keeping customers informed and reducing dependency on the support team.
Klever’s Site Reliability Engineer, Kadu Relvas Barral, credits Squadcast with streamlining on-call scheduling, facilitating rotation adjustments, and enabling instant alert routing to the designated on-call engineer.
Squadcast’s role in empowering Klever to achieve faster incident resolution, improved service visibility, and enhanced customer communication is undeniable. As Klever continues to scale its operations, Squadcast will undoubtedly remain a valuable partner in ensuring exceptional service delivery.
Read the entire Case Study here
Squadcast is an Incident Management tool that’s purpose-built for SRE. Get rid of unwanted alerts, receive relevant notifications and integrate with popular ChatOps tools. Work in collaboration using virtual incident war rooms and use automation to eliminate toil.
Join other developers and claim your FAUN account now!
Influence
Total Hits
Posts
Only registered users can post comments. Please, login or signup.