Presentation: Service Ownership @Slack

Track: Practices of DevOps & Lean Thinking

Location: Ballroom BC

Duration: 11:50am - 12:40pm

Day of week:

Slides: Download Slides

Level: Intermediate

Persona: Backend Developer, Developer

This presentation is now available to view on InfoQ.com

Watch video with transcript

Abstract

As recently as 2017, developers at Slack didn’t carry a pager. They deployed to production over a hundred times a day, and a centralized operations team took the calls in the night. Most pages were not very actionable because they weren’t set up by the dev teams that knew their systems best. Heros and knowledge islands saved day over and over. Post-incident postmortems were poorly attended and did not encourage learning.     

Slowly, then quickly, all that changed. Slack moved to teams of empowered developers on-call, with embedded SREs, safer production deployments, and actionable alerts. Postmortems focus on learning, and meaningful analysis of incident patterns is done at all levels of the company.     

In this talk you’ll hear all about the bumps and scrapes, triumphs and pitfalls of our journey from a centralized ops team to development teams that own the full lifecycle of their systems. It wasn’t easy, but it wasn’t impossible. Hopefully, it will inspire you to try something radically different at your company too.

Speaker: Holly Allen

Service Engineering @SlackHQ

 

Holly Allen is a leader in Service Engineering at Slack, with SRE, Safety Engineering, and Storage in her portfolio. She is tireless in her efforts to make Slack the software reliable and scalable, and Slack the company a delightful place to work. Prior to Slack Holly worked at startups, DreamWorks Animation, and was Director of Engineering at 18F, a civic tech startup in the US government.

Find Holly Allen at

Similar Talks

Stateful Programming Models in Serverless Functions

Qcon

Principal Engineering Manager @Microsoft, helping lead the Azure Functions Team

Chris Gillum

The System of Profound Knowledge

Qcon

VP, Production Engineering @packethost

Ben Rockwood

Incident Management in the Age of DevOps & SRE

Qcon

Co-Founder and Chief Product Officer @Rundeck

Damon Edwards

User & Device Identity for Microservices @ Netflix Scale

Qcon

Senior Software Engineer in Product Edge Access Services Team @Netflix

Satyajit Thadeshwar

Scaling Patterns for Netflix's Edge

Qcon

Playback Edge Engineering @Netflix

Justin Ryan

Managing Failure Modes in Microservice Architectures

Qcon

VP Cloud Architecture Strategy @AWSCloud & Microservices Pioneer

Adrian Cockcroft

Beyond Microservices: Streams, State and Scalability

Qcon

Software Engineer @Confluent, PMC Member @Kafka, & Committer Apache Sqoop

Gwen Shapira