Presentation: Scaling Slack
Abstract
Slack is a communication and collaboration platform for teams. Our millions of users spend 10+ hrs connected to the service on a typical working day. They expect reliability, low latency, and extraordinarily rich client experiences across a wide variety of devices and network conditions. In the talk, we'll examine the limitations that Slack's backend ran into and how we overcame them to scale from supporting small teams to serving gigantic organizations of hundreds and thousands of users. We'll hear stories about the edge cache service, real-time messaging system and how they evolved for major product efforts including Grid and Shared Channels.
What is the focus of your work today?
I work on the edge cache tier for Slack. The focus is to make the service more performant with our growing user base and more resilient to failures. The other important aspect is to support new product efforts at Slack. And we are always product first.
What’s the motivation for this talk?
Developers are generally interested in how other systems work. I’ll give a high level introduction on how Slack works, and then focus on our two-year journey of how Slack scaled. There were mistakes made and lessons learned. Other companies with similar rapid growth may learn a thing or two from our experience.
How you you describe the persona and level of the target audience?
Our ability to scale a service excites me day-to-day. As such, I think the problems that we deal with are highly relevant to architects, system engineers, full-stack engineers and site reliability engineers.
What do you want “that” persona to walk away from your talk knowing that they might not have known 50 minutes before?
Building Slack is not as easy as it may appear to be. Users expect low latency, high performance and extremely rich user experience. Slack contains a large, rapidly changing dataset. Individual components of the data (users, channels, files, etc) reference each other. Those changes need to be consistent across all clients. With the rapid growth of our user base and request volume, we have to, at times, make fundamental changes in our architecture to accommodate the growth in addition to the incremental steps.
Similar Talks
License Compliance for Your Container Supply Chain
Open Source Engineer @VMware
Nisha Kumar
Observability in the SSC: Seeing Into Your Build System
Engineer @honeycombio
Ben Hartshorne
Evolution of Edge @Netflix
Engineering Leader @Netflix
Vasily Vlasov
Mistakes and Discoveries While Cultivating Ownership
Engineering Manager @Netflix in Cloud Infrastructure
Aaron Blohowiak
Optimizing Yourself: Neurodiversity in Tech
Consultant @Microsoft
Elizabeth Schneider
Monitoring and Tracing @Netflix Streaming Data Infrastructure
Architect & Engineer in Real Time Data Infrastructure Team @Netflix
Allen Wang
Future of Data Engineering
Distinguished Engineer @WePay
Chris Riccomini
Coding without Complexity
CEO/Cofounder @darklang
Ellen Chisa
Holistic EdTech & Diversity
Holistic Tech Coach @unlockacademy