How do they do it? In QCon's marquee Architectures track, we learn what it takes to operate at large scale from well-known names in our industry. You will take away hard-earned architectural lessons on scalability, reliability, throughput, and performance.
Track: Architectures You've Always Wondered About
Location: Ballroom A
Day of week:
Track Host: Randy Shoup
Randy is a 30-year veteran of Silicon Valley, and has worked as a senior technology leader and executive at companies ranging from small startups, to mid-sized places, to eBay and Google. Randy is currently VP Engineering at WeWork in San Francisco. He is particularly passionate about the nexus of culture, technology, and organization.
10:35am - 11:25am
Scaling Patterns for Netflix's Edge
In 2008 Netflix had less than a million streaming members. Today we have over 150 million. That explosive growth in membership has led to a similar growth in the number of microservices, in the amount of cloud resources, and our overall architectural complexity. Eventually, that sheer number of computation resources becomes hard to manage and sacrifices our reliability. At Netflix, we’ve found a few techniques that have helped keep our computation growth manageable and reliable.
There are the obvious tasks of performance tuning, reducing features, or reducing data. Going beyond just “tightening the belt” tactics, we had to rethink how we handle every request. At our scale, we can no longer call a customer database on every request, we can no longer fan out to a cascade of mid-tier requests on every request, and we can no longer log every request, so we don’t. This session will introduce the architectural patterns we’ve adopted to accomplish skipping those steps, which would normally be considered required for a functioning system.
I will also be sharing successes we’ve had from unintuitively partitioning computation into multiple services to get better runtime characteristics. Through this session, you will be introduced to useful probabilistic data structures, innovative bi-directional data passing, and open-source projects available from Netflix that make this all possible.
11:50am - 12:40pm
Secrets at Planet-Scale: Engineering the Internal Google KMS
We propose to discuss Google’s internal key management system for cryptographic key material which is a critical part of Google's overall strategy for user data protection. The talk will cover the design choices and strategies that Google chose in order to build a highly reliable, highly scalable service. The talk will close with continued maintenance pain points and suggested practices for your own internal key management service.
This internal KMS underlies most storage, authentication, cross-site scripting forgery, and other critical security systems at Google, and hence needs to have very high availability. Furthermore, Google’s internal KMS not only manages the generation, distribution and rotation of cryptographic keys, but it also manages other secret data. Google’s internal KMS serves a massive volume of queries, more per second than Gmail or any single Google service, and needs to be very reliable in order to do so, historically performing at more than 99.9999% availability.
The design choices that favored high availability have caused a few pain points for our clients. An example is the delay introduced between clients updating their keys/configs and the changes being reflected in production. For many of the system’s clients this delay is too long. We’ll discuss this and other pain points, and how we’re improving the user experience.
1:40pm - 2:30pm
Architectures That Scale Deep - Regaining Control in Deep Systems
We often hear about architectural "scale" as if it's one-dimension and linear. In fact, it is neither, and that's breaking our tools and processes. Where modern, microservice-based architectures are concerned, "large-scale systems" aren't simply larger versions of "small-scale" systems – they are something completely different. Enter the "Deep System."
In this talk, we first develop a shared intuition and formal definition for "Deep Systems" and their common properties: they are layered, distributed, concurrent, multi-tenant, change continuously, and are a beast to manage with conventional tools! We then re-introduce the fundamentals of control theory from the 1960s, including the original conceptualizations of Observability and its conceptual cousin, Controllability. Finally, we use examples from Google and other organizations to illustrate how deep systems have damaged our ability to observe software, and what we need to do in order to regain confidence and control.
2:55pm - 3:45pm
Evolutionary Architecture as Product @ CircleCI
Organizations continually evolve their technical architectures in order to adjust to the changing needs of their business. For example: systems must scale with increasing customer demand, tools must create efficiency in growing teams, and implementations are generalized to support additional product features. At CircleCI, we face all of these drivers, but our role in the software delivery pipeline means we have the additional need to adapt to changes in how software is being built.
And the rate of change in software development approaches is like no other.
CircleCI's history has involved constantly adapting our product architecture to match transformations in the world of software development. From the explosive adoption of Docker to the steady rise of microservice architectures, the changing demands of software engineering teams have proven to be deeply coupled with the structure of CircleCI's service–far more than we anticipated when we started the business 8 years ago.
This talk will cover:
- How the evolution of software development since 2011 has driven the evolution of CircleCI's architecture
- Managing the cost of change when customers have the ability to customize almost anything
- Predictions of future trends in software delivery and the architectural approaches we will take to support them
4:10pm - 5:00pm
Snowflake Architecture: Building a Data Warehouse for the Cloud
At Snowflake, we wanted to architect a data warehouse from the ground up to leverage all the benefits of the cloud. Unlike shared-storage architectures that tie storage and compute together, we built a single integrated system with fully independent scaling for compute, storage and services. In the storage layer, we split data into micro-partitions and extract metadata for efficient query processing. At the compute layer, multiple virtual warehouses in separate compute clusters can simultaneously operate on the same data, giving high availability, performance isolation, scalability and concurrency. Virtual warehouses can also be automatically scaled up and down based on workload and performance.
This talk will cover the three pillars of the Snowflake architecture:
- Separating compute and storage to leverage abundant cloud compute resources
- Building an ACID compliant database system on immutable storage
- Delivering a scalable multi-tenant data warehouse system as a service
5:25pm - 6:15pm
Architectures Panel
How do big operators differ from smaller disruptors? This panel will examine the different architectures that power these systems.
Anvita Pandit, Software Developer @Google
Ben Sigelman, CEO and co-founder @LightStepHQ, Co-creator @OpenTracing API standard
Robert Zuber, CTO @CircleCI
Thierry Cruanes, Co-founder Snowflake Computing @SnowflakeDB
Last Year's Tracks
Monday, 1 November
-
Microservices / Serverless Patterns & Practices
Evolving, observing, persisting, and building modern microservices
-
Practices of DevOps & Lean Thinking
Practical approaches using DevOps & Lean Thinking
-
JavaScript & Web Tech
Beyond JavaScript in the Browser. Exploring WebAssembly, Electron, & Modern Frameworks
-
Modern CS in the Real World
Thoughts pushing software forward, including consensus, CRDT's, formal methods, & probabilistic programming
-
Modern Operating Systems
Applied, practical, & real-world deep-dive into industry adoption of OS, containers and virtualization, including Linux on Windows, LinuxKit, and Unikernels
-
Optimizing You: Human Skills for Individuals
Better teams start with a better self. Learn practical skills for IC
-
Open Spaces
Tuesday, 2 November
-
Architectures You've Always Wondered About
Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, & more
-
21st Century Languages
Lessons learned from languages like Rust, Go-lang, Swift, Kotlin, and more.
-
Emerging Trends in Data Engineering
Showcasing DataEng tech and highlighting the strengths of each in real-world applications.
-
Bare Knuckle Performance
Killing latency and getting the most out of your hardware
-
Socially Conscious Software
Building socially responsible software that protects users privacy & safety
-
Delivering on the Promise of Containers
Runtime containers, libraries, and services that power microservices
-
Open Spaces
Wednesday, 3 November
-
Applied AI & Machine Learning
Applied machine learning lessons for SWEs, including tech around TensorFlow, TPUs, Keras, PyTorch, & more
-
Production Readiness: Building Resilient Systems
More than just building software, building deployable production ready software
-
Developer Experience: Level up your Engineering Effectiveness
Improving the end to end developer experience - design, dev, test, deploy, operate/understand.
-
Security: Lessons Attacking & Defending
Security from the defender's AND the attacker's point of view
-
Future of Human Computer Interaction
IoT, voice, mobile: Interfaces pushing the boundary of what we consider to be the interface
-
Enterprise Languages
Workhorse languages found in modern enterprises. Expect Java, .NET, & Node in this track