Silicon Valley
Past Presentations
Building Resilience in Production Migrations
How do you migrate stateful systems with confidence? Especially when downtime is not an option? Netflix Billing Infrastructure needs to be up 24/7 to support 130+ million global customers. Billing services are the source of truth for a customer’s billing state which changes as customers...
Training Deep Learning Models at Scale on Kubernetes
Deep Learning has recently become very important for all kinds of AI applications from conversational chatbots to self-driving cars. In this talk, we will talk about how we use deep learning for natural language processing, utilize Tensorflow for training deep learning models, run Tensorflow on...
Reducing Risk of Credential Compromise @Netflix
Building a secure system is like constructing a good pizza – each individual layer adds flavor that ultimately builds to the perfect bite. At Netflix we have hand-crafted ingredients that by themself are scrumptious, but when placed together strategically on the crust (read: cloud), constructs...
Capacity Planning for Crypto Mania
Over the course of 2017, Coinbase experienced exponential user and trading volume growth, which in turn led to periods of website instability and downtime. During this period, we saw our systems perform at the very edge of their capacity which inspired important capacity and performance...
Massively scaling MySQL using Vitess
Are you dealing with the challenges of rapid growth? Are you thinking about how to scale your database layer? Should you use NoSQL? Should you shard your relational database? If you are facing these kinds of problems, this session is for you. Vitess is a database solution for deploying, scaling...
Chaos Engineering with Containers
Chaos Engineering is the practice of running thoughtful planned experiments to reveal weakness in our systems. In this session, Ana discusses the benefits of using Chaos Engineering to inject failures in order to make your container infrastructure more reliable. She will also share how to improve...
Interviews
Yes, I Test In Production (And So Do You)
What's the motivation for this talk?
The motivation for this talk is to help people understand that deploying software carries an irreducible element of uncertainty and risk. Trying too hard to prevent failures will actually make your systems and your teams *more* vulnerable to failure and prolonged downtime. So what can you do about it?
Read Full InterviewHuman-Centric Machine Learning Infrastructure @Netflix
Can you give an example of some of the questions you get from data scientists when you are trying to deploy models?
When it comes to common questions, as boring as it may sound, my experience is that machine learning infrastructure is much more about data than science. Most questions we get are related to data: how do I find the data I need, how do I set up the data pipeline, how do I handle the somewhat non-trivial amounts of data in python and R,...
Read Full InterviewArtwork Personalization @Netflix
What work do you do at Netflix?
I lead one of the Machine Learning and Recommendation teams at Netflix. We're responsible for the end-to-end machine learning that decides what shows up on the Netflix homepage across all our different experiences. When you log into Netflix, my team is responsible for what rows of TV shows and movies you see on the homepage. We select...
Read Full InterviewBuilding Resilience in Production Migrations
What's the focus of the work that you do today?
I lead Billing Infrastructure Engineering at Netflix. We build the infrastructure that helps Netflix collect charges from its members. Part of that is to determine who should be charged and how much through our systems. We also hold all the gift codes and balances and track them. We also support major customer workflows. Our services...
Read Full InterviewCapacity Planning for Crypto Mania
What's the focus of the work that you do today?
Jordan: We’re on the Reliability Team at Coinbase. It was formed in response to the crazy spike of scaling challenges around 2017 with Cryptocurrency. The work is focused on traditional SRE topics of monitoring and instrumentation. We act as consultants for other mostly feature-focused teams. For example, we embed in teams to make...
Read Full Interviewnpm and the Future of JavaScript
Can you tell me more about what your talk is about?
I'll talk more about server-side stuff, and I’ll emphasize Node. We've found that the security message is important to people, so there will be quite a bit there. There's been this huge shift in how JavaScript is used and enterprises are only just beginning to catch up to that. There are many people working on JavaScript, and they're...
Read Full Interview