Presentation: Predictive Datacenter Analytics With Strymon
Abstract
A modern enterprise datacenter is a complex, multi-layered system whose components often interact in unpredictable ways. Yet, to keep operational costs low and maximize efficiency, we would like to foresee the impact of changing workloads, updating configurations, modifying policies, or deploying new services.
In this talk, I will share our research group’s ongoing work on Strymon: a system for predicting datacenter behavior in hypothetical scenarios using queryable online simulation. Strymon leverages existing logging and monitoring pipelines of modern production datacenters to ingest cross-layer events in a streaming fashion and predict possible effects of such events in what-if scenarios. Predictions are made online by simulating the hypothetical datacenter state alongside the real one. Driven by a real-use case from our industrial partners, I will highlight the challenges we are facing in building Strymon to support a diverse set of data representations, input sources, query languages, and execution models.
Finally, I will share our initial design decisions and give an overview of Timely Dataflow; a high-performance distributed streaming engine and our platform of choice for Strymon’s core implementation.
Similar Talks
Coding without Complexity
CEO/Cofounder @darklang
Ellen Chisa
Programming the Cloud: Empowering Developers to Do Infrastructure
TypeScript Co-Creator
Luke Hoban
Cloud Native Applications and Infrastructures Panel
TypeScript Co-Creator
Luke Hoban
Scaling Patterns for Netflix's Edge
Playback Edge Engineering @Netflix
Justin Ryan
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Co-founder @gruntwork_io
Yevgeniy Brikman
AWS Cloud Development Kit (CDK)
Developer Tooling Advocate @AWSCloud & CDK Core Contributor
Richard Boyd
Helm 3: A Mariner's Delight
Principal Program Manager @Microsoft & K8s Release Lead for 1.16
Lachlan Evenson
How to Invest in Technical Infrastructure
Foundation Engineering @Stripe