Presentation: Grading Observability (Note: Not a Product Pitch!)
Abstract
Nobody denies the importance of observability in modern production software: with microservices adding scale, concurrency, and frequent deploys, it’s getting harder and harder to answer even basic questions about application behavior. The conventional wisdom has been that metrics, logging and tracing are “the three pillars” of observability, yet organizations check these boxes and still find themselves grasping at straws during emergencies. The problem is that metrics, logs, and traces are just data – if what we need is a car, all we’re talking about is the fuel. We will continue to disappoint ourselves until we reframe observability around two fundamental activities:
(1) detection and
(2) refinement.
For effective observability, “detection” must be both robust and precise, overcoming cardinality concerns amid massive data volumes. “Refinement” revolves around rapid hypothesis testing: we must understand global context across service boundaries, decipher the side-effects of contention under peak load, and present everything with historical reference points to understand what’s changed and what’s normal behavior. In this session, we’ll summarize the contemporary observability dogma, then present a new observability scorecard for objectively reasoning about and assessing observability solutions for modern distributed systems.
Similar Talks
Observability in the SSC: Seeing Into Your Build System
Engineer @honeycombio
Ben Hartshorne
Observability in the Development Process: Not Just for Ops Anymore
Cofounder @honeycombio
Christine Yen
Architectures That Scale Deep - Regaining Control in Deep Systems
CEO and co-founder @LightStepHQ, Co-creator @OpenTracing API standard