Presentation: Fix Spark Failures and Bottlenecks Faster & Easier
Abstract
This talk presents the results of analyzing many Spark jobs on many multi-tenant production clusters. Kirk discusses common issues seen, the symptoms of those issues, and how developers can address them.
At Pepperdata, we have gathered trillions of performance data points on production clusters running Spark, covering a variety of industries, applications, and workload types. We will present key performance insights — best and worst practices, gotchas, and tuning recommendations — based on analyzing the behavior and performance of millions of Spark applications. In addition, we will describe how we are turning these learnings into heuristics used in the open source Dr. Elephant project.