Presentation: Fix Spark Failures and Bottlenecks Faster & Easier

Track: Sponsored Solutions Track I

Location: Pacific BC

Day of week:

Slides: Download Slides

Level: Intermediate

Persona: Architect, Data Engineering

Abstract

This talk presents the results of analyzing many Spark jobs on many multi-tenant production clusters. Kirk discusses common issues seen, the symptoms of those issues, and how developers can address them.

At Pepperdata, we have gathered trillions of performance data points on production clusters running Spark, covering a variety of industries, applications, and workload types. We will present key performance insights — best and worst practices, gotchas, and tuning recommendations — based on analyzing the behavior and performance of millions of Spark applications. In addition, we will describe how we are turning these learnings into heuristics used in the open source Dr. Elephant project.

Speaker: Kirk Lewis

Field Engineer @Pepperdata

Kirk Lewis joined Pepperdata in 2015. Previously, Kirk was a Solutions Engineer at StackVelocity. Before that he was the lead technical architect for big data production platforms at American Express. Kirk has a strong background in big data.

Find Kirk Lewis at