Presentation: Panel: SQL Over Streams, Ask the Experts

Track: Stream Processing In The Modern Age

Location: Bayview AB

Day of week:

Level: Intermediate - Advanced

Persona: Architect, CTO/CIO/Leadership, Data Engineering, Data Scientist, General Software, ML Engineer

Abstract

Queries over streams are generally "continuous," executing for long periods of time and returning incremental results. Yet operations over streams must have the ability to be monotonic. New Generation of Stream Processing Engines has added support for Stream SQL. This AMA / panel features a discussion with thought leaders evolving and shaping the space.

Speaker: Julian Hyde

Original Developer @ApacheCalcite, Co-Founder SQLstream, & Architect @Hortonworks

Julian Hyde is an expert in query optimization, in-memory analytics, and streaming. He is the original developer of Apache Calcite, the query planning framework behind Apache Hive, Drill, Kylin and Phoenix, and was also the original developer of the open source Mondrian OLAP engine. A longtime advocate of streaming analytics, he co-founded SQLstream, an engine that executes streaming analytics at scale via SQL, authored the 2010 paper “Data in Flight”, and has worked with projects including Apex, Beam, Flink, Samza, and Storm to bring add streaming extensions to standard SQL. He is an architect at Hortonworks.

Find Julian Hyde at

Speaker: Tyler Akidau

Engineer @Google & Founder/Committer on Apache Beam

Tyler Akidau is a senior staff software engineer at Google Seattle. He leads technical infrastructure’s internal data processing teams in Seattle (MillWheel & Flume), is a founding member of the Apache Beam PMC, and has spent the last seven years working on massive-scale data processing systems. Though deeply passionate and vocal about the capabilities and importance of stream processing, he is also a firm believer in batch and streaming as two sides of the same coin, with the real endgame for data processing systems the seamless merging between the two. He is the author of the 2015 Dataflow Model paper, the Streaming 101 and Streaming 102 articles, and the upcoming Streaming Systems book. His preferred mode of transportation is by cargo bike, with his two young daughters in tow.

Find Tyler Akidau at

Speaker: Jay Kreps

Co-Founder and CEO @Confluent

Jay Kreps is the CEO of Confluent, Inc., a company backing the popular Apache Kafka messaging system. Prior to founding Confluent, he was formerly the lead architect for data infrastructure at LinkedIn. He is among the original authors of several open source projects including Project Voldemort (a key-value store). Apache Kafka (a distributed messaging system) and Apache Samza (a stream processing system).

Find Jay Kreps at

Speaker: Michael Armbrust

Initial Author of Apache Spark SQL & Leads Streaming Team @Databricks

Michael Armbrust is the initial author of Apache Spark SQL and now leads the Structured Streaming team at Databricks. He received his PhD from UC Berkeley in 2013, and was advised by Michael Franklin, David Patterson, and Armando Fox. His thesis focused on building systems that allow developers to rapidly build scalable interactive applications, and specifically defined the notion of scale independence. His interests broadly include distributed systems, large-scale structured storage and query optimization.

Find Michael Armbrust at

Speaker: Stephan Ewen

Committer @ApacheFlink, CTO @dataArtisans

Stephan Ewen is a PMC member and one of the original creators of Apache Flink, and co-founder and CTO of data Artisans (data-artisans.com). He holds a Ph.D. from the Berlin University of Technology.

Find Stephan Ewen at