Presentation: Fault Tolerance at Speed
This presentation is now available to view on InfoQ.com
Watch video with transcriptAbstract
Distributed systems providing fault tolerance often sacrifice performance. The sacrifice often happens late when a systems engineering approach is not taken. Performance is an inherent aspect of distributed design and should be considered holistically in the systems engineering process. A well designed distributed system can be both fault-tolerant and fast.
In this session, we discuss the techniques and lessons learned from implementing the Aeron Cluster. The focus will be on how Raft can be implemented on Aeron, minimizing the network round trip overhead, and comparing a single process to a fully distributed cluster. Come to this session if interested in how performance can be a first-class design concern and the results which can be delivered.
Similar Talks
Understand the Trade-Offs Using Compilers for Java Applications
Eclipse OpenJ9 and OMR Project Lead @IBM
Mark Stoodley
Java 8 LTS to the Latest - a Performance & Responsiveness Prospective
Java Champion, First Lego League Coach, passionate about JVM Performance @Microsoft
Monica Beckwith
(Really) Understanding Garbage Collection
CTO @AzulSystems