Mastering the Art of Streaming Infrastructure

Thursday, 9:00 AM EST - SEA WATCH/SEA SHORE

Designing a distributed system architecture can be a daunting task, with contradictory requirements and constraints constantly at play. The CAP theorem that directly states the challenges in distributed data stores presents a classic example where developers must choose between consistency, availability, and partition tolerance. The same applies to streaming infrastructure systems, where optimizing for one aspect can come at the cost of another. With cost, throughput, accuracy, and latency as the main constraints for streaming systems, it's crucial to make informed decisions that align with your business goals.

In this session, you'll gain valuable insights into how your system design choices impact your system overall capabilities. You'll also learn about the differences between Flink Streaming and Spark Streaming, both conceptually and in practice. Lastly, you'll understand how combining multiple solutions can be beneficial for your team and business. Join to learn more about the cumbersome world of distributed stream processing systems.

About Adi Polak

Adi Polak

Adi is an experienced Software Engineer and people manager. For most of her professional life, she has worked with data and machine learning for operations and analytics. As a data practitioner, she developed algorithms to solve real-world problems using machine learning techniques and leveraging expertise in Apache Spark, Kafka, HDFS, and distributed large-scale systems.

Adi has taught Spark to thousands of students and is the author of the successful book — Scaling Machine Learning with Spark. Earlier this year, she embarked on a new adventure with data streaming, specifically Flink, and she can't get enough of it.

More About Adi »