Tim is a teacher, author, and technology leader with Confluent, where he serves as the Vice President of Developer Relations. He is a regular speaker at conferences and a presence on YouTube explaining complex technology topics in an accessible way. He tweets as @tlberglund, blogs every few years at http://timberglund.com. He has three grown children and two grandchildren, a fact about which he is rather excited.
The toolset for building scalable data systems is maturing, having adapted well to our decades-old paradigm of update-in-place databases. We ingest events, we store them in high-volume OLTP databases, and we have new OLAP systems to analyze them at scale—even if the size of our operation requires us to grow to dozens or hundreds of servers in the distributed system. But something feels a little dated about the store-and-analyze paradigm, as if we are missing a new architectural insight that might more efficiently distribute the work of storing and computing the events that happen to our software. That new paradigm is stream processing.
In this workshop, we’ll learn the basics of Kafka as a messaging system, learning the core concepts of topic, producer, consumer, and broker. We’ll look at how topics are partitioned among brokers and see the simple Java APIs for getting data in and out. But more than that, we’ll look at how we can extend this scalable messaging system into a streaming data processing system—one that offers significant advantages in scalability and deployment agility, while locating computation in your data pipeline in precisely the places it belongs: in your microservices and applications, and out of costly, high-density systems.
Come to this workshop to learn how to do streaming data computation with Apache Kafka!
Normally simple tasks like running a program or storing and retrieving data become much more complicated when we start to do them on collections of computers, rather than single machines. Distributed systems has become a key architectural concern, and affects everything a program would normally do—giving us enormous power, but at the cost of increased complexity as well.
Using a series of examples all set in a coffee shop, we’ll explore topics like distributed storage, computation, timing, messaging, and consensus. You'll leave with a good grasp of each of these problems, and a solid understanding of the ecosystem of open-source tools in the space.
Developers and architects are increasingly called upon to solve big problems, and we are able to draw on a world-class set of open source tools with which to solve them. Problems of scale are no longer consigned to the web’s largest companies, but are increasingly a part of ordinary enterprise development. At the risk of only a little hyperbole, we are all distributed systems engineers now.
In this talk, we’ll look at four distributed systems architectural patterns based on real-world systems that you can apply to solve the problems you will face in the next few years. We’ll look at the strengths and weaknesses of each architecture and develop a set of criteria for knowing when to apply each one. You will leave knowing how to work with the leading data storage, messaging, and computation tools of the day to solve the daunting problems of scale in your near future.
Your goal is simple: take that is happening in your company—every click, every database change, every application log—and made it all available as a real-time stream of well-structured data? No big deal! You’re just taking your decades-old, batch-oriented data integration and data processing and migrating to to real-time streams and real-time processing. In your shop, you call that Tuesday. But of the several challenges to tackle, you’ll have to get data in and out of that stream processing system, and there’s a whole bunch of code there you don’t want to write. This is where Kafka Connect comes in.
Connect is a standard part of Apache Kafka that provides a scalable, fault-tolerant way to stream data between Kafka and other systems. It provides an API for developing standard connectors for common data sources and sinks, giving you the ability to ingest database changes, write streams to tables, store archives in HDFS, and more. We’ll explore the standard connector implementations offered in the Confluent Open Source download, and look at a few operational questions as well. Come to this session to get connected to Kafka!
It has been a while since you left the easy days of batch processing behind: the lazy ETL jobs that had all night to run, the relaxed SLAs that let you take lunches like Don Draper, the languid bankers’ hours: the salad days of your data processing career. Those days are over now, and producing real-time results on streaming data is the new order of the day. Two seconds is the new overnight.
Apache Kafka is a de facto standard streaming data processing platform, being widely deployed as a messaging system, and having a robust data integration framework (Kafka Connect) and stream processing API (Kafka Streams) to meet the needs that common attend real-time message processing.
On top of that, Kafka now offers ksqlDB, a declarative, SQL-like stream processing language. What once took some moderately sophisticated Java code can now be done at the command line with a familiar and eminently approachable syntax. Come to this talk for an updates on the latest ksqlDB features with live coding on live streaming data.
The toolset for building scalable data systems is maturing, having adapted well to our decades-old paradigm of update-in-place databases. We ingest events, we store them in high-volume OLTP databases, and we have new OLAP systems to analyze them at scale—even if the size of our operation requires us to grow to dozens or hundreds of servers in the distributed system. But something feels a little dated about the store-and-analyze paradigm, as if we are missing a new architectural insight that might more efficiently distribute the work of storing and computing the events that happen to our software. That new paradigm is stream processing.
In this workshop, we’ll learn the basics of Kafka as a messaging system, learning the core concepts of topic, producer, consumer, and broker. We’ll look at how topics are partitioned among brokers and see the simple Java APIs for getting data in and out. But more than that, we’ll look at how we can extend this scalable messaging system into a streaming data processing system—one that offers significant advantages in scalability and deployment agility, while locating computation in your data pipeline in precisely the places it belongs: in your microservices and applications, and out of costly, high-density systems.
Come to this workshop to learn how to do streaming data computation with Apache Kafka!
The toolset for building scalable data systems is maturing, having adapted well to our decades-old paradigm of update-in-place databases. We ingest events, we store them in high-volume OLTP databases, and we have new OLAP systems to analyze them at scale—even if the size of our operation requires us to grow to dozens or hundreds of servers in the distributed system. But something feels a little dated about the store-and-analyze paradigm, as if we are missing a new architectural insight that might more efficiently distribute the work of storing and computing the events that happen to our software. That new paradigm is stream processing.
In this workshop, we’ll learn the basics of Kafka as a messaging system, learning the core concepts of topic, producer, consumer, and broker. We’ll look at how topics are partitioned among brokers and see the simple Java APIs for getting data in and out. But more than that, we’ll look at how we can extend this scalable messaging system into a streaming data processing system—one that offers significant advantages in scalability and deployment agility, while locating computation in your data pipeline in precisely the places it belongs: in your microservices and applications, and out of costly, high-density systems.
Come to this workshop to learn how to do streaming data computation with Apache Kafka!
Build and test software written in Java and many other languages with Gradle, the open source project automation tool that’s getting a lot of attention. This concise introduction provides numerous code examples to help you explore Gradle, both as a build tool and as a complete solution for automating the compilation, test, and release process of simple and enterprise-level applications.
Discover how Gradle improves on the best ideas of Ant, Maven, and other build tools, with standards for developers who want them and lots of flexibility for those who prefer less structure.