HZH-1944 Google Dataflow: The new open model for batch and stream processing | Devoxx

Google Dataflow: The new open model for batch and stream processing

Conference

bigd Big Data & Analytics

Paris 243

jeudi at 14:55 - 15:40

In 2004 Google published the MapReduce paper, a programming model that kick-started big data as we know it. Ten years later, Google introduced Dataflow - a new paradigm, integrating batch and stream processing in one common abstraction. This time the offer was more than a paper, but also an open source Java SDK and a cloud managed service to run it. In 2016 big data players like Cask, Cloudera, Data Artisans, PayPal, Slack, Talend joined Google to propose Dataflow for incubation at the Apache Software Foundation - Dataflow is here, not only unifying batch and streaming, but also the big data world.

In this talk we are going to review Dataflow's differentiating elements and why they matter. We’ll demonstrate Dataflow’s capabilities through a real-time demo with practical insights on how to manage and visualize streams of data.

Sara Robinson Sara Robinson

Sara is a Developer Advocate on Google's Cloud Platform team, where she helps with developer relations through online content, outreach and events. She has a bachelor’s degree in Business and International Studies from Brandeis University. When she's not programming, she can be found running, listening to country music, or finding the best ice cream in SF.

Felipe Hoffa Felipe Hoffa

Google Developer Programs Engineer. Fully invested on big data and its posibilities.