Andy Petrella | Devoxx

Andy Petrella
Andy Petrella Twitter

From Data Fellas

Andy is a mathematician turned into a distributed computing entrepreneur. Besides being a Scala/Spark trainer. Andy also participated in many projects built using spark, cassandra, and other distributed technologies, in various fields including Geospatial, IoT, Automotive and Smart cities projects. He is the creator of the Spark Notebook (https://github.com/andypetrella/spark-notebook), the only reactive and fully Scala notebook for Apache Spark. In 2015, Xavier Tordoir and Andy founded Data Fellas and its product Agile Data Science Toolkit which facilitates the productization of Data Science projects Andy is also member of program committee of the O’Reilly Strata, Scala eXchange and Data Science eXchange and Devoxx BE

Blog: http://data-fellas.guru

bigd Big Data & Analytics

New Data Science: Functional, Distributed, JVM... and Agile

Conference

Data Science, a buzz word we've seen popping everywhere in 2015

Why? It turns out engineers explored the Big Data's value and the way to deal it, that is, digging the gold, Data Science, covering mathematics, statistics, machine learning, data preparation, software development and more

Data science came to the front because data is accumulating and exploiting the value is a key to competitivity. Data Science and Machine Learning in particular had traditionally been the smart and helpful tool mostly designed and developed in academia, the enterprise could only grasp at high premium

Now the game is changing drastically, methods have matured, libraries are available and more data scientists are entering the market

Still, there are many friction points in the development process of services exploiting data. It's true that Data Scientists are developers, but usually they are not software developers and even less devops which leads to a disrupted organization and a lack of efficiency

We present here some solutions providing a unifying environment, helping different people with different tasks and background to develop a data service pipeline with minimal friction and maximal agility