Skip to Main Content
Big Data Now: 2015 Edition
book

Big Data Now: 2015 Edition

by O'Reilly Media, Inc.
January 2016
Beginner to intermediate content levelBeginner to intermediate
162 pages
3h 50m
English
O'Reilly Media, Inc.
Content preview from Big Data Now: 2015 Edition

Chapter 4. Big Data Architecture and Infrastructure

As noted in O’Reilly’s 2015 Data Science Salary Survey, the same four tools—SQL, Excel, R, and Python—continue to be the most widely used in data science for the third year in a row. Spark also continues to be one of the most active projects in big data, seeing a 17% increase in users over the past 12 months. Matei Zaharia, creator of Spark, outlined in his keynote at Strata + Hadoop San Jose two new goals Spark was pursuing in 2015. The first goal was to make distributed processing tools accessible to a wide range of users, beyond big data engineers. An example of this is seen in the new DataFrames API, inspired by R and Python data frames. The second goal was to enhance integration—to allow Spark to interact efficiently in different environments, from NoSQL stores to traditional data warehouses.  

In many ways, the two goals for Spark in 2015—greater accessibility for a wider user base and greater integration of tools/environments—are consistent with the changes we’re seeing in architecture and infrastructure across the entire big data landscape. In this chapter, we present a collection of blog posts that reflect these changes. 

Ben Lorica documents what startups like Tamr and Trifacta have learned about opening up data analysis to non-programmers. Benjamin Hindman laments the fact that we still don’t have an operating system that abstracts and manages hardware resources in the data center. Jim Scott discusses his use of Myriad ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Big Data Now: 2014 Edition

Big Data Now: 2014 Edition

O'Reilly Media, Inc.

Publisher Resources

ISBN: 9781492042273Publisher Website