Skip to Content
Introducing Data Science
book

Introducing Data Science

by Arno Meysman, Davy Cielen, Mohamed Ali
May 2016
Beginner
320 pages
10h 39m
English
Manning Publications
Content preview from Introducing Data Science

Chapter 5. First steps in big data

This chapter covers

  • Taking your first steps with two big data applications: Hadoop and Spark
  • Using Python to write big data jobs
  • Building an interactive dashboard that connects to data stored in a big data database

Over the last two chapters, we’ve steadily increased the size of the data. In chapter 3 we worked with data sets that could fit into the main memory of a computer. Chapter 4 introduced techniques to deal with data sets that were too large to fit in memory but could still be processed on a single computer. In this chapter you’ll learn to work with technologies that can handle data that’s so large a single node (computer) no longer suffices. In fact it may not even fit on a hundred computers. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Learning Data Science

Learning Data Science

Sam Lau, Joseph Gonzalez, Deborah Nolan
Introducing Machine Learning

Introducing Machine Learning

Dino Esposito, Francesco Esposito
Data Science Bookcamp

Data Science Bookcamp

Leonard Apeltsin
Build a Career in Data Science

Build a Career in Data Science

Emily Robinson, Jacqueline Nolis

Publisher Resources

ISBN: 9781633430037Publisher SupportOtherPublisher WebsiteErrata PagePurchase Link