Skip to Content
Python Machine Learning By Example - Second Edition
book

Python Machine Learning By Example - Second Edition

by Yuxi (Hayden) Liu
February 2019
Beginner to intermediate
382 pages
10h 1m
English
Packt Publishing
Content preview from Python Machine Learning By Example - Second Edition

Best practice 5 – storing large-scale data

With the ever-growing size of data, oftentimes we can't simply fit the data in our single local machine and need to store it on the cloud or distributed filesystems. As this is mainly a book on machine learning with Python, we will just touch on some basic areas that you can look into. The two main strategies of storing big data are scale-up and scale-out:

  • A scale-up approach increases storage capacity if data exceeds the current system capacity, such as by adding more disks. This is useful in fast-access platforms.
  • In a scale-out approach, storage capacity grows incrementally with additional nodes in a storage cluster. Apache Hadoop (https://hadoop.apache.org/) is used to store and process big ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Machine Learning by Example - Third Edition

Python Machine Learning by Example - Third Edition

Yuxi (Hayden) Liu
Python Machine Learning, Second Edition - Second Edition

Python Machine Learning, Second Edition - Second Edition

Sebastian Raschka, Jared Huffman, Vahid Mirjalili, Ryan Sun

Publisher Resources

ISBN: 9781789616729Supplemental Content