O'Reilly logo

Analytics for the Internet of Things (IoT) by Andrew Minteer

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Hadoop Distributed File System

Hadoop Distributed File System (HDFS) is a filesystem spread across multiple servers and is designed to run on low-cost commodity hardware. HDFS supports a write-once and read many philosophy. It was designed for large-scale batch processing work on large to enormous sized files.

Files are divided up into blocks. A typical block size is 128 MB. A file on HDFS is sliced up into 128 MB chunks (the blocks) and distributed across different data nodes. Files in HDFS normally range from gigabytes to terabytes.

HDFS was designed for batch processing more than low-latency interactive queries from users. HDFS is not meant for files that frequently change with data updates. New data is typically appended to files or added ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required