producing results more quickly. This technology is available through the use of the
SAS/ACCESS, SAS Scoring Accelerator, and SAS Analytics Accelerator products. The
supported databases include the following:
• DB2 (UNIX only)
Hadoop Data Storage
Data can be stored as Hadoop data, which is divided into blocks and stored across
multiple connected nodes that work together. The benefits of storing data in Hadoop
include the following:
• Hadoop accomplishes two tasks: massive data storage and distributed processing.
• Hadoop is a low-cost alternative for data storage over traditional data storage
options. Hadoop uses commodity hardware to reliably store large quantities of data.
• Data and application processing are protected against hardware failure. If a node
goes down, data is not lost because a minimum of three instances of the data exist in
the Hadoop cluster. Furthermore, jobs are automatically redirected to working
machines in the cluster.
• The distributed Hadoop model is designed to easily and economically scale up from
single servers to thousands of nodes, each offering local computation and storage.
• Unlike traditional relational databases, Hadoop does not require preprocessing of
data before storing it. You can easily store unstructured data.
You can use Hadoop to stage large amounts of raw data for subsequent loading into an
enterprise data warehouse or to create an analytical store for high-value activities such as
advanced analytics, querying, and reporting. SAS enables you to use data stored in
Hadoop to do the following:
• explore data and develop and execute models using the following software: SAS
Visual Analytics, SAS Visual Statistics, SAS High-Performance Analytics products,
SAS In-Database Technology, SAS In-Memory Statistics, and SAS Scoring
Accelerator for Hadoop
• access and manage data using the following software: SAS Data Loader for Hadoop,
SAS Data Quality Accelerator for Hadoop, SAS In-Database Code Accelerator for
Hadoop, SAS/ACCESS interfaces, the Base SAS FILENAME statement and
HADOOP procedure, the SAS Scalable Performance Data (SPD) Engine, and SAS
Data Integration Studio
• use the features of SAS Event Stream Processing, SAS Federation Server, SAS Grid
Manager for Hadoop, SAS High-Performance Marketing Optimization, and SAS
Visual Scenario Designer
Hadoop Data Storage 19