O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Big Data Analytics with SAS

Book Description

Leverage the capabilities of SAS to process and analyze Big Data

About This Book

  • Combine SAS with platforms such as Hadoop, SAP HANA, and Cloud Foundry-based platforms for effecient Big Data analytics
  • Learn how to use the web browser-based SAS Studio and iPython Jupyter Notebook interfaces with SAS
  • Practical, real-world examples on predictive modeling, forecasting, optimizing and reporting your Big Data analysis with SAS

Who This Book Is For

SAS professionals and data analysts who wish to perform analytics on Big Data using SAS to gain actionable insights will find this book to be very useful. If you are a data science professional looking to perform large-scale analytics with SAS, this book will also help you. A basic understanding of SAS will be helpful, but is not mandatory.

What You Will Learn

  • Configure a free version of SAS in order do hands-on exercises dealing with data management, analysis, and reporting.
  • Understand the basic concepts of the SAS language which consists of the data step (for data preparation) and procedures (or PROCs) for analysis.
  • Make use of the web browser based SAS Studio and iPython Jupyter Notebook interfaces for coding in the SAS, DS2, and FedSQL programming languages.
  • Understand how the DS2 programming language plays an important role in Big Data preparation and analysis using SAS
  • Integrate and work efficiently with Big Data platforms like Hadoop, SAP HANA, and cloud foundry based systems.

In Detail

SAS has been recognized by Money Magazine and Payscale as one of the top business skills to learn in order to advance one's career. Through innovative data management, analytics, and business intelligence software and services, SAS helps customers solve their business problems by allowing them to make better decisions faster. This book introduces the reader to the SAS and how they can use SAS to perform efficient analysis on any size data, including Big Data.

The reader will learn how to prepare data for analysis, perform predictive, forecasting, and optimization analysis and then deploy or report on the results of these analyses. While performing the coding examples within this book the reader will learn how to use the web browser based SAS Studio and iPython Jupyter Notebook interfaces for working with SAS. Finally, the reader will learn how SAS's architecture is engineered and designed to scale up and/or out and be combined with the open source offerings such as Hadoop, Python, and R.

By the end of this book, you will be able to clearly understand how you can efficiently analyze Big Data using SAS.

Style and approach

The book starts off by introducing the reader to SAS and the SAS programming language which provides data management, analytical, and reporting capabilities. Most chapters include hands on examples which highlights how SAS provides The Power to Know©. The reader will learn that if they are looking to perform large-scale data analysis that SAS provides an open platform engineered and designed to scale both up and out which allows the power of SAS to combine with open source offerings such as Hadoop, Python, and R.

Table of Contents

  1. Big Data Analytics with SAS
    1. Title Page
      1. Big Data Analytics with SAS
    2. Credits
    3. Foreword
    4. About the Author
    5. About the Reviewer
    6. www.PacktPub.com
    7. Customer Feedback
    8. Dedication
    9. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Downloading the color images of this book
        3. Errata
        4. Piracy
        5. Questions
    10. 1. Setting Up the SAS® Software Environment
      1. What does SAS do?
        1. What is your perception of SAS?
        2. Let's get started with your free version of SAS
        3. History of SAS interfaces
      2. SAS Studio web-based GUI
        1. Describing the rest of SAS Studio
          1. SAS Studio section – Server Files and Folders 
          2. SAS Studio section – Tasks and Utilities
          3. SAS Studio section – Snippets
          4. SAS Studio section – Libraries
          5. SAS Studio section – File Shortcuts
        2. SAS programming language
          1.  First SAS data step program
          2. First use of a SAS PROC
          3. Saving a SAS program
          4. Creating a new SAS program
          5. The AUTOEXEC file
          6. Visual Programmer versus SAS Programmer  
          7. What's in the SAS® University Edition?
          8. Different levels of the SAS analytic platform  
          9. SAS data storage
            1. The SAS dataset
            2. The SAS® Scalable Performance Data Engine 
            3. The Scalable Performance Data Server 
            4. SAS HDAT
          10. SAS formats and informats
          11. Date and time data
      3. Summary
    11. 2. Working with Data Using SAS® Software
      1. Preparing data for analytics
        1. Making data in SAS
          1. Data step code to make data 
          2. PROC SQL to make data
        2. Working with external data
          1. Data step code for importing external data
          2. PROC IMPORT
          3. Referencing external files 
            1. Directly referencing external files
            2. Indirectly referencing external files
        3. Specialty PROCs for working with external data
          1. PROC HADOOP and PROC HDMD
          2. PROC JSON
        4. Specialty PROCs for working with computer languages
          1. PROC GROOVY
          2. PROC LUA
      2. Summary
    12. 3. Data Preparation Using SAS Data Step and SAS Procedures
      1. Data preparation for analytics
        1. Creating indicators for the first and last observation in a by group
        2. Transposing
          1. PROC TRANSPOSE
          2. SAS Studio Transpose Data task
        3. Statistical and mathematical data transformations
          1. PROC MEANS
          2. Imputation
          3. Identifying missing values
          4. Characterizing data
          5. List Table Attributes 
        4. SAS macro facility
          1. Macro variables
          2. Macros
      2. Summary
    13. 4. Analysis with SAS® Software
      1. Analytics
        1. Descriptive and predictive analysis
          1. Descriptive analysis
            1. PROC FREQ
            2. PROC CORR
            3. PROC UNIVARIATE
          2. Predictive analysis
            1. Regression analysis
            2. PROC REG
          3. Forecasting analysis
            1. PROC TIMEDATA
            2. PROC ARIMA
          4. Optimization analysis
            1. SAS/IML
            2. Interacting with the R programming language
            3. PROC IML
      2. Summary
    14. 5. Reporting with SAS® Software
      1. Reporting
        1. SAS Studio tasks and snippets that generate reports and graphs
        2. BASE procedures designed for reporting
          1. TABULATE procedure examples
          2. REPORT procedure example
        3. The Output Delivery System
          1. ODS Tagsets
          2. ODS trace
          3. ODS document and the DOCUMENT procedure
          4. ODS Graphics
          5. How to make a user-defined snippet
      2. Summary
    15. 6. Other Programming Languages in BASE SAS® Software
      1. The DS2 programming language
        1. When to use DS2
        2. How is DS2 similar to the data step?
        3. How are DS2 and DATA step different?
        4. Programming in DS2
          1. DS2 methods
            1. DS2 system methods
            2. DS2 user-defined methods
          2. DS2 packages
            1. DS2 predefined packages
            2. DS2 user-defined packages
          3. Running DS2 programs
            1. The DS2 procedure
            2. DS2 Hello World program – example 1
            3. DS2 Hello World program – example 2
            4. DS2 Hello World program – example 3
            5. DS2 Hello World program – example 4
            6. DS2 Hello World program – example 5
            7. DS2 program with a method that returns a value
            8. DS2 program with a user-defined package
      2. The FedSQL programming language
        1. How to run FedSQL programs
          1. FedSQL program using the FEDSQL procedure
          2. Using FedSQL with DS
      3. Summary
    16. 7. SAS® Software Engineers the Processing Environment for You
      1. Architecture
        1. The SAS platform
        2. Service-Oriented Architecture and microservices
          1. Differences between SOA and microservices
        3. SAS server versus a SAS grid
        4. In-database processing
          1. In-database procedures
          2. Additonal in-database processing SAS offerings
            1. SAS Scoring Accelerator
            2. SAS Code Accelerator
        5. In-memory processing
          1. SAS High-Performance Analytics Server
          2. SAS LASR Analytics Server
          3. SAS Cloud Analytics Server
          4. Dedicated hardware for in-memory processing
        6. Open platform and open source
          1. Running SAS from an iPython Jupyter Notebook
          2. SAS running in a cloud 
            1. A public cloud
            2. A private cloud
            3. A hybrid cloud
      2. Running SAS processing outside the SAS platform
        1. The SAS Embedded Process 
        2. The SAS Event Stream Processing engine
      3. SAS Viya the newest part of the SAS platform
        1. SAS Viya programming
        2. SAS Viya-based solutions
      4. Summary
    17. 8. Why SAS Programmers Love SAS
      1. Why SAS programmers love SAS
        1. Examples of why SAS programmers love SAS
          1. Additional coding examples
            1. The COMPARE procedure
            2. The OPTIONS procedure
      2. Analytics is a great career
        1. Analytics Center of Excellence
          1. The executive sponsor
          2. The data scientist
          3. The data manager
          4. The business analyst
          5. The ACE leader  
          6. Where should an ACE be located?
      3. Analytics across industries
        1. Analytics improving healthcare
        2. Analytics improving government services
        3. Analytics in financial services
        4. Analytics in energy
        5. Analytics in manufacturing
      4. Analytics are great for society
        1. Project Data Sphere®
        2. SAS and Data4Good
        3. GatherIQ™ – get involved in crowdsourcing to solve social issues
      5. References
      6. Summary