IBM Software Defined Infrastructure for Big Data Analytics Workloads

Book description

This IBM® Redbooks® publication documents how IBM Platform Computing, with its IBM Platform Symphony® MapReduce framework, IBM Spectrum Scale (based Upon IBM GPFS™), IBM Platform LSF®, the Advanced Service Controller for Platform Symphony are work together as an infrastructure to manage not just Hadoop-related offerings, but many popular industry offeringsm such as Apach Spark, Storm, MongoDB, Cassandra, and so on.

It describes the different ways to run Hadoop in a big data environment, and demonstrates how IBM Platform Computing solutions, such as Platform Symphony and Platform LSF with its MapReduce Accelerator, can help performance and agility to run Hadoop on distributed workload managers offered by IBM. This information is for technical professionals (consultants, technical support staff, IT architects, and IT specialists) who are responsible for delivering cost-effective cloud services and big data solutions on IBM Power Systems™ to help uncover insights among client’s data so they can optimize product development and business results.

Table of contents

  1. Front cover
  2. Notices
    1. Trademarks
  3. IBM Redbooks promotions
  4. Preface
    1. Authors
    2. Now you can become a published author, too
    3. Comments welcome
    4. Stay connected to IBM Redbooks
  5. Chapter 1. Introduction to big data
    1. 1.1 Evolution and characteristics of big data
    2. 1.2 What’s in a bite
    3. 1.3 Is the demand for a big data solution real?
    4. 1.4 What is Hadoop?
    5. 1.5 Hadoop Distributed File System in more detail
    6. 1.6 MapReduce in more detail
      1. 1.6.1 The map phase
      2. 1.6.2 The reduce phase
    7. 1.7 The changing nature of distributed computing
    8. 1.8 IBM Platform Symphony grid manager
  6. Chapter 2. Big data, analytics, and risk calculation software portfolio
    1. 2.1 The impact of big data
    2. 2.2 Big data analytics
      1. 2.2.1 Big data analytics challenge
      2. 2.2.2 Big data analytics solutions
      3. 2.2.3 IBM big data and analytics areas with solutions
      4. 2.2.4 Big data analytics advantage
    3. 2.3 IBM risk analytics solution advantages
      1. 2.3.1 Algorithmics software
      2. 2.3.2 IBM OpenPages Operational Risk Management software
    4. 2.4 Scenario: How to minimize risk and build a better model
      1. 2.4.1 Algo Market Risk analysis
      2. 2.4.2 IBM SPSS Statistics, Monte Carlo simulation
      3. 2.4.3 Scenario
  7. Chapter 3. IBM Platform Symphony with Application Service Controller
    1. 3.1 Introduction to IBM Platform Symphony v7.1
    2. 3.2 How it operates
      1. 3.2.1 Cluster management
      2. 3.2.2 Application
      3. 3.2.3 How workload management differs from resource management
      4. 3.2.4 Platform Management Console
    3. 3.3 IBM Platform Symphony for multitenant designs
      1. 3.3.1 The narrow view of multitenancy
      2. 3.3.2 Advantages and challenges
      3. 3.3.3 Multitenant designs
      4. 3.3.4 Requirements gathering
      5. 3.3.5 Building a multitenant big data infrastructure
    4. 3.4 Platform Symphony concepts
      1. 3.4.1 Session manager
      2. 3.4.2 Resource groups
      3. 3.4.3 Applications
      4. 3.4.4 Application profiles
      5. 3.4.5 Consumers
      6. 3.4.6 Services
      7. 3.4.7 Sessions
      8. 3.4.8 Repositories
      9. 3.4.9 Tasks
    5. 3.5 Benefits of using Platform Symphony
      1. 3.5.1 Summary
    6. 3.6 Product edition highlights
      1. 3.6.1 Platform Symphony Developer Edition
      2. 3.6.2 Platform Symphony Advanced Edition
    7. 3.7 Optional applications to extend Platform Symphony capabilities
    8. 3.8 Overview of the Application Service Controller add-on
      1. 3.8.1 Application Service Controller lifecycle
      2. 3.8.2 Application framework integrations
      3. 3.8.3 Basic concepts
      4. 3.8.4 Key prerequisites
      5. 3.8.5 Application Service Controller application templates
    9. 3.9 Platform Symphony application implementation
      1. 3.9.1 Planning for Platform Symphony
      2. 3.9.2 Accessing the Platform Symphony management console
      3. 3.9.3 Configuring a cluster for multitenancy
      4. 3.9.4 Adding a new application or tenant
      5. 3.9.5 Configuring application properties
      6. 3.9.6 Associating applications with consumers
      7. 3.9.7 Summary
    10. 3.10 Application Service Controller in a big data solution
      1. 3.10.1 Hadoop implementations in IBM Technology
      2. 3.10.2 Adding Application Service Controller to improve a big data cluster
      3. 3.10.3 Advantages of Spark technology
    11. 3.11 Application Service Controller as the attachment for a cloud-native framework: Cassandra
      1. 3.11.1 Cassandra architecture
      2. 3.11.2 Cassandra and multitenancy
    12. 3.12 Summary
  8. Chapter 4. Mixed IBM Power Systems and Intel environment for big data
    1. 4.1 System components and default settings in the test environment
    2. 4.2 Supported system configurations
    3. 4.3 IBM Platform Symphony installation steps
      1. 4.3.1 Master host installation
      2. 4.3.2 Compute host installation
    4. 4.4 Compiling and installing Hadoop 1.1.1 for IBM PowerPC
      1. 4.4.1 Dependencies
      2. 4.4.2 Building Hadoop
    5. 4.5 IBM Spectrum Scale installation and configuration
      1. 4.5.1 Installation steps
      2. 4.5.2 Cluster configuration steps
      3. 4.5.3 Spectrum Scale network shared disk creation
    6. 4.6 Hadoop configuration
    7. 4.7 MapReduce test with Hadoop Wordcount in IBM Platform Symphony 7.1
  9. Chapter 5. IBM Spectrum Scale for big data environments
    1. 5.1 Spectrum Scale functions
    2. 5.2 Spectrum Scale benefits
    3. 5.3 Comparison of HDFS and Spectrum Scale features
    4. 5.4 What’s new in Spectrum Scale 4.1
      1. 5.4.1 File encryption and secure erase
      2. 5.4.2 Transparent flash cache
      3. 5.4.3 Network performance monitoring
      4. 5.4.4 AFM enhancements
      5. 5.4.5 NFS data migration
      6. 5.4.6 Backup and restore improvements
      7. 5.4.7 FPO enhancements
    5. 5.5 For more information
  10. Chapter 6. IBM Application Service Controller in a mixed environment
    1. 6.1 Enabling the Application Service Controller
      1. 6.1.1 Steps to enable the Application Service Controller
      2. 6.1.2 Verifying the Application Service Controller in the GUI
    2. 6.2 Sample application instance templates
    3. 6.3 Registering an application template for Cassandra
      1. 6.3.1 Preparing the application package
      2. 6.3.2 Registering the application instance
  11. Chapter 7. IBM Platform Computing cloud services
    1. 7.1 IBM Platform Computing cloud services
    2. 7.2 Cloud services architecture
    3. 7.3 IBM Spectrum Scale high-performance services
    4. 7.4 IBM Platform Symphony services
    5. 7.5 IBM High Performance Services for Hadoop
    6. 7.6 IBM Platform LSF services
    7. 7.7 Hybrid Platform LSF on-premises with a cloud service
      1. 7.7.1 Upgrading IBM Platform HPC to enable multicluster functions
      2. 7.7.2 Tasks to install IBM Platform LSF in the cloud
      3. 7.7.3 Configuring the multicluster feature
      4. 7.7.4 Configuring job forwarding
      5. 7.7.5 Testing your configuration
      6. 7.7.6 Hybrid cloud capacity increases and assistance
    8. 7.8 Data management on hybrid clouds
      1. 7.8.1 IBM Platform Data Manager for LSF
      2. 7.8.2 IBM Spectrum Scale Active File Management
  12. Related publications
    1. IBM Redbooks
    2. Other publications
    3. Online resources
    4. Help from IBM
  13. Back cover

Product information

  • Title: IBM Software Defined Infrastructure for Big Data Analytics Workloads
  • Author(s):
  • Release date: June 2015
  • Publisher(s): IBM Redbooks
  • ISBN: 9780738440774