IBM Reference Architecture for Genomics, Power Systems Edition

Book description

This IBM® Redbooks® publication introduces the IBM Reference Architecture for Genomics, IBM Power Systems™ edition on IBM POWER8®. It addresses topics such as why you would implement Life Sciences workloads on IBM POWER8, and shows how to use such solution to run Life Sciences workloads using IBM Platform™ Computing software to help set up the workloads. It also provides technical content to introduce the IBM POWER8 clustered solution for Life Sciences workloads.

This book customizes and tests Life Sciences workloads with a combination of an IBM Platform Computing software solution stack, Open Stack, and third party applications. All of these applications use IBM POWER8, and IBM Spectrum Scale™ for a high performance file system.

This book helps strengthen IBM Life Sciences solutions on IBM POWER8 with a well-defined and documented deployment model within an IBM Platform Computing and an IBM POWER8 clustered environment. This system provides clients in need of a modular, cost-effective, and robust solution with a planned foundation for future growth.

This book highlights IBM POWER8 as a flexible infrastructure for clients looking to deploy life sciences workloads, and at the same time reduce capital expenditures, operational expenditures, and optimization of resources.

This book helps answer clients' workload challenges in particular with Life Sciences applications, and provides expert-level documentation and how-to-skills to worldwide teams that provide Life Sciences solutions and support to give a broad understanding of a new architecture.

Table of contents

  1. Front cover
  2. Notices
    1. Trademarks
  3. IBM Redbooks promotions
  4. Preface
    1. Authors
    2. Now you can become a published author, too!
    3. Comments welcome
    4. Stay connected to IBM Redbooks
  5. Chapter 1. Introduction
    1. 1.1 Life sciences in genomics
    2. 1.2 IBM high-performance computing solutions
      1. 1.2.1 Cluster, grids, and clouds
      2. 1.2.2 HPC cluster for public and private cloud
      3. 1.2.3 Genomic discoveries with IBM Platform Computing solutions
  6. Chapter 2. Reference architecture
    1. 2.1 IBM Reference Architecture for Genomics
      1. 2.1.1 Reference architecture as master plan for deployment
      2. 2.1.2 Overview of AppCenter
    2. 2.2 Hardware and components
      1. 2.2.1 IBM scale-out servers
      2. 2.2.2 Hardware Management Console
      3. 2.2.3 Storage enclosure and components
    3. 2.3 Software and components
      1. 2.3.1 Operating systems and distributions
      2. 2.3.2 IBM Elastic Storage Server GUI
      3. 2.3.3 Applications
      4. 2.3.4 Open source tools
      5. 2.3.5 Platform Cluster Manager
      6. 2.3.6 Platform Application Center
      7. 2.3.7 IBM Platform Load Sharing Facility
      8. 2.3.8 IBM Platform Report Track Monitor
      9. 2.3.9 IBM Platform Process Manager
    4. 2.4 Operations
      1. 2.4.1 Submit a simple job
      2. 2.4.2 Submit flow
    5. 2.5 Network
      1. 2.5.1 Logical mappings
      2. 2.5.2 Network connections
      3. 2.5.3 Cabling
  7. Chapter 3. Scenarios using the reference architecture with workflow examples
    1. 3.1 What is needed to start scenarios for genomics workloads
    2. 3.2 Getting familiar with IBM Platform Computing software
      1. 3.2.1 Terminology
      2. 3.2.2 IBM Platform Load Sharing Facility job lifecycle
      3. 3.2.3 Access the Platform Computing graphical interface
      4. 3.2.4 Platform Computing software administration tips
      5. 3.2.5 Workflow management in PAC
      6. 3.2.6 Flow definitions in PPM
      7. 3.2.7 Managing job flows
      8. 3.2.8 Monitoring and statistics
    3. 3.3 Introduction to genomic workflow
      1. 3.3.1 Terminology
      2. 3.3.2 Introduction to genomics
      3. 3.3.3 Genomic sequencing pipeline
    4. 3.4 Preparing the environment for Life Sciences solution for genomics workloads
      1. 3.4.1 BioBuilds package
      2. 3.4.2 GATK
    5. 3.5 IBM Life Science Platform Provisioning package
      1. 3.5.1 Installation of sample life science workflows
      2. 3.5.2 User management and permissions
      3. 3.5.3 Sample genomic workflow customization
      4. 3.5.4 Visualization of data
    6. 3.6 Storage options using IBM Spectrum Scale
    7. 3.7 Adding compute nodes to a running cluster
      1. 3.7.1 Adding POWER CPU nodes
      2. 3.7.2 Adding POWER GPU nodes
    8. 3.8 Additional information
    9. 3.9 Other vendors packages for IBM Power Systems
      1. 3.9.1 Databiology
  8. Chapter 4. Medicine of the future with IBM
    1. 4.1 Introduction
    2. 4.2 Healthcare areas
      1. 4.2.1 IBM Watson
      2. 4.2.2 Apache Spark
    3. 4.3 Read and follow
  9. Appendix A. Useful software information
    1. Genomics package
  10. Related publications
    1. IBM Redbooks
    2. Other publications
    3. Online resources
    4. Help from IBM
  11. Back cover

Product information

  • Title: IBM Reference Architecture for Genomics, Power Systems Edition
  • Author(s): Dino Quintero, Luis Bolinches, Marcelo Correia Lima, Katarzyna Pasierb, William dos Santos
  • Release date: April 2016
  • Publisher(s): IBM Redbooks
  • ISBN: 9780738441634