Book description
This IBM® Redbooks® publication demonstrates and documents how to implement and manage an IBM PowerLinux™ cluster for big data focusing on hardware management, operating systems provisioning, application provisioning, cluster readiness check, hardware, operating system, IBM InfoSphere® BigInsights™, IBM Platform Symphony®, IBM Spectrum™ Scale (formerly IBM GPFS™), applications monitoring, and performance tuning. This publication shows that IBM PowerLinux clustering solutions (hardware and software) deliver significant value to clients that need cost-effective, highly scalable, and robust solutions for big data and analytics workloads.
This book documents and addresses topics on how to use IBM Platform Cluster Manager to manage PowerLinux BigData data clusters through IBM InfoSphere BigInsights, Spectrum Scale, and Platform Symphony. This book documents how to set up and manage a big data cluster on PowerLinux servers to customize application and programming solutions, and to tune applications to use IBM hardware architectures. This document uses the architectural technologies and the software solutions that are available from IBM to help solve challenging technical and business problems.
This book is targeted at technical professionals (consultants, technical support staff, IT Architects, and IT Specialists) that are responsible for delivering cost-effective Linux on IBM Power Systems™ solutions that help uncover insights among client's data so they can act to optimize business results, product development, and scientific discoveries.
Table of contents
- Front cover
- Notices
- IBM Redbooks promotions
- Preface
- Chapter 1. Introduction to the solution
- Chapter 2. Reference architecture
- Chapter 3. Installation
-
Chapter 4. Design considerations
- 4.1 Important factors for sizing an InfoSphere BigInsights cluster
- 4.2 Customizing the predefined configurations
- 4.3 IBM Spectrum Scale (formerly GPFS) considerations
- 4.4 High availability considerations
- 4.5 Throughput and bandwidth considerations
- 4.6 Data volumes considerations
- 4.7 Security, user authentication, and edge nodes
- 4.8 Impact of use cases in design
- Chapter 5. Solution customization
- Chapter 6. Cluster management
- Chapter 7. Tuning
-
Appendix A. Integration and configuration for IBM Spectrum Scale, Hadoop, and IBM Platform Symphony
- Test cluster description
- Installation and configuration of IBM Spectrum Scale (formerly GPFS)
- Configuration of the IBM Spectrum Scale File Placement Optimizer Hadoop Connector
- Installing and configuring IBM Java and Apache Hadoop
- Running a Hadoop MapReduce job
- Installing and configuring IBM Platform Symphony V7.1
- Running Hadoop MapReduce jobs on Platform Symphony
- Appendix B. Scripts
- Appendix C. BigData Enablement and Administration Toolkit introduction
- Related publications
- Back cover
Product information
- Title: Implementing an IBM InfoSphere BigInsights Cluster using Linux on Power
- Author(s):
- Release date: June 2015
- Publisher(s): IBM Redbooks
- ISBN: 9780738440743
You might also like
book
Getting Started with IBM InfoSphere Optim Workload Replay for DB2
This IBM® Redbooks® publication will help you install, configure, and use IBM InfoSphere® Optim™ Workload Replay …
book
Microsoft® SQL Server™ 2000 Unleashed, Second Edition
Microsoft SQL Server 2000 Unleashed, 2E offers a variety of topics for system and database administrators …
book
BEA WebLogic Server™ 8.1 Unleashed
BEA WebLogic Server Unleashed is the definitive reference work for the WebLogic developer, offering an in-depth …
book
Citrix XenApp Platinum Edition for Windows: The Official Guide, 4th Edition
More than 180,000 organizations and 50 million users run Citrix software