IBM Cloud Pak for Data

Book description

Build end-to-end AI solutions with IBM Cloud Pak for Data to operationalize AI on a secure platform based on cloud-native reliability, cost-effective multitenancy, and efficient resource management

Key Features

  • Explore data virtualization by accessing data in real time without moving it
  • Unify the data and AI experience with the integrated end-to-end platform
  • Explore the AI life cycle and learn to build, experiment, and operationalize trusted AI at scale

Book Description

Cloud Pak for Data is IBM's modern data and AI platform that includes strategic offerings from its data and AI portfolio delivered in a cloud-native fashion with the flexibility of deployment on any cloud. The platform offers a unique approach to addressing modern challenges with an integrated mix of proprietary, open-source, and third-party services.

You'll begin by getting to grips with key concepts in modern data management and artificial intelligence (AI), reviewing real-life use cases, and developing an appreciation of the AI Ladder principle. Once you've gotten to grips with the basics, you will explore how Cloud Pak for Data helps in the elegant implementation of the AI Ladder practice to collect, organize, analyze, and infuse data and trustworthy AI across your business. As you advance, you'll discover the capabilities of the platform and extension services, including how they are packaged and priced. With the help of examples present throughout the book, you will gain a deep understanding of the platform, from its rich capabilities and technical architecture to its ecosystem and key go-to-market aspects.

By the end of this IBM book, you'll be able to apply IBM Cloud Pak for Data's prescriptive practices and leverage its capabilities to build a trusted data foundation and accelerate AI adoption in your enterprise.

What you will learn

  • Understand the importance of digital transformations and the role of data and AI platforms
  • Get to grips with data architecture and its relevance in driving AI adoption using IBM's AI Ladder
  • Understand Cloud Pak for Data, its value proposition, capabilities, and unique differentiators
  • Delve into the pricing, packaging, key use cases, and competitors of Cloud Pak for Data
  • Use the Cloud Pak for Data ecosystem with premium IBM and third-party services
  • Discover IBM's vibrant ecosystem of proprietary, open-source, and third-party offerings from over 35 ISVs

Who this book is for

This book is for data scientists, data stewards, developers, and data-focused business executives interested in learning about IBM's Cloud Pak for Data. Knowledge of technical concepts related to data science and familiarity with data analytics and AI initiatives at various levels of maturity are required to make the most of this book.

Table of contents

  1. IBM Cloud Pak for Data
  2. Contributors
  3. About the authors
  4. About the reviewers
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the color images
    5. Conventions used
    6. Get in touch
    7. Share Your Thoughts
  6. Section 1: The Basics
  7. Chapter 1: The AI Ladder – IBM's Prescriptive Approach
    1. Market dynamics and IBM's Data and AI portfolio
    2. Introduction to the AI ladder
      1. The rungs of the AI ladder
    3. Collect – making data simple and accessible
    4. Organize – creating a trusted analytics foundation
      1. People empowering your data citizens
    5. Analyze – building and scaling models with trust and transparency
    6. Infuse – operationalizing AI throughout the business
      1. Customer service
      2. Risk and compliance
      3. IT operations
      4. Financial operations
      5. Business operations
      6. The case for a data and AI platform
    7. Summary
  8. Chapter 2: Cloud Pak for Data: A Brief Introduction
    1. The case of a data and AI platform – recap
    2. Overview of Cloud Pak for Data
    3. Exploring unique differentiators, key use cases, and customer adoption
      1. Key use cases
      2. Customer use case: AI claim processing
      3. Customer use case: data and AI platform
    4. Cloud Pak for Data: additional details
      1. An open ecosystem
      2. Premium IBM cartridges and third-party services
      3. Industry accelerators
      4. Packaging and deployment options
    5. Red Hat OpenShift
    6. Summary
  9. Section 2: Product Capabilities
  10. Chapter 3: Collect – Making Data Simple and Accessible
    1. Data – the world's most valuable asset
      1. Data-centric enterprises
    2. Challenges with data-centric delivery
    3. Enterprise data architecture
      1. NoSQL data stores – key categories
    4. Data virtualization – accessing data anywhere
    5. Data virtualization versus ETL – when to use what?
    6. Platform connections – streamlining data connectivity
    7. Data estate modernization using Cloud Pak for Data
    8. Summary
  11. Chapter 4: Organize – Creating a Trusted Analytics Foundation
    1. Introducing Data Operations (DataOps)
    2. Organizing enterprise information assets
    3. Establishing metadata and stewardship
      1. Business metadata components
      2. Technical metadata components
    4. Profiling to get a better understanding of your data
    5. Classifying data for completeness
      1. Automating data discovery and business term assignment
    6. Enabling trust with data quality
      1. Steps to assess data quality
      2. DataOps in action
      3. Automation rules around data quality
    7. Data privacy and activity monitoring
    8. Data integration at scale
      1. Considerations for selecting a data integration tool
      2. The extract, transform, and load (ETL) service in Cloud Pak for Data
      3. Advantages of leveraging a cloud-native platform for ETL
    9. Master data management
    10. Extending MDM toward a Digital Twin
    11. Summary
  12. Chapter 5: Analyzing: Building, Deploying, and Scaling Models with Trust and Transparency
    1. Self-service analytics of governed data
    2. BI and reporting
    3. Predictive versus prescriptive analytics
    4. Understanding AI
    5. AI life cycle – Transforming insights into action
    6. AI governance: Trust and transparency
    7. Automating the AI life cycle using Cloud Pak for Data
      1. Data science tools for a diverse data science team
      2. Distributed AI
      3. Establishing a collaborative environment and building AI models
      4. Choosing the right tools to use
      5. ModelOps – Deployment phase
      6. ModelOps – Monitoring phase
      7. Streaming data/analytics
      8. Distributed processing
    8. Summary
  13. Chapter 6: Multi-Cloud Strategy and Cloud Satellite
    1. IBM's multi-cloud strategy
    2. Supported deployment options
      1. Managed OpenShift
      2. AWS Quick Start
      3. Azure Marketplace and QuickStart templates
    3. Cloud Pak for Data as a Service
      1. Packaging and pricing
    4. IBM Cloud Satellite
    5. A data fabric for a multi-cloud future
    6. Summary
  14. Chapter 7: IBM and Partner Extension Services
    1. IBM and third-party extension services
    2. Collect extension services
      1. Db2 Advanced
      2. Informix
      3. Virtual Data Pipeline
      4. EDB Postgres Advanced Server
      5. MongoDB Enterprise Advanced
    3. Organize extension services
      1. DataStage
      2. Information Server
      3. Master Data Management
      4. Analyze cartridges – IBM Palantir
    4. Infuse cartridges
      1. Cognos Analytics
      2. Planning Analytics
      3. Watson Assistant
      4. Watson Discovery
      5. Watson API Kit
    5. Modernization upgrades to Cloud Pak for Data cartridges
      1. Extension services
    6. Summary
  15. Chapter 8: Customer Use Cases
    1. Improving health advocacy program efficiency
    2. Voice-enabled chatbots
    3. Risk and control automation
    4. Enhanced border security
    5. Unified Data Fabric
    6. Financial planning and analytics
    7. Summary
  16. Section 3: Technical Details
  17. Chapter 9: Technical Overview, Management, and Administration
    1. Technical requirements
    2. Architecture overview
      1. Characteristics of the platform
      2. Technical underpinnings
      3. The operator pattern
      4. The platform technical stack
    3. Infrastructure requirements, storage, and networking
      1. Understanding how storage is used
      2. Networking
    4. Foundational services and the control plane
      1. Cloud Pak foundational services
      2. Cloud Pak for Data control plane
      3. Management and monitoring
    5. Multi-tenancy, resource management, and security
      1. Isolation using namespaces
      2. Resource management and quotas
      3. Enabling tenant self-management
    6. Day 2 operations
      1. Upgrades
      2. Scale-out
      3. Backup and restore
    7. Summary
    8. References
  18. Chapter 10: Security and Compliance
    1. Technical requirements
    2. Security and Privacy by Design
      1. Development practices
      2. Vulnerability detection
      3. Delivering security assured container images
    3. Secure operations in a shared environment
      1. Securing Kubernetes hosts
      2. Security in OpenShift Container Platform
      3. Namespace scoping and service account privileges
      4. RBAC and the least privilege principle
      5. Workload notification and reliability assurance
      6. Additional considerations
      7. Encryption in motion and securing entry points
      8. Encryption at rest
      9. Anti-virus software
    4. User access and authorizations
      1. Authentication
      2. Authorization
      3. User management and groups
      4. Securing credentials
    5. Meeting compliance requirements
      1. Configuring the operating environment for compliance
      2. Auditing
      3. Integration with IBM Security Guardium
    6. Summary
    7. References
  19. Chapter 11: Storage
    1. Understanding the concept of persistent volumes
      1. Kubernetes storage introduction
      2. Types of persistent volumes
      3. In-cluster storage
      4. Optimized hyperconverged storage and compute
      5. Separated compute and storage Nodes
      6. Provisioning procedure summary
    2. Off-cluster storage
      1. NFS-based persistent volumes
    3. Operational considerations
      1. Continuous availability with in-cluster storage
      2. Data protection – snapshots, backups, and active-passive disaster recovery
      3. Quiescing Cloud Pak for Data services
      4. Db2 database backups and HADR
      5. Kubernetes cluster backup and restore
    4. Summary
    5. Further reading
  20. Chapter 12: Multi-Tenancy
    1. Tenancy considerations
      1. Designating tenants
      2. Organizational and operational implications
    2. Architecting for multi-tenancy
      1. Achieving tenancy with namespace scoping
      2. Ensuring separation of duties with Kubernetes RBAC and separation of duties with operators
      3. Securing access to a tenant instance
      4. Choosing dedicated versus shared compute nodes
    3. Reviewing the tenancy requirements
      1. Isolating tenants
      2. Tenant security and compliance
      3. Self-service and management
      4. A summary of the assessment
    4. In-namespace sub-tenancy with looser isolation
      1. Approach
      2. Assessing the limitations of this approach
    5. Summary
    6. Why subscribe?
  21. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share Your Thoughts

Product information

  • Title: IBM Cloud Pak for Data
  • Author(s): Hemanth Manda, Sriram Srinivasan, Deepak Rangarao
  • Release date: November 2021
  • Publisher(s): Packt Publishing
  • ISBN: 9781800562127