O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Learning Informatica PowerCenter 10.x - Second Edition

Book Description

Harness the power and simplicity of Informatica PowerCenter 10.x to build and manage efficient data management solutions

About This Book
  • Master PowerCenter 10.x components to create, execute, monitor, and schedule ETL processes with a practical approach.
  • An ideal guide to building the necessary skills and competencies to become an expert Informatica PowerCenter developer.
  • A comprehensive guide to fetching/transforming and loading huge volumes of data in a very effective way, with reduced resource consumption
Who This Book Is For

If you wish to deploy Informatica in enterprise environments and build a career in data warehousing, then this book is for you. Whether you are a software developer/analytic professional and are new to Informatica or an experienced user, you will learn all the features of Informatica 10.x. A basic knowledge of programming and data warehouse concepts is essential.

What You Will Learn
  • Install or upgrade the components of the Informatica PowerCenter tool
  • Work on various aspects of administrative skills and on the various developer Informatica PowerCenter screens such as Designer, Workflow Manager, Workflow Monitor, and Repository Manager.
  • Get practical hands-on experience of various sections of Informatica PowerCenter, such as navigator, toolbar, workspace, control panel, and so on
  • Leverage basic and advanced utilities, such as the debugger, target load plan, and incremental aggregation to process data
  • Implement data warehousing concepts such as schemas and SCDs using Informatica
  • Migrate various components, such as sources and targets, to another region using the Designer and Repository Manager screens
  • Enhance code performance using tips such as pushdown optimization and partitioning
In Detail

Informatica PowerCenter is an industry-leading ETL tool, known for its accelerated data extraction, transformation, and data management strategies. This book will be your quick guide to exploring Informatica PowerCenter's powerful features such as working on sources, targets, transformations, performance optimization, scheduling, deploying for processing, and managing your data at speed.

First, you'll learn how to install and configure tools. You will learn to implement various data warehouse and ETL concepts, and use PowerCenter 10.x components to build mappings, tasks, workflows, and so on. You will come across features such as transformations, SCD, XML processing, partitioning, constraint-based loading, Incremental aggregation, and many more. Moreover, you'll also learn to deliver powerful visualizations for data profiling using the advanced monitoring dashboard functionality offered by the new version.

Using data transformation technique, performance tuning, and the many new advanced features, this book will help you understand and process data for training or production purposes. The step-by-step approach and adoption of real-time scenarios will guide you through effectively accessing all core functionalities offered by Informatica PowerCenter version 10.x.

Style and approach

You'll get hand-on with sources, targets, transformations, performance optimization, scheduling, deploying for processing, and managing your data, and learn everything you need to become a proficient Informatica PowerCenter developer.

Table of Contents

  1. Preface
    1. What this book covers
    2. What you need for this book
    3. Who this book is for
    4. Conventions
    5. Reader feedback
      1. Errata
      2. Piracy
      3. Questions
  2. Downloading and Extracting Informatica PowerCenter Software
    1. Downloading the latest version of Informatica PowerCenter - 10.1.0
    2. Extracting the downloaded files - preparing the installable
    3. Informatica installation - the prerequisites
    4. Beginning the installation - installing the server
      1. Configuring the domain and node
    5. Getting the graphical interface ready- client installation
    6. Summary
  3. Understanding Admin Console
    1. The Informatica architecture
      1. Domain
      2. Node
    2. Informatica services
      1. The service manager
      2. Repository
      3. Repository services
      4. Integration services
      5. Model repository service
    3. The Administration Console - configuration
    4. Repository creation - the centralized database for Informatica
    5. Creating the integration service - the path for flow of data
    6. Model Repository Service - a storage place for other developer tools
    7. Informatica users and authentications
    8. Repository Manager - the client configuration
    9. Summary
  4. Understanding Designer Screen and its Components
    1. Understanding Designer Interface
      1. Designer screen components
    2. Working with Sources
      1. Adding new Open Database Connectivity (ODBC) data source
      2. Working with relational database tables - the Import option
      3. Working with flat files - Import option
        1. Working with delimited files
        2. Working with fixed width files
      4. Working with Sources - the Create option
    3. Working with Targets
      1. Working with Target relational database tables - the Import option
      2. Working with Target Flat Files - the Import option
        1. Working with delimited files
        2. Working with fixed width Files
      3. Working with Target - the Create option
      4. Working with Target - the Copy or Drag-Drop option
      5. Creating Source Definition from Target structure
    4. Feel of data inside Repository - preview
      1. Previewing the source data - flat files
      2. Previewing the Source Data - relational tables
    5. Creating a Database Table
    6. Creating a mapping and using transformation features
    7. Summary
  5. The Lifeline of Informatica - Transformations
    1. Creating the transformation
      1. Mapping Designer
      2. Mapplet Designer
      3. Transformation Developer
    2. Expression transformation
    3. Ports in transformations
    4. Using the expression editor
    5. Aggregator transformation
      1. Using Group By
      2. Using Sorted Input
    6. Sorter transformation
    7. Filter transformation
    8. Router transformation
    9. Rank transformation
      1. Group by Ranking
      2. Rank Index
    10. Sequence Generator transformation
      1. Ports of Sequence Generator transformation
      2. Properties of Sequence Generator transformation
    11. Joiner transformation
      1. Master and Detail Pipeline
      2. Join condition
      3. Join type
        1. Normal join
        2. Full join
        3. Master Outer join
        4. Detail Outer join
    12. Union transformation
    13. Source Qualifier transformation
      1. Viewing default query
      2. Overriding default query
        1. Using the WHERE clause
        2. Joining Source Data
        3. Sorting the data
        4. Selecting distinct records
    14. Classification of transformations
      1. Active and Passive
      2. Connected and Unconnected
    15. Lookup transformation
      1. Creating the Lookup transformation
      2. Configuring the Lookup transformation
      3. Configuring the Lookup transformation
      4. Lookup ports
      5. Lookup query
      6. Unconnected Lookup transformation
      7. Lookup transformation properties
    16. Update Strategy transformation
    17. Normalizer transformation
      1. Configuring Normalizer transformation - ports
    18. Stored Procedure transformation
      1. Importing Stored Procedure transformation
      2. Creating Stored Procedure transformation
      3. Using Stored Procedure transformation in Mapping
        1. Connected Stored Procedure transformation
        2. Unconnected Stored Procedure transformation
    19. Transaction Control transformation
    20. Types of Lookup cache
      1. Building the Cache - Sequential or Concurrent
        1. Sequential cache
        2. Concurrent cache
      2. Persistent cache - the permanent one
      3. Sharing the cache - named or unnamed
        1. Sharing unnamed cache
        2. Sharing named cache
      4. Modifying cache - static or dynamic
        1. Static cache
        2. Dynamic cache
    21. Tracing level
    22. Summary
  6. Using the Designer Screen - Advanced Features
    1. Debug me please - the debugger
    2. Reuse me please - reusable transformation
      1. Using Transformation Developer
      2. Making an existing transformation reusable
      3. Mapplet
    3. Managing the constraints - the target load plan
    4. Avoid hardcoding - parameters and variables
    5. Comparing objects
    6. Summary
  7. Implementing SCD Using Designer Screen Wizards
    1. Types of SCD
      1. SCD1
      2. SCD2 - Version number
      3. SCD2 - FLAG
      4. SCD2 - Date range
      5. SCD3
    2. SCD1 - I hate history!
    3. SCD2 (version number) - I need my ancestors!
    4. SCD2 (flag) - flag the history
    5. SCD2 (date range) - marking the dates
    6. SCD3 - store something if not everything
    7. Summary
  8. Using the Workflow Manager Screen
    1. Using the Workflow Manager
    2. Creating a workflow
      1. Creating a workflow manually
      2. Creating a workflow automatically
    3. Adding a task to the workflow
      1. Adding tasks to the workflow directly
      2. Adding tasks to the workflow by task developer
      3. Adding tasks to the workflow by task developer
    4. Working with the Session task and basic properties
    5. Assigning the integration service to the workflow
    6. Deleting a workflow
    7. Trigger - starting a workflow
      1. Running a complete workflow
      2. Running a part of the workflow
      3. Running a task
    8. Working with Connection objects
      1. Creating a Connection object
      2. Configuring the Relational Database
    9. Summary
  9. Learning Various Tasks in Workflow Manager screen
    1. Working with tasks
    2. Configuring a task
    3. Session task
      1. Tabs of a session task
      2. Creating a Session Task
    4. Command task
      1. Creating a Command task
    5. Email Task
      1. Creating an Email Task
    6. Assignment Task
      1. Creating an Assignment Task
    7. Timer task
      1. Creating a Timer Task
    8. Control task
      1. Creating a Control task
    9. Decision task
      1. Creating a Decision task
    10. Event tasks - Event Wait and Event Raise
      1. Creating an Event (Wait/Raise) task
    11. Link task
      1. Creating a link task
    12. Worklets - Groups of tasks
      1. Creating a Worklet
    13. Summary
  10. Advanced Features of Workflow Manager Screen
    1. Schedulers
    2. File list - the indirect way
    3. Incremental Aggregation
    4. Parameter file - parameters and variables
      1. Defining session-level variables
      2. Defining workflow-level variables
      3. Defining mapping-level variables
      4. Creating a parameter file
      5. Mentioning the Parameter file at the workflow level
      6. Mentioning the Parameter file at the session level
    5. Summary
  11. Working with Workflow Monitor - Monitoring the code
    1. Using the Workflow Monitor
    2. Connecting the Workflow Manager Screen
    3. Opening previous workflow runs
    4. Running or recovering workflow or task
    5. Stopping or aborting the workflow or task
    6. Status of the workflow and tasks
    7. Viewing session log and workflow log
    8. Working with the workflow log
    9. Working with the session Log
    10. Viewing workflow run properties
    11. Viewing session run properties
      1. Task detail properties
      2. Source/target statistics properties
    12. Common Errors
    13. Summary
  12. The Deployment Phase - Using Repository Manager
    1. Using the Repository Manager
    2. Take me to the next stage - Deployment or Migration
      1. Export/Import
        1. Migrating from Designer
        2. Migrating from Repository Manager
      2. Copy/Paste
      3. Drag/Drop
    3. Summary
  13. Optimization - Performance Tuning
    1. Bottlenecks
      1. Finding the target bottleneck
        1. Using thread statistics
        2. Configuring the sample target load
      2. Eliminating the target bottleneck
        1. Minimizing target table deadlocks
        2. Dropping indexes and constraints
        3. Increasing the checkpoint interval
        4. Using an external loader
        5. Increasing the network packet size
        6. Using bulk load
      3. Finding the source bottleneck
        1. Using thread statistics
        2. Test mapping
        3. Using filter transformation
        4. Checking the database query
      4. Eliminating the source bottleneck
        1. Increasing the network packet size
        2. Optimizing the database query
      5. Finding the mapping bottleneck
        1. Using thread statistics
        2. Using filter transformation
      6. Eliminating the mapping bottleneck
        1. Using single pass mapping
        2. Avoiding data type conversions
        3. Unchecking unnecessary ports
        4. Processing numeric data
        5. Using operators instead of functions
        6. Using Decode in place of multiple IIF
        7. Tracing level
        8. Using variable ports
        9. Optimizing filter transformation
        10. Optimizing Aggregator transformation
        11. Optimizing joiner transformation
        12. Optimizing lookup transformation
      7. Eliminating the session bottleneck
        1. Optimizing the commit interval
        2. Buffer memory
        3. Performance data
      8. Eliminating the system bottleneck
    2. Working on partitioning
      1. Partitioning properties
        1. Partition points
        2. Number of partitions
        3. Partition types
    3. Pushdown optimization
    4. Summary