IBM InfoSphere DataStage Data Flow and Job Design

Book description

IBM Information Server is a revolutionary new software platform that helps organizations derive more value from the complex heterogeneous information spread across their systems. It enables organizations to integrate disparate data and deliver trusted information wherever and whenever needed, in line and in context, to specific people, applications, and processes.

IBM InfoSphere™ DataStage® is a critical component of the IBM Information Server, and the parallel framework of IBM InfoSphere DataStage is also the foundation for IBM InfoSphere QualityStage and IBM InfoSphere Information Analyzer components.

This IBM® Redbooks® publication develops usage scenarios that describe the implementation of IBM InfoSphere DataStage flow and job design with special emphasis on the new features such as the distributed transaction stage (DTS) in Version 8.0.1, slowly changing dimensions stage ( Version 8.0.1), complex flat file stage (Version 8.0.1), and access to mainframe data.

Table of contents

  1. Cover Page
  2. Title Page
  3. Copyright Page
  4. Figures
  5. Tables
  6. Examples
  7. Notices
    1. Trademarks
  8. Preface
    1. The team that wrote this book
    2. Become a published author
    3. Comments welcome
  9. Chapter 1. IBM InfoSphere DataStage overview
    1. 1.1 Introduction
    2. 1.2 IBM Information Server architecture
      1. 1.2.1 Component overview
      2. 1.2.2 Topologies supported
    3. 1.3 IBM InfoSphere DataStage within the IBM Information Server architecture
      1. 1.3.1 Shared components
      2. 1.3.2 Runtime architecture
    4. 1.4 IBM InfoSphere DataStage main functions
      1. 1.4.1 Data transformation
      2. 1.4.2 Jobs
      3. 1.4.3 Parallel processing
    5. 1.5 Best practices overview
      1. 1.5.1 Standards
      2. 1.5.2 Development guidelines
      3. 1.5.3 Component usage
      4. 1.5.4 DataStage data types
      5. 1.5.5 Partitioning data
      6. 1.5.6 Collecting data
      7. 1.5.7 Sorting
      8. 1.5.8 Stage specific guidelines
  10. Chapter 2. IBM InfoSphere DataStage stages
    1. 2.1 Introduction
    2. 2.2 Aggregator
    3. 2.3 Complex Flat File
    4. 2.4 Column Import
    5. 2.5 Column Export
    6. 2.6 Data Set
    7. 2.7 Distributed Transaction (new in Version 8.1)
    8. 2.8 FTP Enterprise
    9. 2.9 Funnel
    10. 2.10 Join
    11. 2.11 Lookup
    12. 2.12 Merge
    13. 2.13 Sequential File
    14. 2.14 Slowly Changing Dimension
    15. 2.15 Sort
    16. 2.16 Surrogate Key Generator
    17. 2.17 Transformer
  11. Chapter 3. Retail industry scenario
    1. 3.1 Retail industry scenario
      1. 3.1.1 One time tasks (Day 0)
      2. 3.1.2 Recurring tasks
      3. 3.1.3 Recurring tasks (Day 1)
      4. 3.1.4 Recurring tasks (Day 2)
      5. 3.1.5 Recurring tasks (Day 3)
  12. Appendix A. IBM Information Server setups
    1. A.1 Introduction
    2. A.2 Configure IBM InfoSphere Classic Federation Server for z/OS
      1. A.2.1 Installation
      2. A.2.2 Configuration of IBM InfoSphere Classic Federation for z/OS system catalog
      3. A.2.3 Configuration of Classic Data Architect
    3. A.3 Create the Queue Manager
    4. A.4 Set up the XA parameters on Queue Manager
    5. A.5 Create the queues
  13. Appendix B. Code and scripts used in the retail industry scenario
    1. B.1 Introduction
  14. Appendix C. Additional material
    1. Locating the Web material
    2. Using the Web material
      1. How to use the Web material
  15. Related publications
    1. Other publications
    2. Online resources
    3. How to get Redbooks
    4. Help from IBM
  16. Index
  17. Footnotes

Product information

  • Title: IBM InfoSphere DataStage Data Flow and Job Design
  • Author(s): Nagraj Alur, Celso Takahashi, Sachiko Toratani, Denis Vasconcelos
  • Release date: July 2008
  • Publisher(s): IBM Redbooks
  • ISBN: 9780738431116