O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Turning Spreadsheets into Corporate Data

Book Description

For years, business users have leveraged spreadsheets for storing and communicating data. Although spreadsheets may be easy to create and update, making important corporate decisions based on spreadsheets is risky due to the lack of data credibility. Whether you are a manager, developer, end user, or student, this book will help you turn spreadsheet data into credible, useful, reliable data that can be trusted in order to make important decisions.
A chapter is dedicated to each of the following topics:
  • Brief history of spreadsheets
  • Spreadsheet paradox
  • Spreadsheet varieties
  • The PDF spreadsheet
  • Spreadsheet formatting
  • Spreadsheet disambiguation
  • The intermediate database
  • The ssdef database
  • The corporate database
  • The metadata database (mnemonic database)
  • Political considerations
  • Data modeling and the spreadsheet
  • Case study

Table of Contents

  1. Introduction
  2. 1: Brief History of Spreadsheets
    1. The IT Labyrinth
    2. End User Acceptance of the Spreadsheet
    3. Spreadsheet Hell
    4. A Tradeoff
    5. Responsibility—The Flip Side of Control
    6. Management’s Problem
    7. Differences Between Two Types of Data
    8. In Summary
  3. 2: Spreadsheet Paradox
    1. Public Data
    2. The Spreadsheet as a Medium of Exchange
    3. Recurring/Non-Recurring Spreadsheets
    4. The Spectrum of Spreadsheets
    5. The Cost of Transforming a Spreadsheet
    6. Factors Other than Cost
    7. Transcription of Data
    8. Cell Formula
    9. Spreadsheet Descriptors
    10. Artificially Supplying Descriptors
    11. In Summary
  4. 3: Spreadsheet Varieties
    1. Simple Demarcation—xlstab
    2. Other Special Characters—eold and Linefeed
    3. The Internal View of a Spreadsheet
    4. A Missing Column Heading
    5. A Missing Value
    6. A Multiline Row
    7. The “Standard” Spreadsheet Format
    8. Managing the User of the Spreadsheet
    9. The ssdef Table
    10. The Spreadsheet Processing Log
    11. The Lineage of Spreadsheet Data
    12. The Cell Formula
    13. Relating to the Real World
    14. Identifying the Header Line
    15. In Summary
  5. 4: The PDF Spreadsheet
    1. The Importance of Special Characters
    2. PDF and OCR
    3. A Final Option
    4. In Summary
  6. 5: The Basics of Spreadsheet Formatting
    1. The System Name
    2. Unreliability of Report Name
    3. Multiple Sheets in a Spreadsheet
    4. Other Special Characters
    5. Identifying Column Headings
    6. Similar Column Headings
    7. Blocking Off Sections of a Spreadsheet
    8. Non-Standard Spreadsheet Structures
    9. A Spreadsheet that Cannot be Mapped
    10. A Spreadsheet in a TXT Format
    11. In Summary
  7. 6: Spreadsheet Disambiguation
    1. Selecting Spreadsheets for Inclusion into Corporate Data
    2. Recasting the Spreadsheet
    3. Logging the Spreadsheet for Transformation
    4. Entry into the Path Queue
    5. Defining the Spreadsheet Headings
    6. Pairing the ssdef Specification to the Spreadsheet
    7. Finding and Creating Database Definitions and Values
    8. The Intermediate Database
    9. Some Anomalies
    10. What if an Error is Discovered?
    11. Manual Effort Required
    12. Spreadsheet Width
    13. Subdividing a Spreadsheet
    14. No Value for a Column Name
    15. No Column Headings
    16. Creating the ssdef Specification Once
    17. In Summary
  8. 7: The Intermediate Database
    1. Finding Errors
    2. The Contents of the Intermediate Database
    3. Functions Served by the Data Elements
    4. Alternate Name
    5. Adding Context to Data Values
    6. Editing Data in the Intermediate Database
    7. In Summary
  9. 8: The ssdef Database
    1. Organizing Data Inside the ssdef Table
    2. Processing Using ssdef Records
    3. Searching the Full Path Queue
    4. In Summary
  10. 9: The Corporate Database
    1. From Intermediate Data to Corporate Data
    2. Grouped Corporate Data
    3. Tracing the Lineage
    4. In Summary
  11. 10: The Mnemonic Dictionary
    1. The Contents of the Mnemonic Dictionary
    2. Grouping Like Data Elements
    3. Applying Naming Conventions
    4. Value of the Mnemonic Dictionary
    5. In Summary
  12. 11: Political Considerations Within the Organization
    1. Shifting Control
    2. Immutability of Data
    3. The Importance of Alternate Names
    4. Limited Editing
    5. Super Classifications of Data
    6. The Lineage of Corporate Data
    7. Relative Volumes of Data
    8. In Summary
  13. 12: Data Modeling and the Spreadsheet Environment
    1. The Entity Relationship Diagram
    2. The Data Item Set
    3. The Physical Model
    4. The Data Model
    5. The Data Model and Spreadsheet Data
    6. “Correctness” of Data
    7. Aligning Data from Different Spreadsheets
    8. An Algorithmic Resolution
    9. An Indexed Resolution
    10. Resolution and the Data Model
    11. Spreadsheet Data in the Data Warehouse
    12. Changing Spreadsheet Data
    13. In Summary
  14. 13: Case Study
  15. Glossary
  16. Index