Data Mining with Microsoft® SQL Server® 2008

Book description

Understand how to use the new features of Microsoft SQL Server 2008 for data mining by using the tools in Data Mining with Microsoft SQL Server 2008, which will show you how to use the SQL Server Data Mining Toolset with Office 2007 to mine and analyze data. Explore each of the major data mining algorithms, including naive bayes, decision trees, time series, clustering, association rules, and neural networks. Learn more about topics like mining OLAP databases, data mining with SQL Server Integration Services 2008, and using Microsoft data mining to solve business analysis problems.

Table of contents

  1. Copyright
  2. About the Authors
  3. Credits
  4. Acknowledgments
  5. Foreword
  6. Introduction
    1. How This Book Is Organized
    2. Who Should Read This Book
    3. Conventions
    4. Tools You Will Need
    5. What's on the Website
  7. 1. Introduction to Data Mining in SQL Server 2008
    1. 1.1. Business Problems for Data Mining
    2. 1.2. Data Mining Tasks
      1. 1.2.1. Classification
      2. 1.2.2. Clustering
      3. 1.2.3. Association
      4. 1.2.4. Regression
      5. 1.2.5. Forecasting
      6. 1.2.6. Sequence Analysis
      7. 1.2.7. Deviation Analysis
    3. 1.3. Data Mining Project Cycle
      1. 1.3.1. Business Problem Formation
      2. 1.3.2. Data Collection
      3. 1.3.3. Data Cleaning and Transformation
      4. 1.3.4. Model Building
      5. 1.3.5. Model Assessment
      6. 1.3.6. Reporting and Prediction
      7. 1.3.7. Application Integration
      8. 1.3.8. Model Management
    4. 1.4. Summary
  8. 2. Applied Data Mining Using Microsoft Excel 2007
    1. 2.1. Setting Up the Table Analysis Tools
      1. 2.1.1. Configuring Analysis Services with Administrative Privileges
      2. 2.1.2. Configuring Analysis Services without Administrative Privileges
      3. 2.1.3. What the Add-Ins Expect
      4. 2.1.4. What to Do If You Need Help
    2. 2.2. The Analyze Key Influencers Tool
      1. 2.2.1. The Main Influencers Report
      2. 2.2.2. The Discrimination Report
      3. 2.2.3. Summary of the Analyze Key Influencers Task
    3. 2.3. The Detect Categories Tool
      1. 2.3.1. Launching the Tool
      2. 2.3.2. The Categories Report
        1. 2.3.2.1. Categories and the Number of Rows in Each
        2. 2.3.2.2. Characteristics of Each Category
        3. 2.3.2.3. The Category Profiles Chart
      3. 2.3.3. Summary of the Detect Categories Tool
    4. 2.4. The Fill From Example Tool
      1. 2.4.1. Running the Tool and Interpreting the Results
      2. 2.4.2. Refining the Results
      3. 2.4.3. Summary of the Fill From Example Tool
    5. 2.5. The Forecasting Tool
      1. 2.5.1. Launching the Tool and Specifying Options
      2. 2.5.2. Interpreting the Results
      3. 2.5.3. Summary of the Forecast Tool
    6. 2.6. The Highlight Exceptions Tool
      1. 2.6.1. Using the Tool
      2. 2.6.2. More Complex Interactions
      3. 2.6.3. Limitations and Troubleshooting
      4. 2.6.4. Summary of the Highlight Exceptions Tool
    7. 2.7. The Scenario Analysis Tool
      1. 2.7.1. The Goal Seek Tool
      2. 2.7.2. Using Goal Seek for a Numeric Goal
      3. 2.7.3. Using Goal Seek for the Whole Table
      4. 2.7.4. The What-If Tool
      5. 2.7.5. Using What-If for the Whole Table
      6. 2.7.6. Summary of the Scenario Analysis Tool
    8. 2.8. The Prediction Calculator Tool
      1. 2.8.1. Running the Tool
        1. 2.8.1.1. The Prediction Calculator Spreadsheet
        2. 2.8.1.2. The Printable Calculator Spreadsheet
      2. 2.8.2. Refining the Results
      3. 2.8.3. Using the Results
      4. 2.8.4. Summary of the Prediction Calculator Tool
    9. 2.9. The Shopping Basket Analysis Tool
      1. 2.9.1. Using the Tool
      2. 2.9.2. The Bundled Item Report
      3. 2.9.3. The Recommendations Report
      4. 2.9.4. Tweaking the Tool
      5. 2.9.5. Summary of the Shopping Basket Analysis Tool
    10. 2.10. Technical Overview of the Table Analysis Tools
    11. 2.11. Summary
  9. 3. Data Mining Concepts and DMX
    1. 3.1. History of DMX
    2. 3.2. Why DMX?
    3. 3.3. The Data Mining Process
    4. 3.4. Key Concepts
      1. 3.4.1. Attribute
      2. 3.4.2. State
      3. 3.4.3. Case
      4. 3.4.4. Keys
      5. 3.4.5. Inputs and Outputs
    5. 3.5. DMX Objects
      1. 3.5.1. Mining Structure
      2. 3.5.2. Mining Model
    6. 3.6. DMX Query Syntax
      1. 3.6.1. Creating Mining Structures
        1. 3.6.1.1. Discretized Columns
        2. 3.6.1.2. Nested Tables
        3. 3.6.1.3. Partitioning into Testing and Training Sets
      2. 3.6.2. Creating Mining Models
        1. 3.6.2.1. Nested Tables
        2. 3.6.2.2. Complex Nesting Scenarios
        3. 3.6.2.3. Filters
      3. 3.6.3. Populating Mining Structures
        1. 3.6.3.1. Populating Nested Tables
        2. 3.6.3.2. Querying Structure Data
        3. 3.6.3.3. Querying Model Data
    7. 3.7. Prediction
      1. 3.7.1. Prediction Join
      2. 3.7.2. Prediction Query Syntax
        1. 3.7.2.1. Nested Source Data
        2. 3.7.2.2. Real-Time Prediction
        3. 3.7.2.3. Degenerate Predictions
      3. 3.7.3. Prediction Functions
        1. 3.7.3.1. PredictNodeID
        2. 3.7.3.2. External and User-Defined Functions
      4. 3.7.4. Predictions on Nested Tables
      5. 3.7.5. Predicting Nested Value Columns
    8. 3.8. Summary
  10. 4. Using SQL Server Data Mining
    1. 4.1. Introducing the Business Intelligence Development Studio
      1. 4.1.1. Understanding the User Interface
      2. 4.1.2. Offline Mode and Immediate Mode
        1. 4.1.2.1. Immediate Mode
        2. 4.1.2.2. Getting Started in Immediate Mode
        3. 4.1.2.3. Offline Mode
        4. 4.1.2.4. Getting Started in Offline Mode
        5. 4.1.2.5. Switching Project Modes
      3. 4.1.3. Creating Data Mining Objects
    2. 4.2. Setting Up Your Data Sources
      1. 4.2.1. Understanding Data Sources
        1. 4.2.1.1. Creating the MovieClick Data Source
      2. 4.2.2. Using the Data Source View
        1. 4.2.2.1. Creating the MovieClick Data Source View
        2. 4.2.2.2. Working with Named Calculations
        3. 4.2.2.3. Creating a Named Calculation on the Customers Table
        4. 4.2.2.4. Working with Named Queries
        5. 4.2.2.5. Creating a Named Query Based on the Customers Table
        6. 4.2.2.6. Organizing the DSV
        7. 4.2.2.7. Exploring Data
    3. 4.3. Creating and Editing Models
      1. 4.3.1. Structures and Models
      2. 4.3.2. Using the Data Mining Wizard
      3. 4.3.3. Creating the MovieClick Mining Structure and Model
      4. 4.3.4. Using Data Mining Designer
        1. 4.3.4.1. Working with the Mining Structure Editor
        2. 4.3.4.2. Adding the Genre Column to the Movies Nested Table
        3. 4.3.4.3. Working with the Mining Models Editor
        4. 4.3.4.4. Creating and Modifying Additional Models
    4. 4.4. Processing
      1. 4.4.1. Processing the MovieClick Mining Structure
    5. 4.5. Using Your Models
      1. 4.5.1. Understanding the Model Viewers
      2. 4.5.2. Using the Mining Accuracy Chart
        1. 4.5.2.1. Selecting Test Data
        2. 4.5.2.2. Understanding the Accuracy Charts
        3. 4.5.2.3. Using the Profit Chart
        4. 4.5.2.4. Multiple Target Accuracy Charts
        5. 4.5.2.5. Using the Classification Matrix
        6. 4.5.2.6. Scatter Accuracy Charts
      3. 4.5.3. Creating a Lift Chart on MovieClick
      4. 4.5.4. Using CrossValidation
      5. 4.5.5. Using the Mining Model Prediction Builder
      6. 4.5.6. Executing a Query on the MovieClick Model
      7. 4.5.7. Creating Data Mining Reports
    6. 4.6. Using SQL Server Management Studio
      1. 4.6.1. Understanding the Management Studio User Interface
      2. 4.6.2. Using Server Explorer
      3. 4.6.3. Using Object Explorer
      4. 4.6.4. Using the Query Editor
    7. 4.7. Summary
  11. 5. Implementing a Data Mining Process Using Office 2007
    1. 5.1. Introducing the Data Mining Client
    2. 5.2. Importing Data Using the Data Mining Client
    3. 5.3. Data Exploration and Preparation
      1. 5.3.1. Discretizing Data with the Explore Data Tool
      2. 5.3.2. Chopping Off the Long Tail
      3. 5.3.3. Consolidating Meaning
      4. 5.3.4. Eliminating Spurious Values
      5. 5.3.5. Rebalancing Data
    4. 5.4. Modeling
      1. 5.4.1. Task-Based Modeling
        1. 5.4.1.1. Introduction
        2. 5.4.1.2. Select Data
        3. 5.4.1.3. Select Columns and Options
        4. 5.4.1.4. Split Data
        5. 5.4.1.5. Finishing the Task
      2. 5.4.2. Advanced Modeling in the Data Mining Client
    5. 5.5. Accuracy and Validation
    6. 5.6. Model Usage
      1. 5.6.1. Browsing Models
      2. 5.6.2. Viewing Models with Visio
      3. 5.6.3. Querying Models
      4. 5.6.4. Query Wizard
    7. 5.7. Data Mining Cell Functions
      1. 5.7.1. DMPREDICT
      2. 5.7.2. DMPREDICTTABLEROW
      3. 5.7.3. DMCONTENTQUERY
    8. 5.8. Model Management
    9. 5.9. Trace
    10. 5.10. Summary
  12. 6. Microsoft Naïve Bayes
    1. 6.1. Introducing the Naïve Bayes Algorithm
    2. 6.2. Using the Naïve Bayes Algorithm
      1. 6.2.1. Creating a Predictive Model
      2. 6.2.2. Data Exploration
      3. 6.2.3. Analysis of Key Influencers
      4. 6.2.4. Document Classification
      5. 6.2.5. DMX
        1. 6.2.5.1. Drill-through
      6. 6.2.6. Understanding Naïve Bayes Content
      7. 6.2.7. Exploring a Naïve Bayes Model
        1. 6.2.7.1. Dependency Network
        2. 6.2.7.2. Attribute Profiles
        3. 6.2.7.3. Attribute Characteristics
        4. 6.2.7.4. Attribute Discrimination
    3. 6.3. Understanding Naïve Bayes Principles
      1. 6.3.1. Limitations of the Naïve Bayes Algorithm
    4. 6.4. Naïve Bayes Parameters
      1. 6.4.1. MAXIMUM_INPUT_ATTRIBUTES
      2. 6.4.2. MAXIMUM_OUTPUT_ATTRIBUTES
      3. 6.4.3. MAXIMUM_STATES
      4. 6.4.4. MINIMUM_DEPENDENCY_PROBABILITY
    5. 6.5. Summary
  13. 7. Microsoft Decision Trees Algorithm
    1. 7.1. Introducing Decision Trees
    2. 7.2. Using Decision Trees
      1. 7.2.1. Creating a Decision Tree Model
      2. 7.2.2. DMX Queries
        1. 7.2.2.1. Classification Model
        2. 7.2.2.2. Regression Model
        3. 7.2.2.3. Association
      3. 7.2.3. Model Content
      4. 7.2.4. Interpreting the Model
    3. 7.3. Decision Tree Principles
      1. 7.3.1. Basic Concepts of Tree Growth
      2. 7.3.2. Working with Many States in an Attribute
      3. 7.3.3. Avoiding Overtraining
      4. 7.3.4. Incorporating Prior Knowledge
      5. 7.3.5. Feature Selection
      6. 7.3.6. Using Continuous Inputs
      7. 7.3.7. Regression
      8. 7.3.8. Association Analysis with Microsoft Decision Trees
    4. 7.4. Parameters
      1. 7.4.1. COMPLEXITY_PENALTY
      2. 7.4.2. MINIMUM_SUPPORT
      3. 7.4.3. SCORE_METHOD
      4. 7.4.4. SPLIT_METHOD
      5. 7.4.5. MAXIMUM_INPUT_ATTRIBUTES
      6. 7.4.6. MAXIMUM_OUTPUT_ATTRIBUTES
      7. 7.4.7. FORCE_REGRESSOR
    5. 7.5. Stored Procedures
    6. 7.6. Summary
  14. 8. Microsoft Time Series Algorithm
    1. 8.1. Overview
    2. 8.2. Usage
      1. 8.2.1. Time Series Scenarios
        1. 8.2.1.1. Performing a Simple Forecast
        2. 8.2.1.2. Predicting Interdependent Series
        3. 8.2.1.3. Understanding Your Time Series
        4. 8.2.1.4. What-If Scenarios
        5. 8.2.1.5. Predicting New Series
    3. 8.3. DMX
      1. 8.3.1. Model Creation
      2. 8.3.2. Model Processing
      3. 8.3.3. Forecasting
        1. 8.3.3.1. Returning Supplemental Statistics
        2. 8.3.3.2. Changing the Future—Executing a What-If Forecast
        3. 8.3.3.3. Forecasting with Little Data—Applying Models to New Data
      4. 8.3.4. Drill-Through
    4. 8.4. Principles of Time Series
      1. 8.4.1. Autoregression
      2. 8.4.2. Periodicity
      3. 8.4.3. Autoregression Trees
      4. 8.4.4. Prediction
    5. 8.5. Parameters
      1. 8.5.1. MISSING_VALUE_SUBSTITUTION
      2. 8.5.2. PERIODICITY_HINT
      3. 8.5.3. AUTO_DETECT_PERIODICITY
      4. 8.5.4. MINIMUM and MAXIMUM_SERIES_VALUE
      5. 8.5.5. FORECAST_METHOD
      6. 8.5.6. PREDICTION_SMOOTHING
      7. 8.5.7. INSTABILITY_SENSITIVITY
      8. 8.5.8. HISTORIC_MODEL_COUNT and HISTORIC_MODEL_GAP
      9. 8.5.9. COMPLEXITY_PENALTY and MINIMUM_SUPPORT
    6. 8.6. Model Content
    7. 8.7. Summary
  15. 9. Microsoft Clustering
    1. 9.1. Overview
    2. 9.2. Usage of Clustering
      1. 9.2.1. Performing a Clustering
      2. 9.2.2. Clustering as an Analytical Step
      3. 9.2.3. Anomaly Detection Using Clustering
      4. 9.2.4. DMX
        1. 9.2.4.1. Model Creation
        2. 9.2.4.2. Drill-through
        3. 9.2.4.3. Cluster
        4. 9.2.4.4. ClusterProbability
        5. 9.2.4.5. PredictHistogram
        6. 9.2.4.6. PredictCaseLikelihood
      5. 9.2.5. Model Content
      6. 9.2.6. Understanding Your Cluster Models
        1. 9.2.6.1. Get a High-Level Overview
        2. 9.2.6.2. Pick a Cluster and Determine how It Is Different from the General Population
        3. 9.2.6.3. Determine how the Cluster Is Different from Nearby Clusters
        4. 9.2.6.4. Verify that Your Assertions Are True
        5. 9.2.6.5. Label the Cluster
    3. 9.3. Principles of Clustering
      1. 9.3.1. Hard Clustering versus Soft Clustering
      2. 9.3.2. Discrete Clustering
      3. 9.3.3. Scalable Clustering
      4. 9.3.4. Clustering Prediction
    4. 9.4. Parameters
      1. 9.4.1. CLUSTERING_METHOD
      2. 9.4.2. CLUSTER_COUNT
      3. 9.4.3. MINIMUM_CLUSTER_CASES
      4. 9.4.4. MODELLING_CARDINALITY
      5. 9.4.5. STOPPING_TOLERANCE
      6. 9.4.6. SAMPLE_SIZE
      7. 9.4.7. CLUSTER_SEED
      8. 9.4.8. MAXIMUM_INPUT_ATTRIBUTES
      9. 9.4.9. MAXIMUM_STATES
    5. 9.5. Summary
  16. 10. Microsoft Sequence Clustering
    1. 10.1. Introducing the Microsoft Sequence Clustering Algorithm
    2. 10.2. Using the Microsoft Sequence Clustering Algorithm
      1. 10.2.1. Creating a Sequence Clustering Model
      2. 10.2.2. DMX Queries
        1. 10.2.2.1. Executing Cluster Predictions
        2. 10.2.2.2. Executing Sequence Predictions
        3. 10.2.2.3. Extracting the Probability for the Sequence Predictions
        4. 10.2.2.4. Using the Histogram of the Sequence Predictions
        5. 10.2.2.5. Detecting Unusual Sequence Patterns
      3. 10.2.3. Interpreting the Model
        1. 10.2.3.1. Cluster Diagram
        2. 10.2.3.2. Cluster Profiles
        3. 10.2.3.3. Cluster Characteristics
        4. 10.2.3.4. Cluster Discrimination
        5. 10.2.3.5. State Transitions
    3. 10.3. Microsoft Sequence Clustering Algorithm Principles
      1. 10.3.1. Understanding a Markov Chain
      2. 10.3.2. Order of a Markov Chain
      3. 10.3.3. State Transition Matrix
      4. 10.3.4. Clustering with a Markov Chain
      5. 10.3.5. Cluster Decomposition
    4. 10.4. Model Content
    5. 10.5. Algorithm Parameters
      1. 10.5.1. CLUSTER_COUNT
      2. 10.5.2. MINIMUM_SUPPORT
      3. 10.5.3. MAXIMUM_STATES
      4. 10.5.4. MAXIMUM_SEQUENCE_STATES
    6. 10.6. Summary
  17. 11. Microsoft Association Rules
    1. 11.1. Introducing Microsoft Association Rules
    2. 11.2. Using the Association Rules Algorithm
      1. 11.2.1. Data Exploration Models
      2. 11.2.2. A Simple Recommendation Engine
      3. 11.2.3. Advanced Cross-Sales Analysis
      4. 11.2.4. DMX
      5. 11.2.5. Model Content
      6. 11.2.6. Interpreting the Model
    3. 11.3. Association Algorithm Principles
    4. 11.4. Understanding Basic Association Algorithm Terms and Concepts
      1. 11.4.1.
        1. 11.4.1.1. Itemset
        2. 11.4.1.2. Support
        3. 11.4.1.3. Probability (Confidence)
        4. 11.4.1.4. Importance
      2. 11.4.2. Finding Frequent Itemsets
      3. 11.4.3. Generating Association Rules
      4. 11.4.4. Prediction
    5. 11.5. Algorithm Parameters
      1. 11.5.1. MINIMUM_SUPPORT
      2. 11.5.2. MAXIMUM_SUPPORT
      3. 11.5.3. MINIMUM_PROBABILITY
      4. 11.5.4. MINIMUM_IMPORTANCE
      5. 11.5.5. MAXIMUM_ITEMSET_SIZE
      6. 11.5.6. MINIMUM_ITEMSET_SIZE
      7. 11.5.7. MAXIMUM_ITEMSET_COUNT
      8. 11.5.8. OPTIMIZED_PREDICTION_COUNT
      9. 11.5.9. AUTODETECT_MINIMUM_SUPPORT
    6. 11.6. Summary
  18. 12. Microsoft Neural Network and Logistic Regression
    1. 12.1. Same Principle, Two Algorithms
    2. 12.2. Using the Microsoft Neural Network
      1. 12.2.1. Text Classification Models
      2. 12.2.2. Utility Models
      3. 12.2.3. DMX Queries
    3. 12.3. Model Content
    4. 12.4. Interpreting the Model
    5. 12.5. Principles of the Microsoft Neural Network Algorithm
      1. 12.5.1. What Is a Neural Network?
      2. 12.5.2. Combination and Activation
      3. 12.5.3. Backpropagation, Error Function, and Conjugate Gradient
      4. 12.5.4. A Simple Example of Processing a Neural Network
      5. 12.5.5. Normalization and Mapping
      6. 12.5.6. Topology of the Network
      7. 12.5.7. Training the Ending Condition
    6. 12.6. Nonlinearly Separable Classes
    7. 12.7. Algorithm Parameters
      1. 12.7.1. MAXIMUM_INPUT_ATTRIBUTES
      2. 12.7.2. MAXIMUM_OUTPUT_ATTRIBUTES
      3. 12.7.3. MAXIMUM_STATES
      4. 12.7.4. HOLDOUT_PERCENTAGE
      5. 12.7.5. HOLDOUT_SEED
      6. 12.7.6. HIDDEN_NODE_RATIO
      7. 12.7.7. SAMPLE_SIZE
    8. 12.8. Summary
  19. 13. Mining OLAP Cubes
    1. 13.1. Introducing OLAP
      1. 13.1.1. Understanding Star and Snowflake Schemas
      2. 13.1.2. Understanding Dimension and Hierarchy
      3. 13.1.3. Understanding Measures and Measure Groups
      4. 13.1.4. Understanding Cube Processing and Storage
      5. 13.1.5. Using Proactive Caching
      6. 13.1.6. Querying a Cube
    2. 13.2. Performing Calculations
    3. 13.3. Browsing a Cube
    4. 13.4. Understanding Unified Dimension Modeling
    5. 13.5. Understanding the Relationship between OLAP and Data Mining
      1. 13.5.1. Mining Aggregated Data
      2. 13.5.2. OLAP Pattern Discovery Needs
      3. 13.5.3. OLAP Mining versus Relational Mining
    6. 13.6. Building OLAP Mining Models Using Wizards and Editors
      1. 13.6.1. Using the Data Mining Wizard
        1. 13.6.1.1. Building the Customer Segmentation Model
        2. 13.6.1.2. Creating a Market Basket Model
        3. 13.6.1.3. Creating a Sales Forecast Model
      2. 13.6.2. Using the Data Mining Designer
    7. 13.7. Understanding Data Mining Dimensions
    8. 13.8. Using MDX within DMX Queries
    9. 13.9. Using Analysis Management Objects for the OLAP Mining Model
    10. 13.10. Summary
  20. 14. Data Mining with SQL Server Integration Services
    1. 14.1. An Overview of SSIS
      1. 14.1.1. Understanding SSIS Packages
      2. 14.1.2. Task Flow
        1. 14.1.2.1. Standard Tasks in SSIS
        2. 14.1.2.2. Containers
        3. 14.1.2.3. Debugging
        4. 14.1.2.4. Exploring a Control Flow Example
      3. 14.1.3. Data Flow
        1. 14.1.3.1. Transformations
        2. 14.1.3.2. Viewers
        3. 14.1.3.3. Exploring a Data Flow Example
    2. 14.2. Working with SSIS in Data Mining
      1. 14.2.1. Data Mining Tasks
        1. 14.2.1.1. Data Mining Query Task
        2. 14.2.1.2. Analysis Services Processing Task
        3. 14.2.1.3. Analysis Services Execute DDL Task
      2. 14.2.2. Data Mining Transformations
        1. 14.2.2.1. Data Mining Model Training Destination
        2. 14.2.2.2. Data Mining Query Transformation
        3. 14.2.2.3. Example Data Flows
        4. 14.2.2.4. Using Non-Predictive Data Mining Queries in an Integration Services Pipeline
      3. 14.2.3. Text Mining Transformations
        1. 14.2.3.1. Term Extraction Transformation
        2. 14.2.3.2. Term Lookup Transformation
        3. 14.2.3.3. More Details on the Text Mining Process
    3. 14.3. Summary
  21. 15. SQL Server Data Mining Architecture
    1. 15.1. Introducing Analysis Services Architecture
    2. 15.2. XML for Analysis
      1. 15.2.1. XMLA APIs
        1. 15.2.1.1. Discover
        2. 15.2.1.2. Execute
      2. 15.2.2. XMLA and Analysis Services
    3. 15.3. Processing Architecture
    4. 15.4. Predictions
    5. 15.5. Data Mining Administration
      1. 15.5.1. Server Configuration
      2. 15.5.2. Data Mining Security
      3. 15.5.3. Security Requirements for Creating and Training Mining Objects
      4. 15.5.4. Security for Various Deployment Scenarios
        1. 15.5.4.1. Local Database and Analysis Services
        2. 15.5.4.2. Local Analysis Services and a Remote Database
        3. 15.5.4.3. Intranet Analysis Services and Databases on the Same Server
        4. 15.5.4.4. Analysis Services and Databases behind an HTTP Endpoint in an Internet Deployment
        5. 15.5.4.5. Configuring Analysis Services for Use with Data Mining Excel Add-Ins over HTTP
    6. 15.6. Summary
  22. 16. Programming SQL Server Data Mining
    1. 16.1. Data Mining APIs
      1. 16.1.1. ADO
      2. 16.1.2. ADO.NET
      3. 16.1.3. ADOMD.NET
      4. 16.1.4. Server ADOMD.NET
      5. 16.1.5. AMO
    2. 16.2. Using Analysis Services APIs
    3. 16.3. Using Microsoft.AnalysisServices to Create and Manage Mining Models
      1. 16.3.1. AMO Basics
      2. 16.3.2. AMO Applications and Security
      3. 16.3.3. Object Creation
        1. 16.3.3.1. Creating Data Access Objects
        2. 16.3.3.2. Creating the Mining Structure
        3. 16.3.3.3. Creating the Mining Models
        4. 16.3.3.4. Processing Mining Models
        5. 16.3.3.5. Deploying Mining Models
        6. 16.3.3.6. Setting Mining Permissions
    4. 16.4. Browsing and Querying Mining Models
      1. 16.4.1. Predicting with ADOMD.NET
      2. 16.4.2. More on Table-Valued Parameters in ADOMD.NET
      3. 16.4.3. Browsing Models
    5. 16.5. Stored Procedures
      1. 16.5.1. Writing Stored Procedures
        1. 16.5.1.1. Stored Procedures and Prepare Invocations
      2. 16.5.2. A Stored Procedure Example
      3. 16.5.3. Executing Queries inside Stored Procedures
      4. 16.5.4. Returning Data Sets from Stored Procedures
      5. 16.5.5. Deploying and Debugging Stored Procedure Assemblies
    6. 16.6. Summary
  23. 17. Extending SQL Server Data Mining
    1. 17.1. Plug-in Algorithms
      1. 17.1.1. Plug-in Algorithm Framework
      2. 17.1.2. Lifetime of a Plug-in Algorithm Instance
      3. 17.1.3. Conceptual Overview
      4. 17.1.4. Model Creation and Processing
      5. 17.1.5. Prediction
      6. 17.1.6. Content Navigation
      7. 17.1.7. Custom Functions
      8. 17.1.8. PMML
      9. 17.1.9. Managed vs. Native Plug-ins
      10. 17.1.10. Installing Plug-in Algorithms
      11. 17.1.11. Where to Find Out More about Plug-in Algorithms
    2. 17.2. Data Mining Viewers
      1. 17.2.1. Interfaces to Be Implemented
      2. 17.2.2. Rendering the Information
      3. 17.2.3. Retrieving Information from Analysis Services
      4. 17.2.4. Registering the Viewer
      5. 17.2.5. Where to Find Out More about Plug-in Viewers
    3. 17.3. Summary
  24. 18. Implementing a Web Cross-Selling Application
    1. 18.1. Source Data Description
    2. 18.2. Building Your Model
      1. 18.2.1. Identifying the Data Mining Task
      2. 18.2.2. Using Decision Trees for Association
      3. 18.2.3. Using the Association Rules Algorithm
      4. 18.2.4. Comparing the Two Models
    3. 18.3. Making Predictions
      1. 18.3.1. Making Batch Prediction Queries
      2. 18.3.2. Using Singleton Prediction Queries
    4. 18.4. Integrating Predictions with Web Applications
      1. 18.4.1. Understanding Web Application Architecture
      2. 18.4.2. Setting the Permissions
      3. 18.4.3. Examining Sample Code for the Web Recommendation Application
    5. 18.5. Summary
  25. 19. Conclusion and Additional Resources
    1. 19.1. Recapping the Highlights of SQL Server 2008 Data Mining
      1. 19.1.1. State-of-the-Art Algorithms
      2. 19.1.2. Easy-to-Use Tools
      3. 19.1.3. Simple-Yet-Powerful API
      4. 19.1.4. Integration with Sibling BI Technologies
    2. 19.2. Exploring New Data Mining Frontiers and Opportunities
    3. 19.3. Further Reference
      1. 19.3.1. Microsoft Data Mining
      2. 19.3.2. General Data Mining
  26. A. Data Sets
    1. A.1. MovieClick Data Set
    2. A.2. Voting Records Data Set
    3. A.3. Wine Sales
    4. A.4. Foodmart
    5. A.5. College Plans Data Set
  27. B. Supported Functions
    1. B.1. DMX Language Functions
    2. B.2. VBA Functions
    3. B.3. Excel Functions
    4. B.4. ASSprocs Stored Procedures

Product information

  • Title: Data Mining with Microsoft® SQL Server® 2008
  • Author(s):
  • Release date: November 2008
  • Publisher(s): Wiley
  • ISBN: 9780470277744