O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Data Science with SQL Server Quick Start Guide

Book Description

Get unique insights from your data by combining the power of SQL Server, R and Python

Key Features

  • Use the features of SQL Server 2017 to implement the data science project life cycle
  • Leverage the power of R and Python to design and develop efficient data models
  • find unique insights from your data with powerful techniques for data preprocessing and analysis

Book Description

SQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you.

This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment.

You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm.

What you will learn

  • Use the popular programming languages,T-SQL, R, and Python, for data science
  • Understand your data with queries and introductory statistics
  • Create and enhance the datasets for ML
  • Visualize and analyze data using basic and advanced graphs
  • Explore ML using unsupervised and supervised models
  • Deploy models in SQL Server and perform predictions

Who this book is for

SQL Server professionals who want to start with data science, and data scientists who would like to start using SQL Server in their projects will find this book to be useful. Prior exposure to SQL Server will be helpful.

Downloading the example code for this book You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Table of Contents

  1. Title Page
  2. Copyright and Credits
    1. Data Science with SQL Server Quick Start Guide
  3. Packt Upsell
    1. Why subscribe?
    2. PacktPub.com
  4. Contributors
    1. About the author
    2. About the reviewer
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  6. Writing Queries with T-SQL
    1. Before starting – installing SQL Server
      1. SQL Server setup 
    2. Core T-SQL SELECT statement elements
      1. The simplest form of the SELECT statement
      2. Joining multiple tables
      3. Grouping and aggregating data
    3. Advanced SELECT techniques
      1. Introducing subqueries
      2. Window functions
      3. Common table expressions
      4. Finding top n rows and using the APPLY operator
    4. Summary
  7. Introducing R
    1. Obtaining R
    2. Your first line R of code in R
    3. Learning the basics of the R language
    4. Using R data structures
    5. Summary
  8. Getting Familiar with Python
    1. Selecting the Python environment
    2. Writing your first python code
    3. Using functions, branches, and loops
    4. Organizing the data
    5. Integrating SQL Server and ML
    6. Summary
  9. Data Overview
    1. Getting familiar with a data science project life cycle
    2. Ways to measure data values
    3. Introducing descriptive statistics for continuous variables
      1. Calculating centers of a distribution
      2. Measuring the spread
      3. Higher population moments
    4. Using frequency tables to understand discrete variables
    5. Showing associations graphically
    6. Summary
  10. Data Preparation
    1. Handling missing values
    2. Creating dummies
    3. Discretizing continuous variables
      1. Equal width discretization
      2. Equal height discretization
      3. Custom discretization
    4. The entropy of a discrete variable
    5. Advanced data preparation topics
      1. Efficient grouping and aggregating in T-SQL
      2. Leveraging Microsoft scalable libraries in Python
      3. Using the dplyr package in R
    6. Summary
  11. Intermediate Statistics and Graphs
    1. Exploring associations between continuous variables
    2. Measuring dependencies between discrete variables
    3. Discovering associations between continuous and discrete variables
    4. Expressing dependencies with a linear regression formula
    5. Summary
  12. Unsupervised Machine Learning
    1. Installing ML services (In-Database) packages 
    2. Performing market-basket analysis
    3. Finding clusters of similar cases
    4. Principal components and factor analyses
    5. Summary
  13. Supervised Machine Learning
    1. Evaluating predictive models
    2. Using the Naive Bayes algorithm
    3. Predicting with logistic regression
    4. Trees, forests, and more trees
    5. Predicting with T-SQL
    6. Summary
  14. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think