Skip to Content
SQL for Data Scientists
book

SQL for Data Scientists

by Renee M. P. Teate
September 2021
Beginner
288 pages
6h 54m
English
Wiley
Content preview from SQL for Data Scientists

CHAPTER 1Data Sources

As a data analyst or data scientist, you will encounter data from many sources—from databases to spreadsheets to Application Programming Interfaces (APIs)—which you are expected to use for predictive modeling. Understanding the source system your data comes from, how it was initially gathered and stored, and how frequently it is updated, will take you a long way toward an effective analysis. In my experience, issues with a predictive model can often be traced back all the way to the source data or the query that first pulls the data from the source. Exploring the data available for your analysis starts with exploring the structure of the source database.

Data Sources

Data can be stored in many forms and structures. Examples of unstructured data include text documents or images stored as individual files in a computer's file system. In this book, we'll be focusing on structured data, which is typically organized into a tabular format, like a spreadsheet or database table containing limited-length text or numeric values.

Many software applications enable the organization of data into structured forms. One example you are likely familiar with is Microsoft Excel, for creating and maintaining spreadsheets. Excel also includes some analysis capabilities, such as pivot tables for summarizing spreadsheets and data visualization tools for plotting data points from a spreadsheet. Some functions in Excel allow you to connect data in one spreadsheet to another, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

SQL for Data Analysis

SQL for Data Analysis

Cathy Tanimura
SQL for Data Analytics - Third Edition

SQL for Data Analytics - Third Edition

Jun Shan, Matt Goldwasser, Upom Malik, Benjamin Johnston
Practical SQL

Practical SQL

Anthony DeBarros

Publisher Resources

ISBN: 9781119669364Purchase Link