Chapter 16

Using SQL in Data Science


check Getting a grip on relational databases and SQL

check Designing great relational databases

check Doing data science tasks with SQL functions

SQL, or Structured Query Language, is a standard for creating, maintaining, and securing relational databases. It’s a set of rules that you can use to quickly and efficiently query, update, modify, add, or remove data from large and complex databases. You use SQL, rather than Python or Excel, to do these tasks because SQL is the simplest, fastest way to get the job done. It offers a plain and standardized set of core commands and methods that are easy to use when performing these tasks. In this chapter, I introduce you to basic SQL concepts and explain how you can use SQL to do cool things, like query, join, group, sort, and even text-mine structured datasets.

Although you can use SQL to work with structured data that resides in relational database management systems, you can’t use standard SQL as a solution for handling big data, because you can’t handle big data using relational database technologies. I give you more solutions for handling big data in Chapter 2, where I discuss data engineering and ...

Get Data Science For Dummies, 2nd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.