O'Reilly logo

Data Science from Scratch, 2nd Edition by Joel Grus

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 24. Databases and SQL

Memory is man’s greatest friend and worst enemy.

Gilbert Parker

The data you need will often live in databases, systems designed for efficiently storing and querying data. The bulk of these are relational databases, such as PostgreSQL, MySQL, and SQL Server, which store data in tables and are typically queried using Structured Query Language (SQL), a declarative language for manipulating data.

SQL is a pretty essential part of the data scientist’s toolkit. In this chapter, we’ll create NotQuiteABase, a Python implementation of something that’s not quite a database. We’ll also cover the basics of SQL while showing how they work in our not-quite database, which is the most “from scratch” way I could think of to help you understand what they’re doing. My hope is that solving problems in NotQuiteABase will give you a good sense of how you might solve the same problems using SQL.

CREATE TABLE and INSERT

A relational database is a collection of tables, and of relationships among them. A table is simply a collection of rows, not unlike some of the matrices we’ve been working with. However, a table also has associated with it a fixed schema consisting of column names and column types.

For example, imagine a users dataset containing for each user her user_id, name, and num_friends:

users = [[0, "Hero", 0],
         [1, "Dunn", 2],
         [2, "Sue", 3],
         [3, "Chi", 3]]

In SQL, we might create this table with:

CREATE TABLE users (
    user_id INT NOT NULL,
    name VARCHAR(200), ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required