O'Reilly logo

Data Science from Scratch by Joel Grus

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 23. Databases and SQL

Memory is man’s greatest friend and worst enemy.

Gilbert Parker

The data you need will often live in databases, systems designed for efficiently storing and querying data. The bulk of these are relational databases, such as Oracle, MySQL, and SQL Server, which store data in tables and are typically queried using Structured Query Language (SQL), a declarative language for manipulating data.

SQL is a pretty essential part of the data scientist’s toolkit. In this chapter, we’ll create NotQuiteABase, a Python implementation of something that’s not quite a database. We’ll also cover the basics of SQL while showing how they work in our not-quite database, which is the most “from scratch” way I could think of to help you understand what they’re doing. My hope is that solving problems in NotQuiteABase will give you a good sense of how you might solve the same problems using SQL.

CREATE TABLE and INSERT

A relational database is a collection of tables (and of relationships among them). A table is simply a collection of rows, not unlike the matrices we’ve been working with. However, a table also has associated with it a fixed schema consisting of column names and column types.

For example, imagine a users data set containing for each user her user_id, name, and num_friends:

users = [[0, "Hero", 0],
         [1, "Dunn", 2],
         [2, "Sue", 3],
         [3, "Chi", 3]]

In SQL, we might create this table with:

CREATE TABLE users (
    user_id INT NOT NULL,
    name VARCHAR(200),
    num_friends ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required