CHAPTER 14Storing and Modifying Data

We have covered many aspects of developing datasets for machine learning that involve selecting data from a database and preparing it for machine learning models, but what do you do once you have designed your query and are ready to start analyzing the results? Your SQL editor will often allow you to write the results of your query to a CSV file to be imported into Business Intelligence (BI) software such as Tableau or machine learning scripts in a language like Python. However, sometimes for data governance, data security, teamwork, or file size and processing speed purposes, it is preferable to store the dataset within the database.

In this chapter, we'll cover some types of SQL queries beyond SELECT statements, such as INSERT statements, which allow you to store the results of your query in a new table in the database.

Storing SQL Datasets as Tables and Views

In most databases, you can store the results of a query as either a table or a view. Storing results as a table takes a snapshot of whatever the results are at the time the query is run and saves the data returned as a new table object, or as new rows appended to an existing table, depending on how you write your SQL statement. A database view instead stores the SQL itself and runs it on-demand when you write a query that references the name of the view, to dynamically generate a new dataset based on the state of the referenced database objects at the time you run the query. (You ...

Get SQL for Data Scientists now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.