Skip to Content
Data Science at the Command Line
book

Data Science at the Command Line

by Jeroen Janssens
October 2014
Beginner to intermediate
210 pages
4h 32m
English
O'Reilly Media, Inc.
Content preview from Data Science at the Command Line

Chapter 2. Getting Started

In this chapter, we are going to make sure that you have all the prerequisites for doing data science at the command line. The prerequisites fall into two parts: (1) having a proper environment with all the command-line tools that we employ in this book, and (2) understanding the essential concepts that come into play when using the command line.

First, we describe how to install the Data Science Toolbox, which is a virtual environment based on GNU/Linux that contains all the necessary command-line tools. Subsequently, we explain the essential command-line concepts through examples.

By the end of this chapter, you’ll have everything you need in order to continue with the first step of doing data science, namely obtaining data.

Overview

In this chapter, you’ll learn:

  • How to set up the Data Science Toolbox

  • Essential concepts and tools necessary to do data science at the command line

Setting Up Your Data Science Toolbox

In this book we use many different command-line tools. The distribution of GNU/Linux that we are using, Ubuntu, comes with a whole bunch of command-line tools pre-installed. Moreover, Ubuntu offers many packages that contain other, relevant command-line tools. Installing these packages yourself is not too difficult. However, we also use command-line tools that are not available as packages and require a more manual, and more involved, installation. In order to acquire the necessary command-line tools without having to go through ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Science with Java

Data Science with Java

Michael R. Brzustowicz
Data Wrangling with Python

Data Wrangling with Python

Jacqueline Kazil, Katharine Jarmul
Data Analytics with Hadoop

Data Analytics with Hadoop

Benjamin Bengfort, Jenny Kim
Data Science on AWS

Data Science on AWS

Chris Fregly, Antje Barth

Publisher Resources

ISBN: 9781491947845Supplemental ContentErrata Page