Chapter 1. Development Setup
This chapter covers the downloads and software installations needed to use this book, and sketches out a recommended development environment. As youâll see, this isnât as onerous as it might once have been. Iâll cover Python and JavaScript dependencies separately and give a brief overview of cross-language IDEs.
The Accompanying Code
Thereâs a GitHub repository for the bulk of the code covered in this book, including the full Nobel Prize visualization. To get hold of it, just perform a git clone to a suitable local directory:
$ git clone https://github.com/Kyrand/dataviz-with-python-and-js-ed-2.git
This should create a local dataviz-with-python-and-js-v2 directory with the key source code covered by the book.
Python
The bulk of the libraries covered in the book are Python-based, but what might have been a challenging attempt to provide comprehensive installation instructions for the various operating systems and their quirks is made much easier by the existence of Anaconda, a Python platform that bundles together most of the popular analytics libraries in a convenient package. The book assumes you are using Python 3, which was released in 2008 and is now firmly established.
Anaconda
Installing some of the bigger Python libraries used to be a challenge all in itself, particularly those such as NumPy that depend on complex low-level C and Fortran packages. Thatâs a great deal easier now and most will happily install using Pythonâs easy_install
with a pip
command:
$
pip
install
NumPy
But some big number-crunching libraries are still tricky to install. Dependency management and versioning (you might need to use different versions of Python on the same machine) can make things trickier still, and this is where Anaconda comes into its own. It does all the dependency checking and binary installs so you donât have to. Itâs also a very convenient resource for a book like this.
To get your free Anaconda install, just navigate your browser to the Anaconda site, choose the version for your operating system (ideally at least Python 3.5), and follow the instructions. Windows and OS X get a graphical installer (just download and double-click), whereas Linux requires you to run a little bash script:
$ bash Anaconda3-2021.11-Linux-x86_64.sh
Hereâs the latest installing instructions:
I recommend sticking to defaults when installing Anaconda.
The official check guide can be found at the Anaconda site. Windows and macOS users can use the Anacondaâs Navigator GUI or, along with Linux users, use the Conda command-line interface.
Installing Extra Libraries
Anaconda contains almost all the Python libraries covered in this book (see the Anaconda documentation for the full list of Anaconda library packages). Where we need a non-Anaconda library, we can use pip
(short for Pip Installs Python), the de facto standard for installing Python libraries. Using pip
to install is as easy as can be. Just call pip install
followed by the name of the package from the command line and it should be installed or, with any luck, give a sensible error:
$ pip install dataset
Virtual Environments
Virtual environments provide a way of creating a sandboxed development environment with a particular Python version and/or set of third-party libraries. Using these virtual environments avoids polluting your global Python with these installs and gives you a lot more flexibility (you can play with different package versions or change your Python version if need be). The use of virtual environments is becoming a best practice in Python development, and I strongly suggest that you follow it.
Anaconda comes with a conda
system command that makes creating and using virtual environments easy. Letâs create a special one for this book, based on the full Anaconda package:
$ conda create --name pyjsviz anaconda ... # # To activate this environment, use: # $ source activate pyjsviz # # To deactivate this environment, use: # $ source deactivate #
As the final message says, to use this virtual environment you need only source activate
it (for Windows machines, you can leave out the source
):
$source
activate pyjsviz discarding /home/kyran/anaconda/bin from PATH prepending /home/kyran/.conda/envs/pyjsviz/bin to PATH(
pyjsviz)
$
Note that you get a helpful cue at the command line to let you know which virtual environment youâre using.
The conda
command can do a lot more than just facilitate virtual environments, combining the functionality of Pythonâs pip
installer and virtualenv
command, among other things. You can get a full rundown in the Anaconda documentation.
If youâre confident with standard Python virtual environments, these have been made a lot easier to work with by their incorporation in Pythonâs Standard Library. To create a virtual environment from the command line:
$ python -m venv python-js-viz
This creates a python-js-viz
directory containing the various elements of the virtual environment. This includes some activation scripts. To activate the virtual environment with macOS or Linux, run the activate script:
$ source
python-js-viz/bin/activate
On Windows machines, run the .bat file:
$ python-js-viz/Scripts/activate.bat
You can then use pip
to install Python libraries to the virtual environment, avoiding polluting your global Python distribution:
$(
python-js-viz)
pip install NumPy
To install all the libraries required by this book, you can use the requirements.txt file in the bookâs GitHub repo:
$(
python-js-viz)
pip install -r requirements.txt
You can find information on the virtual environment in the Python documentation.
JavaScript
The good news is that you donât need much JavaScript software at all. The only must-have is the Chrome/Chromium web browser, which is used in this book. It offers the most powerful set of developer tools of any current browser and is cross-platform.
To download Chrome, just go to the home page and download the version for your operating system. This should be automatically detected.
All the JavaScript libraries used in this book can be found in the accompanying GitHub repo, but there are generally two ways to deliver them to the browser. You can use a content delivery network (CDN), which efficiently caches a copy of the library retrieved from the delivery network. Alternatively, you can use a local copy of the library served to the browser. Both of these methods use the script
tag in an HTML document.
Content Delivery Networks
With CDNs, rather than having the libraries installed on your local machine, the JavaScript is retrieved by the browser over the web, from the closest available server. This should make things very fastâfaster than if you served the content yourself.
To include a library via CDN, you use the usual <script>
tag, typically placed at the bottom of your HTML page. For example, the following call adds a current version of D3:
<
script
src
=
"https://cdnjs.cloudflare.com/ajax/libs/d3/7.1.1/d3.min.js"
charset
=
"utf-8"
>
</
script
>
Installing Libraries Locally
If you need to install JavaScript libraries locally, because, for example, you anticipate doing some offline development work or canât guarantee an internet connection, there are a number of fairly simple ways to do so.
You can just download the separate libraries and put them in your local serverâs static folder. This is a typical folder structure. Third-party libraries go in the static/libs directory off root, like so:
nobel_viz/ âââ static âââ css âââ data âââ libs â âââ d3.min.js âââ js
If you organize things this way, to use D3 in your scripts now requires a local file reference with the <script>
tag:
<
script
src
=
"/static/libs/d3.min.js"
></
script
>
Databases
The recommended database for small to medium-sized dataviz projects is the brilliant, serverless, file-based, SQL-based SQLite. This database is used throughout the dataviz toolchain demonstrated in the book and is the only database you really need.
The book also covers basic Python interactions with MongoDB, the most popular nonrelational, or NoSQL database:
- SQLite
-
SQLite should come as standard with macOS and Linux machines. For Windows, follow this guide.
- MongoDB
-
You can find installation instructions for the various operating systems in the MongoDB documentation.
Note that weâll be using Pythonâs SQLAlchemy SQL library either directly or through libraries that build on it. This means we can convert any SQLite examples to another SQL backend (e.g., MySQL or PostgreSQL) by changing a configuration line or two.
Getting MongoDB Up and Running
MongoDB can be a little trickier to install than some databases. As mentioned, you can follow this book perfectly well without going through the hassle of installing the server-based MongoDB, but if you want to try it out or find yourself needing to use it at work, here are some installation notes:
For OS X users, check out the official docs for MongoDB installation instructions.
This Windows-specific guide from the official docs should get your MongoDB server up and running. You will probably need to use administrator privileges to create the necessary data directories and so on.
More often than not these days, youâll be installing MongoDB to a Linux-based server, most commonly an Ubuntu variant, which uses the deb file format to deliver its packages. The official MongoDB docs do a good job covering an Ubuntu install.
MongoDB uses a data directory to store to and, depending how you install it, you may need to create this yourself. On OS X and Linux boxes, the default is a data directory off the root directory, which you can create using mkdir
as a superuser (sudo
):
$ sudo mkdir /data $ sudo mkdir /data/db
Youâll then want to set ownership to yourself:
$ sudo chown 'whoami' /data/db
With Windows, installing the MongoDB Community Edition, you can create the necessary data directory with the following command:
$ cd C:\ $ md "\data\db"
The MongoDB server will often be started by default on Linux boxes; otherwise, on Linux and OS X the following command will start a server instance:
$ mongod
On Windows Community Edition, the following, run from a command prompt, will start a server instance:
C:\mongodb\bin\mongod.exe
Easy MongoDB with Docker
MongoDB can be tricky to install. For example, current Ubuntu variants (> version 22.04) have incompatible SSL libs. If you have Docker installed, a working development DB on the default port 27017 is only a single command away:
$ sudo docker run -dp 27017:27017 -v local-mongo:/data/db --name local-mongo --restart=always mongo
This nicely side-steps local library incompatibilities and the like.
Integrated Development Environments
As I explain in âThe Myth of IDEs, Frameworks, and Toolsâ, you donât need an IDE to program in Python or JavaScript. The development tools provided by modern browsers, Chrome in particular, mean you only really need a good code editor to have pretty much the optimal setup.
One caveat here is that these days intermediate to advanced JavaScript tends to involve frameworks like React, Vue, and Svelte that do benefit from the bells and whistles provided by a decent IDE, particularly handling multiformat files (where HTML, CSS, and JS are all embedded together). The good news is that the freely available Visual Studio Code (VSCode) has become the de facto standard for modern web development. Itâs got plug-ins for pretty much everything and a very large and active community, so questions tend to be answered and bugs hunted down fast.
For Python, I have tried a few dedicated IDEs but theyâve never stuck. The main itch I was trying to scratch was finding a decent debugging system. Setting breakpoints in Python with a text editor isnât particularly elegant, and using the command-line debugger pdb
feels a little too old school sometimes. Nevertheless, Python does have a pretty good logging system included, which takes the edge off its rather clunky default debugging. VSCode is pretty good for Python programming, but there are some Python-specific IDEs that are arguably a little smoother.
In no particular order, here are a few that Iâve tried and not disliked:
- PyCharm
-
This option offers solid code assistance and good debugging and would probably top a favorite IDE poll of seasoned Pythonistas.
- PyDev
-
If you like Eclipse and can tolerate its rather large footprint, this might well be for you.
- Wing Python IDE
-
This is a solid bet, with a great debugger and incremental improvements over a decade-and-a-halfâs worth of development.
Summary
With free, packaged Python distributions such as Anaconda, and the inclusion of sophisticated JavaScript development tools in freely available web browsers, the necessary Python and JavaScript elements of your development environment are a couple of clicks away. Add a favorite editor and a database of choice,1 and you are pretty much good to go. There are additional libraries, such as node.js, that can be useful but donât count as essential. Now that weâve established our programming environment, the next chapters will teach the preliminaries needed to start our journey of data transformation along the toolchain, starting with a language bridge between Python and JavaScript.
1 SQLite is great for development purposes and doesnât need a server running on your machine.
Get Data Visualization with Python and JavaScript, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.