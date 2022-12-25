Data Visualization with Python and JavaScript, 2nd Edition

Data Visualization with Python and JavaScript, 2nd Edition

by Kyran Dale
Released December 2022
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781098111878

Book description

How do you turn raw, unprocessed, or malformed data into dynamic, interactive web visualizations? In this practical book, author Kyran Dale shows data scientists and analysts--as well as Python and JavaScript developers--how to create the ideal toolchain for the job. By providing engaging examples and stressing hard-earned best practices, this guide teaches you how to leverage the power of best-of-breed Python and JavaScript libraries.

Python provides accessible, powerful, and mature libraries for scraping, cleaning, and processing data. And while JavaScript is the best language when it comes to programming web visualizations, its data processing abilities can't compare with Python's. Together, these two languages are a perfect complement for creating a modern web-visualization toolchain. This book gets you started.

You'll learn how to:

  • Obtain data you need programmatically, using scraping tools or web APIs: Requests, Scrapy, Beautiful Soup
  • Clean and process data using Python's heavyweight data processing libraries within the NumPy ecosystem: Jupyter notebooks with pandas+Matplotlib+Seaborn
  • Deliver the data to a browser with static files or by using Flask, the lightweight Python server, and a RESTful API
  • Pick up enough web development skills (HTML, CSS, JS) to get your visualized data on the web
  • Use the data you've mined and refined to create web charts and visualizations with Plotly, D3, Leaflet, and other libraries

Table of contents

  1. 1. A Language-Learning Bridge Between Python and JavaScript
    1. Similarities and Differences
    2. Interacting with the Code
      1. Python
      2. JavaScript
    3. Basic Bridge Work
      1. Style Guidelines, PEP 8, and use strict
      2. CamelCase Versus Underscore
      3. Importing Modules, Including Scripts
      4. Keeping Your Namespaces Clean
      5. Outputting “Hello World!”
      6. Simple Data Processing
      7. String Construction
      8. Significant Whitespace Versus Curly Brackets
      9. Comments and doc-strings
      10. Declaring Variables, let, var
      11. Strings and Numbers
      12. Booleans
      13. Data Containers: Dicts, Objects, Lists, Arrays
      14. Functions
      15. Iterating: for Loops and Functional Alternatives
      16. Conditionals: if, else, elif, switch
      17. File Input and Output
      18. Classes and Prototypes
    4. Differences in Practice
      1. Method Chaining
      2. Enumerating a List
      3. Tuple Unpacking
      4. Collections
      5. Underscore
      6. Functional Array Methods and List Comprehensions
      7. Map, Reduce, and Filter with Python’s Lambdas
      8. JavaScript Closures and the Module Pattern
      9. This Is That
    5. A Cheat Sheet
    6. Summary
  2. 2. Reading and Writing Data with Python
    1. Easy Does It
    2. Passing Data Around
    3. Working with System Files
    4. CSV, TSV, and Row-Column Data Formats
    5. JSON
      1. Dealing with Dates and Times
    6. SQL
      1. Creating the Database Engine
      2. Defining the Database Tables
      3. Adding Instances with a Session
      4. Querying the Database
      5. Easier SQL with Dataset
    7. MongoDB
    8. Dealing with Dates, Times, and Complex Data
    9. Summary
  3. 3. Webdev 101
    1. The Big Picture
    2. Single-Page Apps
    3. Tooling Up
      1. The Myth of IDEs, Frameworks, and Tools
      2. A Text-Editing Workhorse
      3. Browser with Development Tools
      4. Terminal or Command Prompt
    4. Building a Web Page
      1. Serving Pages with HTTP
      2. The DOM
      3. The HTML Skeleton
      4. Marking Up Content
      5. CSS
      6. JavaScript
      7. Data
    5. Chrome’s Developer Tools
      1. The Elements Tab
      2. The Sources Tab
      3. Other Tools
    6. A Basic Page with Placeholders
    7. Positioning and Sizing Containers with Flex
      1. Filling the Placeholders with Content
    8. Scalable Vector Graphics
      1. The <g> Element
      2. Circles
      3. Applying CSS Styles
      4. Lines, Rectangles, and Polygons
      5. Text
      6. Paths
      7. Scaling and Rotating
      8. Working with Groups
      9. Layering and Transparency
      10. JavaScripted SVG
    9. Summary
  4. 4. Getting Data off the Web with Python
    1. Getting Web Data with the requests Library
    2. Getting Data Files with requests
    3. Using Python to Consume Data from a Web API
      1. Consuming a RESTful Web API with requests
      2. The Worldbank’s climate change APIs
      3. Getting Country Data for the Nobel Dataviz
    4. Using Libraries to Access Web APIs
      1. Using Google Spreadsheets
      2. Using the Twitter API with Tweepy
    5. Scraping Data
      1. Why We Need to Scrape
      2. BeautifulSoup and lxml
      3. A First Scraping Foray
    6. Getting the Soup
    7. Selecting Tags
      1. Crafting Selection Patterns
      2. Caching the Web Pages
      3. Scraping the Winners’ Nationalities
    8. Summary
  5. 5. Heavyweight Scraping with Scrapy
    1. Setting Up Scrapy
    2. Establishing the Targets
    3. Targeting HTML with Xpaths
      1. Testing Xpaths with the Scrapy Shell
      2. Selecting with Relative Xpaths
    4. A First Scrapy Spider
    5. Scraping the Individual Biography Pages
    6. Chaining Requests and Yielding Data
      1. Caching Pages
      2. Yielding Requests
    7. Scrapy Pipelines
    8. Scraping Text and Images with a Pipeline
      1. Specifying Pipelines with Multiple Spiders
    9. Summary
  6. 6. Introduction to NumPy
    1. The NumPy Array
      1. Creating Arrays
      2. Array Indexing and Slicing
      3. A Few Basic Operations
    2. Creating Array Functions
      1. Calculating a Moving Average
    3. Summary
  7. 7. Introduction to Pandas
    1. Why Pandas Is Tailor-Made for Dataviz
    2. Why Pandas Was Developed
    3. Heterogeneous Data and Categorizing Measurements
    4. The DataFrame
      1. Indices
      2. Rows and Columns
      3. Selecting Groups
    5. Creating and Saving DataFrames
      1. JSON
      2. CSV
      3. Excel Files
      4. SQL
      5. MongoDB
    6. Series into DataFrames
    7. Summary
  8. 8. Cleaning Data with Pandas
    1. Coming Clean About Dirty Data
    2. Inspecting the Data
    3. Indices and Pandas Data Selection
      1. Selecting Multiple Rows
    4. Cleaning the Data
      1. Finding Mixed Types
      2. Replacing Strings
      3. Removing Rows
      4. Finding Duplicates
      5. Sorting Data
      6. Removing Duplicates
      7. Dealing with Missing Fields
      8. Dealing with Times and Dates
    5. The Full clean_data Function
    6. Saving the Cleaned Dataset
      1. Merging DataFrames
    7. Summary
  9. 9. Visualizing Data with Matplotlib
    1. Pyplot and Object-Oriented Matplotlib
    2. Starting an Interactive Session
    3. Interactive Plotting with Pyplot’s Global State
      1. Configuring Matplotlib
      2. Setting the Figure’s Size
      3. Points, Not Pixels
      4. Labels and Legends
      5. Titles and Axes Labels
      6. Saving Your Charts
    4. Figures and Object-Oriented Matplotlib
      1. Axes and Subplots
    5. Plot Types
      1. Bar Charts
      2. Scatter Plots
    6. Seaborn
      1. FacetGrids
      2. Pairgrids
    7. Summary
  10. 10. Exploring Data with Pandas
    1. Starting to Explore
    2. Plotting with Pandas
    3. Gender Disparities
      1. Unstacking Groups
      2. Historical Trends
    4. National Trends
      1. Prize Winners per Capita
      2. Prizes by Category
      3. Historical Trends in Prize Distribution
    5. Age and Life Expectancy of Winners
      1. Age at Time of Award
      2. Life Expectancy of Winners
      3. Increasing Life Expectancies over Time
    6. The Nobel Diaspora
    7. Summary
