Data Science Programming All-in-One For Dummies

Book description

Your logical, linear guide to the fundamentals of data science programming

Data science is exploding—in a good way—with a forecast of 1.7 megabytes of new information created every second for each human being on the planet by 2020 and 11.5 million job openings by 2026. It clearly pays dividends to be in the know. This friendly guide charts a path through the fundamentals of data science and then delves into the actual work: linear regression, logical regression, machine learning, neural networks, recommender engines, and cross-validation of models.

Data Science Programming All-In-One For Dummies is a compilation of the key data science, machine learning, and deep learning programming languages: Python and R. It helps you decide which programming languages are best for specific data science needs. It also gives you the guidelines to build your own projects to solve problems in real time.

  • Get grounded: the ideal start for new data professionals
  • What lies ahead: learn about specific areas that data is transforming  
  • Be meaningful: find out how to tell your data story
  • See clearly: pick up the art of visualization

Whether you’re a beginning student or already mid-career, get your copy now and add even more meaning to your life—and everyone else’s!

Table of contents

  1. Cover
  2. Introduction
    1. About This Book
    2. Foolish Assumptions
    3. Icons Used in This Book
    4. Beyond the Book
    5. Where to Go from Here
  3. Book 1: Defining Data Science
    1. Chapter 1: Considering the History and Uses of Data Science
      1. Considering the Elements of Data Science
      2. Defining the Role of Data in the World
      3. Creating the Data Science Pipeline
      4. Comparing Different Languages Used for Data Science
      5. Learning to Perform Data Science Tasks Fast
    2. Chapter 2: Placing Data Science within the Realm of AI
      1. Seeing the Data to Data Science Relationship
      2. Defining the Levels of AI
      3. Creating a Pipeline from Data to AI
    3. Chapter 3: Creating a Data Science Lab of Your Own
      1. Considering the Analysis Platform Options
      2. Choosing a Development Language
      3. Obtaining and Using Python
      4. Obtaining and Using R
      5. Presenting Frameworks
      6. Accessing the Downloadable Code
    4. Chapter 4: Considering Additional Packages and Libraries You Might Want
      1. Considering the Uses for Third-Party Code
      2. Obtaining Useful Python Packages
      3. Locating Useful R Libraries
    5. Chapter 5: Leveraging a Deep Learning Framework
      1. Understanding Deep Learning Framework Usage
      2. Working with Low-End Frameworks
      3. Understanding TensorFlow
  4. Book 2: Interacting with Data Storage
    1. Chapter 1: Manipulating Raw Data
      1. Defining the Data Sources
      2. Considering the Data Forms
      3. Understanding the Need for Data Reliability
    2. Chapter 2: Using Functional Programming Techniques
      1. Defining Functional Programming
      2. Understanding Pure and Impure Languages
      3. Comparing the Functional Paradigm
      4. Using Python for Functional Programming Needs
      5. Understanding How Functional Data Works
      6. Working with Lists and Strings
      7. Employing Pattern Matching
      8. Working with Recursion
      9. Performing Functional Data Manipulation
    3. Chapter 3: Working with Scalars, Vectors, and Matrices
      1. Considering the Data Forms
      2. Defining Data Type through Scalars
      3. Creating Organized Data with Vectors
      4. Creating and Using Matrices
      5. Extending Analysis to Tensors
      6. Using Vectorization Effectively
      7. Selecting and Shaping Data
      8. Working with Trees
      9. Representing Relations in a Graph
    4. Chapter 4: Accessing Data in Files
      1. Understanding Flat File Data Sources
      2. Working with Positional Data Files
      3. Accessing Data in CSV Files
      4. Moving On to XML Files
      5. Considering Other Flat-File Data Sources
      6. Working with Nontext Data
      7. Downloading Online Datasets
    5. Chapter 5: Working with a Relational DBMS
      1. Considering RDBMS Issues
      2. Accessing the RDBMS Data
      3. Creating a Dataset
      4. Mixing RDBMS Products
    6. Chapter 6: Working with a NoSQL DMBS
      1. Considering the Ramifications of Hierarchical Data
      2. Accessing the Data
      3. Interacting with Data from NoSQL Databases
      4. Working with Dictionaries
      5. Developing Datasets from Hierarchical Data
      6. Processing Hierarchical Data into Other Forms
  5. Book 3: Manipulating Data Using Basic Algorithms
    1. Chapter 1: Working with Linear Regression
      1. Considering the History of Linear Regression
      2. Combining Variables
      3. Manipulating Categorical Variables
      4. Using Linear Regression to Guess Numbers
      5. Learning One Example at a Time
    2. Chapter 2: Moving Forward with Logistic Regression
      1. Considering the History of Logistic Regression
      2. Differentiating between Linear and Logistic Regression
      3. Using Logistic Regression to Guess Classes
      4. Switching to Probabilities
      5. Working through Multiclass Regression
    3. Chapter 3: Predicting Outcomes Using Bayes
      1. Understanding Bayes' Theorem
      2. Using Naïve Bayes for Predictions
      3. Working with Networked Bayes
      4. Considering the Use of Bayesian Linear Regression
      5. Considering the Use of Bayesian Logistic Regression
    4. Chapter 4: Learning with K-Nearest Neighbors
      1. Considering the History of K-Nearest Neighbors
      2. Learning Lazily with K-Nearest Neighbors
      3. Leveraging the Correct k Parameter
      4. Implementing KNN Regression
      5. Implementing KNN Classification
  6. Book 4: Performing Advanced Data Manipulation
    1. Chapter 1: Leveraging Ensembles of Learners
      1. Leveraging Decision Trees
      2. Working with Almost Random Guesses
      3. Meeting Again with Gradient Descent
      4. Averaging Different Predictors
    2. Chapter 2: Building Deep Learning Models
      1. Discovering the Incredible Perceptron
      2. Hitting Complexity with Neural Networks
      3. Understanding More about Neural Networks
      4. Looking Under the Hood of Neural Networks
      5. Explaining Deep Learning Differences with Other Forms of AI
    3. Chapter 3: Recognizing Images with CNNs
      1. Beginning with Simple Image Recognition
      2. Understanding CNN Image Basics
      3. Moving to CNNs with Character Recognition
      4. Explaining How Convolutions Work
      5. Detecting Edges and Shapes from Images
    4. Chapter 4: Processing Text and Other Sequences
      1. Introducing Natural Language Processing
      2. Understanding How Machines Read
      3. Understanding Semantics Using Word Embeddings
      4. Using Scoring and Classification
  7. Book 5: Performing Data-Related Tasks
    1. Chapter 1: Making Recommendations
      1. Realizing the Recommendation Revolution
      2. Downloading Rating Data
      3. Leveraging SVD
    2. Chapter 2: Performing Complex Classifications
      1. Using Image Classification Challenges
      2. Distinguishing Traffic Signs
    3. Chapter 3: Identifying Objects
      1. Distinguishing Classification Tasks
      2. Perceiving Objects in Their Surroundings
      3. Overcoming Adversarial Attacks on Deep Learning Applications
    4. Chapter 4: Analyzing Music and Video
      1. Learning to Imitate Art and Life
      2. Mimicking an Artist
      3. Moving toward GANs
    5. Chapter 5: Considering Other Task Types
      1. Processing Language in Texts
      2. Processing Time Series
    6. Chapter 6: Developing Impressive Charts and Plots
      1. Starting a Graph, Chart, or Plot
      2. Setting the Axis, Ticks, and Grids
      3. Defining the Line Appearance
      4. Using Labels, Annotations, and Legends
      5. Creating Scatterplots
      6. Plotting Time Series
      7. Plotting Geographical Data
      8. Visualizing Graphs
  8. Book 6: Diagnosing and Fixing Errors
    1. Chapter 1: Locating Errors in Your Data
      1. Considering the Types of Data Errors
      2. Obtaining the Required Data
      3. Validating Your Data
      4. Manicuring the Data
      5. Dealing with Dates in Your Data
    2. Chapter 2: Considering Outrageous Outcomes
      1. Deciding What Outrageous Means
      2. Considering the Five Mistruths in Data
      3. Considering Detection of Outliers
      4. Examining a Simple Univariate Method
      5. Developing a Multivariate Approach
    3. Chapter 3: Dealing with Model Overfitting and Underfitting
      1. Understanding the Causes
      2. Determining the Sources of Overfitting and Underfitting
      3. Guessing the Right Features
    4. Chapter 4: Obtaining the Correct Output Presentation
      1. Considering the Meaning of Correct
      2. Determining a Presentation Type
      3. Choosing the Right Graph
      4. Working with External Data
    5. Chapter 5: Developing Consistent Strategies
      1. Standardizing Data Collection Techniques
      2. Using Reliable Sources
      3. Verifying Dynamic Data Sources
      4. Looking for New Data Collection Trends
      5. Weeding Old Data
      6. Considering the Need for Randomness
  9. Index
  10. About the Authors
  11. Connect with Dummies
  12. End User License Agreement

Product information

  • Title: Data Science Programming All-in-One For Dummies
  • Author(s): John Paul Mueller, Luca Massaron
  • Release date: January 2020
  • Publisher(s): For Dummies
  • ISBN: 9781119626114