Extending Power BI with Python and R

Book description

Perform more advanced analysis and manipulation of your data beyond what Power BI can do to unlock valuable insights using Python and R

Key Features

  • Get the most out of Python and R with Power BI by implementing non-trivial code
  • Leverage the toolset of Python and R chunks to inject scripts into your Power BI dashboards
  • Implement new techniques for ingesting, enriching, and visualizing data with Python and R in Power BI

Book Description

Python and R allow you to extend Power BI capabilities to simplify ingestion and transformation activities, enhance dashboards, and highlight insights. With this book, you'll be able to make your artifacts far more interesting and rich in insights using analytical languages.

You'll start by learning how to configure your Power BI environment to use your Python and R scripts. The book then explores data ingestion and data transformation extensions, and advances to focus on data augmentation and data visualization. You'll understand how to import data from external sources and transform them using complex algorithms. The book helps you implement personal data de-identification methods such as pseudonymization, anonymization, and masking in Power BI. You'll be able to call external APIs to enrich your data much more quickly using Python programming and R programming. Later, you'll learn advanced Python and R techniques to perform in-depth analysis and extract valuable information using statistics and machine learning. You'll also understand the main statistical features of datasets by plotting multiple visual graphs in the process of creating a machine learning model.

By the end of this book, you’ll be able to enrich your Power BI data models and visualizations using complex algorithms in Python and R.

What you will learn

  • Discover best practices for using Python and R in Power BI products
  • Use Python and R to perform complex data manipulations in Power BI
  • Apply data anonymization and data pseudonymization in Power BI
  • Log data and load large datasets in Power BI using Python and R
  • Enrich your Power BI dashboards using external APIs and machine learning models
  • Extract insights from your data using linear optimization and other algorithms
  • Handle outliers and missing values for multivariate and time-series data
  • Create any visualization, as complex as you want, using R scripts

Who this book is for

This book is for business analysts, business intelligence professionals, and data scientists who already use Microsoft Power BI and want to add more value to their analysis using Python and R. Working knowledge of Power BI is required to make the most of this book. Basic knowledge of Python and R will also be helpful.

Table of contents

  1. Extending Power BI with Python and R
  2. Foreword
  3. Contributors
  4. About the author
  5. About the reviewers
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Share Your Thoughts
  7. Section 1: Best Practices for Using R and Python in Power BI
  8. Chapter 1: Where and How to Use R and Python Scripts in Power BI
    1. Technical requirements
    2. Injecting R or Python scripts into Power BI
      1. Data loading
      2. Data transformation
      3. Data visualization
    3. Using R and Python to interact with your data
    4. R and Python limitations on Power BI products
    5. Summary
  9. Chapter 2: Configuring R with Power BI
    1. Technical requirements
    2. The available R engines
      1. The CRAN R distribution
      2. The Microsoft R Open distribution and MRAN
      3. Microsoft R Client
      4. Phasing out of Microsoft R Open
    3. Choosing an R engine to install
      1. The R engines used by Power BI
      2. Installing the suggested R engines
    4. Installing an IDE for R development
      1. Installing RStudio
    5. Configuring Power BI Desktop to work with R
    6. Configuring the Power BI service to work with R
      1. Installing the on-premises data gateway in personal mode
      2. Sharing reports that use R scripts in the Power BI service
    7. R visuals limitations
    8. Summary
  10. Chapter 3: Configuring Python with Power BI
    1. Technical requirements
    2. The available Python engines
    3. Choosing a Python engine to install
      1. The Python engines used by Power BI
      2. Installing the suggested Python engines
    4. Installing an IDE for Python development
      1. Configuring Python with RStudio
      2. Configuring Python with Visual Studio Code
    5. Configuring Power BI Desktop to work with Python
    6. Configuring the Power BI service to work with R
      1. Sharing reports that use Python scripts in the Power BI service
    7. Limitations of Python visuals
    8. Summary
  11. Section 2: Data Ingestion and Transformation with R and Python in Power BI
  12. Chapter 4: Importing Unhandled Data Objects
    1. Technical requirements
    2. Importing RDS files in R
      1. A brief introduction to Tidyverse
      2. Creating a serialized R object
      3. Using an RDS file in Power BI
    3. Importing PKL files in Python
      1. A very short introduction to the PyData world
      2. Creating a serialized Python object
      3. Using a PKL file in Power BI
    4. Summary
    5. References
  13. Chapter 5: Using Regular Expressions in Power BI
    1. Technical requirements
    2. A brief introduction to regexes
      1. The basics of regexes
      2. Checking the validity of email addresses
      3. Checking the validity of dates
    3. Validating data using regex in Power BI
      1. Using regex in Power BI to validate emails with Python
      2. Using regex in Power BI to validate emails with R
      3. Using regex in Power BI to validate dates with Python
      4. Using regex in Power BI to validate dates with R
    4. Loading complex log files using regex in Power BI
      1. Apache access logs
      2. Importing Apache access logs in Power BI with Python
      3. Importing Apache access logs in Power BI with R
    5. Extracting values from text using regex in Power BI
      1. One regex to rule them all
      2. Using regex in Power BI to extract values with Python
      3. Using regex in Power BI to extract values with R
    6. Summary
    7. References
  14. Chapter 6: Anonymizing and Pseudonymizing Your Data in Power BI
    1. Technical requirements
    2. De-identifying data
      1. De-identification techniques
      2. Understanding pseudonymization
      3. What is anonymization?
    3. Anonymizing data in Power BI
      1. Anonymizing data using Python
      2. Anonymizing data using R
    4. Pseudonymizing data in Power BI
      1. Pseudonymizing data using Python
      2. Pseudonymizing data using R
    5. Summary
    6. References
  15. Chapter 7: Logging Data from Power BI to External Sources
    1. Technical requirements
    2. Logging to CSV files
      1. Logging to CSV files with Python
      2. Logging to CSV files with R
    3. Logging to Excel files
      1. Logging to Excel files with Python
      2. Logging to Excel files with R
    4. Logging to an Azure SQL server
      1. Installing SQL Server Express
      2. Creating an Azure SQL database
      3. Logging to an Azure SQL server with Python
      4. Logging to an Azure SQL server with R
    5. Summary
    6. References
  16. Chapter 8: Loading Large Datasets beyond the Available RAM in Power BI
    1. Technical requirements
    2. A typical analytic scenario using large datasets
    3. Import large datasets with Python
      1. Installing Dask on your laptop
      2. Creating a Dask DataFrame
      3. Extracting information from a Dask DataFrame
      4. Importing a large dataset in Power BI with Python
    4. Importing large datasets with R
      1. Installing disk.frame on your laptop
      2. Creating a disk.frame instance
      3. Extracting information from disk.frame
      4. Importing a large dataset in Power BI with R
    5. Summary
    6. References
  17. Section 3: Data Enrichment with R and Python in Power BI
  18. Chapter 9: Calling External APIs to Enrich Your Data
    1. Technical requirements
    2. What a web service is
    3. Registering for Bing Maps Web Services
    4. Geocoding addresses using Python
      1. Using an explicit GET request
      2. Using an explicit GET request in parallel
      3. Using the Geocoder library in parallel
    5. Geocoding addresses using R
      1. Using an explicit GET request
      2. Using an explicit GET request in parallel
      3. Using the tidygeocoder package in parallel
    6. Accessing web services using Power BI
      1. Geocoding addresses in Power BI with Python
      2. Geocoding addresses in Power BI with R
    7. Summary
    8. References
  19. Chapter 10: Calculating Columns Using Complex Algorithms
    1. Technical requirements
    2. The distance between two geographic locations
      1. Spherical trigonometry
      2. The law of cosines distance
      3. The Law of Haversines distance
      4. Vincenty's distance
      5. What kind of distance to use and when
    3. Implementing distances using Python
      1. Calculating distances with Python
      2. Calculating distances in Power BI with Python
    4. Implementing distances using R
      1. Calculating distances with R
      2. Calculating distances in Power BI with R
    5. The basics of linear programming
      1. Linear equations and inequalities
      2. Formulating a linear optimization problem
    6. Definition of the LP problem to solve
      1. Formulating the LP problem
    7. Handling optimization problems with Python
      1. Solving the LP problem in Python
      2. Solving the LP problem in Power BI with Python
    8. Solving LP problems with R
      1. Solving the LP problem in R
      2. Solving the LP problem in Power BI with R
    9. Summary
    10. References
  20. Chapter 11: Adding Statistics Insights: Associations
    1. Technical requirements
    2. Exploring associations between variables
    3. Correlation between numeric variables
      1. Karl Pearson's correlation coefficient
      2. Charles Spearman's correlation coefficient
      3. Maurice Kendall's correlation coefficient
      4. Description of a real case
      5. Implementing correlation coefficients in Python
      6. Implementing correlation coefficients in R
      7. Implementing correlation coefficients in Power BI with Python and R
    4. Correlation between categorical and numeric variables
      1. Considering both variables categorical
      2. Considering a numeric variable and a categorical one
      3. Implementing correlation coefficients in Python
      4. Implementing correlation coefficients in R
      5. Implementing correlation coefficients in Power BI with Python and R
    5. Summary
    6. References
  21. Chapter 12: Adding Statistics Insights: Outliers and Missing Values
    1. Technical requirements
    2. What outliers are and how to deal with them
      1. The causes of outliers
      2. Dealing with outliers
    3. Identifying outliers
      1. Univariate outliers
      2. Multivariate outliers
    4. Implementing outlier detection algorithms
      1. Implementing outlier detection in Python
      2. Implementing outlier detection in R
      3. Implementing outlier detection in Power BI
    5. What missing values are and how to deal with them
      1. The causes of missing values
      2. Handling missing values
    6. Diagnosing missing values in R and Python
    7. Implementing missing value imputation algorithms
      1. Removing missing values
      2. Imputing tabular data
      3. Imputing time-series data
      4. Imputing missing values in Power BI
    8. Summary
    9. References
  22. Chapter 13: Using Machine Learning without Premium or Embedded Capacity
    1. Technical requirements
    2. Interacting with ML in Power BI with data flows
    3. Using AutoML solutions
      1. PyCaret
      2. Azure AutoML
      3. RemixAutoML for R
    4. Embedding training code in Power Query
      1. Training and using ML models with PyCaret
      2. Using PyCaret in Power BI
    5. Using trained models in Power Query
      1. Scoring observations in Power Query using a trained PyCaret model
    6. Using trained models in script visuals
      1. Scoring observations in a script visual using a trained PyCaret model
    7. Calling web services in Power Query
      1. Using Azure AutoML models in Power Query
      2. Using Cognitive Services in Power Query
    8. Summary
    9. References
  23. Section 3: Data Visualization with R in Power BI
  24. Chapter 14: Exploratory Data Analysis
    1. Technical requirements
    2. What is the goal of EDA?
      1. Understanding your data
      2. Cleaning your data
      3. Discovering associations between variables
    3. EDA with Python and R
    4. EDA in Power BI
      1. Dataset summary page
      2. Missing values exploration
      3. Univariate exploration
      4. Multivariate exploration
      5. Variable associations
    5. Summary
    6. References
  25. Chapter 15: Advanced Visualizations
    1. Technical requirements
    2. Choosing a circular barplot
    3. Implementing a circular barplot in R
    4. Implementing a circular barplot in Power BI
    5. Summary
    6. References
  26. Chapter 16: Interactive R Custom Visuals
    1. Technical requirements
    2. Why interactive R custom visuals?
    3. Adding a dash of interactivity with Plotly
    4. Exploiting the interactivity provided by HTML widgets
    5. Packaging it all into a Power BI Custom Visual
      1. Installing the pbiviz package
      2. Developing your first R HTML custom visual
    6. Importing the custom visual package into Power BI
    7. Summary
    8. References
    9. Why subscribe?
  27. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share Your Thoughts

Product information

  • Title: Extending Power BI with Python and R
  • Author(s): Luca Zavarella
  • Release date: November 2021
  • Publisher(s): Packt Publishing
  • ISBN: 9781801078207