Chapter 1. Python vs. R for Data Science
Introduction
Python and R are two of the mainstream languages in data science. Fundamentally, Python is a language for programmers, whereas R is a language for statisticians. In a data science context, there is a significant degree of overlap when it comes to the capabilities of each language in the fields of regression analysis and machine learning. Your choice of language will depend highly on the environment in which you are operating. In a production environment, Python integrates with other languages much more seamlessly and is therefore the modus operandi in this context. However, R is much more common in research environments due to its more extensive selection of libraries for statistical analysis.
Basics
Python |
R |
Current version |
|
|
3.6 |
3.4.3 |
Self-defined as |
|
|
Python is a programming language that lets you work quickly and integrate your systems effectively. According to the official website, the Python quote emphasizes productivity as well as its use as a glue language. |
R is an open source language that is specifically designed for conducting statistical analysis. As such, it is highly popular within fields such as data science, engineering, and other cognitive disciplines. The R Project for Statistical Computing describes the R language as an environment specifically designed for “statistical computing and graphics.” |
Strengths |
|
|
Python has significantly more flexibility in interacting with ... | |
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access