Executive Summary

NOW IN ITS THIRD EDITION, the 2015 version of the Data Science Salary Survey explores patterns in tools, tasks, and compensation through the lens of clustering and linear models. The research is based on data collected through an online 32-question survey, including demographic information, time spent on various data-related tasks, and the use/non-use of 116 software tools. Over 600 respondents from a variety of industries completed the survey, two-thirds of whom are based in the United States.

Key findings include:

The same four tools—SQL, Excel, R, and Python—remain at the top for the third year in a row

Spark (and Scala) use has grown tremendously from last year, and their users tend to earn more

Using last year’s data for comparison, R is now used by more data professionals who otherwise tend to use commercial tools

Inversely, R is no longer used as frequently by data practitioners who use other open source tools such as Python or Spark

Salaries in the software industry are highest

Even when all other variables are held equal, women are paid thousands less than their male counterparts

Cloud computing (still) pays

About 40% of variation in respondents’ salaries can be attributed to other pieces of data they provided

We invite you to not only read the report but participate: try plugging your own information into one of the linear models to predict your own salary. And, of course, the survey is open for the 2016 report. Spend just 5 to 10 minutes ...

Get 2015 Data Science Salary Survey now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.