The top two tools in the sample were Excel and SQL, both with use by 69% of the sample, followed by R (57%) and Python (54%). Compared to last year, Excel is up (from 59%), as is R (from 52%), while SQL and Python are only slightly higher than last year.
The cluster with the largest coefficient was centered on Spark and Unix, contributing +3.9 per tool. Spark usage was 20%, up from last year’s a modest 3%, and it continues to be used by the more well paid individuals in the sample.
In contrast to the largely open source Spark/Unix cluster, the second highest cluster coefficient (+2.4) was assigned to a cluster dominated by proprietary software: Tableau, Teradata, Netezza, Microstrategy, Aster Data, and Jaspersoft. In last year’s report, Teradata also featured as a tool with a large, ...