Eight categories of tools were included as binary questions on the survey; respondents simply marked the ones that they currently use in a professional context. The tool categories were operating systems, programming languages, text editors, IDEs, data tools, cloud/containers, build automation tools, and frameworks.
On average, respondents used 3.6 programming languages and 16 tools of any kind. Less than 3% of the sample used fewer than 6 tools, while 19% used at least 20. Some tools seemed to encourage a larger toolkit: respondents who used Scala, Objective-C, Kubernetes, Google App Engine, Go, Groovy, YAML, Cassandra, Solr, or Spark used 21–23 tools on average.
It is interesting to note that Vim remains by far the most popular text editor, and IntelliJ is the most popular IDE (a lot higher than Eclipse or Visual Studio). MySQL still rules in databases, with PostgreSQL barely coming out better than Excel in popularity. PostgreSQL pays slightly better than MySQL. But the high salaries tend to be with NoSQL and cloud-related technologies: Hadoop, Spark, MongoDB, Cassandra, etc. These do even better than Oracle.
Instead of feeding individual tools into the model (which would result in a small selection of them being chosen as model coefficients), we instead have first built clusters of the most frequently used tools. The motivation behind this is that tools are often highly correlated with one another. (Operating systems were excluded from the clusters.)
The 18 clusters were ...