Now that we have so much more data and this data is being stored longer and in more accessible formats, data scientists are increasingly in demand. Demand for data scientists is growing sharply across many fields and sectors. The term “data scientist” can refer to specific training and background (with more and more advanced degree programs cropping up), but for the purposes of this discussion, let’s assume that data scientists are those who are being asked to extract insight, draw conclusions, and make predictions from data. Data scientists work with data, analyzing, transforming, and building models and databases. Sometimes those acting in data science capacities have relatively little formal training in data science. We certainly hope that everyone engaged in data science has a sufficient understanding of statistics so as not to employ dubious methods or arrive at erroneous conclusions.
We covered some tools specific to genomic analysis in Chapter 2. In this chapter we explore NoSQL database offerings and statistical tools for 21st-century data science. Data science is a vast topic that we will not be able to cover exhaustively in this chapter. If you’d like more information, I would suggest consulting: