2

Review of fundamental statistical concepts

This chapter offers a brief introduction to basic statistical knowledge for helping the reader to design and interpret disease biomarker discovery studies: Statistical error types, data sampling and hypothesis testing for numerical and nominal data, odd scores, and the interpretation of other statistical indicators. This includes parametric and non-parametric techniques for comparing groups and models, which is a basic approach to detecting potential biomarkers in large-scale ‘omic’ studies. This chapter also explains different predictive evaluation techniques: traditional measures, techniques for classification and numerical predictions, and an introduction to the application of receiver operating characteristic curves and related methods.

2.1 Basic concepts and problems

Although we assume that the reader has some basic experience in statistical analysis, the first half of this chapter offers a quick overview of fundamental definitions and terminology that will be used in subsequent chapters.

A key approach to biomarker discovery research is to compare cases vs. control samples to detect statistical differences, which could lead to the identification and prioritization of potential biomarkers. Control and case samples are commonly obtained before treatment or before knowing its classification (e.g. diagnosis, prognosis). Control samples are obtained from healthy patients, untreated patients, or from patients who did not experience ...

Get Bioinformatics and Biomarker Discovery: "Omic" Data Analysis for Personalized Medicine now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.