Guest commentary on chapter 5: Advances in biomarker discovery with gene expression data

Haiying Wang, Huiru Zheng

Computer Science Research Institute, School of Computing and Mathematics, University of Ulster, Newtownabbey, Co.Antrim, BT37 0QB, UK

With the ability to measure simultaneously the expression levels of thousands of genes in a single experiment, global gene expression profiling technologies such as micro-arrays and serial analysis of gene expression (SAGE) offer significant advantages in the search for new biomarkers. However, the massive amounts of genome-wide expression data generated pose a great challenge for data mining and analysis. It has been shown that traditional statistical and classification techniques are not sufficient to address some fundamental issues in the search of novel and meaningful biomarkers. For example, one common practice is to apply statistical tests to score genes on the basis of their association with specific clinical outcomes and then to select the top-ranked genes as biomarker candidates, which may result in the identification of a set of highly correlated biomarkers. Gerszten and Wang (2008) argued that, in order to achieve a significant improvement in predictive performance, new orthogonal biomarkers associated with new disease pathways are needed. Unsupervised clustering techniques and recent advances in network-based analysis offer great benefits in this endeavour.

Unsupervised clustering approaches

Clustering is the process of ...

Get Bioinformatics and Biomarker Discovery: "Omic" Data Analysis for Personalized Medicine now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.