CHAPTER 29
BICLUSTERING OF MICROARRAY DATA
29.1 INTRODUCTION
One of the main challenges in computational molecular biology is the design of efficient algorithms capable of analyzing biological data, like microarray data. Analysis of gene expression data obtained from microarray experiments can be made through biclustering. Indeed, gene expression data are usually represented by a data matrix (see Table 29.1), where the ith row represents the ith gene, the jth column represents the jth condition and the cell mij represents the expression level of the ith gene under the jth condition.
In general, subsets of genes are coexpressed only under certain conditions but behave almost independently under others. Discovering such coexpressions can be helpful to discover genetic knowledge such as genes annotation or genes interaction. Hence, it is very interesting to make a simultaneous clustering of rows (genes) and of columns (conditions) of the data matrix to identify groups of rows coherent with groups of columns (i.e., to identify clusters of genes that are coexpressed under clusters of conditions, or clusters of conditions that make clusters of genes coexpress). This type of clustering is called biclustering [9]. A cluster made thanks to a biclustering is called bicluster. Hence, a bicluster of genes (respectively ...