Analysis of Gene Expression Data using Functional Principal Components


The large amount of data involved in DNA microarrays implies the development of efficient computer algorithms to analyze the gene expressions, and thus to study the transcriptome. Numerous techniques already exist and we propose a new method based on the key idea that gene profiles may be considered as continuous curves. The analysis of the set of curves stemming from the DNA microarray may be then performed using a functional analysis which can exhibit the main modes of variations in this set, gather genes with similar variations and extract characteristic parameters of gene profiles. We aim here at introducing this method, called the Functional Principal Component Analysis. A prospective study has been performed on two available datasets, concerning on the one hand the sporulation data of the Saccharomyces cerevisiae, and on the other hand data of tumor cell lines. Results are very promising: the method is able to extract characteristic parameters from the datasets, to extract significant modes of variations in the set of gene profiles, and to link these variations to biological processes already studied in literature.

In Computer Methods and Programs in Biomedicine 75:1-9