Bioinformatics Methods for Analysis of Gene Expression Data
Recent technical advances in molecular biology have had a profound impact on biology and medicine. It is unlikely that this progress could have been made without the support of modern computers. For instance, the massive amounts of sequence data produced by The Human Genome Project, and other genome sequencing efforts, have been stored, shared, and analyzed through use of electronic equipment. After completion of sequencing projects, research effort will be shifted towards the analysis of the biological function of the thousands of genes in the genome(s), their mRNAs, and their proteins. One of the keys to understand the function of genes and their products is the level of expression in various cell types and states. High-throughput technologies for measuring gene expression, such as spotted cDNA microarrays, synthesized oligonucleotide microarrays, and SAGE, are maturing. Huge amounts of gene expression data have already been generated, and more will follow. Similarly as with sequence data, gene expression data will be analyzed with highly computerized methods. However, in contrast, analysis of gene expression will benefit considerably from integration of external sources of information, including clinical information of patients, published literature, and sequence data.
In summary, this thesis presents computational methods for various steps in analysis of gene expression data measured by cDNA microarrays and similar technologies. This includes methods to assess data quality, compare measurements from different technological platforms, and extract and integrate background knowledge in analysis.