Microarrays have been developed for the new technology that monitors thousands of ... Key Words: Normalization; cDNA ; M
The 2nd International Biometric Society Conference of The Eastern Mediterranean Region
NORMALIZATION FOR MICROARRAY DATA
Hasan ÖNDER, Yalçın TAHTALI, Zeynel CEBECİ Universty of Çukurova, Faculty of Agriculture, Department of Animal Science
Abstract Microarrays have been developed for the new technology that monitors thousands of the gene expression. Recent improvement of the microarray technology is remarkable, and the application is spread over the various region. This study focuses on the issue of the normalization. The usual normalization method adopts bias as the mean value of the whole gene or housekeeping gene expression intensity. Normalization of the measured intensities is a prerequisite of the comparisons of any statistical analysis, yet insufficient attention has been paid to its systematic study. The most straightforward normalization methods in use is rest on the implicit assumption of linear response between true expression level and output intensity.
Key Words: Normalization; cDNA ; Microarray Data Analysis Introduction DNA microarray technology is a powerful technique which enables the simultaneous detection of thousands of genes in a given sample (1). Therefore, some systematic errors occur during the microarray analysis which should be corrected. Normalization of microarray data is still one of the major bottlenecks in the process of analyzing the gene expression. Normalization is needed in order to remove non-biological influences from data and also to allow comparison of one experiment with another (2). Data taken from different study can be evaluated on same level by means of normalization.
Normalization Sources of systematic variation might affect different microarray experiments in various extends, it is necessary that systematic variation be removed so as to compare microarrays. Normalization is the term used to describe the process of removing of variation (3).
General function involves not only common normalization methods such as mean centering or lowness normalization but it also allows the user to apply more complicated functions (2). Analysis of variance is appreciated to the data before and after normalization to understand how normalization affected the main sources of variance. The understanding assumption of normalization technique is that the quantity of initial mRNA is the same as both of the labeled samples and the total quantity of mRNA hybridizing to the array for each sample (4). The major error sources for the microarray experiments are given below: • mRNA preparation • Transcription • Labeling • Amplification • Systematic variation in pin geometry • Random fluctuations in target volume • Target fixation • Hybridization parameters • Slide inhomogeneities
• Non-specific hybridization • Non-specific background and overshining • Image analysis Different types of the experiments are required for assessing different type of fluctuations (5). In a common normalization technique, scaling factors depend on total signals. Only a small part of the genes is subject to significant experiment-related changes in expression or changes tend to compensate as to not significantly affect the normalization quantities. Variations on the theme are given below: • Compute a mean relationship other than a linear trend • Compute normalization quantities separately for groups of genes, where the partition captures an obvious source of non experimental variation • Compute normalization quantities iteratively • Compute normalization quantities on subset of genes that ought not to show systematic variation • Compute normalization quantities through ad-hoc wild type vs wild type comparisons • Compute normalization quantities through models describing variation sources (6)
Some algorithms for normalization are: • Linear regression • Log-linear regression • Ratio statistics • Log (ratio) mean/median centering • Nonlinear regression Furthermore, some optional used defined quality values can be used (7) Most of the normalization methods assume that the average ratio is 1 or the average log ratio is zero. The assumption illustrates that the average gene does not change its expression under the condition being studied. Therefore, some researchers maintain that not only should the mean/median log ratio be made to equal zero, but that the distribution of the ratios should also be scaled to give a uniform standard deviation. Generally, a normalization method have two components. First is a set of elements on the microarray which is selected and second is set of elements which is used to calculate either a normalization value, or normalization function, which in turn be applied to all of, or a subset of the raw data. In addition, global mean or median normalization or intensity dependent normalization are used for calculating normalization function or factor (3).
Results and Discussion In conclusion, appropriate normalization methods should be used in order to compare microarrays or experiments accurately. Determining suitable method depends on the structure of data. The most straightforward normalization methods in use is rest on the implicit assumption of linear response between true expression level and output intensity.
References 1. WU, W., WILDSMITH, S. E., WINKLEY, A. J., YALLOP, R., ELCOCK, F. J., BUGELSKI, P. J., 2001. Chemometric strategies for Normalisation of Gene Expression Data Obtained from cDNA Microarrays. Analytica Chimica Acta 446, 451 - 466 2. SAARIKKO, I., VILJANEN, T., LAHESMAA, R., SALAKOSKI, T., UUSIPAIKKA, E., 2002. General Optimisation Approach for Normalising cDNA Microarray Data with Replicates. (URL: http://www3.btk.utu.fi:8080/Genomics/Bioinfo/Research/ISCB2002.pdf : 23.10.2002) 3. The MGED Data Transformation and Normalization Working Group http://www.dnachip.org/mged/normalization.html : 19.10.2002 4. Bacteria Diagnostic: Normalization. http://www.stanford.edu/~dalmassi/Biochem218/normalization.html: 18.10.2002 5. SCHUCHHARDT, J., BEULE, D., MALİK, A., WOLSKI, E., EICKHOFF, H., LEHRACH, H., ve HERZEL, H., 2000. Normalization Strategies for cDNA Microarrays. Nucleic Acid Research, 2000, Vol.28, No.10 6. Prepocessing of Microarray Data I: Normalization and Missing Values http://globin.cse.psu.edu/courses/spring2002/3_Norm_miss.pdf : 12.10.2002 7. EMBL-EBI,2002. Gene Expression RFP Pesponse. http://xml.coverpages.org/ebi-GE-00-11-16.pdf : 18.10.2002