Application of Principal Component Analysis for Gene Sequences (cDNA microarrays)

In this study, principal component analysis has been applied on data comprising of 6675 gene and 20 sequence collected by using cDNA microarray technology from livers of mice used in toxicology studies in certain time periods. Forming of gene groups from similar expression profiles and description of related genes which are implemented by similar component loads among the groups have been explained by using this cDNA technology. Besides that, interpretation and decomposition of factors (components) from correlation matrix which belongs to same data group have been explained. Some of the methods developed for minimizing the data set to fewer components which can explain the whole data structure have been evaluated. According to methods, if we assume that the first 9 eigen values are enough to describe the whole variance, then in this case, it is thought that it is good enough to describe the whole variance by using 9 eigen values with a variance loss of 20,79% instead of describing the whole variance by using 20 eigen values.

Gen Dizilerinde (cDNA Mikroarray) Temel Bileşenler Analizinin Uygulanması

Bu çalışmada, farelerin karaciğerleri üzerine belirli zaman periyotlarında uygulanmış olan, toksikolojik çalışmalardan alınan ve cDNA mikrodizi teknolojisi kullanılarak elde edilen 6675 gen ve 20 dizi içeren verilere temel bileşenler analizi uygulanmıştır. cDNA teknolojisi kullanılarak, birbirine benzer ifade profilleri ile gen gruplarının oluşturulması ve gruplar içerisindeki benzer bileşen (component) yükleri vasıtasıyla birbirleriyle ilişkili genlerin tanımlanması açıklanmıştır. Bunun yanı sıra aynı veri kümesine ait korelasyon matrisinden faktörlerin ayrıştırılması ve yorumu hakkında bilgiler verilmiştir. Kullanılan veri seti içinde, bütün veri yapısın izah edebilecek daha az sayıda bileşene indirgemek için temel bileşen sayısına karar verme yöntemlerinden birkaçı değerlendirilmiştir. Bu yöntemlere göre ilk 9 temel bileşenin bütün yapının varyansını açıklamaya yeterli olduğu düşünülürse bu durumda %20,79 oranında bir varyans kaybı ile 20 temel bileşen yerine 9 temel bileşen ile açıklamanın yeterli olduğu düşünülmektedir.

___

Anderson TW. 1974. An Introduction to Multivariate Statistical Analysis. John Wiley&Sons, Inc., New York, USA,374s.

Barash Y, Dehan E, Krupsky M. 2004. Comperative Analysis of Algorithms for Signal Quantitaion from Oligonucleotide Microarrays. Bioinformatics. 20(6):839-846.

Brown PO, Botstein D. 1999. Exploring the new world of the genome with DNA microarrays. Nature genetics. 21:33-37.

Brown SM, Grundy NW, Lin D, Cristianini N. 2000. KnowledgeBased Analysis of Microarray Gene Expression Data by Using SVM. Pnas.97(1): 262-267.

Causton HC, Quackenbush J, Brazma A. 2003. A Beginner’s Guide: Microarray Gene Expression Data Analysis. Blackwell Publishing, Malden,USA,160s.

Cooley WW, Lohnes PR. 1971. Multivariate Data Analysis. John Wiley&Sons, Inc., New York, USA,364s.

Collins FS, Morgan M, Patrinos A. 2003.The Human Genome Project: Lessons from Large- scale Biology. Science. 300:286-290.

Cox JM. 2001. Applications of nylon membrane arrays to gene expression analysis. Journal of Immunol meth. 250:3-13

Draghici S. 2003. Data Analysis Tolls for DNA Microarrays. Chapman&Hall/Crc., NewYork, Usa, 477s.

Eisen MB, PT Spellman, PO Brown, D Botstein. 1998. Cluster Analysis and Display of Genome-wide Expression Patterns. Proc. Natl. Acad. Sci. 95(25), 14863-14868.

Gardeux V, Natowicz R, Wanderley MFB, Chelouah R. 2013.

Optimization for feature selection in DNA microarrays. Heuristics: Theory and Applications.

Golub, TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Bloomfield CD. 1999. Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science, 286(5439), 531-537.

Heinloth AN, Irwin RC, Boorman AG. 2004. Gene Expression Profiling of Rat Livers Reveals Indicators of Potential Adverse Effects. Toxıcologıcal Sciences. 80:193–202.

Gonzalo R, Sanchez A. 2018. Chapter Three - Introduction to Microarrays Technology and Data Analysis. Comprehensive Analytical Chemistry. 82: 37-69.

Herrero J, Diaz R, Dopazo J. 2003. Gene Expression Data Preprocessing. Bioinformatics. 19: 655-656.

Johnson RA, Wichern N. 1982. Applied Multivariate Statistical Analysis. Prentice-Hall. London,333s

Jolliffe IT. 2002. Principal Component Analysis. Springer,New York,487s.

Jordan B. 2001. DNA Microarrays: Gene Expression Aplications. Springer, New York, USA, 140s.

Kamberova G, Shah S. 2002. DNA Array Image Analysis. DNA Pres LLC, USA, 206s.

Kerr MK, Churchill GA. 2001. Statistical Design and the Analysis of Gene Expression Microarray Data. Genetical Research. 77(2):123-128.

Landgrebe J, Welzl G, Metz T, Van Gaalen MM, Ropers H, Wurst W, Holsboer F. 2002. Molecular Characterisation of Antidepressant Effects in the Mouse Brain Using Gene Expression Profiling. Journal of Psychiatric Research, 36(3), 119-129.

Lipshutz RJ, Fodor SP, Gingeras TR, Lockhart DJ. 1999. High density synthetic oligonucleotide arrays. Nat Genet. 21:20-4.

Leung FY, Cavalieri D. 2003. Fundamentals of cDNA Microarray Data Analysis. Trends in Genetics. 19: 649-659.

Lin SM, Johnson KF. 2002. Methods of Microarray Data Analysis. Kluwer Academic Publishers, Boston,182s.

Minitab 13 Statistical Software. 2004. [Computer software]. State College, PA: Minitab, Inc.

Newton MN, CM Kendziorski, CS Richmond, FR Blattner, KW Tsui. 2001. Improving Statistical Inference About Gene Expression Changes from Microarray Data. Journal of Computational Biology, 8: 37-52.

Ocampo RV, Sanchez GA, Luna M, Vega A. 2016. Improving pattern classification of microarray data by using PCA and logistic regression. Intelligent Data Analysis. 53-67

Özdamar K. 2002. Paket Programlar ile İstatistiksel Veri Analizi. Kaan Kitabevi. 513s.

Peterson LE. 2001. Factor analysis of Cluster-spesific gene expression levels from cDNA microarrays. Computer methods and programs in Biomedicine. 69:179-188.

Peterson LE. 2003. Partitioning Large-sample Microarray-Based Gene Expression Profiles Using Principal component Analysis. Computer methods and programs in Biomedicine. 70:107-119.

Rao MS, Van VTR, Ciurlionis R, Buck WR, Mittelstadt SW, Liguori MJ. 2019. Comparison of RNA-Seq and Microarray Gene Expression Platforms for the Toxicogenomic Evaluation of Liver From Short-Term Rat Toxicity Studies. Frontiers in Genetics. (9): 636

Raychaudhuri S, Stuart JM, Altman RB. 2000. Principal Component Analysis to Summarize Microarray Experiments: Application to Sporulation Time Series. Pacific Symposium on Biocomputing, 452-463.

Schena M. 2001. DNA Microarrays: A Practical Approach. Oxford University Pres, New York, USA, 206s.

Schuchhardt J, Beule D, Malik A, Wolski E, Eickhoff H, Lehrach H, Herzel H. 2000. Normalization Strategies for cDNA Microarrays. Nucleic Acid Research. 28:4,1-5.

Shao G, Li D, Zhang J, Yang J, Shangguan Y. 2019. Automatic microarray image segmentation with clustering-based algorithms. PLoS ONE 14(1): e0210075.

Shoemaker DD, Schadt EE, Armour CD. 2001. Experimental Annotation of The Human Genome Using Microarray Technology. Nature. 409:922-927.

S-PLUS. 2000. TIBCO Software Inc.

Taguchi YH. 2017. Identification of candidate drugs using tensordecomposition-based unsupervised feature extraction in integrated analysis of gene expression between diseases and DrugMatrix datasets. Sci Rep.7(1):13733.

Tatlıdil H. 1996. Çok Değişkenli İstatistiksel Analiz, Akademi matbaası,424s.

Wagner A, Kumar R, Conley Y, Kochanek P, Berga S. 2015. Multiple Aromatization Mechanisms Influence Mortality and CNS Secondary Injury Profiles after Severe TBI. J Neurotrauma. 2011;28:871–888.
Türk Tarım - Gıda Bilim ve Teknoloji dergisi-Cover
  • ISSN: 2148-127X
  • Yayın Aralığı: Aylık
  • Başlangıç: 2013
  • Yayıncı: Turkish Science and Technology Publishing (TURSTEP)