Gen Ekspresyon Verilerinde Yapay Sinir Ağlarına Dayalı Denetimli Temel Bileşenler Analizi Yaklaşımı
Bu çalışmada, denetimli temel bileşenler analizi (D-TBA) ile yeni bir yaklaşım olarak önerilen yapay sinir ağlarıyla denetimli temel bileşenler analizi (D-YSA-TBA) kullanılarak çok boyutlu gen ekspresyon verilerinin boyutunun indirgenmesi ve random survival forests (RSF) analizi kullanılarak performansların karşılaştırılması amaçlandı. Simülasyon uygulamasında çok değişkenli normal dağılımdan 100 birim için 5000 gen ve bu gen verisi ile ilişkili yaşam süresi verisi türetildi. Simülasyon aşaması 1000 tekrarlı olarak gerçekleştirildi. Ayrıca yaygın B-hücreli lenfoma (DLBCL) hastası 240 bireye ilişkin gen ekspresyon verileri kullanıldı. Önemli genlerin seçiminde Wald istatistiği kullanılarak boyut indirgemesi yapıldı. Yöntemlerden elde edilen yeni veri setleri RSF analizi kullanılarak analiz edildi. Simülasyon uygulamasında D-TBA ve D-YSA-TBAyöntemlerinin açıklayıcılıkları arasında anlamlı bir fark olduğu görülmüştür (p
Supervised Principal Component Analysis Approach Based on Artificial Neural Networks in Gene Expression Data
The aim of this study is dimension reduction of multidimensional gene expression data using supervisedprincipal component analysis (S-PCA) and –proposed as a new approach- supervised principal component analysiswith artificial neural networks (S-ANN-PCA) and to compare performances of these two methods by using randomsurvival forests (RSF). In simulation application 5000 genes were generated according to multivariate normaldistribution and then survival time that is correlated to these gene data were generated for 100 units. Simulation stepwas carried out with 1000 repetitions.In addition, gene expression data for 240 individuals with extensive B-cell lymphoma (DLBCL) were used.Dimension reduction was done using Wald statistic in selection of important genes. The new data sets obtained fromthe methods were analyzed using RSF analysis.In the simulation application, it was obtained that the explanatorinessof S-PCA was significantly different from S-ANN-PCA (p
___
- Dudoit S, Fridlyand J, Speed TP. Comparison
of discrimination methods for the
classification of tumors using gene expression
data. Journal of the American statistical
association. 2002;97(457):77-87.
- Quackenbush J. Computational analysis of
microarray data. Nature reviews genetics.
2001;2(6):418-27.
- Khan J, Wei JS, Ringner M, Saal LH, Ladanyi
M, Westermann F, et al. Classification and
diagnostic prediction of cancers using gene
expression profiling and artificial neural
networks. Nature medicine. 2001;7(6):673-9.
- O'Neill MC, Song L. Neural network analysis
of lymphoma microarray data: prognosis and
diagnosis near-perfect. BMC bioinformatics.
2003;4(1):13
- Liu B, Cui Q, Jiang T, Ma S. A combinational
feature selection and ensemble neural network method for classification of gene expression
data. BMC bioinformatics. 2004;5(1):136.
- Zhao H, Ljungberg B, Grankvist K, Rasmuson
T, Tibshirani R, Brooks JD. Gene expression
profiling predicts survival in conventional
renal cell carcinoma. PLoS medicine.
2005;3(1):e13.
- Van Wieringen WN, Kun D, Hampel R,
Boulesteix A-L. Survival prediction using
gene expression data: a review and
comparison. Computational statistics & data
analysis. 2009;53(5):1590-603.
- Nguyen TS, Rojo J. Dimension reduction of
microarray data in the presence of a censored
survival response: a simulation study.
Statistical applications in genetics and
molecular biology. 2009;8(1):1-38.
- Ishwaran H, Kogalur UB. Random survival
forests for R. R News. 2007;7(2):25-31.
- Rosenwald A, Wright G, Chan WC, Connors
JM, Campo E, Fisher RI, et al. The use of
molecular profiling to predict survival after
chemotherapy for diffuse large-B-cell
lymphoma. New England Journal of Medicine.
2002;346(25):1937-47.
- Bender R, Augustin T, Blettner M. Generating
survival times to simulate Cox proportional
hazards models. Statistics in medicine.
2005;24(11):1713-23.
- Haykin S. Neural Networks, a comprehensive
foundation,2nd ed., Prentice Hall, 842 p. 1999.
- Breiman L. Random forests. Machine
learning. 2001;45(1):5-32
- Hanley JA, McNeil BJ. A method of
comparing the areas under receiver operating
characteristic curves derived from the same
cases. Radiology. 1983;148(3):839-43.
- Bøvelstad HM, Nygård S, Størvold HL, Aldrin
M, Borgan Ø, Frigessi A, et al. Predicting
survival from microarray data—a comparative
study. Bioinformatics. 2007;23(16):2080-7.
- Zhang H, Yu C-Y, Singer B, Xiong M.
Recursive partitioning for tumor classification
with gene expression microarray data.
Proceedings of the National Academy of
Sciences. 2001;98(12):6730-5.
- Michailidis G, de Leeuw J. Multilevel
homogeneity analysis with differential
weighting. Computational statistics & data
analysis. 2000;32(3):411-42..
- Daszykowski M, Walczak B, Massart D. A
journey into low-dimensional spaces with
autoassociative neural networks. Talanta.
2003;59(6):1095-105.
- Fotheringhame D, Baddeley R. Nonlinear
principal components analysis of spike train data. Biological Cybernetics.
1997;77(4):283-8.
- Oja E. Principal components, minor
components, and linear neural networks.
Neural networks. 1992;5(6):927-35.
- Ture M, Kurt I, Akturk Z. Comparison of
dimension reduction methods using patient
satisfaction data. Expert Systems with
Applications. 2007;32(2):422-6.
- Hsieh WW. Nonlinear principal component
analysis by neural networks. Tellus A:
Dynamic Meteorology and Oceanography.
2001;53(5):599-615.
- Albanis G, Batchelor R, editors. Assessing the
long-term credit standing using dimensionality
reduction techniques based on neural
networks—an alternative to overfitting. The
proceedings of the SCI 99/ISAS 99
conference, Orlando, US; 1999.
- HAYAT EA, Mevlut T, SENOL S. An
Alternative Dimension Reduction Approach to
Supervised Principal Components Analysis in
High Dimensional Survival Data. Turkiye
Klinikleri Journal of Biostatistics.
2016;8(1):21-9
- Dong D, McAvoy TJ. Batch tracking via
nonlinear principal component analysis.
AIChE Journal. 1996;42(8):2199-208.
- Scholz M, Fraunholz M, Selbig J. Nonlinear
principal component analysis: neural network
models and applications. Principal manifolds
for data visualization and dimension
reduction: Springer; 2008. p. 44-67.
- Monahan AH. Nonlinear principal component
analysis by neural networks: theory and
application to the Lorenz system. Journal of
Climate. 2000;13(4):821-35.
- Hsieh WW. Machine learning methods in the
environmental sciences: Neural networks and
kernels: Cambridge university press; 2009.
- Kramer MA. Nonlinear principal component
analysis using autoassociative neural
networks. AIChE journal. 1991;37(2):233-43
- Beer DG, Kardia SL, Huang C-C, Giordano
TJ, Levin AM, Misek DE, et al. Geneexpression
profiles predict survival of patients
with lung adenocarcinoma. Nature medicine.
2002;8(8):816-24.
- Chen X, Wang L, Smith JD, Zhang B.
Supervised principal component analysis for
gene set enrichment of microarray data with
continuous or survival outcomes.
Bioinformatics. 2008;24(21):2474-81
- Bair E, Tibshirani R. Semi-supervised
methods to predict patient survival from gene
expression data. PLoS biology.
2004;2(4):e108.