Destek vektör motorları ile protein yapısındaki düzensiz bölgelerin tahmini

Protein yapısındaki düzensizlikler hastalıklara ve islevsel bozukluklara neden olabildikleri gibi bazı durumlarda proteinin çok önemli fonksiyonları yerine getirmesinde ön kosul olabilmektedirler. Bu nedenle proteinlerin yapılarındaki düzensizliklerin tespit edilmesi oldukça önem tasımaktadır. Klinik ve laboratuar çalısmalarındaki zaman ve maliyeti düsürmek amacı ile bu çalısmada, deneysel yaklasımlara alternatif olarak veri madenciliğinin önemli yöntemlerinden olan Yapay Sinir Ağlarından radyal tabanlı fonksiyonlar, ve Destek Motor Vektörleri kullanılmıstır. Düzenli ve düzensiz protein yapılarının incelenmesinde kullanılmak üzere, Su ve ark. tarafından 2006 yılında hazırlanmıs bir veri seti kullanılmıstır. Üç metotdan en basarılısı Yakınsal Destek Vektörleri Motorları olmustur.

Prediction of protein disorder regions with support vector machines

As disorders in protein structure can cause diseases and the functional disorders, in some cases this can be a pre-condition for a protein to carryout the vital functions. Hence, the determination of the disorders in structures has vital importance. Clinical and experimental studies on these fields are quite expensive and time-consuming, In this study, as an alternative for the experimental approaches, the most common data mining tools, Artificial Neural Networks-radial basis functions, and Support Vector Machines, have been used to reduce the time and the cost. To be used at the examination of the structures of the order and disorder proteins, Su et al. in 2006 has been used for the purpose of the comparisons. Proximal Support Vector machines gives better results when compared with three of them.

___

  • 1. Romero,P., Obrodovic,Z., Dunker,A.K., "Intelligent Data Analysis for Protein Disorder Prediction", Artificial Intelligence Review, 14, 447-484, 2000.
  • 2. Kolata,G., "Trying to Crack the Second Half of the Genetic Code". Science 233: 1037–1039, 1986.
  • 3. Klee,C., Draetta,G., Hubbard,M., "Calcineurin, In Meister, A. (ed.) Advances in Enzymology", Vol. 61, 149–200, 1988.
  • 4. Dunker,A.K., Obradovic,Z., Romero,P., Kissinger,C., Villafranca,E., "On The Importance of Being Disordered", PDB Newsletter, 81, 3-5, 1997.
  • 5. Wright,P.E. and Dyson,H.J., "Intrinsically Unstructured Proteins: Re-assessing The Protein Structure-Function Paradigm", J. Mol. Biol., 293, 321-331, 1999.
  • 6. Dunker,A.K. and Obradovic,Z., The Protein Trinity – Linking Function and Disorder, Nature Biotechnology, 19, 805-806, 2001.
  • 7. Dunker,A.K., Lawson,J.D., Brown,C. J., Willams,R.M., Romero,P., Oh, J.S., Oldfield, C.S., Campen,A.M., Ratliff,C.M., Hipps,K.W., Ausio,J., Nissen,M.S., Reeves,R., Kang,C., Kissinger,C.R., Bailey,R.W., Griswold,M.D., Chiu,W., Garner,E.C., Obradovic,Z., "Intrinsically Disordered Protein. Journal of Molecular Graphics and Modelling", 19, 26-59, 2001.
  • 8. Romero,P., Obradovic,Z., Kissinger,C., Villafranca,E., Dunker,A.K., "Identifying Disordered Regions in Proteins from Amino Asid Sequence", In Proc. IEEE Int. Conf. On Neural Networks, 1, 90-95, 1997.
  • 9. Vucetic,S., Obrodovic,Z., Vacic,V., Radivojac,P., Peng,K., Lakoucheva,L.M., Cortese,M.S., Lawson, J.D., Brown, C. J., Sikes, J. G., Newton, C. D., Dunker, A. K., DisProt: A Database of Protein Disorder, Bioinformatics, 21, 137-140, 2005.
  • 10. Yang,R.Z., Thomson,R., Mcneil,P., Esnouf,R.M., "RONN: The Bio-basis Function Neural Network Technique Applied To The Detection of Natively Disordered Regions in Proteins", Bioinformatics, 21, 3369-3376, 2005.
  • 11. Boutselakis,H., Dimitropoulos,D., Fillon,J., Golovina,A., Henrick,K., Hussain,A., Ionides,J., John,M., Keller,P.A., KRissinel,E., Mcneil,P., Naim,A., Newman,R., Oldfield,T., Pineda,J., Rachedi,A., Copeland,J., Sitnov,A., Sobhany,S., Suarez-Uruena,A., Swaminathan,J., Tagari,M., Tate,J., Tromm,S., Velankar,S., Vranken,W., "E-MSD: The European Bioinformatics Institute Macromolecular Structure Database", Nucleic Acids Res., 31, 458-462, 2003.
  • 12. Li,W., Jarozewski,L., Godzik,A., "Clustering of Highly Homologous Sequences To Reduce The Size of Large Protein Databases", Bioinformatics, 17, 282-283, 2001.
  • 13. Melamud,E. and Moult,J., "Evaluation of Disorder Predictions in CASP5", Proteins, 53, 561-565, 2003.
  • 14. Shimizu,K., Hirose,S., Noguchi,T., Muraoka,Y., "Predicting The Protein Disordered Region Using Modified Position Specific Scoring Matrix", The 15th International Conference on Genome Informatics, P150, 2004.
  • 15. Chen,Y.C., Lin,Y.S., Line,C.J., Hwang,J.K., "Prediction of The Bonding States of Cysteines Using The Support Vector Machines Based on Multiple Feature Vectors and Cysteine State Sequences", Proteins, 55, 1036-1042, 2004.
  • 16. Jin,Y. and Dunbrack,R.L., "Assessment of Disorder Predictions in CASP6", Proteins, Early View, 2005.
  • 17. Broomhead,D,S., Lowe,D., "Multivariable functional interpolation and adaptive networks", Complex Systems, 2:321-355, 1988.
  • 18. Vapnik, V., "Statistical Learning Theory", Wiley, 1998.
  • 19. Stittson,M.O., Weston,J.A.E., Gammerman,A., Vovk,V., Vapnik,V., "Theory of Support Vector Machine", Technical Report CSD-TR-96-17, Royal Holloway University of London, Department of Computer Science,1996.
  • 20. Gunn,S., "Support Vector Machines for Classification and Regression", ISIS Technical Report, University of Southampton, Image Speech & Intelligent System Group 1998.
  • 21. Cristianini,N., Shawe-Taylor,J., "An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods", Cambridge University Press, 2000.
  • 22. Mayayoranz,E., Alpaydın,E., "Support Vector Machines For Multi-Class Classification", IDIAP Research Report, 98-06, 1998.
  • 23. Lee,Y., Mangasarian,O.L., "A smooth Support Vector Machine", Technical Report 99-03 University of Wisconsin, 1999.
  • 24. Fung,K. ve Mangasarian,O.L., "Proximal Support Vector Machine Classifiers", Proceedings KDD-2001, Knowledge Discovery and Data Mining, San Francisco, CA, 2001.