Gen Örneklerinin Eşli Destek Vektör Makinesi ile Sınıflandırılması

Gen örnekleriyle ilgili karşılaşılan sınıflandırma problemlerinde en büyük sorun az sayıda örnek elde edilmesine karşın verinin büyük boyutlu olmasıdır. Bu tür problemlerde kullanılacak sınıflandırıcının büyük boyutlu verinin işlenmesine olanak sağlayan ve eldeki az sayıda örnekten maksimum bilgiyi çıkaran bir sınıflandırıcı olması gerekir. Bu kapsamda, öncelikle ikili/çoklu sınıflandırma problemlerini ayrı ayrı eşli ikili sınıflandırma problemlerine çeviren bir sınıflandırma metodolojisi geliştirilmiştir. Bunun için, çevrimiçi bir sınıflandırıcı eşli ikili sınıflandırma problemlerini çözecek şekilde tekrar düzenlenmiştir. Oluşan sınıflandırıcı gerçek problemlerin çoğu üzerinde diğer popüler sınıflandırıcılara göre oldukça iyi bir performans göstermiştir.

Classification of Gene Samples Using Pair-Wise Support Vector Machines

The main problem in the classification problems encountered with gene samples is that the dimension of the data is high although the sample size is small. In such problems, the classifier to be used must be a classifier that allows the processing of high dimensional data and extracts maximum information from a small number of samples at hand. In this context, a classification methodology has been developed, which first transforms the problem of binary or multiple classification into separate pair-wise classification problems. To this end, an online classifier has been adapted to solve pair-wise binary classification problems. The resulting classifier performed better on most of the real problems compared to other popular classifiers.

___

  • Anlauf, J.K., & Biehl, M. (1989). The adatron: an adaptive perceptron algorithm. EPL (Europhysics Letters). 10(7): p. 687.
  • Basilico, J., & Hofmann, T. (2004). Unifying collaborative and content-based filtering. In Proceedings of the twenty-first international conference on Machine learning. p. 9. ACM.
  • Ben-Hur, A. & Noble, W.S. (2005). Kernel methods for predicting protein–protein interactions. Bioinformatics, 21(suppl 1): pp.i38-i46.
  • Bordes, A., Ertekin, S., Weston, J., & Bottou, L. (2005). Fast kernel classifiers with online and active learning. Journal of Machine Learning Research. 6: pp.1579-1619.
  • Boser, B.E., Guyon, I.M., & Vapnik, V.N. (1992). A training algorithm for optimal margin classifiers. In the Proceedings of the fifth annual workshop on Computational learning theory. pp: 144-152. ACM.
  • Bottou, L., & LeCun, Y. (2003). Large scale online learning. In NIPS. 30: p. 77.
  • Dietterich, T., & Bakiri G. (1995). Solving multiclass learning problems via error correcting output codes, Journal of Artificial Intelligence Research, 2: 263-286.
  • Dudoit, S., Fridlyand, J., & Speed, T. (2002). Comparison of discrimination methods for the classification of tumors using geneexpression data. J. Am. Stat. Assoc. 97: 77–87.
  • Freund, Y., & Schapire, R.E. (1999). Large margin classification using the perceptron algorithm. Machine learning. 37(3): pp.277-296.
  • Friedman J. (1996). Another approach to polychotomous classifcation, Technical Report, Technical report, Stanford University, Department of Statistics.
  • Gentile, C. (2001). A new approximate maximal margin classification algorithm. Journal of Machine Learning Research. 2: pp.213-242.
  • Kashima, H., Oyama, S., Yamanishi, Y., & Tsuda, K. (2009). On pairwise kernels: An efficient alternative and generalization analysis. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. pp. 1030-1037. Springer Berlin Heidelberg.
  • Li, Y., & Long, P.M. (2002). The relaxed online maximum margin algorithm. Machine Learning. 46(1-3): pp.361-387.
  • Mayoraz, E., & Alpaydin E. (1999). Support vector machines for multi-class classification. In the International Work-Conference on Artificial Neural Networks. Springer Berlin Heidelberg.
  • Minsky, M., & Papert, S. (1969). Perceptrons.
  • Oyama, S., & Manning, C.D. (2004). Using feature conjunctions across examples for learning pairwise classifiers. In European Conference on Machine Learning. pp. 322-333. Springer Berlin Heidelberg.
  • Platt, J.C. (1999). 12 fast training of support vector machines using sequential minimal optimization. Advances in kernel methods. pp.185-208.
  • Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological review, 65(6): p. 386.
  • Shalev-Shwartz, S., Singer, Y., & Srebro, N. (2007). Pegasos: Primal estimated sub-gradient solver for svm. In Proceedings of the 24th international conference on Machine learning. pp. 807-814. ACM.
  • Schölkopf, B., & Smola, A.J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press.
  • Vert, J.P., Qiu, J., & Noble, W.S. (2007). A new pairwise kernel for biological network inference with support vector machines. BMC bioinformatics, 8(10): p.S8.
  • Weston, J., & Watkins, C. (1999). Support vector machines for multi-class pattern recognition. In ESANN. 99: pp. 219-224.
  • Xu, W. (2011). Towards optimal one pass large scale learning with averaged stochastic gradient descent. arXiv preprint arXiv:1107.2490.