Mustafa Gökçe BAYDOĞAN, Özgür Emre SİVRİKAYA, Mert YÜKSEKGÖNÜL

Learning prototypes for multiple instance learning

Multiple instance learning (MIL) is a weakly supervised learning method that works on the labeled bag of instances data. A prototypical network is a popular embedding approach in MIL. They overcome the common problems that other MIL approaches may have to deal with including dimensionality, loss of instance-level information, and complexity. They demonstrate competitive performance in classification. This work proposes a simple model that provides a permutation invariant prototype generator from a given MIL data set. We aim to find out prototypes in the feature space to map the collection of instances (i.e. bags) to a distance feature space and simultaneously learn a linear classifier for MIL. Another advantage of prototypical networks is that they are commonly used in the machine learning domain to facilitate interpretability. Our experiments on classical MIL benchmark data sets demonstrate that the proposed framework is an accurate and efficient classifier compared to the existing approaches.

PDF

___

[1] Al-Hussaini I, Xiao C, Westover MB, Sun J. SLEEPER: interpretable Sleep staging via Prototypes from Expert Rules. 2019.
[2] Simonyan K, Vedaldi A, Zisserman A. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. Workshop at International Conference on Learning Representations 2014.
[3] Ilse M, Tomczak JM, Welling M. Attention-based Deep Multiple Instance Learning. Proceedings of the 35th International Conference on Machine Learning 2018; 80: 2127-2136.
[4] Montavon G, Samek W, Müller KR. Methods for Interpreting and Understanding Deep Neural Networks. Digital Signal Processing 2018; 73:1-15. doi: 10.1016/j.dsp.2017.10.011
[5] Viola P, Platt J, Zhang C. Multiple Instance Boosting for Object Detection. Advances in neural information processing systems 18 2005: 1417-1424.
[6] Alpaydin E, Cheplygina V, Loog M, Tax D. Single- vs. Multiple-Instance Classification. Pattern Recognition 2015; 48 (9). doi: 10.1016/j.patcog.2015.04.006
[7] Raykar VC, Krishnapuram B, Bi J, Dundar M, Rao RB. Bayesian multiple instance learning: automatic feature selection and inductive transfer. In Proceedings of the 25th international conference on Machine learning (ICML ’08) 2008. Association for Computing Machinery, New York, NY, USA, 808–815. doi: 10.1145/1390156.1390258
[8] Weidmann N, Frank E, Pfahringer B. A Two-level Learning Method for Generalized Multi-instance Problems. Machine Learning: ECML 2003: 468-479. doi: 10.1007/978-3-540-39857-8_42
[9] Li Y, Tax D, Duin R, Loog M. Multiple-instance learning as a classifier combining problem. Pattern Recognition 2013; 46 (3): 865–874. doi: 10.1016/j.patcog.2012.08.018
[10] Wang J, Zucker JD. Solving the multiple-instance problem: A lazy learning approach.Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000) 2000: 1119-1126.
[11] Dietterich T, Lathrop R, Lozano-Pérez T. Solving the Multiple Instance Problem with Axis-Parallel Rectangles. Artificial Intelligence 2001; 89 (1-2): 31-71. doi: 10.1016/S0004-3702(96)00034-3
[12] Cheplygina V, Tax D, Loog M. Multiple Instance Learning with Bag Dissimilarities. Pattern Recognition 2015; 48 (1): 264-275. 10.1016/j.patcog.2014.07.022.
[13] Tax D, Loog M, Duin R, Cheplygina V, Lee WJ. Bag Dissimilarities for Multiple Instance Learning. 7005. 222-234. doi: 10.1007/978-3-642-24471-1_16
[14] Pekalska E, Duin RPW. The Dissimilarity Representation for Pattern Recognition: Foundations and Applications. Singapore: World Scientific 2005.
[15] Chen Y, Bi J, Wang J. MILES: Multiple-Instance Learning via Embedded instance Selection. IEEE transactions on pattern analysis and machine intelligence 2007; 28: 1931-1947. doi:10.1109/TPAMI.2006.248
[16] Cheplygina V, Tax D, Loog M. Combining Instance Information to Classify Bags. Multiple Classifier Systems, submitted 2013. doi: 10.1007/978-3-642-38067-9_2.
[17] Cheplygina V, Tax DMJ, Loog M. Dissimilarity-Based Ensembles for Multiple Instance Learning. IEEE Transactions on Neural Networks and Learning Systems 2016; 27 (6): 1379-1391. doi: 10.1109/TNNLS.2015.2424254
[18] Yang HM, Zhang XY, Yin F, Liu CL. Robust Classification with Convolutional Prototype Learning. 2018. 3474- 3482. doi: 10.1109/CVPR.2018.00366.
[19] Snell J, Swersky K, Zemel R. Prototypical Networks for Few-shot Learning. Advances in Neural Information Processing Systems 2017; 30: 4077-4087.
[20] Wang X, Yan Y, Tang P, Bai X, and Liu W. Revisiting multiple instance neural networks: Pattern Recognition 2018; 74: 15–24. doi:10.1016/j.patcog.2017.08.026
[21] Wu J, Yu Y, Huang C, Yu K. Deep multiple instance learning for image classification and auto-annotation. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015: 3460-3469. doi: 10.1109/CVPR.2015.7298968.
[22] Tang P, Wang X, Bai S, Shen W, Bai X et al. PCL: Proposal Cluster Learning for Weakly Supervised Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 2020; 42 (1): 176–191. doi:10.1109/TPAMI.2018.2876304
[23] Fang W, Chang L, Wei K, Xiangyang J, Jianbin J et al. C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection 2019. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). doi: 10.1109/CVPR.2019.00230.
[24] Dennis DK, Pabbaraju C, Simhadri HV, Jain P. Multiple instance learning for efficient sequential data classification on resource-constrained devices. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18) 2018. Curran Associates Inc., Red Hook, NY, USA, 10976–10987.
[25] Angelidis S, Lapata M. Multiple Instance Learning Networks for Fine-Grained Sentiment Analysis. Transactions of the Association for Computational Linguistics 2018; 6: 17-31. doi:10.1162/tacl_a_00002
[26] McFee B, Salamon J, Bello JP. Adaptive Pooling Operators for Weakly Labeled Sound Event Detection. IEEE/ACM Transactions on Audio, Speech and Language Processing 2018; 26 (11): 2180–2193.
[27] Ba JJ, Kiros RJ, Hinton GE. Layer Normalization. CoRR 2016. arXiv preprint arXiv:1607.06450
[28] Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32Nd International Conference on International Conference on Machine Learning 2015;
37: 448-456. [29] Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. 3rd International Conference on Learning Representations, ICLR 2015. https://dblp.org/rec/bib/journals/corr/KingmaB14
[30] Kucukasci E, Baydogan MG. Multiple Instance Learning Repository. 2018. http://www.multipleinstancelearning.com/
[31] Zhou ZH, Zhang ML. Solving multi-instance problems with classifier ensemble based on constructive clustering. Knowledge and Information Systems 2007; 11(2): 155-170. doi: 10.1007/s10115-006-0029-3
[32] Wei XS, Wu J, Zhou ZH. Scalable algorithms for multi-instance learning. IEEE transactions on neural networks and learning systems 2017; 28 (4): 975-987. doi: 10.1109/ICDM.2014.16
[33] Paszke A, Gross S, Chintala S, Chanan G, Yang A et. al. Automatic differentiation in PyTorch 2017.
[34] The GPyOpt authors. GPyOpt: A Bayesian Optimization framework in python. 2016. http://github.com/SheffieldML/GPyOpt/
[35] Friedman M. A Comparison of Alternative Tests of Significance for the Problem of m Rankings. The Annals of Mathematical Statistics 1940; 11 (1): 86-92.
[36] Nemenyi P. Distribution-free multiple comparisons. 1963.