Domain adaptation on graphs by learning graph topologies: theoretical analysis and an algorithm

Domain adaptation on graphs by learning graph topologies: theoretical analysis and an algorithm

Traditional machine learning algorithms assume that the training and test data have the same distribution,while this assumption does not necessarily hold in real applications. Domain adaptation methods take into account thedeviations in data distribution. In this work, we study the problem of domain adaptation on graphs. We consider a sourcegraph and a target graph constructed with samples drawn from data manifolds. We study the problem of estimating theunknown class labels on the target graph using the label information on the source graph and the similarity between thetwo graphs. We particularly focus on a setting where the target label function is learned such that its spectrum is similarto that of the source label function. We first propose a theoretical analysis of domain adaptation on graphs and presentperformance bounds that characterize the target classification error in terms of the properties of the graphs and the datamanifolds. We show that the classification performance improves as the topologies of the graphs get more balanced, i.e.as the numbers of neighbors of different graph nodes become more proportionate, and weak edges with small weightsare avoided. Our results also suggest that graph edges between too distant data samples should be avoided for goodgeneralization performance. We then propose a graph domain adaptation algorithm inspired by our theoretical findings,which estimates the label functions while learning the source and target graph topologies at the same time. The jointgraph learning and label estimation problem is formulated through an objective function relying on our performancebounds, which is minimized with an alternating optimization scheme. Experiments on synthetic and real data setssuggest that the proposed method outperforms baseline approaches.

___

  • [1] Fang Z, Zhang Z. Discriminative transfer learning on manifold. In: SIAM International Conference on Data Mining; Austin, TX, USA; 2013. pp. 539-547.
  • [2] Rohrbach M, Ebert S, Schiele B. Transfer learning in a transductive setting. In: Advances in Neural Information Processing Systems; Lake Tahoe, NV, USA; 2013. pp. 46-54.
  • [3] Wang C, Mahadevan S. Manifold alignment using procrustes analysis. In: International Conference on Machine Learning; Helsinki, Finland; 2008. pp. 1120-1127.
  • [4] Xiao M, Guo Y. Feature space independent semi-supervised domain adaptation via kernel matching. IEEE Transactions on Pattern Analysis and Machine Intelligence 2015. 37 (1): 54-66.
  • [5] Pilancı M, Vural E. Domain adaptation via transferring spectral properties of label functions on graphs. In: IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop; Bordeaux, France; 2016. pp. 1-5.
  • [6] Huang J, Smola AJ, Gretton A, Borgwardt KM, Schölkopf B. Correcting sample selection bias by unlabeled data. In: Advances in Neural Information Processing Systems; Vancouver, BC, Canada; 2006. pp. 601-608.
  • [7] Khalighi S, Ribeiro B, Nunes U. Importance weighted import vector machine for unsupervised domain adaptation. IEEE Transactions on Cybernetics 2017; 47 (10): 3280-3292.
  • [8] Daumé III H, Kumar A, Saha A. Frustratingly easy semi-supervised domain adaptation. In: 2010 Workshop on Domain Adaptation for Natural Language Processing; Uppsala, Sweden; 2010. pp. 53-59.
  • [9] Duan L, Xu D, Tsang IW. Learning with augmented features for heterogeneous domain adaptation. In: International Conference on Machine Learning; Edinburgh, UK; 2012. pp. 1-8.
  • [10] Courty N, Flamary R, Tuia D, Rakotomamonjy A. Optimal transport for domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence 2017; 39 (9): 1853-1865.
  • [11] Fernando B, Habrard A, Sebban M, Tuytelaars T. Unsupervised visual domain adaptation using subspace alignment. In: IEEE International Conference on Computer Vision; Sydney, Australia; 2013. pp. 2960-2967.
  • [12] Gong B, Shi Y, Sha F, Grauman K. Geodesic flow kernel for unsupervised domain adaptation. In: IEEE Conference on Computer Vision and Pattern Recognition; Providence, RI, USA; 2012. pp. 2066-2073.
  • [13] Jiang M, Huang W, Huang Z, Yen GG. Integration of global and local metrics for domain adaptation learning via dimensionality reduction. IEEE Transactions on Cybernetics 2017; 47 (1): 38-51.
  • [14] Pan SJ, Tsang IW, Kwok JT, Yang Q. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks 2011; 22 (2): 199-210.
  • [15] Yan H, Ding, Li P, Wang Q, Xu Y, Zuo W. Mind the class weight bias: Weighted maximum mean discrepancy for unsupervised domain adaptation. In: IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA; 2017. pp 945-954.
  • [16] Cheng L Pan SJ. Semi-supervised domain adaptation on manifolds. IEEE Transactions on Neural Networks and Learning Systems 2014; 25 (12): 2240-2249.
  • [17] Yao T, Pan Y, Ngo C, Li H, Mei T. Semi-supervised domain adaptation with subspace learning for visual recognition. In: IEEE Conference on Computer Vision and Pattern Recognition; Boston, MA, USA; 2015. pp. 2142-2150.
  • [18] Das D, Lee CSG. Sample-to-sample correspondence for unsupervised domain adaptation. Engineering Applications of Artificial Intelligence 2018; 73: 80-91.
  • [19] Eynard D, Kovnatsky A, Bronstein MM, Glashoff K, Bronstein AM. Multimodal manifold analysis by simultaneous diagonalization of Laplacians. IEEE Transactions on Pattern Analysis and Machine Intelligence 2015; 37 (12): 2505-2517.
  • [20] Rodolà E, Cosmo L, Bronstein MM, Torsello A, Cremers D. Partial functional correspondence. Computer Graphics Forum 2017; 36 (1): 222-236.
  • [21] Dong X, Thanou D, Frossard F, Vandergheynst P. Learning Laplacian matrix in smooth graph signal representations. IEEE Transactions on Signal Processing 2016; 64 (23): 6160-6173.
  • [22] Egilmez HE, Pavez E, Ortega A. Graph learning from data under Laplacian and structural constraints. IEEE Journal of Selected Topics in Signal Processing 2017; 11 (6): 825-841.
  • [23] Kalofolias V. How to learn a graph from smooth signals. In: International Conference on Artificial Intelligence and Statistics; Cadiz, Spain; 2016. pp. 920-929.
  • [24] Argyriou A, Herbster M, Pontil M. Combining graph Laplacians for semi-supervised learning. In: Advances in Neural Information Processing Systems; Vancouver, BC, Canada; 2005. pp. 67-74.
  • [25] Dai G, Yeung D. Kernel selection for semi-supervised kernel machines. In: International Conference on Machine Learning; Corvallis, OR, USA; 2007. pp. 185-192.
  • [26] Cortes C, Mansour Y, Mohri M. Learning bounds for importance weighting. In: Advances in Neural Information Processing Systems; Vancouver, BC, Canada; 2010. pp. 442-450.
  • [27] Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F et al. A theory of learning from different domains. Machine Learning 2010; 79 (1-2): 151-175.
  • [28] Ben-David S, Blitzer J, Crammer K, Pereira F. Analysis of representations for domain adaptation. In: Advances in Neural Information Processing Systems; Vancouver, BC, Canada; 2006. pp. 137-144.
  • [29] Blitzer J, Crammer K, Kulesza A, Pereira F, Wortman J. Learning bounds for domain adaptation. In: Advances in Neural Information Processing Systems; Vancouver, BC, Canada; 2007. pp. 129-136.
  • [30] Mansour Y, Mohri M, Rostamizadeh A. Domain adaptation: Learning bounds and algorithms. In: 22nd Conference on Learning Theory; Montreal, QC, Canada; 2009. pp. 1-11.
  • [31] Chung FRK. Spectral Graph Theory. Providence, RI, USA: American Mathematical Society, 1997.
  • [32] Shuman DI, Narang SK, Frossard P, Ortega A, Vandergheynst P. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine 2013; 30 (3): 83-98.
  • [33] Hein M, Audibert JY, von Luxburg U. From graphs to manifolds - weak and strong pointwise consistency of graph Laplacians. In: 18th Annual Conference on Learning Theory; Bertinoro, Italy; 2005. pp. 470-485.
  • [34] Singer A. From graph to manifold Laplacian: The convergence rate. Applied and Computational Harmonic Analysis 2006; 21 (1): 128-134.
  • [35] Vural E. Domain adaptation on graphs by learning graph topologies: Theoretical analysis and an algorithm. arXiv Technical report, 2018.
  • [36] Wang C, Mahadevan S. Heterogeneous domain adaptation using manifold alignment. In: 22nd International Joint Conference on Artificial Intelligence; Barcelona, Spain; 2011. pp. 1541-1546.
  • [37] Nene SA, Nayar SK, Murase H. Columbia Object Image Library (COIL-20) Technical report No. CUCS-006-96. New York, NY, USA: Columbia University, 1996.