Convolutional auto encoders for sentence representation generation

In this study, we have proposed an alternative approach for sentence modeling problem. The difficulty of the choice of answer, the semantically related questions and the lack of syntactic closeness of the answers give rise to the difficulty of selecting the answer. The deep learning field has recently achieved a pivotal success in semantic analysis, machine translation, and text summaries. The essence of this work, inspired by the human orthographic processing mechanism and using multiple convolution filters with pre-rendered 2-Dimension 2D representations of sentences, input or output size is to learn the basic features of the language without concerns. For this reason, the semantic relations in the sentence structure are learned by the convolutional variational auto-encoders first, and then the question and answer spaces learned by the auto-encoders are linked with proposed intermediate models. We have benchmarked five variations of our proposed model, which is based on Variational Auto-Encoder with multiple latent spaces and able to achieve lower error rates than the baseline model, which is the base Convolutional LSTM.

___

  • [1] Kalchbrenner N, Grefenstette, E Blunsom P. A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics; Baltimore, Maryland, USA; 2014. pp. 655-665. doi: 10.3115/v1/P14-1062
  • [2] Yu L, Hermann K M, Blunsom P, Pulman S. Deep learning for answer sentence selection. arXiv preprint 2014; arXiv:1412.1632.
  • [3] Yin W, Schütze H, Xiang B, Zhou B. Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Transactions of the Association for Computational Linguistics 2016; 4: 259-272. doi: 10.1162/tacl_a_00097
  • [4] Tan M, Santos Cd, Xiang B, Zhou B. Lstm-based deep learning models for non-factoid answer selection. arXiv preprint 2016; arXiv:1511.04108.
  • [5] Le Q, Mikolov T. Distributed representations of sentences and documents. In: International Conference on Machine Learning; Lake Beijing, China; 2014. pp. 1188-1196.
  • [6] Collobert R, Weston J. A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning; Helsinki, Finland; 2008. pp. 160-167.
  • [7] Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems; Lake Tahoe, Nevada, 2013. pp. 3111-3119.
  • [8] Cornelissen P, Hansen P, Kringelbach M, Pugh K. The Neural Basis of Reading. China: Oxford University Press, 2010.
  • [9] Whitney C. How the brain encodes the order of letters in a printed word: The seriol model and selective literature review. Psychonomic Bulletin & Review 2001; 8 (2): 221-243. doi: 10.3758/BF03196158
  • [10] Michel JB, Shen YK, Aiden AP, Veres A, Gray MK et al. Quantitative analysis of culture using millions of digitized books. Science 2011; 331 (6014): 176-182. doi: 10.1126/science.1199644
  • [11] Kingma DP, Welling M. Auto-encoding variational bayes. arXiv preprint 2013; arXiv:1312.6114.
  • [12] Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M. Striving for simplicity: The all convolutional net. arXiv preprint 2015; arXiv:1412.6806.
  • [13] Kandel ER, Schwartz JH, Jessell TM, Siegelbaum SA, Hudspeth AJ et al. Principles of Neural Science. New York, USA: McGraw-hill, 2000.
  • [14] Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 2014; 15 (1): 1929-1958.
  • [15] Xingjian S, Chen Z, Wang H, Yeung DY, Wong WK et al. Convolutional lstm network: A machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems; Montreal, Canada; 2015. pp. 802-810.
  • [16] Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation 1997; 9 (8): 1735-1780. doi: 10.1162/neco.1997.9.8.1735
  • [17] Danescu-Niculescu-Mizil C, Lee L. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs. In: Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics; Stroudsburg, PA, USA; 2011. pp. 76-87.
  • [18] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS); Chia Laguna Resort, Sardinia, Italy; 2010. pp. 249-256.
  • [19] Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge, Mass., USA: MIT Press, 2016.
  • [20] Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. arXiv preprint 2014; arXiv:1412.6980v9.
  • [21] Chollet F. Deep Learning with Python. USA: Manning Publications Co., 2017.
  • [22] Jones NC, Pevzner PA. An Introduction to Bioinformatics Algorithms. USA: MIT Press, 2004.
  • [23] Lu L, Zheng Y, Carneiro G, Yang L. Deep Learning and Convolutional Neural Networks for Medical Image Computing. Switzerland: Springer, 2017, pp. 35-48.
  • [24] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv preprint 2014; arXiv:1409.0473.