A method for ontology-based semantic relatedness measurement

There are many methods having different approaches for assessing similarity and relatedness and they are used in many application areas, including web service discovery, invocation and composition, word sense disambiguation, information retrieval, ontology alignment and merging, document clustering, and short answer grading. These methods can be categorized as path-based, information content-based, feature-based, geometric model-based, and hybrid approaches. These approaches use resources such as concept hierarchy, conceptual graph, and corpus for computing similarity and relatedness. With the rise of the semantic web, ontologies have attracted the attention of several researchers. Ontologies represented in the Web Ontology Language (OWL) are also valuable resources for similarity and relatedness measurement. The method proposed in this paper interprets some OWL constructs to assess semantic relatedness. The motivation behind this is to benefit from the rich expressive power of OWL to obtain better semantic relatedness measurement results. The success of the method has been validated against human judgments. The correlation between human judgments and automatically computed semantic relatedness values was calculated as 0.685 and was significant at the 0.01 level.

A method for ontology-based semantic relatedness measurement

There are many methods having different approaches for assessing similarity and relatedness and they are used in many application areas, including web service discovery, invocation and composition, word sense disambiguation, information retrieval, ontology alignment and merging, document clustering, and short answer grading. These methods can be categorized as path-based, information content-based, feature-based, geometric model-based, and hybrid approaches. These approaches use resources such as concept hierarchy, conceptual graph, and corpus for computing similarity and relatedness. With the rise of the semantic web, ontologies have attracted the attention of several researchers. Ontologies represented in the Web Ontology Language (OWL) are also valuable resources for similarity and relatedness measurement. The method proposed in this paper interprets some OWL constructs to assess semantic relatedness. The motivation behind this is to benefit from the rich expressive power of OWL to obtain better semantic relatedness measurement results. The success of the method has been validated against human judgments. The correlation between human judgments and automatically computed semantic relatedness values was calculated as 0.685 and was significant at the 0.01 level.

___

  • D. Martin, M. Burstein, J. Hobbs, O. Lassila, D. McDermott, S. McIlraith, S. Narayanan, M. Paolucci, B. Parsia, T. Payne, E. Sirin, N. Srinivasan, K. Sycara, OWL-S: Semantic Markup for Web Services, W3C Member Submission, 200 Available at http://www.w3.org/Submission/OWLS/. C. Leacock, M. Chodorow, Combining local context and WordNet similarity for word sense identification, In: C. Fellbaum, Ed., WordNet: An Electronic Lexical Database, Cambridge, MA, USA, MIT Press, 1998.
  • C. Manning, P. Raghavan, H. Sch¨ utze, Introduction to Information Retrieval, Cambridge University Press, 2008. J. Euzenat, P. Valtchev, “Similarity-based ontology alignment in OWL-Lite”, Proceedings of the 16th European Conference on Artificial Intelligence, pp. 333–337, 2004.
  • X. Zhang, L. Jing, X. Hu, M. Ng, J.X. Jiangxi, X. Zhou, “Medical document clustering using ontology-based term similarity measures”, International Journal of Data Warehousing and Mining, Vol. 4, pp. 62–73, 2008.
  • M. Mohler, R. Mihalcea, “Text-to-text semantic similarity for automatic short answer grading”, Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 567–575, 2009.
  • J.R. Anderson, “A spreading activation theory of memory”, Journal of Verbal Learning and Verbal Behavior, Vol. 22, pp. 261–295, 1983.
  • A. Tversky, “Features of similarity”, Psychological Review, Vol. 84, pp. 327–352, 1977.
  • P. Resnik, “Using information content to evaluate semantic similarity in a taxonomy”, Proceedings of the 14th International Joint Conference on Artificial Intelligence, Vol. 1, pp. 448–453, 1985.
  • A. Budanitsky, G. Hirst, “Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures”, Workshop on WordNet and Other Lexical Resources, Second Meeting of the North American Chapter of the Association for Computational Linguistics, pp. 29–34, 2001.
  • C. Fellbaum, WordNet: An Electronic Lexical Database, Cambridge, MA, USA, Bradford Books, 1998.
  • W. Mao, W.W. Chu, “Free-text medical document retrieval via phrased-based vector space model”, Proceedings of the AMIA Symposium, pp. 489–493, 2002.
  • R.L. Solso, M.K. MacLin, O.H. MacLin, Cognitive Psychology, 7th ed., Boston, MA, USA, Allyn & Bacon, 2005. Y. Wang, “The OAR model of neural informatics for internal knowledge representation in the brain”, The International Journal of Cognitive Informatics and Natural Intelligence, Vol. 1, pp. 64–75, 2007.
  • A.M. Collins, E.F. Loftus, “A spreading-activation theory of semantic processing”, Psychological Review, Vol. 82, pp. 407–428, 1975.
  • J.Z. Wang, Z. Du, R. Payattakool, P.S. Yu, C. Chen, “A new method to measure the semantic similarity of GO terms”, Bioinformatics, Vol. 23, pp. 1274–1281, 2007.
  • L. Mazuel, N. Sabouret, “Semantic relatedness measure using object properties in an ontology”, Proceedings of the 7th International Conference on the Semantic Web, pp. 681–694, 2008.
  • R. Rada, H. Mili, E. Bicknell, M. Blettner, “Development and application of a metric on semantic nets”, IEEE Transactions on Systems, Man, and Cybernetics, Vol. 19, pp. 17–30, 1989.
  • Z. Wu, M. Palmer, “Verb semantics and lexical selection”, Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, pp. 133–138, 1994.
  • T. Slimani, B.B. Yaghlane, K. Mellouli, “A new similarity measure based on edge counting”, Proceedings of the World Academy of Science, Engineering and Technology, Vol. 17, pp. 773–777, 2006.
  • H.A. Nguyen, H. Al-Mubaid, “New ontology-based semantic similarity measure for the biomedical domain”, IEEE International Conference on Granular Computing, pp. 623–628, 2006.
  • R. Aydo˘ gan, P. Yolum, “Learning consumer preferences using semantic similarity”, Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 1293–1300, 2007.
  • J. Ge, Y. Qiu, “Concept similarity matching based on semantic distance”, Proceedings of the 4th International Conference on Semantics, Knowledge, and Grid, pp. 380–383, 2008.
  • J. Miller, J. Mukerji, MDA Guide Version 1.0.1, Needham Heights, MA, USA, Object Management Group, 2003. A.G. Kleppe, “A language description is more than a metamodel”, Proceedings of the 4th International Workshop on Software Language Engineering, 2007.
  • E. Seidewitz, “What models mean”, IEEE Software, Vol. 20, pp. 26–32, 2003.
  • C. Atkinson, M. Gutheil, B. Kennel, “A flexible infrastructure for multilevel language engineering”, IEEE Transactions on Software Engineering, Vol. 35, pp. 742–755, 2009.
  • C. Atkinson, T. K¨ uhne, “Concepts for comparing modeling tool architectures”, Model Driven Engineering Languages and Systems, Vol. 3713, pp. 398–413, 2005.
  • D. Gaˇ sevi´ c, N. Kaviani, M. Hatala, “On metamodeling in megamodels”, Proceedings of the 10th International Conference on Model Driven Engineering Languages and Systems, pp. 91–105, 2007.
  • B. Onyshkevych, “Ontosearch: using an ontology as a search space for knowledge based text processing”, PhD, Carnegie Mellon University, USA, 1997.
  • R. Knappe, “Measures of semantic similarity and relatedness for use in ontology-based information retrieval”, PhD, Roskilde University, Denmark, 2005.
  • J.J. Carroll, I. Dickinson, C. Dollin, D. Reynolds, A. Seaborne, K. Wilkinson, “Jena: implementing the semantic web recommendations”, 13th International World Wide Web Conference on Alternate Track Papers & Posters, pp. 74–83, 2004.
  • H. Rubenstein, J.B. Goodenough, “Contextual correlates of synonymy”, Communications of the ACM, Vol. 8, pp. 627–633, 1965.
  • G.A. Miller, W.G. Charles, “Contextual correlates of semantic similarity”, Language and Cognitive Processes, Vol. 6, pp. 1–28, 1991.
  • L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman, E. Ruppin, “Placing search in context: the concept revisited”, ACM Transactions on Information Systems, Vol. 20, pp. 116–131, 2002.
  • E. Sirin, B. Parsia, B.C. Grau, A. Kalyanpur, Y. Katz, “Pellet: a practical OWL-DL reasoned”, Web Semantics: Science, Services and Agents on the World Wide Web, Vol. 5, pp. 51–53, 2007.
  • J.J. Jiang, D.W. Conrath, “Semantic similarity based on corpus statistics and lexical taxonomy”, Proceedings of the International Conference on Research in Computational Linguistics, pp. 19–33, 1997.
  • D. Lin, “An information-theoretic definition of similarity”, Proceedings of the 15th International Conference on Machine Learning, pp. 296–304, 1998.
  • E. Gabrilovich, S. Markovitch, “Wikipedia-based semantic interpretation for natural language processing”, Journal of Artificial Intelligence Research, Vol. 34, pp. 443–498, 2009.
  • A. Budanitsky, G. Hirst, “Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures”, Workshop on WordNet and Other Lexical Resources, pp. 29–34, 2001.
  • D. Bollegala, Y. Matsuo, M. Ishizuka, “A relational model of semantic similarity between words using automatically extracted lexical pattern clusters from the web”, The 2009 Conference on Empirical Methods in Natural Language Processing, Vol. 2, pp. 803–812, 2009.