A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 3 Issue 2
Apr.  2016

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 6.171, Top 11% (SCI Q1)
    CiteScore: 11.2, Top 5% (Q1)
    Google Scholar h5-index: 51, TOP 8
Turn off MathJax
Article Contents
Qingpeng Zhang and David Haglin, "Semantic Similarity between Ontologies at Different Scales," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 2, pp. 132-140, 2016.
Citation: Qingpeng Zhang and David Haglin, "Semantic Similarity between Ontologies at Different Scales," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 2, pp. 132-140, 2016.

Semantic Similarity between Ontologies at Different Scales


This work was supported by National Natural Science Foundation of China (71402157), the Natural Science Foundation of Guangdong Province, China (2014A030313753), CityU Start-up (7200399), the Center for Adaptive Super Computing Software -MultiThreaded Architectures (CASS-MT) at the U. S. Department of Energy's Pacific Northwest National Laboratory. Pacific Northwest National Laboratory Is Operated by Battelle Memorial Institute (Contract DE-ACO6-76RL01830).

  • In the past decade, existing and new knowledge and datasets have been encoded in different ontologies for semantic web and biomedical research. The size of ontologies is often very large in terms of number of concepts and relationships, which makes the analysis of ontologies and the represented knowledge graph computational and time consuming. As the ontologies of various semantic web and biomedical applications usually show explicit hierarchical structures, it is interesting to explore the trade-offs between ontological scales and preservation/precision of results when we analyze ontologies. This paper presents the first effort of examining the capability of this idea via studying the relationship between scaling biomedical ontologies at different levels and the semantic similarity values. We evaluate the semantic similarity between three gene ontology slims (plant, yeast, and candida, among which the latter two belong to the same kingdom - fungi) using four popular measures commonly applied to biomedical ontologies (Resnik, Lin, Jiang-Conrath, and SimRel). The results of this study demonstrate that with proper selection of scaling levels and similarity measures, we can significantly reduce the size of ontologies without losing substantial detail. In particular, the performances of Jiang- Conrath and Lin are more reliable and stable than that of the other two in this experiment, as proven by 1) consistently showing that yeast and candida are more similar (as compared to plant) at different scales, and 2) small deviations of the similarity values after excluding a majority of nodes from several lower scales. This study provides a deeper understanding of the application of semantic similarity to biomedical ontologies, and shed light on how to choose appropriate semantic similarity measures for biomedical engineering.


  • loading
  • [1]
    Berners-Lee T, Hendler J, Lassila O. The semantic web. Scientific American, 2001, 284(5): 34-43
    Maedche A, Staab S. Ontology learning for the semantic web. IEEE Intelligent Systems, 2001, 16(2): 72-79
    Maedche A, Staab S. Measuring similarity between ontologies. In: Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web. Sigöuenza, Spain: Springer-Verlag, 2002. 251-263
    Allemang D, Hendler J. Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2008. 352
    Hendler J, Holm J, Musialek C, Thomas G. US government linked open data: semantic.data.gov. IEEE Intelligent Systems, 2012, 27(3): 25-31
    Al-Saffar S, Joslyn C, Chappell A. Structure discovery in large semantic graphs using extant ontological scaling and descriptive semantics. In: Proceedings of the 2011 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). Lyon, France: IEEE, 2011. 211-218
    Shen Z, Ma K L, Eliassi-Rad T. Visual analysis of large heterogeneous social networks by semantic and structural abstraction. IEEE Transactions on Visualization and Computer Graphics, 2006, 12(6): 1427-1439
    Dai B T, Kwee A, Lim E P. ViStruclizer: a structural visualizer for multidimensional social networks. In: Proceedings of the 17th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Gold Coast, Australia: Springer, 2013. 49-60
    Giereth M O. An Architecture for Visual Patent Analysis [Ph. D. dissertation], Universitöatsbibliothek der Universitöat Stuttgart: Stuttgart, 2013.
    Gomes F, Devezas J, Figueira A' . Temporal visualization of a multidimensional network of news clips. Advances in Information Systems and Technologies. Berlin Heidelberg: Springer, 2013. 157-166
    Kienreich W, Wozelka R, Sabol V, Seifert C. Graph visualization using hierarchical edge routing and bundling. In: Proceedings of the 3rd International Eurovis Workshop on Visual Analytics (EuroVA). The Eurographics Association, 2012.
    Budanitsky A, Hirst G. Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures. In: Proceedings of the 2001Workshop onWordNet and Other Lexical Resources, Second Meeting of the North American Chapter of the Association for Computational Linguistics. Pittsburgh, 2001.
    Patwardhan S, Banerjee S, Pedersen T. Using measures of semantic relatedness for word sense disambiguation. In: Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Text Processing. Mexico City, Mexico: Springer, 2003. 241-257
    Song X B, Li L, Srimani P K, Yu P S, Wang J Z. Measure the semantic similarity of GO terms using aggregate information content. IEEE/ACM Transactions on Computer Biological Bioinformatics, 2014, 11(3): 468-476
    Miller G A. WordNet: a lexical database for English. Communications of the ACM, 1995, 38(11): 39-41
    Resnik P. Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1995. 448-453
    Jiang J J, Conrath D W. Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the 1977 International Conference on Research in Computational Linguistics (ROCLING X). Taiwan: China, 1997. 19-33
    Lin D K. An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1998. 296- 304
    Pesquita C, Faria D, Falc~ao A O, Lord P, Couto F M. Semantic similarity in biomedical ontologies. PLoS Computational Biology, 2009, 5(7): e1000443
    Schwering A. Approaches to semantic similarity measurement for geospatial data: a survey. Transactions in GIS, 2008, 12(1): 5-29
    Schlicker A, Domingues F S, Rahnenföuhrer J, Lengauer T. A new measure for functional similarity of gene products based on gene ontology. BMC Bioinformatics, 2006, 7: 302
    Lord P W, Stevens R D, Brass A, Goble C A. Semantic similarity measures as tools for exploring the gene ontology. Pacific Symposium on Biocomputing, 2003, 8: 601-612
    Lee W N, Shah N, Sundlass K, Musen M. Comparison of ontologybased semantic-similarity measures. In: Proceedings of the 2008 AMIA Annual Symposium. American Medical Informatics Association, 2008. 384-388
    Al-Mubaid H, Nguyen H A. Measuring semantic similarity between biomedical concepts within multiple ontologies. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2009, 39(4): 389-398
    Couto F M, Silva M J, Coutinho P M. Measuring semantic similarity between gene ontology terms. Data and Knowledge Engineering, 2007, 61(1): 137-152
    Harris M A, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, et al. Gene ontology consortium. The gene ontology (GO) database and informatics resource. Nucleic Acids Research, 2004, 32(Database issue): D258-D261
    North S C. Incremental layout in DynaDAG. Graph Drawing. Berlin Heidelberg: Springer, 1996. 409-418
    Schlicker A, Albrecht M. FunSimMat: a comprehensive functional similarity database. Nucleic Acids Research, 2008, 36(Database issue): D434-D439
    Erickson J S, Viswanathan A, Shinavier J, Shi Y M, Hendler J A. Open government data: a data analytics approach. IEEE Intelligent Systems, 2013, 28(5): 19-23


    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (987) PDF downloads(3) Cited by()


    DownLoad:  Full-Size Img  PowerPoint