Health Query Expansion based on Graph Matching between DBpedia and UMLS

Sarah Dahir, Abderrahim El Qadi, Hamid Bennis


Information Retrieval (IR) in the medical domain is considered as a challenging task for many reasons. Short health queries tend to lack information on user's intent, and the target corpus may not have sufficient information for Relevance Feedbacks. And even, if the user obtains relevant documents to his/her queries, it is difficult for him/her to understand the technical terms.  In contrast, in this paper, we propose an approach for health queries reformulation based on graph matching between two external linked data sources: DBpedia and Unified Medical Language System (UMLS). DBpedia has a broad coverage of topics and less noise compared to Wikipedia articles, and UMLS is specific to the medical domain. We also introduced the degree centrality to measure the graph connectivity and to select the most efficient candidate terms for query expansion. Experimental results on MEDLINE collection using Okapi BM25 as a retrieval model showed that our approach outperformed related methods, and the two sources achieved very good retrieval results. They helped in the diversification of the retrieved documents and the improvement of the recall.


Information Retrieval; Search Result Diversification; Query Reformulation; Linked data; Graph matching; Degree centrality

Full Text:


International Journal of Online and Biomedical Engineering (iJOE) – eISSN: 2626-8493
Creative Commons License
Scopus logo Clarivate Analyatics ESCI logo IET Inspec logo DOAJ logo DBLP logo EBSCO logo Ulrich's logo Google Scholar logo MAS logo