Author List: Wei, Chih-Ping; Hu, Paul Jen-Hwa; TAI, CHIA-HUNG; HUANG, CHUN-NENG; YANG, CHIN-SHENG;
Journal of Management Information Systems, 2007, Volume 24, Issue 3, Page 269-295.
Word mismatch represents a fundamental information retrieval challenge that has become increasingly important as electronic document repositories (e.g., Web resources, digital libraries) grow in number and sheer volume. In general, word mismatch refers to the phenomenon in which a concept is described by different terms in user queries and in source documents. Query expansion represents a promising avenue to address such problems. Previous research predominantly approaches query expansion on the basis of global or local analysis. However, these approaches emphasize a global perspective rather than taking a topic-specific view of term associations. As a consequence, their effectiveness can be severely constrained when the document corpus spans a diverse set of topics. In this study, we propose a topic-based approach for query expansion and develop and empirically evaluate two novel methods--namely, nonfuzzy and fuzzy topic-based query expansion--to address word mismatch problems. According to our evaluation results, the proposed topic-based approach is more effective than a benchmark global analysis method, particularly when user queries consist of multiple query terms.
Keywords: document clustering;fuzzy clustering;information retrieval;query expansion;text mining;topic-based query expansion;word mismatch
Algorithm:

List of Topics

#81 0.174 applications application reasoning approach cases support hypertext case-based prototype problems consistency developed benchmarking described efficient practical address activity demonstrate effective
#231 0.141 information management data processing systems corporate article communications organization control distributed department capacity departments major user hardware cost applications expansion
#299 0.111 office document documents retrieval automation word concept clustering text based automated created individual functions major approach operations prototype identify report
#281 0.104 database language query databases natural data queries relational processing paper using request views access use matching automated semantic based languages
#289 0.101 qualitative methods quantitative approaches approach selection analysis criteria used mixed methodological aspects recent selecting combining known conclusions included article appropriateness
#291 0.079 local global link complex view links particularly need thought number supports efforts difficult previously linked achieving simple poor individual rise
#157 0.075 evaluation effectiveness assessment evaluating paper objectives terms process assessing criteria evaluations methodology provides impact literature potential important evaluated identifying multiple
#220 0.071 research study different context findings types prior results focused studies empirical examine work previous little knowledge sources implications specifically provide