Author List: Mookerjee, Vijay S.; Mannino, Michael V.;
Information Systems Research, 1997, Volume 8, Issue 1, Page 51.
Retrieval of a set of cases similar to a new case is a problem common to a number of machine learning approaches such as nearest neighbor algorithms, conceptual clustering, and case based reasoning. A limitation of most case retrieval algorithms is their lack of attention to information acquisition costs. When information acquisition costs are considered, cost reduction is hampered by the practice of separating concept formation and retrieval strategy formation. To demonstrate the above claim, we examine two approaches. The first approach separates concept formation and retrieval strategy formation. To form a retrieval strategy in this approach, we develop the CR<sub>1c</sub> (case retrieval loss criterion) algorithm that selects attributes in ascending order of expected loss. The second approach jointly optimizes concept formation and retrieval strategy formation using a cost based variant of the ID3 algorithm (ID3<sub>c</sub>). ID3<sub>c</sub> builds a decision free wherein attributes are selected using entropy reduction per unit information acquisition cost. Experiments with four data sets are described in which algorithm, attribute cost coefficient of variation, and matching threshold are factors. The experimental results demonstrate that (i) jointly optimizing concept formation and retrieval strategy formation has substantial benefits, and (ii) using cost considerations can significantly reduce information acquisition costs, even if concept formation and retrieval strategy formation are separated.
Keywords: Case Based Systems; Case Retrieval Algorithms; Cost Reduction; Information Costs; Joint Versus Separate Optimization
Algorithm:

List of Topics

#97 0.220 set approach algorithm optimal used develop results use simulation experiments algorithms demonstrate proposed optimization present analytical distribution selection number existing
#151 0.167 costs cost switching reduce transaction increase benefits time economic production transactions savings reduction impact services reduced affect expected optimal associated
#299 0.134 office document documents retrieval automation word concept clustering text based automated created individual functions major approach operations prototype identify report
#10 0.124 strategies strategy based effort paper different findings approach suggest useful choice specific attributes explain effective affect employ particular online control
#147 0.120 process problem method technique experts using formation identification implicit analysis common proactive input improvements identify traditional stages identifying explicit setting
#193 0.061 time use size second appears form larger benefits combined studies reasons selected underlying appear various significantly result include make attention