Author List: Zhao, Huimin; Soofi, Ehsan S.;
Journal of Management Information Systems, 2006, Volume 22, Issue 4, Page 305-336.
Identifying attribute correspondences across heterogeneous databases is a critical and time-consuming step in integrating the databases. Past research has applied correlation analysis techniques to explore correspondences between attributes. These techniques, however, are appropriate for numeric attributes that are linearly related. This paper proposes an information-theoretic approach to exploring correspondences between attributes in heterogeneous databases. The proposed approach is applicable to character attributes, as well as to numeric attributes, regardless whether or not they are linearly related. It overcomes some serious shortcomings of previous approaches based on correlation analysis and has much broader applicability. The proposed procedure samples both matching and nonmatching pairs of records from the databases under consideration, applies matching functions to compare pairs of attributes, and then uses the mutual information to measure the dependency between a matching function as applied to a pair of attributes and the class (i.e., matching or nonmatching) of a pair of records. A high mutual information index implies a potential attribute correspondence, which is presented to the analyst for further evaluation. The paper also presents some empirical results demonstrating the utility of the proposed approach.
Keywords: attribute correspondence; attribute matching; composite information systems; database interoperability; heterogeneous databases; information theory; interorganizational systems; mutual information
Algorithm:

List of Topics

#55 0.280 attributes credibility wikis tools wiki potential consequences gis potentially expectancy shaping exploring related anonymous attribute employing life comment comments 2.0
#281 0.185 database language query databases natural data queries relational processing paper using request views access use matching automated semantic based languages
#44 0.143 approach analysis application approaches new used paper methodology simulation traditional techniques systems process based using proposed method present provides various
#292 0.132 information research literature systems framework review paper theoretical based potential future implications practice discussed current concept propositions findings provided extant
#141 0.067 information approach article mis presents doctoral dissertations analysis verification management requirements systems list needs including user requirement systematic observation structured
#117 0.064 standards interorganizational ios standardization standard systems compatibility effects cooperation firms industry benefits open interoperability key heterogeneous vertical propose vendors collective