Author List: Padmanabhan, Balaji; Zheng, Zhiqiang (Eric); Kimbrough, Steven O.;
MIS Quarterly, 2006, Volume 30, Issue 2, Page 247-267.
Due to the vast amount of user data tracked online, the use of data-based analytical methods is becoming increasingly common for e-businesses. Recently the term analytical eCRM has been used to refer to the use of such methods in the online world. A characteristic of most of the current approaches in eCRM is that they use data collected about users' activities at a single site only and, as we argue in this paper, this can present an incomplete picture of user activity. However, it is possible to obtain a complete picture of user activity from across-site data on users. Such data is expensive, but can be obtained by firms directly from their users or from market data vendors. A critical question is whether such data is worth obtaining, an issue that little prior research has addressed. In this paper, using a data mining approach, we present an empirical analysis of the modeling benefits that can be obtained by having complete information. Our results suggest that the magnitudes of gains that can be obtained from complete data range from a few percentage points to 50 percent, depending on the problem for which it is used and the performance metrics considered. Qualitatively we find that variables related to customer loyalty and browsing intensity are particularly important and these variables are difficult to derive from data collected at a single site. More importantly, we find that a firm has to collect a reasonably large amount of complete data before any benefits can be reaped and caution against acquiring too little data.
Keywords: Data mining; eCRM; incomplete data; information value
Algorithm:

List of Topics

#6 0.227 data used develop multiple approaches collection based research classes aspect single literature profiles means crowd collected trend accuracy databases accurate
#130 0.172 online users active paper using increasingly informational user data internet overall little various understanding empirical despite lead cascades help availability
#86 0.098 methods information systems approach using method requirements used use developed effective develop determining research determine assessment useful series critical existing
#215 0.092 data classification statistical regression mining models neural methods using analysis techniques performance predictive networks accuracy method variables prediction problem measure
#219 0.090 response responses different survey questions results research activities respond benefits certain leads two-stage interactions study address respondents question directly categories
#114 0.075 performance firm measures metrics value relationship firms results objective relationships firm's organizational traffic measure market study improve accounting measuring aggregate
#33 0.053 web site sites content usability page status pages metrics browsing design use web-based guidelines results implications portal loyalty navigability addition