Author List: Park, Sung-Hyuk; Huh, Soon-Young; Oh, Wonseok; Han, Sang Pil;
MIS Quarterly, 2012, Volume 36, Issue 4, Page 1217-1237.
Drawing from the social and relational perspectives, this study offers an innovative conceptualization and operational approach regarding the validation of self-reported customer demographic data, which has become an essential corporate asset for harnessing business intelligence. Specifically, based on social network and homophily paradigms in which individuals have a natural tendency to associate and interact frequently with others with similar characteristics, we constructed a relational inference model to determine the accuracy of self-administered consumer profiles. In addition, to further enhance the reliability of our model's prediction capability, we employed the entropy mechanism that minimizes potential biases that may arise from a simple probabilistic approach. To empirically validate the accuracy of our inference framework, we obtained and analyzed over 20 million actual call transactions supplied by one of the largest global telecommunication service providers. The results suggest that our social network-based inference model consistently outperforms other competing mechanisms (e.g., weighted average and simple relational classifier) regardless of the criteria choice (e.g., number of call receivers, call duration, and call frequency), with an accuracy rate of approximately 93 percent. Finally, to confirm the generalizability of our findings, we conducted simulation experiments to validate the robustness of the results in response to variations in parameter values and increases in potential noise in the data. We discuss several implications related to business intelligence for both research and practice, and offer new directions for future studies.
Keywords: business intelligence; customer profile; data quality; inference model; query processing system; simulation experiment; social network
Algorithm:

List of Topics

#213 0.103 assimilation beliefs belief confirmation aggregation initial investigate observed robust particular comparative circumstances aggregated tendency factors examine stages uncertainty instead confidence
#97 0.098 set approach algorithm optimal used develop results use simulation experiments algorithms demonstrate proposed optimization present analytical distribution selection number existing
#133 0.085 data predictive analytics sharing big using modeling set power inference behavior explanatory related prediction statistical generated substantially novel building million
#37 0.079 intelligence business discovery framework text knowledge new existing visualization based analyzing mining genetic algorithms related techniques large proposed novel artificial
#234 0.077 social networks influence presence interactions network media networking diffusion implications individuals people results exchange paper sites evidence self-disclosure important examine
#6 0.072 data used develop multiple approaches collection based research classes aspect single literature profiles means crowd collected trend accuracy databases accurate
#220 0.067 research study different context findings types prior results focused studies empirical examine work previous little knowledge sources implications specifically provide
#281 0.065 database language query databases natural data queries relational processing paper using request views access use matching automated semantic based languages
#208 0.063 feedback mechanisms mechanism ratings efficiency role effective study economic design potential economics discuss profile recent component granularity turn compared using
#288 0.061 customer customers crm relationship study loyalty marketing management profitability service offer retention it-enabled web-based interactions operations sales strategy channels set