TY - JOUR
T1 - I've seen "enough"
T2 - 43rd International Conference on Very Large Data Bases, VLDB 2017
AU - Rahman, Sajjadur
AU - Aliakbarpour, Maryam
AU - Kong, Ha Kyung
AU - Blais, Eric
AU - Karahalios, Karrie
AU - Parameswaran, Aditya
AU - Rubinfield, Ronitt
N1 - Publisher Copyright:
© 2017 VLDB.
PY - 2017/8/1
Y1 - 2017/8/1
N2 - Data visualization is an effective mechanism for identifying trends, insights, and anomalies in data. On large datasets, however, generating visualizations can take a long time, delaying the extraction of insights, hampering decision making, and reducing exploration time. One solution is to use online sampling-based schemes to generate visualizations faster while improving the displayed estimates incrementally, eventually converging to the exact visualization computed on the entire data. However, the intermediate visualizations are approximate, and often fluctuate drastically, leading to potentially incorrect decisions. We propose sampling-based incremental visualization algorithms that reveal the "salient" features of the visualization quickly-with a 46× speedup relative to baselines-while minimizing error, thus enabling rapid and errorfree decision making. We demonstrate that these algorithms are optimal in terms of sample complexity, in that given the level of interactivity, they generate approximations that take as few samples as possible. We have developed the algorithms in the context of an incremental visualization tool, titled INCVISAGE, for trendline and heatmap visualizations. We evaluate the usability of INCVISAGE via user studies and demonstrate that users are able to make effective decisions with incrementally improving visualizations, especially compared to vanilla online-sampling based schemes.
AB - Data visualization is an effective mechanism for identifying trends, insights, and anomalies in data. On large datasets, however, generating visualizations can take a long time, delaying the extraction of insights, hampering decision making, and reducing exploration time. One solution is to use online sampling-based schemes to generate visualizations faster while improving the displayed estimates incrementally, eventually converging to the exact visualization computed on the entire data. However, the intermediate visualizations are approximate, and often fluctuate drastically, leading to potentially incorrect decisions. We propose sampling-based incremental visualization algorithms that reveal the "salient" features of the visualization quickly-with a 46× speedup relative to baselines-while minimizing error, thus enabling rapid and errorfree decision making. We demonstrate that these algorithms are optimal in terms of sample complexity, in that given the level of interactivity, they generate approximations that take as few samples as possible. We have developed the algorithms in the context of an incremental visualization tool, titled INCVISAGE, for trendline and heatmap visualizations. We evaluate the usability of INCVISAGE via user studies and demonstrate that users are able to make effective decisions with incrementally improving visualizations, especially compared to vanilla online-sampling based schemes.
UR - http://www.scopus.com/inward/record.url?scp=85037042988&partnerID=8YFLogxK
U2 - 10.14778/3137628.3137637
DO - 10.14778/3137628.3137637
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.conferencearticle???
AN - SCOPUS:85037042988
SN - 2150-8097
VL - 10
SP - 1262
EP - 1273
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 11
Y2 - 28 August 2017 through 1 September 2017
ER -