TY - GEN

T1 - Ranged hash functions and the price of churn

AU - Aspnes, James

AU - Safra, Muli

AU - Yin, Yitong

PY - 2008

Y1 - 2008

N2 - Ranged hash functions generalize hash tables to the setting where hash buckets may come and go over time, a typical case in distributed settings where hash buckets may correspond to unreliable servers or network connections. Monotone ranged hash functions are a particular class of ranged hash functions that minimize item reassignments in response to churn: changes in the set of available buckets. The canonical example of a monotone ranged hash function is the ring-based consistent hashing mechanism of Karger et al. [13]. These hash functions give a maximum load of Θ (n/m log m) when n is the number of items and m is the number of buckets. The question of whether some better bound could be obtained using a more sophisticated hash function has remained open. We resolve this question by showing two lower bounds. First, the maximum load of any randomized monotone ranged hash function is Ω(√In m) when n - o(m log m). This bound covers almost all of the nontrivial case, because when n = Ω(m log m) simple random assignment matches the trivial lower bound of Ω(n/m). We give a matching (though impractical) upper bound that shows that our lower bound is tight over almost all of its range. Second, for randomized monotone ranged hash functions derived from metric spaces, there is a further trade-off between the expansion factor of the metric and the load balance, which for the special case of growth-restricted metrics gives a bound of Ω (n/m log m), asymptotically equal to that of consistent hashing. These are the first known non-trivial lower bounds for ranged hash functions. They also explain why in ten years no better ranged hash functions have arisen to replace consistent hashing.

AB - Ranged hash functions generalize hash tables to the setting where hash buckets may come and go over time, a typical case in distributed settings where hash buckets may correspond to unreliable servers or network connections. Monotone ranged hash functions are a particular class of ranged hash functions that minimize item reassignments in response to churn: changes in the set of available buckets. The canonical example of a monotone ranged hash function is the ring-based consistent hashing mechanism of Karger et al. [13]. These hash functions give a maximum load of Θ (n/m log m) when n is the number of items and m is the number of buckets. The question of whether some better bound could be obtained using a more sophisticated hash function has remained open. We resolve this question by showing two lower bounds. First, the maximum load of any randomized monotone ranged hash function is Ω(√In m) when n - o(m log m). This bound covers almost all of the nontrivial case, because when n = Ω(m log m) simple random assignment matches the trivial lower bound of Ω(n/m). We give a matching (though impractical) upper bound that shows that our lower bound is tight over almost all of its range. Second, for randomized monotone ranged hash functions derived from metric spaces, there is a further trade-off between the expansion factor of the metric and the load balance, which for the special case of growth-restricted metrics gives a bound of Ω (n/m log m), asymptotically equal to that of consistent hashing. These are the first known non-trivial lower bounds for ranged hash functions. They also explain why in ten years no better ranged hash functions have arisen to replace consistent hashing.

UR - http://www.scopus.com/inward/record.url?scp=58449124442&partnerID=8YFLogxK

M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???

AN - SCOPUS:58449124442

SN - 9780898716474

T3 - Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms

SP - 1066

EP - 1075

BT - Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms

T2 - 19th Annual ACM-SIAM Symposium on Discrete Algorithms

Y2 - 20 January 2008 through 22 January 2008

ER -