TY - GEN
T1 - An extendable data structure for incremental stable perfect hashing
AU - Bercea, Ioana Oriana
AU - Even, Guy
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/9/6
Y1 - 2022/9/6
N2 - We consider the problem of dynamically assigning n elements unique indices, known as hashcodes, in the range [(1+o(1))n]. This problem is known as perfect hashing and is considered a fundamental building block in the design of more involved data structures. The challenge we address is that of designing a data structure that meets several, seemingly opposing, requirements: (1) the range and the space of the data structure must be, at all times, proportional to the current cardinality nt of the input set, and (2) the hashcodes it assigns must be stable in that the hashcode of an element must not change while the element is continuously in the set. A simple argument shows that these two desiderata are impossible to achieve when arbitrary deletions and insertions are allowed. In this paper, we show that one can achieve them when only insertions occur and, more generally, when the hashcode range and the space are allowed to grow as a function of the maximum cardinality Nt of the set until time t. The data structure executes all operations in worst case constant time with high probability and requires space that is within a constant factor of the lower bound. In particular, this leads to a hash table design that does not need to move elements as its size grows. More generally, we present, as an application, a cyclic sequence of reductions between data structures that lead to the following bootstrapping result. Let B(u,n) denote the lower bound on the space of a dictionary for n elements over a universe [u]. Given a compact dynamic dictionary (i.e., space O(B(u,n))), we can use it to build a dynamic dictionary with space B(u,n)+O(nloglogn). This reduction increases the time per operation by an additive constant and applies both to the extendable and non-extendable settings (failure probability is 1/poly(n) per insertion).
AB - We consider the problem of dynamically assigning n elements unique indices, known as hashcodes, in the range [(1+o(1))n]. This problem is known as perfect hashing and is considered a fundamental building block in the design of more involved data structures. The challenge we address is that of designing a data structure that meets several, seemingly opposing, requirements: (1) the range and the space of the data structure must be, at all times, proportional to the current cardinality nt of the input set, and (2) the hashcodes it assigns must be stable in that the hashcode of an element must not change while the element is continuously in the set. A simple argument shows that these two desiderata are impossible to achieve when arbitrary deletions and insertions are allowed. In this paper, we show that one can achieve them when only insertions occur and, more generally, when the hashcode range and the space are allowed to grow as a function of the maximum cardinality Nt of the set until time t. The data structure executes all operations in worst case constant time with high probability and requires space that is within a constant factor of the lower bound. In particular, this leads to a hash table design that does not need to move elements as its size grows. More generally, we present, as an application, a cyclic sequence of reductions between data structures that lead to the following bootstrapping result. Let B(u,n) denote the lower bound on the space of a dictionary for n elements over a universe [u]. Given a compact dynamic dictionary (i.e., space O(B(u,n))), we can use it to build a dynamic dictionary with space B(u,n)+O(nloglogn). This reduction increases the time per operation by an additive constant and applies both to the extendable and non-extendable settings (failure probability is 1/poly(n) per insertion).
KW - dictionary
KW - extendable data structure
KW - filter
KW - perfect hashing
KW - succint data structure
UR - http://www.scopus.com/inward/record.url?scp=85132709167&partnerID=8YFLogxK
U2 - 10.1145/3519935.3520070
DO - 10.1145/3519935.3520070
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85132709167
T3 - Proceedings of the Annual ACM Symposium on Theory of Computing
SP - 1298
EP - 1310
BT - STOC 2022 - Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing
A2 - Leonardi, Stefano
A2 - Gupta, Anupam
PB - Association for Computing Machinery
T2 - 54th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2022
Y2 - 20 June 2022 through 24 June 2022
ER -