TY - GEN
T1 - Efficient low-contention parallel algorithms
AU - Gibbons, Phillip B.
AU - Matias, Yossi
AU - Ramachandran, Vijaya
N1 - Publisher Copyright:
© 1994 ACM.
PY - 1994/8/1
Y1 - 1994/8/1
N2 - The queue-read, queue-write (QRQW) PRAM model [GMR94] permits concurrent reading and writing, but at a cost proportional to the number of readers/writers to a memory location in a given step. The QRQW model reflects the contention properties of most parallel machines more accurately than either the well-studied CRCW or EREW models: the CRCW model does not adequately penalize algorithms with high contention to shared memory locations, while the EREW model is too strict in its insistence on zero contention at each step. Of primary practical and theoretical interest, then, is the design of fast and efficient QRQW algorithms for problems for which all previous algorithms either suffer from high contention, fail to be fast, or fail to be work-optimal. This paper describes low-contention, fast, work-optimal QRQW PRAM algorithms for the fundamental problems of finding a random permutation, parallel hashing, load balancing, and sorting. There is no known fast, work-optimal EREW algorithm known for finding a random permutation or for parallel hashing. For load balancing, we improve upon the EREW result whenever the ratio of the maximum to the average load is not too large. We show that the logarithmic dependence of the QRQW running time on this ratio is inherent by providing a matching lower bound. We demonstrate the performance advantage of a QRQW random permutation algorithm, compared with the popular EREW algorithm, by implementing and running both algorithms on the MasPar MP-1. Finally, we extend the work-time framework for the design of parallel algorithms to account for contention, and relate it to the QRQW PRAM model. We use our QRQW load balancing algorithm, as well as the QRQW linear compaction algorithm in [GMR94], to provide automatic tools for processor allocation - an issue that needs to be handled when translating an algorithm from its work-time presentation into the explicit PRAM description.
AB - The queue-read, queue-write (QRQW) PRAM model [GMR94] permits concurrent reading and writing, but at a cost proportional to the number of readers/writers to a memory location in a given step. The QRQW model reflects the contention properties of most parallel machines more accurately than either the well-studied CRCW or EREW models: the CRCW model does not adequately penalize algorithms with high contention to shared memory locations, while the EREW model is too strict in its insistence on zero contention at each step. Of primary practical and theoretical interest, then, is the design of fast and efficient QRQW algorithms for problems for which all previous algorithms either suffer from high contention, fail to be fast, or fail to be work-optimal. This paper describes low-contention, fast, work-optimal QRQW PRAM algorithms for the fundamental problems of finding a random permutation, parallel hashing, load balancing, and sorting. There is no known fast, work-optimal EREW algorithm known for finding a random permutation or for parallel hashing. For load balancing, we improve upon the EREW result whenever the ratio of the maximum to the average load is not too large. We show that the logarithmic dependence of the QRQW running time on this ratio is inherent by providing a matching lower bound. We demonstrate the performance advantage of a QRQW random permutation algorithm, compared with the popular EREW algorithm, by implementing and running both algorithms on the MasPar MP-1. Finally, we extend the work-time framework for the design of parallel algorithms to account for contention, and relate it to the QRQW PRAM model. We use our QRQW load balancing algorithm, as well as the QRQW linear compaction algorithm in [GMR94], to provide automatic tools for processor allocation - an issue that needs to be handled when translating an algorithm from its work-time presentation into the explicit PRAM description.
UR - http://www.scopus.com/inward/record.url?scp=85027418327&partnerID=8YFLogxK
U2 - 10.1145/181014.181382
DO - 10.1145/181014.181382
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85027418327
T3 - Proceedings of the 6th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 1994
SP - 236
EP - 247
BT - Proceedings of the 6th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 1994
PB - Association for Computing Machinery, Inc
T2 - 6th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 1994
Y2 - 27 June 1994 through 29 June 1994
ER -