Flat-combining NUMA locks

Dave Dice, Virendra J. Marathe, Niro Shavit

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multicore machines are growing in size, and accordingly shifting from simple bus-based designs to NUMA and CCNUMA architectures. With this shift, the need for scalable hierarchical locking algorithms is becoming crucial to performance. This paper presents a novel scalable hierarchical queue-lock algorithm based on the flat combining synchronization paradigm. At the core of the new algorithm is a scheme for building local queues of waiting threads in a highly efficient manner, and then merging them globally, all with little interconnect traffic and virtually no costly synchronization operations in the common case. In empirical testing on an Oracle SPARC Enterprise T5440 Server, a 256-way CC-NUMA machine, our new flat-combining hierarchical lock significantly outperforms all classic locking algorithms, and at high concurrency levels, provides up to a factor of two improvement over HCLH, the most efficient known hierarchical locking algorithm.

Original languageEnglish
Title of host publicationSPAA'11 - Proceedings of the 23rd Annual Symposium on Parallelism in Algorithms and Architectures
Pages65-74
Number of pages10
DOIs
StatePublished - 2011
Externally publishedYes
Event23rd ACM Symposium on Parallelism in Algorithms and Architectures, SPAA'11 - San Jose, CA, United States
Duration: 4 Jun 20116 Jun 2011

Publication series

NameAnnual ACM Symposium on Parallelism in Algorithms and Architectures

Conference

Conference23rd ACM Symposium on Parallelism in Algorithms and Architectures, SPAA'11
Country/TerritoryUnited States
CitySan Jose, CA
Period4/06/116/06/11

Keywords

  • flat combining
  • hierarchical locks
  • queue locks

Fingerprint

Dive into the research topics of 'Flat-combining NUMA locks'. Together they form a unique fingerprint.

Cite this