Fast concurrent queues for x86 processors

Research output: Contribution to journalArticlepeer-review


Conventional wisdom in designing concurrent data structures is to use the most powerful synchronization primitive, namely compare-and-swap (CAS), and to avoid contended hot spots. In building concurrent FIFO queues, this reasoning has led researchers to propose combining-based concurrent queues. This paper takes a different approach, showing how to rely on fetch-and-add (F&A), a less powerful primitive that is available on x86 processors, to construct a nonblocking (lock-free) linearizable concurrent FIFO queue which, despite the F&A being a contended hot spot, outperforms combining-based implementations by 1.5× to 2.5× in all concurrency levels on an x86 server with four multicore processors, in both single-processor and multi-processor executions.

Original languageEnglish
Pages (from-to)103-112
Number of pages10
JournalACM SIGPLAN Notices
Issue number8
StatePublished - Aug 2013


  • Concurrent queue
  • Fetch-andadd
  • Nonblocking algorithm


Dive into the research topics of 'Fast concurrent queues for x86 processors'. Together they form a unique fingerprint.

Cite this