Sorting is one of a set of fundamental problems in computer science. In this paper we present the first wait-free algorithm for sorting an input array of size N using P ≤ N processors to achieve optimal running time. We show two variants of the algorithm, one deterministic and one randomized, and prove that, with high probability, the latter suffers no more than O(√P) contention when run synchronously. Known sorting algorithms, when made wait-free through previously established transformation techniques, have complexity O(log3 N). The algorithm we present here, when run in the CRCW PRAM model, executes with high probability in O(log N) time when P = N, and O((N log N)/P) otherwise, which is optimal amongst comparison-based sorting algorithms. The wait-free property guarantees that the sort will complete despite any delays or failures incurred by the processors. This is a very desirable property from an operating systems point of view, since it allows oblivious thread scheduling as well as thread creation and deletion, without fear of losing the algorithm's correctness.