## Abstract

The repair problem for an (n, k) error-correcting code calls for recovery of an unavailable coordinate of the codeword by downloading as little information as possible from a subset of the remaining coordinates. Using the terminology motivated by coding in distributed storage, we attempt to repair a failed node by accessing information stored on d helper nodes, where k ≪ d ≪ n - 1, and using as little repair bandwidth as possible to recover the lost information.By the so-called cut-set bound (Dimakis et al., 2010), the repair bandwidth of an (n,k = n - r) MDS code using d helper nodes is at least dl/(d + 1 - k), where l is the size of the node. A number of constructions of MDS array codes have been shown to meet this bound with equality. In a related but separate line of work, Guruswami and Wootters (2016) studied repair of Reed-Solomon (RS) codes, showing that it is possible to perform repair using a smaller bandwidth than under the trivial approach. At the same time, their work as well as follow-up papers stopped short of constructing RS codes (or any scalar MDS codes) that meet the cut-set bound with equality, which has been an open problem in coding theory.In this work we present a solution to this problem, constructing RS codes of length n over the field of size (ql, l = exp((1 + o(1)n log n) that meet the cut-set bound. We also prove an almost matching lower bound on l, showing that super-exponential scaling is both necessary and sufficient for achieving the cut-set bound using linear repair schemes. More precisely, we prove that for scalar MDS codes (including the RS codes) to meet this bound, the sub-packetization l must satisfy l ≫ exp((1 + o(1))k log k).

## Keywords

- Cut-set bound
- Optimal sub-packetization
- Repair bandwidth