TY - JOUR

T1 - Optimal Rebuilding of Multiple Erasures in MDS Codes

AU - Wang, Zhiying

AU - Tamo, Itzhak

AU - Bruck, Jehoshua

N1 - Publisher Copyright:
© 2016 IEEE.

PY - 2017/2

Y1 - 2017/2

N2 - Maximum distance separable (MDS) array codes are widely used in storage systems due to their computationally efficient encoding and decoding procedures. An MDS code with r redundancy nodes can correct any r node erasures by accessing (reading) all the remaining information in the surviving nodes. However, in practice, e erasures are a more likely failure event, for some 1 ≤ e < r. Hence, a natural question is how much information do we need to access in order to rebuild e storage nodes. We define the rebuilding ratio as the fraction of remaining information accessed during the rebuilding of e erasures. In our previous work, we constructed MDS codes, called zigzag codes, that achieve the optimal rebuilding ratio of 1/r for the rebuilding of any systematic node when e=1 ; however, all the information needs to be accessed for the rebuilding of the parity node erasure. The (normalized) repair bandwidth is defined as the fraction of information transmitted from the remaining nodes during the rebuilding process. For codes that are not necessarily MDS, Dimakis et al. proposed the regenerating codes framework where any r erasures can be corrected by accessing some of the remaining information, and any e=1 erasure can be rebuilt from some subsets of surviving nodes with optimal repair bandwidth. In this paper, we present three results on rebuilding of codes: 1) we show a fundamental outer bound on the storage size of the node and the repair bandwidth similar to the regenerating codes framework, and show that zigzag codes achieve the optimal rebuilding ratio of e/r for systematic nodes of MDS codes, for any 1 ≤ e ≤ r ; 2) we construct systematic codes that achieve optimal rebuilding ratio of 1/r, for any systematic or parity node erasure; and 3) we present error correction algorithms for zigzag codes, and in particular demonstrate how these codes can be corrected beyond their minimum Hamming distances.

AB - Maximum distance separable (MDS) array codes are widely used in storage systems due to their computationally efficient encoding and decoding procedures. An MDS code with r redundancy nodes can correct any r node erasures by accessing (reading) all the remaining information in the surviving nodes. However, in practice, e erasures are a more likely failure event, for some 1 ≤ e < r. Hence, a natural question is how much information do we need to access in order to rebuild e storage nodes. We define the rebuilding ratio as the fraction of remaining information accessed during the rebuilding of e erasures. In our previous work, we constructed MDS codes, called zigzag codes, that achieve the optimal rebuilding ratio of 1/r for the rebuilding of any systematic node when e=1 ; however, all the information needs to be accessed for the rebuilding of the parity node erasure. The (normalized) repair bandwidth is defined as the fraction of information transmitted from the remaining nodes during the rebuilding process. For codes that are not necessarily MDS, Dimakis et al. proposed the regenerating codes framework where any r erasures can be corrected by accessing some of the remaining information, and any e=1 erasure can be rebuilt from some subsets of surviving nodes with optimal repair bandwidth. In this paper, we present three results on rebuilding of codes: 1) we show a fundamental outer bound on the storage size of the node and the repair bandwidth similar to the regenerating codes framework, and show that zigzag codes achieve the optimal rebuilding ratio of e/r for systematic nodes of MDS codes, for any 1 ≤ e ≤ r ; 2) we construct systematic codes that achieve optimal rebuilding ratio of 1/r, for any systematic or parity node erasure; and 3) we present error correction algorithms for zigzag codes, and in particular demonstrate how these codes can be corrected beyond their minimum Hamming distances.

KW - Distributed storage

KW - correcting erasures and errors

KW - multiple erasures

KW - regenerating codes

UR - http://www.scopus.com/inward/record.url?scp=85010376046&partnerID=8YFLogxK

U2 - 10.1109/TIT.2016.2633411

DO - 10.1109/TIT.2016.2633411

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:85010376046

SN - 0018-9448

VL - 63

SP - 1084

EP - 1101

JO - IEEE Transactions on Information Theory

JF - IEEE Transactions on Information Theory

IS - 2

M1 - 7762203

ER -