TY - JOUR
T1 - Optimal Rebuilding of Multiple Erasures in MDS Codes
AU - Wang, Zhiying
AU - Tamo, Itzhak
AU - Bruck, Jehoshua
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/2
Y1 - 2017/2
N2 - Maximum distance separable (MDS) array codes are widely used in storage systems due to their computationally efficient encoding and decoding procedures. An MDS code with r redundancy nodes can correct any r node erasures by accessing (reading) all the remaining information in the surviving nodes. However, in practice, e erasures are a more likely failure event, for some 1 ≤ e < r. Hence, a natural question is how much information do we need to access in order to rebuild e storage nodes. We define the rebuilding ratio as the fraction of remaining information accessed during the rebuilding of e erasures. In our previous work, we constructed MDS codes, called zigzag codes, that achieve the optimal rebuilding ratio of 1/r for the rebuilding of any systematic node when e=1 ; however, all the information needs to be accessed for the rebuilding of the parity node erasure. The (normalized) repair bandwidth is defined as the fraction of information transmitted from the remaining nodes during the rebuilding process. For codes that are not necessarily MDS, Dimakis et al. proposed the regenerating codes framework where any r erasures can be corrected by accessing some of the remaining information, and any e=1 erasure can be rebuilt from some subsets of surviving nodes with optimal repair bandwidth. In this paper, we present three results on rebuilding of codes: 1) we show a fundamental outer bound on the storage size of the node and the repair bandwidth similar to the regenerating codes framework, and show that zigzag codes achieve the optimal rebuilding ratio of e/r for systematic nodes of MDS codes, for any 1 ≤ e ≤ r ; 2) we construct systematic codes that achieve optimal rebuilding ratio of 1/r, for any systematic or parity node erasure; and 3) we present error correction algorithms for zigzag codes, and in particular demonstrate how these codes can be corrected beyond their minimum Hamming distances.
AB - Maximum distance separable (MDS) array codes are widely used in storage systems due to their computationally efficient encoding and decoding procedures. An MDS code with r redundancy nodes can correct any r node erasures by accessing (reading) all the remaining information in the surviving nodes. However, in practice, e erasures are a more likely failure event, for some 1 ≤ e < r. Hence, a natural question is how much information do we need to access in order to rebuild e storage nodes. We define the rebuilding ratio as the fraction of remaining information accessed during the rebuilding of e erasures. In our previous work, we constructed MDS codes, called zigzag codes, that achieve the optimal rebuilding ratio of 1/r for the rebuilding of any systematic node when e=1 ; however, all the information needs to be accessed for the rebuilding of the parity node erasure. The (normalized) repair bandwidth is defined as the fraction of information transmitted from the remaining nodes during the rebuilding process. For codes that are not necessarily MDS, Dimakis et al. proposed the regenerating codes framework where any r erasures can be corrected by accessing some of the remaining information, and any e=1 erasure can be rebuilt from some subsets of surviving nodes with optimal repair bandwidth. In this paper, we present three results on rebuilding of codes: 1) we show a fundamental outer bound on the storage size of the node and the repair bandwidth similar to the regenerating codes framework, and show that zigzag codes achieve the optimal rebuilding ratio of e/r for systematic nodes of MDS codes, for any 1 ≤ e ≤ r ; 2) we construct systematic codes that achieve optimal rebuilding ratio of 1/r, for any systematic or parity node erasure; and 3) we present error correction algorithms for zigzag codes, and in particular demonstrate how these codes can be corrected beyond their minimum Hamming distances.
KW - Distributed storage
KW - correcting erasures and errors
KW - multiple erasures
KW - regenerating codes
UR - http://www.scopus.com/inward/record.url?scp=85010376046&partnerID=8YFLogxK
U2 - 10.1109/TIT.2016.2633411
DO - 10.1109/TIT.2016.2633411
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85010376046
SN - 0018-9448
VL - 63
SP - 1084
EP - 1101
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
IS - 2
M1 - 7762203
ER -