TY - JOUR
T1 - Gradient Coding from Cyclic MDS Codes and Expander Graphs
AU - Raviv, Netanel
AU - Tamo, Itzhak
AU - Tandon, Rashish
AU - Dimakis, Alexandros G.
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2020/12
Y1 - 2020/12
N2 - Gradient coding is a technique for straggler mitigation in distributed learning. In this paper we design novel gradient codes using tools from classical coding theory, namely, cyclic MDS codes, which compare favorably with existing solutions, both in the applicable range of parameters and in the complexity of the involved algorithms. Second, we introduce an approximate variant of the gradient coding problem, in which we settle for approximate gradient computation instead of the exact one. This approach enables graceful degradation, i.e., the $\ell _{2}$ error of the approximate gradient is a decreasing function of the number of stragglers. Our main result is that normalized adjacency matrices of expander graphs yield excellent approximate gradient codes, which enable significantly less computation compared to exact gradient coding, and guarantee faster convergence than trivial solutions under standard assumptions. We experimentally test our approach on Amazon EC2, and show that the generalization error of approximate gradient coding is very close to the full gradient while requiring significantly less computation from the workers.
AB - Gradient coding is a technique for straggler mitigation in distributed learning. In this paper we design novel gradient codes using tools from classical coding theory, namely, cyclic MDS codes, which compare favorably with existing solutions, both in the applicable range of parameters and in the complexity of the involved algorithms. Second, we introduce an approximate variant of the gradient coding problem, in which we settle for approximate gradient computation instead of the exact one. This approach enables graceful degradation, i.e., the $\ell _{2}$ error of the approximate gradient is a decreasing function of the number of stragglers. Our main result is that normalized adjacency matrices of expander graphs yield excellent approximate gradient codes, which enable significantly less computation compared to exact gradient coding, and guarantee faster convergence than trivial solutions under standard assumptions. We experimentally test our approach on Amazon EC2, and show that the generalization error of approximate gradient coding is very close to the full gradient while requiring significantly less computation from the workers.
KW - Gradient descent
KW - coding theory
KW - distributed computing
KW - expander graphs
UR - http://www.scopus.com/inward/record.url?scp=85097347219&partnerID=8YFLogxK
U2 - 10.1109/TIT.2020.3029396
DO - 10.1109/TIT.2020.3029396
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85097347219
SN - 0018-9448
VL - 66
SP - 7475
EP - 7489
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
IS - 12
M1 - 9216021
ER -