Deep Neural Network (Deep Learning) models have been traditionally trained on dedicated servers, after collecting data from various edge devices and sending them to the server. In recent years new methodologies have emerged for training models in a distributed manner over edge devices, keeping the data on the devices themselves. This allows for better data privacy and reduces the training costs. One of the main challenges for such methodologies is reducing the communication costs to and mainly from the edge devices. In this work we compare the two main methodologies used for distributed edge training: Federated Learning and Large Batch Training. For each of the methodologies we examine their convergence rates, communication costs, and final model performance. In addition, we present two techniques for compressing the communication between the edge devices, and examine their suitability for each one of the training methodologies.