Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Mitchell Wortsman*, Gabriel Ilharco, Samir Yitzhak Gadre, Rebecca Roelofs, Raphael Gontijo-Lopes, Ari S. Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon, Simon Kornblith, Ludwig Schmidt

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

143 Scopus citations


The conventional recipe for maximizing model accuracy is to (1) train multiple models with various hyperparameters and (2) pick the individual model which performs best on a held-out validation set, discarding the remainder. In this paper, we revisit the second step of this procedure in the context of fine-tuning large pre-trained models, where fine-tuned models often appear to lie in a single low error basin. We show that averaging the weights of multiple models fine-tuned with different hyperparameter configurations often improves accuracy and robustness. Unlike a conventional ensemble, we may average many models without incurring any additional inference or memory costs-we call the results “model soups.” When fine-tuning large pre-trained models such as CLIP, ALIGN, and a ViT-G pretrained on JFT, our soup recipe provides significant improvements over the best model in a hyperparameter sweep on ImageNet. The resulting ViT-G model, which attains 90.94% top-1 accuracy on ImageNet, achieved a new state of the art. Furthermore, we show that the model soup approach extends to multiple image classification and natural language processing tasks, improves out-of-distribution performance, and improves zero-shot performance on new downstream tasks. Finally, we analytically relate the performance similarity of weight-averaging and logit-ensembling to flatness of the loss and confidence of the predictions, and validate this relation empirically. Code is available at

Original languageEnglish
Pages (from-to)23965-23998
Number of pages34
JournalProceedings of Machine Learning Research
StatePublished - 2022
Event39th International Conference on Machine Learning, ICML 2022 - Baltimore, United States
Duration: 17 Jul 202223 Jul 2022


FundersFunder number
Allen Institute
Blavatnik Family Foundation
Israeli Science Foundation2486/21
NSF AI Institute for Foundations of Machine Learning
Yandex Initiative for Machine Learning
National Science FoundationIIS 1652052, IIS 17303166
Defense Advanced Research Projects AgencyW911NF-15-1-0543, N66001-19-2-4031
Blavatnik Family Foundation
Allen Institute
Israel Science Foundation


    Dive into the research topics of 'Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time'. Together they form a unique fingerprint.

    Cite this