This newsletter made possible by MosaicML. Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models They got training to be both way faster and way more modular, to the extent that it might change standard practice for training models.
2
This newsletter made possible by MosaicML. Little bit of a slow week on arXiv, but we have some interesting science-of-deep-learning work and some…
1
This newsletter made possible by MosaicML. An Impartial Take to the CNN vs Transformer Robustness Contest A robustness throwdown featuring {ViT, Swin…
1
This newsletter made possible by MosaicML. Did you know? People who share this newsletter are 98% less likely to fall prey to made-up statistics. Is…
2
This newsletter made possible by MosaicML.Forward this to your coworkers for a chance to win…erm…my gratitude? Next-ViT: Next Generation Vision…
This post made possible by MosaicML. If you like it, consider forwarding it to a friend! DeepSpeed Inference: Enabling Efficient Inference of…
3
This newsletter made possible by MosaicML. ⭐ Solving Quantitative Reasoning Problems with Language Models They finetune PaLM models on a corpus of math…
5
This newsletter made possible by MosaicML. ⭐ (Certified!!) Adversarial Robustness for Free! Existing work has shown that you can make any classifier…
1
See all

Davis Summarizes Papers