Davis Summarizes Papers
Subscribe
Sign in
Home
Archive
About
New
Top
2022-8-14 arXiv roundup: Branch-Train-Merge, Model patching, lots of LLM papers
This newsletter made possible by MosaicML. Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models They got training to be both…
Davis Blalock
Aug 15
2
Share this post
2022-8-14 arXiv roundup: Branch-Train-Merge, Model patching, lots of LLM papers
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-8-7 arXiv roundup: Adam and sharpness, Recursive self-improvement for coding, Training and model tweaks
This newsletter made possible by MosaicML. Little bit of a slow week on arXiv, but we have some interesting science-of-deep-learning work and some…
Davis Blalock
Aug 8
1
Share this post
2022-8-7 arXiv roundup: Adam and sharpness, Recursive self-improvement for coding, Training and model tweaks
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-7-31 arXiv roundup: Transformer vs CNN showdown, 1000x smaller DLRM, ELECTRA improvements
This newsletter made possible by MosaicML. An Impartial Take to the CNN vs Transformer Robustness Contest A robustness throwdown featuring {ViT, Swin…
Davis Blalock
Aug 1
1
Share this post
2022-7-31 arXiv roundup: Transformer vs CNN showdown, 1000x smaller DLRM, ELECTRA improvements
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-7-24 arXiv roundup: Int8 training at almost no accuracy loss, DataPerf, Scaling & inductive biases
This newsletter made possible by MosaicML. Did you know? People who share this newsletter are 98% less likely to fall prey to made-up statistics. Is…
Davis Blalock
Jul 25
2
Share this post
2022-7-24 arXiv roundup: Int8 training at almost no accuracy loss, DataPerf, Scaling & inductive biases
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-7-17 arxiv roundup: Next-ViT, Anthropic & DeepMind & Google interrogate giant language models, 16 other papers
This newsletter made possible by MosaicML.Forward this to your coworkers for a chance to win…erm…my gratitude? Next-ViT: Next Generation Vision…
Davis Blalock
Jul 18
Share this post
2022-7-17 arxiv roundup: Next-ViT, Anthropic & DeepMind & Google interrogate giant language models, 16 other papers
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-7-10 arXiv roundup: DeepSpeed inference, Simpler detection backbones, Spatial sparsification
This post made possible by MosaicML. If you like it, consider forwarding it to a friend! DeepSpeed Inference: Enabling Efficient Inference of…
Davis Blalock
Jul 11
3
Share this post
2022-7-10 arXiv roundup: DeepSpeed inference, Simpler detection backbones, Spatial sparsification
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-7-3 arXiv roundup: Minerva, Beating power laws, Surprising OOD linearity
This newsletter made possible by MosaicML. ⭐ Solving Quantitative Reasoning Problems with Language Models They finetune PaLM models on a corpus of math…
Davis Blalock
Jul 4
5
Share this post
2022-7-3 arXiv roundup: Minerva, Beating power laws, Surprising OOD linearity
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-6-26 arXiv roundup: Way better certified robustness, Progressive SSL, Empirical NTKs
This newsletter made possible by MosaicML. ⭐ (Certified!!) Adversarial Robustness for Free! Existing work has shown that you can make any classifier…
Davis Blalock
Jun 27
1
Share this post
2022-6-26 arXiv roundup: Way better certified robustness, Progressive SSL, Empirical NTKs
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-6-19 arXiv roundup: RHO-LOSS, Pix2Seq v2, Fisher SAM
This post made possible by MosaicML. Measuring the Carbon Intensity of AI in Cloud Instances Different training jobs can have hugely different carbon…
Davis Blalock
Jun 20
3
Share this post
2022-6-19 arXiv roundup: RHO-LOSS, Pix2Seq v2, Fisher SAM
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-6-12: 7x Faster ResNet-50, BIG-Bench, Neural corpus indexer, DeepSpeed & fp8 quantization
Blazingly Fast Computer Vision Training with the Mosaic ResNet and Composer We (MosaicML) trained a ResNet-50 7x faster with no loss of accuracy. This…
Davis Blalock
Jun 13
3
Share this post
2022-6-12: 7x Faster ResNet-50, BIG-Bench, Neural corpus indexer, DeepSpeed & fp8 quantization
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-6-5 arXiv roundup: SAM for free, FlashAttention, Supervised MAE
This newsletter made possible by MosaicML. Relatedly, do you have any friends who are {ML, cloud, platform} engineers and who might be open to a new…
Davis Blalock
Jun 6
2
Share this post
2022-6-5 arXiv roundup: SAM for free, FlashAttention, Supervised MAE
dblalock.substack.com
Copy link
Twitter
Facebook
Email
2022-5-28 arXiv roundup: OptFormer, Imagen, Thinking step by step, 23 other papers
Huge haul of papers this week thanks to last week’s NeurIPS deadline. If I counted correctly, there are 26 summaries in here. As always, this newsletter…
Davis Blalock
May 30
7
Share this post
2022-5-28 arXiv roundup: OptFormer, Imagen, Thinking step by step, 23 other papers
dblalock.substack.com
Copy link
Twitter
Facebook
Email
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts