This newsletter made possible by MosaicML. GPT-4 Technical Report This is a 98-page document, so we’re just gonna go through some highlights. First, scaling is still going strong. We haven’t saturated the log-log-linear trend yet. This holds not just for the pretraining…
1
This newsletter made possible by MosaicML. Pretraining BERT from Scratch for $20 We trained an optimized BERT model to match the results from the…
2
This newsletter made possible by MosaicML (look at our shiny new website). Full Stack Optimization of Transformer Inference: a Survey Besides having a…
This newsletter made possible by MosaicML. Poisoning Web-Scale Training Datasets is Practical You can poison public datasets by buying domains …
1
This newsletter made possible by MosaicML. This one’s a little delayed because it turns out combing through 1384 papers takes a while. LUT-NN: Towards…
1
This newsletter made possible by MosaicML. There were over 800 arXiv papers this week thanks to the ICML deadline, so this one ended up a little…
This newsletter made possible by MosaicML. Training Stable Diffusion from Scratch Costs <$160k We showed you could train your own Stable Diffusion model…
This newsletter made possible by MosaicML. Looks like we’ve got a little bit of a pre-ICML lull this week. But next week might be crazy… Learning…
1
See all

Davis Summarizes Papers