2022-3-5 arXiv roundup: 5-bit training, What pretraining data to use, Expanding training sets via ML
dblalock.substack.com
This newsletter made possible by MosaicML (look at our shiny new website). Full Stack Optimization of Transformer Inference: a Survey Besides having a ton of links to relevant papers like most surveys, this paper also does a lot of profiling of transformer inference as a workload.
2022-3-5 arXiv roundup: 5-bit training, What pretraining data to use, Expanding training sets via ML
2022-3-5 arXiv roundup: 5-bit training, What…
2022-3-5 arXiv roundup: 5-bit training, What pretraining data to use, Expanding training sets via ML
This newsletter made possible by MosaicML (look at our shiny new website). Full Stack Optimization of Transformer Inference: a Survey Besides having a ton of links to relevant papers like most surveys, this paper also does a lot of profiling of transformer inference as a workload.