2022-8-7 arXiv roundup: Adam and sharpness, Recursive self-improvement for coding, Training and model tweaks
dblalock.substack.com
This newsletter made possible by MosaicML. Little bit of a slow week on arXiv, but we have some interesting science-of-deep-learning work and some actionable training tweaks. Adaptive Gradient Methods at the Edge of Stability Adam doesn’t train at the edge of stability, but it does train at the edge of
2022-8-7 arXiv roundup: Adam and sharpness, Recursive self-improvement for coding, Training and model tweaks
2022-8-7 arXiv roundup: Adam and sharpness…
2022-8-7 arXiv roundup: Adam and sharpness, Recursive self-improvement for coding, Training and model tweaks
This newsletter made possible by MosaicML. Little bit of a slow week on arXiv, but we have some interesting science-of-deep-learning work and some actionable training tweaks. Adaptive Gradient Methods at the Edge of Stability Adam doesn’t train at the edge of stability, but it does train at the edge of