2022-8-14 arXiv roundup: Branch-Train-Merge, Model patching, lots of LLM papers
dblalock.substack.com
This newsletter made possible by MosaicML. Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models They got training to be both way faster and way more modular, to the extent that it might change standard practice for training models.
2022-8-14 arXiv roundup: Branch-Train-Merge, Model patching, lots of LLM papers
2022-8-14 arXiv roundup: Branch-Train-Merge…
2022-8-14 arXiv roundup: Branch-Train-Merge, Model patching, lots of LLM papers
This newsletter made possible by MosaicML. Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models They got training to be both way faster and way more modular, to the extent that it might change standard practice for training models.