2022-11-27 arXiv roundup: Multimodal retrieval, int8 and int4 LLM quantization
dblalock.substack.com
This newsletter made possible by MosaicML. Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers They propose randomly dropping some fraction of the tokens for each transformer block except the first and last ones.
2022-11-27 arXiv roundup: Multimodal retrieval, int8 and int4 LLM quantization
2022-11-27 arXiv roundup: Multimodal…
2022-11-27 arXiv roundup: Multimodal retrieval, int8 and int4 LLM quantization
This newsletter made possible by MosaicML. Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers They propose randomly dropping some fraction of the tokens for each transformer block except the first and last ones.