2021-9-19 arXiv roundup - OMPQ, Don't pretrain?, EfficientBERT, Primer
dblalock.substack.com
OMPQ: Orthogonal Mixed Precision Quantization They figure out how many bits to use for different layers in <9 seconds by using a proxy objective. In the camp of "read this if and only if you care about about quantization." Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative
2021-9-19 arXiv roundup - OMPQ, Don't pretrain?, EfficientBERT, Primer
2021-9-19 arXiv roundup - OMPQ, Don't…
2021-9-19 arXiv roundup - OMPQ, Don't pretrain?, EfficientBERT, Primer
OMPQ: Orthogonal Mixed Precision Quantization They figure out how many bits to use for different layers in <9 seconds by using a proxy objective. In the camp of "read this if and only if you care about about quantization." Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative