2021-9-12 arXiv roundup

May 03, 2022

⭐ Bag of Tricks for Optimizing Transformer Efficiency

Really nice paper full of practical improvements.

Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

They mostly just tie MLM, but they have a couple plots in the appendix where predicting the first letter of masked words does way better. Needs more detailed reading.

Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models

To go from keywords -> caption better, they go keywords -> google images results -> captions from pretrained caption network -> final caption. Another interesting datapoint in the theme of leaning on external pretrained models to get better results. In this case, the “model” is google image search.

C-MinHash: Rigorously Reducing K Permutations to Two

I don't think most deep learning people care about minhash, but a lot of people do; plus this is an example of a pretty strong algorithms paper for anyone curious what those look like

Davis Summarizes Papers

Discussion about this post