2021-9-12 arXiv roundup
⭐ Bag of Tricks for Optimizing Transformer Efficiency Really nice paper full of practical improvements.
Frustratingly Simple Pretraining Alternatives to Masked Language Modeling They mostly just tie MLM, but they have a couple plots in the appendix where predicting the first letter of masked words does way better. Needs more detailed reading.
Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models. To go from keywords -> caption better, they go keywords -> google images results -> captions from pretrained caption network -> final caption. Another interesting datapoint in the theme of leaning on external pretrained models to get better results. In this case, the “model” is google image search.
C-MinHash: Rigorously Reducing K Permutations to Two I don't think most deep learning people care about minhash, but a lot of people do; plus this is an example of a pretty strong algorithms paper for anyone curious what those look like
Create your profile
Only paid subscribers can comment on this post
Check your email
For your security, we need to re-authenticate you.
Click the link we sent to , or click here to sign in.