Oh man, we got some good stuff this week. But first, quick thanks to MosaicML for making this newsletter possible, as well as @andrew_n_carr, @dpaleka, @Muhtasham9, @code_star @abhi_venigalla, and all the other nice people on Twitter who randomly recommended this newsletter this week. I normally just hit “publish” and don’t hear anything back, so it means a lot when people say they like it.
The Wide Attention paper is misleading. They only perform experiments on the tasks where even a bag-of-words model gets good performance. I got much worse performance when I tried this on other tasks.
The Wide Attention paper is misleading. They only perform experiments on the tasks where even a bag-of-words model gets good performance. I got much worse performance when I tried this on other tasks.