Discussion about this post

User's avatar
Sarthak Bhatt's avatar

The Wide Attention paper is misleading. They only perform experiments on the tasks where even a bag-of-words model gets good performance. I got much worse performance when I tried this on other tasks.

Expand full comment

No posts