Davis Summarizes Papers
Subscribe
Sign in
Share this post
Davis Summarizes Papers
2022-3-6: N:M sparse attention, Rethinking demonstrations, Shift instead of attention
Copy link
Facebook
Email
Notes
More
2022-3-6: N:M sparse attention, Rethinking…
Davis Blalock
May 3, 2022
Share this post
Davis Summarizes Papers
2022-3-6: N:M sparse attention, Rethinking demonstrations, Shift instead of attention
Copy link
Facebook
Email
Notes
More
Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models
Read →
Comments
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts
Share this post
2022-3-6: N:M sparse attention, Rethinking…
Share this post
Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models