Archive - Davis Summarizes Papers

2024-8-4 arXiv roundup: LLama 3.1, training a 100T biological neural net

In case you’re wondering what I’ve been up to instead of posting for the past couple months, I was kicking off a training run for a 100T parameter…

Aug 5, 2024 •

2024-8-25: Scaling curves for All of the Things

Good news: we got a bunch of important findings this week.

Aug 26, 2024 •

2024-4-7 arXiv roundup: DBRX, Backlog highlights part 1

It’s good to be back.

Apr 8, 2024 •

2023-11-19 arXiv roundup: Inverse-free inverse Hessians, Faster LLMs, Closed-form diffusion

A fundamental result in queueing theory is that, if items enter the queue faster than they’re processed, the length of the queue tends to infinity.

Nov 19, 2023 •

2023-9 arXiv roundup: A bunch of good ML systems and Empirical science papers

Got behind the curve again and ended up taking me more than a week to catch up.

Oct 6, 2023 •

2024-4-28 arXiv roundup: data and scaling, backlog highlights part 3

Besides getting to cover unusually interesting work, the upside of having a big backlog is that you can group your coverage thematically.

Apr 29, 2024 •

2024-4-14 arXiv roundup: backlog highlights part 2

Bunch of interesting stuff this week. Before we jump in, one quick clarification from last week: I mentioned how it was an interesting marketing lesson…

Apr 15, 2024 •

2023-11-26 arXiv roundup: Big potential wins, 1 bit per parameter, Simplifying transformers

Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication

Dec 1, 2023 •

2023-7-23 arXiv roundup: OpenAI breaking changes, Much better attention and image captions

This newsletter made possible by MosaicML.

Jul 25, 2023 •

2023-10-16 arXiv roundup: Cornucopia of easy (claimed) wins for LLMs

Also, I was on the AI Stories podcast!

Oct 17, 2023 •

2023-8 arXiv roundup: Look I gave a talk, SILO-ing language models, lots of MoE + tool use papers

(I’m still planning on doing weekly installments in general—I just got behind and it took a while to catch up).

Sep 1, 2023 •

2023-7-9 arXiv roundup: LLMs ignore the middle of their context, MoE + instruction tuning rocks

This newsletter made possible by MosaicML.

Jul 11, 2023 •

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts