2024-8-4 arXiv roundup: LLama 3.1, training a…

Aug 5, 2024

In case you’re wondering what I’ve been up to instead of posting for the past couple months, I was kicking off a training run for a 100T parameter biological neural network:

Read →

2 Comments

Yinxi

Aug 6, 2024

Congrats on the new addition to the family, and also a more dynamic training round! Insights on llama3.1 are very helpful as usual, I am mostly impressed by the simple architecture to max training stability and the heavy use of synthetic data in post training.

Expand full comment

Tim Dingman

Aug 5, 2024

Just started training my second 100T model and haven't had time to read the Llama 3.1 paper. Thanks for putting in the work 🙂

Expand full comment

Davis Summarizes Papers

2024-8-4 arXiv roundup: LLama 3.1, training a…