Interesting observation about Retentive Networks: https://twitter.com/ericzelikman/status/1682097753151660032?s=46&t=R1HcRy3wUpT5EYNQxGl8wg

They seem severely undertrained compared to other networks like Llama 2. Wondering if they just converge a little faster in the beginning of training and hence the favorable perf compared to regular Transformers.

Expand full comment

There might be a few corrections to make on the "How is ChatGPT's behavior changing over time?" summary. Do not take this in bad faith, it's just that I read this newsletter and like it to stay true to the facts.

> In many cases, GPT-4 got worse while GPT-3.5 got better.

You might not be aware that claims of performance decreases seems to be misplaced, at least on the experiments investigated in the paper.

In particular, https://twitter.com/Si_Boehm/status/1681801371656536068 claims the LeetCode performance of the produced code got significantly better.

Similarly, https://twitter.com/tjade273/status/1682009691633614849 claims, depending in what setting you test primality detection on 5-digit numbers properly, the June version is either significantly better or about the same.

> OpenAI’s APIs have *quietly* changed in quality a lot in the past few months"

The paper investigates the difference between gpt-4-0314 and gpt-4-0613. The old version is to be supported until at least June 2024. Every OpenAI developer got an email introducing the new version.

Expand full comment

The new llama-2.0 (meta/facebook) is even more woke than chatGPT ( open-ai/microsoft )


Expand full comment

Everybody agrees that the chatGPT4 has gotten dumber in time.

Soon to be worthless.

Expand full comment