2 Comments

I don't think the authors claim LIMA outperforms GPT-4, Claude, BARD. "responses from LIMA are either equivalent or strictly preferred to GPT-4 in 43% of cases"

Expand full comment

Ah, good catch. I read the colors in legend backwards and thought they were *winning* 57% of the time vs GPT-4. Should be fixed now.

Expand full comment