2021-1-16: Grokking, Semantic segmentation with {BERT embeddings, only image-level labels}
dblalock.substack.com
Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets OpenAI paper where they try to teach neural nets to solve problems with algorithmic solutions. The bayes error rate for these problems is zero, but the relationships are not as smooth as in most traditional tasks, making them much harder to learn. What they find is that long after the model has memorized the training set, it suddenly starts doing well on the validation set. They refer to this phenomenon as "grokking". Not that actionable, but interesting work that thinks more deeply about the nature of intelligence than the typical deep learning paper.
2021-1-16: Grokking, Semantic segmentation with {BERT embeddings, only image-level labels}
2021-1-16: Grokking, Semantic segmentation…
2021-1-16: Grokking, Semantic segmentation with {BERT embeddings, only image-level labels}
Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets OpenAI paper where they try to teach neural nets to solve problems with algorithmic solutions. The bayes error rate for these problems is zero, but the relationships are not as smooth as in most traditional tasks, making them much harder to learn. What they find is that long after the model has memorized the training set, it suddenly starts doing well on the validation set. They refer to this phenomenon as "grokking". Not that actionable, but interesting work that thinks more deeply about the nature of intelligence than the typical deep learning paper.