Convolutional Xformers for Vision Not sure what to make of the overall efficacy of the approach, but they report lifts from 1) switching optimizers from AdamW to SGD during training, and 2) turning off randaugment near the end of training, both of which seem like actionable (if somewhat mysterious) optimizations.
Share this post
2022-1-30: Xformers, ConvMixer…
Share this post
Convolutional Xformers for Vision Not sure what to make of the overall efficacy of the approach, but they report lifts from 1) switching optimizers from AdamW to SGD during training, and 2) turning off randaugment near the end of training, both of which seem like actionable (if somewhat mysterious) optimizations.