What is best of n sampling? I didn’t follow. Is that like, which of these n imag...

alextp · on April 13, 2022

Choose the top N according to the proxy objective and then use the real objective to choose the best out of those N candidates.

jmalicki · on April 15, 2022

That was my initial understanding, which left me confused.

But they're taking the top n according to the model, then taking the top according to the proxy, not actual, objective. This avoids the Winner's Curse problem of top model ranking with reasonable probability.

They are then comparing this to the highest scoring actual preference.