Anonymous
8/14/2025, 11:11:49 AM
No.106256433
>>106256360
Who says they actually did true knowledge distillation, though? What if they just used larger models to create high-quality synthetic data, which many also call distillation? In that case, the more compute you put (for example with iterative refining), the better the data can potentially be, even exceeding that of the original model on the first pass.
Who says they actually did true knowledge distillation, though? What if they just used larger models to create high-quality synthetic data, which many also call distillation? In that case, the more compute you put (for example with iterative refining), the better the data can potentially be, even exceeding that of the original model on the first pass.