Anonymous
8/28/2025, 1:58:35 PM
No.106411437
[Report]
I am still convinced that large-scale LLM training has been a mistake. Over time, the models have become better at modeling language, but more ignorant about trivia and other stuff relevant for RP and storywriting largely because of picrel. Post-training only mitigates some of its implications. Synthetic data, which may or may have not been used during pretraining, has little to do with it.