>>106378307
it was to be expected, it was a different dataset mix. training loss is a bit of a moving target. validation got hit too but I think it was just because my validation set is only a single document slightly out of domain. but actual generations have been consistently getting a bit better too,