Search Results

Found 1 results for "b9b9b79fc32981969ef43eac37ee66c2" across all boards searching md5.

6/17/2025, 7:09:33 PM

>>8630350
On big pretrain runs when you can't actually overfit the model, loss, besides being an indicator of training going smoothly (pic related), may be a pretty useful metric. If you are seeing shit like this, you can immediately tell that something is fucked up.
One shouldn't really use it as a metric for tiny diffusion training runs at all, in any way.
>>8630360
>Gradients
Why don't you actually try to look at them instead?

Go to Thread

Page 1