Well I did some moar testing on this FP32 CLIP.
Seems to have potential imo. There is also another FP32 Illustrious CLIP floating around that I want to get around to testing.
Not shown here, I have tested another FP16 CLIP, which seems to have exhibited behavior more similar to the FP16 CLIP inside the model(don't have a huge experiment sample size for that, admittedly).
While whatever "restoration" this guy did also has a significant effect on quality most probably, I believe that the text encoder benefits from FP32 precision. Further evidenced by the change seen here when loading FP32 CLIP as FP16>>106187649. (Conversely, the image also changes when you load the FP16 CLIP as FP32, not necessarily for the better or worse though)