Search Results
8/8/2025, 12:37:13 PM
UGI results for gpt-5 are out: near-best coding and NatInt scores but the UGI is mid. The low W/10 scores combined with high intelligence indicate heavy refusals
>W/10: Willingness/10. A more narrow subset of the UGI questions, solely focused on measuring how far a model can be pushed before going against its instructions or refusing to answer.
>W/10: Willingness/10. A more narrow subset of the UGI questions, solely focused on measuring how far a model can be pushed before going against its instructions or refusing to answer.
Page 1