Search Results

Found 1 results for "cc36616a907f282d7073ea7ceb430c51" across all boards searching md5.

llama.cpp CUDA dev !!yhbFjk57TDr/g/105734070#105736983
6/28/2025, 10:03:12 PM
>>105734070
I'm starting to think making language models play chess against each other was a bad idea.
With my current setup I'm making them play chess960 against themselves where they have to choose a move from the top 10 Stockfish moves.
The goal is for the models to select from the top 1,2,3,... moves more frequently than a random number generator.
If I just give them the current board state in Forsyth–Edwards Notation none of the models I've tested are better than RNG, probably due to tokenizer issues.
If you baby the models through the notation they eventually seem to develop some capability at the size of Gemma 3 27b, that model selects from the top 2 Stockfish moves 25% of the time.
My cope is that maybe I'll still get usable separation between weak models if their opponent is also weak; Stockfish assumes that the opponent is going to punish bad moves, but if both players are bad maybe it will be fine.
Otherwise I'll probably need to find something else to evaluate the llama.cpp training code with.