>>105886333 (OP)most of the recent AI gains have come from reinforcement learning which doesn't need more example data, instead it just needs a signal as to what's better or worse. you can still use humans for that if you want for subjective measures relatively cheaply, and you can automatically do it for objective ones so your only real limit becomes how much compute you have to spare. that's why all the big model gains have been in objectively verifiable domains like math and coding.
data in the end is just a bootstrap for the process, if it was ALL you relied on then you wouldn't be able to surpass the performance of the best examples in the data, which recent models have been doing for a while.