Anonymous
8/4/2025, 10:37:12 PM No.106142025
Why are we teaching our LLMs and similar AI models backwards? We give them the whole internet or just wikipedia and bunch of high quality books and expect it to generalize. If it is true that the AI transformer architecture is like a human brain, shouldn't we even try to attempt teaching it like a child? When you want your child to become a physicist you don't throw PHD level physics textbooks at them and tell them to read more and more until they start understanding it, so why are we doing this shit when teaching LLMs and multimodal models?
We should be making models that can do basic stuff toddler could do and then scale up from there. Make the LLM understand cause and effect and spelling before giving it texts explaining quantum mechanics. No fucking wonder that all LLMs and multimodal models are extremely predisposed to overfitting and require billion times more learning data then a human when they are learning the fact that trees are green by reading 20 fantasy novels and maybe catching the fact that when characters describe green forest they mean the tree leaves are green rather then like grass or moss. Testing LLMs on PHD level problems and competing with the smartest highschoolers at math, yet having the hardest test be connecting fucking boxes on a image is absolute joke. Moravec's paradox probably isn't the law of the fucking universe and people pretending it is is why there is next to no progress in the field. Why did none of the greatest minds being offered billion dollar salaries by Zuck think of just teaching it simple verbal and special puzzles first and then dumping the internet on it? And if they did then why aren't there news about some dummy toddler model crushing these basic visual benchmarks that hypemen billionaires waste billions trying to edge out extra 2% improvement on?
We should be making models that can do basic stuff toddler could do and then scale up from there. Make the LLM understand cause and effect and spelling before giving it texts explaining quantum mechanics. No fucking wonder that all LLMs and multimodal models are extremely predisposed to overfitting and require billion times more learning data then a human when they are learning the fact that trees are green by reading 20 fantasy novels and maybe catching the fact that when characters describe green forest they mean the tree leaves are green rather then like grass or moss. Testing LLMs on PHD level problems and competing with the smartest highschoolers at math, yet having the hardest test be connecting fucking boxes on a image is absolute joke. Moravec's paradox probably isn't the law of the fucking universe and people pretending it is is why there is next to no progress in the field. Why did none of the greatest minds being offered billion dollar salaries by Zuck think of just teaching it simple verbal and special puzzles first and then dumping the internet on it? And if they did then why aren't there news about some dummy toddler model crushing these basic visual benchmarks that hypemen billionaires waste billions trying to edge out extra 2% improvement on?
Replies: