>>106513204
Wouldn't it be better to just train the thing on stories or writing documents that you deem have good writing and logic in bed with any? I've always thought that these companies original approach of training on the entire internet was unbelievably inefficient and overkill. Yes having that amount of text resulted in the model knowing how to APPEAR intelligent and coherent instead of just mouthing off nonsense at first inference but there is no way in hell it NEEDS to have trillions of tokens at a minimum.