>>106159168
No company even pretrains on data longer than 8l or 4k. They then do "length extension" with synthetic data but it's obviously not going to learn anything about writing full novels.