I'm currently training a tiny [text-image] to [text-image] diffusion model and it's fascinating how as training epochs pass, letters and (maybe?) language start to emerge. What I'm doing is unlikely to yield anything useful in practice, but I feel more confident about the idea being feasible in practice now.