What about a hybrid method?
Render primitive, simple meshes in real time, and use essentially a glorified inpaint generative AI to generate the lighting and details. You could even (slowly) render a more detailed "reference" image and feed it into an AI to make sure things look right.