>after years and years, check how much VRAM is required to have AI model with voice
>"DeepSeek-R1-0528 requires approximately 1.37 TB in FP16 and Sesame AI is 1.28 TB, the combined requirement is 2.65 TB of VRAM"
>???
Sesame still sounds like robot shit and DeepSeek thinks for 5-10 minutes to think anything. Sesame is also proprietary and other models are TTS tier shit. When software improves and we can run a real human in our $50-100 phone without phone being extremely hot? Even if we would meme something so ass that it doesn't make any sense like only having 8x3090, it would still cost over $4000 to even larp some low iq AI with poor TTS.