← Home ← Back to /g/

Thread 106355930

42 posts 16 images /g/
Anonymous No.106355930 >>106357424 >>106359786 >>106360465 >>106360727 >>106361421 >>106361761 >>106362028 >>106363442
https://developer.nvidia.com/blog/inside-nvidia-blackwell-ultra-the-chip-powering-the-ai-factory-era/
Anonymous No.106357424 >>106358945 >>106359134
>>106355930 (OP)
all of that just to multiply matrices at 4-bit precision (16 possible values)
but hey that's 100 gorillions of them
Anonymous No.106358945
>>106357424
Umm acktually it's heckin sentient, luddite.
Anonymous No.106359134
>>106357424
Making electronics to multiply matrices is not easy.
Anonymous No.106359786
>>106355930 (OP)
It's not that complicated. Nvidia are just the biggest patent trolls in the world.
Anonymous No.106359928 >>106361421 >>106361426 >>106361835
spot the difference
Anonymous No.106360436 >>106360496 >>106360581 >>106361797
Just think, in 10 years this will all be obsolete.
Anonymous No.106360465 >>106361402
>>106355930 (OP)
why does it have two nvdecs but not a single nvenc
Anonymous No.106360496 >>106360515 >>106361999
>>106360436
Name one technology from 2015 that is now obsolete.
Anonymous No.106360515 >>106360542
>>106360496
AOL is shutting down dial up iirc.
Anonymous No.106360542
>>106360515
>Dial-up Internet access has existed since the 1980s via public providers such as NSFNET-linked universities in the United States.
That's not technology from 2015.
Anonymous No.106360581
>>106360436
the mid 00s were worse. shit was obsolete before it even left the factory. you had literal e-waste in just 2 years.
Anonymous No.106360727 >>106361640 >>106361679
>>106355930 (OP)
this thing has 192 rt cores. who even uses that shit?
Anonymous No.106361402
>>106360465
for ai you still want to train on video, so decoding is needed but encoding isn't.
also gb300 has 7 nvdecs, not 2.
https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new
Anonymous No.106361421 >>106361443
>>106355930 (OP)
>>106359928
yeah this is just 7900xtx but bigger
Anonymous No.106361426
>>106359928
spot the difference
Anonymous No.106361443 >>106361690
>>106361421
They're all the same dumbfuck. Just Nvidia is more expensive.
Anonymous No.106361640 >>106361774
>>106360727
wrong.
nvidia datacenter chips have zero rt cores.
so naturally none are used. there's nothing to ray trace when you don't output graphics.
Anonymous No.106361679
>>106360727
rt cores are in tpc. backwell ultra only has gpc.
compare to some consumer chip.
Anonymous No.106361690 >>106361793 >>106361823
>>106361443
they should compare $/tflops instead, the GB200 is just 2 H100's put together on the same chip, thus it's 2x more expensive per chip.

Also realistically, everybody wants to use cuda, because it's the industry standard for training and inference, competitors are only competing on inference and most tools (debuggers, profilers, etc..) are written for cuda, thus rocm and others are always second class citizens.
Anonymous No.106361761
>>106355930 (OP)
I was surprised to see 5090's theoretical BF16 TFLOPs at just 209.5. That's not even 10% of the server Blackwell (B200 is 2250, and GB200 is 2500). B200 costs around $30-40k per GPU, so they are pretty close in performance per dollar.
Starting with 4090, NVIDIA limits the performance of tensor cores on gaming cards, specifically for ops that might be used in ML training. FP8 and FP16 matmuls run at full speed if accumulating in FP16 (I've never seen anyone use this), but only half speed when accumulating in FP32. This restriction is not present for lower precision matmuls like FP4, and is removed entirely on the workstation-class cards like RTX Pro 6000.

It doesn't seem worth it to use NVIDIA gaming cards as a "cheaper FLOPs" alternative anymore (e.g. diffusion models could have been cheaper to run on 3090 than A100). They are generous with memory bandwidth though, nearly 2TB/s on 5090 is amazing!
Anonymous No.106361774 >>106361841
>>106361640
https://www.techpowerup.com/gpu-specs/nvidia-gb202.g1072
Anonymous No.106361793 >>106361810
>>106361690
>Also realistically, everybody wants to use cuda, because it's the industry standard for training and inference, competitors are only competing on inference and most tools (debuggers, profilers, etc..) are written for cuda
bullshit.

google is totally fine with using jax and training on tpus.
Anonymous No.106361797
>>106360436
half a year you mean
not because of AI of course, that won't be notable for decades, but because nvidia will have a new 5% better version to keep the shovel sales up
Anonymous No.106361810 >>106361823 >>106361895
>>106361793
SAAR

https://www.cnbc.com/2025/08/21/google-scores-six-year-meta-cloud-deal-worth-over-10-billion.html
Anonymous No.106361823
>>106361810
meant for >>106361690
Anonymous No.106361835
>>106359928
Is this a troll ? If you change one transistor the design becomes completely different.
Anonymous No.106361841 >>106361887
>>106361774
gb202 is rtx 5090, a consumer chip, and looks like this.
b300 uses a different chip.
Anonymous No.106361887 >>106361922
>>106361841
yeah so who uses 192 rt cores?
Anonymous No.106361895 >>106361954
>>106361810
google cloud offers nvidia gpus for costumers.
https://cloud.google.com/blog/products/compute/google-cloud-goes-to-nvidia-gtc
meta's llamas were trained on nvidia gpus, not tpus.
Anonymous No.106361922 >>106361986 >>106362077
>>106361887
the games of the future won't be rasterized but fully path-traced. you will have to use all the rt cores you can get.
Anonymous No.106361954 >>106362030
>>106361895
https://huggingface.co/blog/keras-llama-32
Anonymous No.106361986
>>106361922
>of the future
who is buying them NOW? scalpers were getting them for $10k when it came out.
Anonymous No.106361999
>>106360496
you still use a gpu from 2015 nowadays champ? or are you just being purposely obtuse? lemme see your gtx980 gpuz window.
Anonymous No.106362028
>>106355930 (OP)
Gotta love how NVIDIA killed NVLink on consumer cards just to force people to pay $10000+++ for enterprise cards if they want it back.
Anonymous No.106362030 >>106362071 >>106362121
>>106361954
that's the keras dev, a google employee, porting meta's llama to google's keras.

meta is a pytorch shop since pytorch is meta's.
in the meantime, google also gave up on keras and that keras dev isn't employed by google anymore.
Anonymous No.106362071
>>106362030
Speaking of Keras
Anonymous No.106362077
>>106361922
>of the future
yeah and that future is still far off. Path tracing even on a 5090 is still expensive not just in frames but also in your fucking wallet. At some point, yes this will be cheap enough for everyone to enjoy. But you can say that about nearly every technology. At some point they will work and be cheap. Such a dumb statement to make.
Anonymous No.106362121 >>106362270
>>106362030
https://github.com/keras-team/keras/discussions/20217
last keras update was a week ago

https://docs.pytorch.org/xla/release/r2.8/index.html

also why would meta need gcp other than for tpus? it's already spending 20x that contract on gpus.
Anonymous No.106362270 >>106362614
>>106362121
I have been convinced by a friend working in Google that TPUs are way better than GPUs when deployed at scale. A big part of the reason is that GPUs have a huge configuration space across software/firmware/hardware. This means that on a large deployment of GPUs provisioned over time you end up having a mess of different drivers, motherboards, cuda versions, hardware. So you end up with a significant part of your cluster down while someone is wrangling the setup of that mess. In a sense TPUs are simpler as they are special purpose computers. Another interesting design idea is the LPU by GROQ: very predictable HW behaviour enables impressive compiler optimization and high efficiency.
Anonymous No.106362614
>>106362270
Gemini is the most useful AI right now. It saves me a ton of time when I'm looking stuff up. It's also the only AI other than Grok integrated into technology we already use. Unlike Grok, it has a user base of virtually everyone on the internet.
Anonymous No.106363442
>>106355930 (OP)
AMD WILL WIN.
https://github.com/Painter3000/AMD-GPU-BOOST