Search Results

Found 1 results for "e55092bdfe42b4cbf5dea46d1d304ef6" across all boards searching md5.

6/13/2025, 7:38:47 PM

Bob — Power User with a 24GB GPU and 32GB RAM
Wants to experiment with advanced workflows using FLUX.1-dev, ControlNet, and IPAdapter
Bob has a strong machine:
A 24GB GPU (e.g., 3090, 4090)
32GB of system RAM
He's running FLUX.1-dev as his base model, with:
CLIP-L and T5-XXL encoders ~11GB memory combined
A quantized version of FLUX’s 24GB bf16 transformer to FP8 (≈12GB VRAM)
VAE for decoding
Plus extras like ControlNet, IPAdapter, LoRA adapters
Bob has to offload constantly to fit everything, and if he adds too many extras, he hits system RAM limits during offload. Memory juggling becomes the bottleneck.
With Honey, Bob offloads text encoders and VAE remotely, freeing up the rest of his system for more ControlNets, IPAdapters and LoRAs. Honey gives Bob breathing room to experiment and push boundaries instead of memory limits.
>>105583448
:3
>>105583535
yeah zeromq handles queuing internally. not sure whether comfy api supports queueing, does it return the result directly or give like a uuid where you check the result? either way comfy would have to process the entire workflow
i'll have to write up some user stories for the internal deployment case, should help explain it better
>what about llm?
for hidream's llama usage yes i can handle that, not right now because i only have 1 gpu. regular text generation llm usage no like there's a million other solutions for that

Go to Thread

Page 1