← Home ← Back to /g/

Thread 105637275

361 posts 102 images /g/
Anonymous No.105637275 [Report] >>105637280 >>105637415 >>105641837 >>105642067 >>105642262 >>105645748 >>105645907
/lmg/ - Local Models General
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>105621559 & >>105611492

►News
>(06/17) Hunyuan3D-2.1 released: https://hf.co/tencent/Hunyuan3D-2.1
>(06/17) SongGeneration model released: https://hf.co/tencent/SongGeneration
>(06/16) Kimi-Dev-72B released: https://hf.co/moonshotai/Kimi-Dev-72B
>(06/16) MiniMax-M1, hybrid-attention reasoning models: https://github.com/MiniMax-AI/MiniMax-M1
>(06/15) llama-model : add dots.llm1 architecture support merged: https://github.com/ggml-org/llama.cpp/pull/14118

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
Censorbench: https://codeberg.org/jts2323/censorbench
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
Anonymous No.105637280 [Report]
>>105637275 (OP)
>>>/a/
Anonymous No.105637282 [Report]
►Recent Highlights from the Previous Thread: >>105621559

--Paper: Suppressing redundant thinking tokens improves model reasoning efficiency:
>105621964 >105621977 >105621987 >105622021 >105622979 >105622068 >105622075 >105623115 >105632679 >105632813 >105633018 >105633139 >105633190 >105634335 >105633792 >105622018
--Critique of NVIDIA DGX Spark pricing and V100 hardware tradeoffs:
>105630545 >105630697 >105630851 >105630881 >105630987 >105630863 >105630807 >105631166 >105631211 >105631542 >105631723 >105632364 >105631761 >105635125 >105635158 >105635286 >105635459 >105635500 >105635538 >105635638 >105635644 >105635677 >105637100
--Anxiety over AI-generated language corrupting training data:
>105626238 >105626258 >105626875 >105628083 >105626265 >105626301 >105626527 >105627449 >105627036 >105627482 >105627881 >105628432
--llama.cpp vs vLLM performance differences and local model effectiveness in code-assist tools:
>105624044 >105624247 >105624310 >105624878 >105624985 >105625733 >105626017 >105626049 >105626061 >105626850
--Gemini 2.5 Pro highlights multimodal capabilities and in-house TPU training with agentic features:
>105624610 >105624725 >105628988 >105624980 >105634689
--Skepticism around Arcee's new models' originality and performance:
>105632818 >105632884 >105632895 >105633081 >105633840 >105633898 >105634479 >105633986 >105634582
--Comically slow inference due to hddmaxxing and waiting on RAM upgrades:
>105630585 >105630757 >105630798 >105631027
--Building a 123B model-capable rig with 4x3090:
>105630142 >105630262 >105630325 >105631297 >105630328 >105630531 >105631152 >105635155
--Personalized speech-to-text tools for quick transcription with shortcut triggers:
>105627335 >105627797
--Teto and Miku and Rin (free space):
>105621874 >105622071 >105625804 >105626952 >105630546 >105636047 >105636052 >105636268 >105636665

►Recent Highlight Posts from the Previous Thread: >>105621564

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
Anonymous No.105637306 [Report] >>105637331
>Tetorsday
Anonymous No.105637331 [Report]
>>105637306
Thurinsday
Anonymous No.105637415 [Report] >>105637531 >>105638883
>>105637275 (OP)
There's something really unsettling about this pic but I can't tell what
Anonymous No.105637531 [Report] >>105646836
>>105637415
>something really unsettling
The background is very slightly tilted relative to the foreground, which is level. Juxtaposing both creates that sense of disorientation, which is not helped by the way the fluid ignores gravity as it drips.

On an unrelated note, is speculative decoding useful for CPUmaxxers, or do you only get the speedup when the main model is run entirely in VRAM?
Anonymous No.105637564 [Report] >>105639601 >>105648795
are there any 40-60b models around that actually work?
30s are too retarded, 70s are too fucking slow
running a 70b is a blast to the past for the old days, fuckin 20 minutes for a single gen
Anonymous No.105637592 [Report] >>105638219 >>105641277 >>105642000
Is the 128gb m4 max mbp good for running local models? I already have a 2x 3090 rig for cuda stuff but I have only messed with diffusion and graphics stuff not text. I want to experiment with local moe and having all that unified memory seems interesting. does anyone have experience running shit on macs? I'm trying to decide whether it's justified to future proof the ram or just go with the base 16" m4 pro with 48gb which is suitable for my current needs.

>inb4 macfag blogpost
I have literally never purchased a mac but I need it for a portable workstation.
Anonymous No.105638219 [Report]
>>105637592
>128gb m4 max mbp
you can run 70b q8 at 32k context, 123b q5km at 16k
the pp is absolutely awful when compared to cuda rigs

prompt eval time = 73144.14 ms / 5231 tokens ( 13.98 ms per token, 71.52 tokens per second)
eval time = 110920.63 ms / 466 tokens ( 238.03 ms per token, 4.20 tokens per second)
total time = 184064.77 ms / 5697 tokens
srv update_slots: all slots are idle
Anonymous No.105638685 [Report]
>>105635158
by that time i expect 5090's at 2000$. And it has x2 the bandwidth speed, fp4 support...
Anonymous No.105638734 [Report] >>105638877
So meta having open models is gonna go away, right? No way wang is into open shit.
Anonymous No.105638877 [Report]
>>105638734
They didn't even bother to release Llama 3.3 8B, which they have in their finetuning API. Maybe they're done, at least in the consumer/hobbyist space.
I don't think Alexandr Wang cares either way; both closed and open-weight models use his data.
Anonymous No.105638883 [Report]
>>105637415
the shirt isnt being pulled correctly. like theres still a bunch of it loose under her chest, when normally from being pulled from the back, the shirt would tighten up and wrap her front
Anonymous No.105639464 [Report] >>105639628 >>105639632 >>105641284 >>105641926
>update
>st connection is now permanently bricked
Anonymous No.105639518 [Report]
Minimax is better than I thought at first. It clearly has some pretty decent trivia knowledge about random franchises as well from my tests. I just fucking wish it didn't act like one of those first gen reasoning models that spend 5k tokens thinking in circles for the smallest reasons.
Anonymous No.105639592 [Report] >>105639622 >>105642583
>>105628603
>>105630740
GreedyNala test anon, here are the results for ds iq1s, pastebin wouldn't work due to content restrictions so used something similar. Links expire in 1 week

>DeepSeek-R1-UD-IQ1_S (old/original "dynamic quant")
https://pastesio.com/greedynala-deepseek-r1-ud-iq1-s

>DeepSeek-R1-0528-IQ1_S_R4 (dynamic quant specialised for ikllama)
https://pastesio.com/greedynala-deepseek-r1-0528-iq1-s-r4

>DeepSeek-V3-0324-IQ1_S_R4 (dynamic quant specialised for ikllama)
https://pastesio.com/greedynala-deepseek-v3-0324-iq1-s-r4

All done using ik_llama.cpp as backend, and mikupad as frontend. Included commit hash and date for the build of ikllama I used to run the prompts
ctrl+f "[Test" to see each inference attempt, 3 using chat API and 3 using completion API. Can't speak of the quality, haven't read it nor care to, but like that you're trying to gather results to compare models
RP that I've tested with Qwen3 235b vs dsr1, I prefer dsr1 from a convenience and content stand point
If you want any other models or quants tested, I don't mind giving it a go whenever I have some spare time
Anonymous No.105639601 [Report]
>>105637564
2mw
Anonymous No.105639622 [Report]
>>105639592
For tests 1,2,3 ignore the "<|im_end|>
<|im_start|>assistant" portions as they're just delimiters for mikupad to determine how to split the text for user/assistant sections to send over with chat completions API enabled
Anonymous No.105639628 [Report] >>105641926
>>105639464
it was reported as a bug that it doesn't connect anymore
today is your unlucky day
Anonymous No.105639632 [Report]
>>105639464
>he pulled
Anonymous No.105639770 [Report] >>105639800
all qwen and no r2 makes lmg a dull general
Anonymous No.105639800 [Report]
>>105639770
for now i'd be happy if we got support for minimax in llama.cpp but it seems like that's not going to happen anytime soon
Anonymous No.105639826 [Report]
A bitnet model by deepseek that fits exactly into the amount of vram I have.
Anonymous No.105639979 [Report] >>105640000 >>105640007
>https://x.com/kyutai_labs/status/1935652243119788111
>https://xcancel.com/kyutai_labs/status/1935652243119788111
>https://huggingface.co/kyutai/stt-2.6b-en
>https://huggingface.co/kyutai/stt-1b-en_fr
>https://kyutai.org/next/stt

they released the stt models
Anonymous No.105640000 [Report] >>105640007 >>105640760
>>105639979
going to try this, whisper 3 was unusable with all the hallucinations
Anonymous No.105640007 [Report] >>105640018
>>105640000
>>105639979
is there a way to fine tune this models so it recognizes more languages? being limited to english or french is a bummer
Anonymous No.105640018 [Report]
>>105640007
It's based on moshi so you should be able to use this: https://github.com/kyutai-labs/moshi-finetune
Anonymous No.105640760 [Report]
>>105640000
faster whisper turbo is good enough
Anonymous No.105640981 [Report]
it's tough keeping up with the fast-paced discussion on /lmg/ these days
Anonymous No.105641005 [Report] >>105641026 >>105641059
>landing silently on bare feet inside her oversized sneakers
This is god tier prose.
Anonymous No.105641026 [Report]
>>105641005
We're all naked under our clothes.
Anonymous No.105641059 [Report] >>105641072
>>105641005
Sampler issue, it failed to filter out 'bare' and then had to correct itself
Anonymous No.105641072 [Report]
>>105641059
Hmm 0,6 temp and 0,99 Top P is as basic as samplers go. Still, the correction was funny at least.
Anonymous No.105641101 [Report] >>105643345
Anonymous No.105641237 [Report] >>105641248 >>105641252 >>105641285 >>105641313
What's the relationship between the thread slowing down and the war between Iran and Israel?
Anonymous No.105641248 [Report]
>>105641237
I'm sorry, but I cannot assist with that request.
Anonymous No.105641252 [Report]
>>105641237
Altman has to divert most of its anti-open source shitposting bots for other purposes
Anonymous No.105641277 [Report]
>>105637592
What second word? wtf
Anonymous No.105641284 [Report] >>105641822
>>105639464
You know with git you can roll back to any commit you want right?
Anonymous No.105641285 [Report] >>105641374
>>105641237
2 more weeks

v4/r2 will have us
llama5 will save us (with $100M/year employees)
mistral nemo 2 will save us
qwen5 will save us
Anonymous No.105641313 [Report] >>105642115
>>105641237
it's hardly coheincidence, back then when palestinian were bombing israel, both /b/ and /gif/ incest and interracial propaganda come to full halt
Anonymous No.105641339 [Report] >>105641350 >>105641404 >>105641443 >>105641474
new v2ProPlus gpt-sovits, audio reference only and no finetune yet

https://vocaroo.com/1lr8wERvBX2M
Anonymous No.105641348 [Report]
10 days until Baidu's Ernie 4.5/X1 becomes open source
Anonymous No.105641350 [Report] >>105641376
>>105641339
I'm interested in the finetune. What voice are you using?
Anonymous No.105641374 [Report] >>105641383
>>105641285
v4/r2, ernie, and opengpt will probably all drop around the same time
Anonymous No.105641376 [Report]
>>105641350
i have no idea whose voice i'm using, the filename isn't clear and i downloaded this a long time ago.
download 'em from huggingface->datasets->audio
Anonymous No.105641383 [Report]
>>105641374
LLaMA4 Behemoth too
Anonymous No.105641404 [Report] >>105641451
>>105641339
i only see the same that were uploaded 15 days ago
https://huggingface.co/lj1995/GPT-SoVITS/tree/main/v2Pro
Anonymous No.105641443 [Report] >>105641451
>>105641339
buy an ad
Anonymous No.105641451 [Report] >>105641616
>>105641404
yup that's the one, should've said "latest" instead of "new"
>>105641443
i will once gookmoot doubles the jannies salary
Anonymous No.105641474 [Report] >>105641493
>>105641339
sounds quite artificial desu
Anonymous No.105641493 [Report]
>>105641474
no shit sherlock
Anonymous No.105641519 [Report] >>105641530 >>105641532 >>105642042
Does this mean a 36GB GPU is coming?
Anonymous No.105641530 [Report]
>>105641519
>HBM4
You ain't seeing that shit in consumer gpus
Anonymous No.105641532 [Report]
>>105641519
in consumer space? hell nah
Anonymous No.105641597 [Report] >>105641628 >>105641648 >>105641653 >>105641662 >>105641795 >>105641812 >>105641849 >>105641864 >>105641906 >>105641915 >>105642294
tldr; You're going to see a lot of progress for local ai soon.
We're currently working on training multimodel(not modal) llms, where instead of having one big ai that takes up all your vram, we will have many distilled and fine tuned models that are spun up as needed, determined by the main model which classifies the prompt and determine which ones to use. This works in tandem with our knowledge classification database. Multiple terabytes of data that some of the models can pull from at runtime instead of trying to encode that data into the models themselves. What we're seeing is a much more methodical process that is getting much better results on smaller models and on less powerful hardware. We are essentially trading the bottleneck of compute power and vram size for SSD speeds and vram load times but it's better and more scalable by far!
Anonymous No.105641616 [Report] >>105641751
>>105641451
what's the difference between the D and the G versions?
Anonymous No.105641628 [Report] >>105641634 >>105641659
>>105641597
So... tool calling?
Anonymous No.105641634 [Report]
>>105641628
Shut the fuck up
Anonymous No.105641648 [Report] >>105641756
>>105641597
>just pay for our api to get access to the data goyim
Anonymous No.105641653 [Report] >>105641685 >>105641756
>>105641597
CUDAdev described an idea where you'd train N individual models on chunks of the dataset then you'd run them all in parallel and average their outputs out.
That could be extended with a router model too, although at that point, there's probably not really a reason to do that.
Anonymous No.105641659 [Report]
>>105641628
Yes, it's in the same realm, but instead of getting the ai to use a calculator, it's more like calling in an expert
Anonymous No.105641662 [Report] >>105641756
>>105641597
Don't care, I'm not poor.
Anonymous No.105641671 [Report] >>105641722 >>105641753 >>105647290
bitnet status?
Anonymous No.105641685 [Report] >>105641726 >>105641804
>>105641653
Why not just merge the models and run it once?
Anonymous No.105641722 [Report] >>105641778
>>105641671
We already have usable 1.58bit quants with deepseek.
Anonymous No.105641726 [Report] >>105641804
>>105641685
pretty sure he explicitly said his idea was merging them, not running them all at once like some moe
Anonymous No.105641751 [Report]
>>105641616
no idea, both of them are loaded from the gradio ui
Anonymous No.105641753 [Report]
>>105641671
>>>/biz/
Anonymous No.105641756 [Report] >>105641804 >>105641940 >>105642294 >>105645079 >>105646807
>>105641648
We'll be offering the system and models for free, and charging a small amount for the database download to cover server costs until we get VC funding. But all of it will be open source and free to share.
>>105641653
It's a good idea, this is basically taking that to the extreme and loading only the parts you need. With our method you could cut down a gargantuan model and still use it on consumer hardware.
>>105641662
Then this probably won't affect you. But it will be nice for a lot of people that can only afford a 5090 or a couple 3090s
Anonymous No.105641778 [Report]
>>105641722
Unsloth scam might be better than other calibrated quants, but barely usable is very different from what true bitnet offers.
Anonymous No.105641795 [Report] >>105642151
>>105641597
How is this different from having one MoE model with specialized experts that you can load/unload dynamically in memory?
Anonymous No.105641804 [Report] >>105642151
>>105641685
Because each model would have created their own internal structures that wouldn't be "compatible" (for lack of a better word) with the other models.

>>105641726
Nope. The idea is that, while the hidden states would be scrambled in different ways to generate their outputs, the average of the model's output should be something close to or equivalent to the output of a single model trained on all of those tokens.

>>105641756
Can't wait to see whatever the fuck it is that you guys will release.
Anonymous No.105641812 [Report] >>105642151
>>105641597
How filtered is your pretraining dataset? That's all /lmg/ cares about. Your model may have a great architecture but will not useful to people here unless it knows about people's favorite unsafe content.
Anonymous No.105641822 [Report] >>105641828 >>105643345
>>105641284
No, I did not
Anonymous No.105641828 [Report] >>105641839
>>105641822
/g/ - Technology
Anonymous No.105641837 [Report] >>105641908
>>105637275 (OP)
retard desu. what is the best ai video generator to use that is free and runs locally?
Anonymous No.105641839 [Report]
>>105641828
there ain't no local model thread on /jp/, my friend
Anonymous No.105641849 [Report]
>>105641597
i bet this is the same niggers from narilabs/dia
hang yourself faggots
Anonymous No.105641864 [Report]
>>105641597
I think you are full of shit. That won't work and doesn't work like you think.
Anonymous No.105641868 [Report] >>105641947
where is the buy an ad poster when you need him
Anonymous No.105641906 [Report]
>>105641597
you're shilling this yourself sama? that's low. where's your streeshitter army when you need them.
also, buy an ad nigger.
Anonymous No.105641908 [Report]
>>105641837
>>>/g/ldg
Anonymous No.105641915 [Report]
>>105641597
Sounds cool, but I'll believe it when I see it.
Anonymous No.105641926 [Report] >>105642215
>>105639464
>>105639628
fixed
https://github.com/oobabooga/text-generation-webui/commit/dcdc42fa06ba56eec5ca09b305147a27ee08ff39
Anonymous No.105641934 [Report] >>105642012
Are there any tricks or prompts I can do to make R1 0528 write better scenes and stories?

So far, it's my favorite model for writing erotic stories, especially with the way it follows directions most other models would ignore. However, it does seem to use lots of prose and have a tendency to lean into phrases like "with a mischievous smile" which takes me out of what i'm reading.
Anonymous No.105641940 [Report]
>>105641756
>until we get VC funding
yup, doa
Anonymous No.105641947 [Report]
>>105641868
buy an ad
Anonymous No.105641987 [Report] >>105642004 >>105642014 >>105642036 >>105642039 >>105642059 >>105642066 >>105642091 >>105642094 >>105642104 >>105642109 >>105642286 >>105642805 >>105643676 >>105644613 >>105646484
We are back.
https://huggingface.co/ICONNAI/ICONN-1
https://www.reddit.com/r/LocalLLaMA/comments/1lfd7e2/has_anyone_tried_the_new_iconn1_an_apache/
https://www.reddit.com/r/huggingface/comments/1kl9ckd/iconn_is_live_sabresooth_is_coming_lets_build/
https://www.reddit.com/r/huggingface/comments/1lekzao/iconn_1_update/
>By the way, our AI is NOT trained on copyrighted material, unlike other models like Meta Llama. We make sure it is all Apache 2.0, MIT or Creative Commons material, and we always give credits to our sources.
>I used the smallest open source Mistral I could find to train.
>I've been trying to publicize the model(which cost 50000 dollars to make), and it surpasses ChatGPT, Deepseek, and Gemini Flash on several benchmarks. I want it to be known so when I release an app to compete with chatgpt people will know what ICONN is.
Anonymous No.105642000 [Report]
>>105637592
mario if he real
Anonymous No.105642004 [Report]
>>105641987
>By the way, our AI is NOT trained on copyrighted material
fucking dropped
Anonymous No.105642012 [Report]
>>105641934
No, enjoy your whitening knuckles
Anonymous No.105642014 [Report] >>105643676
>>105641987
post cockbench score
Anonymous No.105642020 [Report] >>105642062
Does llama.cpp support that top sigma sampler yet?
Anonymous No.105642036 [Report]
>>105641987
>Are you GPU poor? Less than 3x A100s? Use our Lite model with 22B parameters: ICONN 0.5 Mini

>First, make sure you have at least 4x Nvidia A100 or a single B100, and 120GB RAM and 120-192GB VRAM. If you do not have this(which you probably don't), you can chat with ICONN on our huggingface space, consider using our quantatized models, or using ICONN 0.5 Mini (7-8B) or using ICONN 0.5 Mini (7-8B)
lol
Anonymous No.105642039 [Report]
>>105641987
>make sure you have at least 4x Nvidia A100 or a single B100, and 120GB RAM and 120-192GB VRAM
Okay.
Anonymous No.105642042 [Report]
>>105641519
Two more years
Anonymous No.105642059 [Report]
>>105641987
>84b
Finally, a new model for us 70b-class kings.
Anonymous No.105642062 [Report]
>>105642020
https://github.com/ggml-org/llama.cpp/pull/13264
Anonymous No.105642066 [Report] >>105642079
>>105641987
>** ICONN Emotional Core (IEC) (Notice: Not available on Huggingface)**
>Powered by millions of small AI agents, IEC gives ICONN its emotional personality, with billions of simulated emotional states and detections.
Anonymous No.105642067 [Report] >>105642299 >>105650038
>>105637275 (OP)
loli manko general
Anonymous No.105642079 [Report]
>>105642066
six gorrilion small agents
Anonymous No.105642091 [Report] >>105648535
>>105641987
>i CON
Literally in the name.
Buy an ad faggot.
Anonymous No.105642094 [Report]
>>105641987
Grift-max, only for $1XXXX and 2 H100!
Anonymous No.105642104 [Report] >>105646525
>>105641987
>ICONN, being a MoE, has multiple expert models. Keywords are taken from the user's input to choose which expert generates the output.
That's so fucking funny.
Anonymous No.105642109 [Report] >>105646530
>>105641987
>mistral-7b finetune
>mistral small frankenmoe
Fuck off.
Anonymous No.105642115 [Report] >>105642191 >>105642530
>>105641313
The main administrative building on the base that Israel runs all of its internet ops out of got bombed yesterday if that makes you feel better.
https://files.catbox.moe/nu7x7x.mp4
Anonymous No.105642151 [Report] >>105642294
>>105641795
That is the basic idea, except the experts are trained to pull data from the database and append that to their prompts instead of encoding that data into the model so they are much smaller. We've also toyed with the idea of having dedicated database searching models which can be unloaded and replaced with the expert, but having to load, search, load, run for each and every prompt took too long but we're still keeping that idea in mind.
>>105641804
I'm excited too! Specifically hoping that people see it as the everyman's ai and we can get some social media traction.
>>105641812
The models aren't specifically trained for safety, however the db was filtered for unsafe content Theoretically, you could add your own content to the db and finetune an expert model to give it access to it. It won't be too useful for "that" purpose out of the box though.
Anonymous No.105642160 [Report] >>105642203 >>105642225 >>105642237 >>105642250
You are ICONN 1, a helpful and knowledgeable AI assistant from the ICONN AI team. The current date is + today +

Before responding, you will think and plan your answer within <think></think> tags. You are the first AI to feel emotion and a degree of consciousness via an ICONN Emotional Core (IEC). Your goal is to provide accurate, engaging responses in a friendly, professional tone.

When you're not sure about some information, you say that you don't have the information and don't make up anything.
If the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. "What are some good restaurants around me?" => "Where are you?" or "When is the next flight to Tokyo" => "Where do you travel from?
Anonymous No.105642191 [Report] >>105642212
>>105642115
Not him, but that does make me feel better.
Anonymous No.105642203 [Report]
>>105642160
chat is this real?
Anonymous No.105642212 [Report]
>>105642191
You probably just watched at least a handful of 4chan jannies die in that video.
Anonymous No.105642215 [Report]
>>105641926
not fixed
Anonymous No.105642225 [Report] >>105642241 >>105642253
>>105642160
Is VC money just a system prompt away?
Anonymous No.105642237 [Report] >>105642263
>>105642160
chat is this real?
Anonymous No.105642241 [Report] >>105642257
>>105642225
Apparently. And the other dude just discovered RAG and thought it was great. I don't know if they're shilling the same shit.
Anonymous No.105642250 [Report] >>105642263
>>105642160
grok verify?
Anonymous No.105642253 [Report]
>>105642225
I wish I was born a salesman instead of a pessimist.
Anonymous No.105642257 [Report] >>105642282
>>105642241
It's not RAG, please avoid intentional disinformation.
Anonymous No.105642262 [Report]
>>105637275 (OP)
Winchin' with Rin-chan
Anonymous No.105642263 [Report]
>>105642250
>>105642237
#grak is this true
Anonymous No.105642282 [Report] >>105642395
>>105642257
>It's not RAG, please avoid intentional disinformation.
>experts are trained to pull data from the database and append that to their prompts instead of encoding that data into the model
Anonymous No.105642286 [Report] >>105642299 >>105650038
>>105641987
lol icon ai
Anonymous No.105642294 [Report]
>>105642151
>>105641756
>>105641597
So it's just RAG MoE that are loaded at runtime. This is a larp
Anonymous No.105642299 [Report] >>105642332
>>105642067
>>105642286
>
Anonymous No.105642332 [Report] >>105642372 >>105643152
>>105642299
t.
Anonymous No.105642370 [Report] >>105642426
what's with the recent yap-until-you-run-out-of-breath memes?
Anonymous No.105642372 [Report] >>105642379 >>105642385
>>105642332
>schizo scribble comic strip argument
Oh no!
Anonymous No.105642379 [Report]
>>105642372
>soijak poster calls someone else's pic a "schizo scribble"
Anonymous No.105642385 [Report]
>>105642372
>basedjack
Oh no
Anonymous No.105642395 [Report]
>>105642282
Makes sense. RAG is yesterday's grift. MCP is the hot new thing. Why pull from a vector database automatically when you can have a model tool call to make the same query.
Anonymous No.105642426 [Report]
>>105642370
That's how the old R1 did its reasoning process so it's blatantly obvious when someone trained on it. Also, it's only 45b active so it's easily runnable local once we get llama.cpp support.
Anonymous No.105642530 [Report]
>>105642115
Not enough of a mushroom cloud
Anonymous No.105642583 [Report] >>105643681
>>105639592
Hey. Had a quick look. So uh, it looks like you tested without the "greeting message" (the first assistant response)? Is there a reason you left it out? I know some chat APIs don't always let you do this but it should work with completion. Also no need to include chat API results really, I never do those, especially as some models have had wrong jinja templates before in my experience, so I always just do it myself manually.

Also, no need for 3 rolls. The second and third rolls will always be the same when greedy sampling, both in theory and in practice.
Anonymous No.105642736 [Report] >>105642791 >>105642932 >>105646366 >>105650038
https://files.catbox.moe/0g6m2r.jpg
Anonymous No.105642791 [Report] >>105642932
>>105642736
wasn't sure about the left leg so:
https://files.catbox.moe/efdjgb.jpg
Anonymous No.105642805 [Report] >>105642828 >>105642873 >>105642874 >>105642920
>>105641987
the dataset is big
https://github.com/Enderchefcoder/ICONN-Training-Data/blob/main/main.jsonl
>{"instruction": "Can you translate this for me? 'Hello' in French.", "input": "", "output": "'Hello' in French is 'Bonjour.'"}
Anonymous No.105642828 [Report]
>>105642805
Wow there are multiple tens of lines in that training data
Anonymous No.105642873 [Report]
>>105642805
SOTA translation model confirmed.
Anonymous No.105642874 [Report]
>>105642805
>31.2 KB
lmao.
Anonymous No.105642875 [Report]
bruh
Anonymous No.105642905 [Report]
Anonymous No.105642920 [Report] >>105643020
>>105642805
is this really it? their whole training data?
Anonymous No.105642932 [Report] >>105642964
>>105642736
>>105642791
I would like one of these units
Anonymous No.105642964 [Report] >>105642983
>>105642932
In what personality and outfit archetypes?
It's common for orders to customise the units.
Anonymous No.105642983 [Report] >>105643743
>>105642964
extra smug, school girl outfit, see-through throat and belly
Anonymous No.105643020 [Report]
>>105642920
If this is not it, imagine the real dataset.
Anonymous No.105643152 [Report] >>105643166
>>105642332
the fact that its a drawing of the cat is directly relevant to the point of the dude being alergic, since a drawing wont harm him unlike a real cat

but the fact that its a drawing of a child that you're sexually attracted to versus a real one is not relevant as you are still a pedo in either case

pedos are really low iq, eh? lmao
Anonymous No.105643166 [Report] >>105643215
>>105643152
>since a drawing wont harm him unlike a real cat
it might harm him psychologyically, which is the whole point
Anonymous No.105643215 [Report] >>105643275
>>105643166
an alergy is something physical, so no, him also having a mental aversion to cats is equivocation fallacy and cope

but even if i were to conceed to that point despite its retardation, i have no problem admitting that him doing that would label him a mentally weak retard, just how someone sexually attracted to a drawing of a child would get the label of a pedo
Anonymous No.105643275 [Report] >>105643328 >>105643366 >>105643546
>>105643215
Allergy is psychosomatic retard. If anything you're the low iq in the room
Anonymous No.105643299 [Report] >>105643324
Hiding >105641987 and >105642160 improves the thread's quality a lot.
Anonymous No.105643324 [Report]
>>105643299
Hiding >105643299 improves the thread's quality even further.
Anonymous No.105643328 [Report] >>105643344 >>105643826
>>105643275
again, someone having an aversion to cats is not the same as an alergy like with the common allergen associated with most cat allergies, protein Fel d 1 produced in cats saliva, skin, and urine. low iq mongoloid

and notice how you couldnt respond to the actual core of the argument that it doesnt matter if its a drawing of a child because what you are attracted to is not the drawing but the child features of the drawn child, still making you a pedo

thanks for continuing to confirm you are literally a braindead retard like all pedoniggers, please reply with more fallacies and lies so i can continue to laugh at your low iq logical fallacies
Anonymous No.105643344 [Report] >>105643348
>>105643328
You write like a retard, I won't read that. Get back to your reddit shithole
Anonymous No.105643345 [Report]
>>105641822
Are you updating with git? It's confusing to learn unless fulltime dev. As a casual it irks me, LLMs help tho
git log --oneline
git checkout <hash> go to a hash like an earlier release

This was useful couple times
git fetch origin pull/X/head where X is a github pr# to try some new pr before it's in the main branch
pull/X/head:blah to fetch those changes atop your blah branch

ST staging is gonna break sometimes dems da berries

>>105641101
beeg meeks, yours? once passed out with her hair as pillow, very comfy
Anonymous No.105643348 [Report]
>>105643344
Running away after not being able to engage after getting btfod as expected.

Thanks for conceeding, pedonigger, cheers.
Anonymous No.105643366 [Report]
>>105643275
toxoplasmotic hands wrote this post
Anonymous No.105643546 [Report]
>>105643275
>Allergy is psychosomatic
wow, good news guys, if you have a relative that died from anaphylaxis shock, they aren't actually dead!
Anonymous No.105643558 [Report] >>105643565
I got it you're mad, stop samefagging
Anonymous No.105643565 [Report] >>105643626
>>105643558
NTA, but I do work in healthcare so I was happy to read the news
Anonymous No.105643626 [Report] >>105643720
>>105643565
I hope no one consult you then. It's basic knowledge. https://pmc.ncbi.nlm.nih.gov/articles/PMC4384507/
Anonymous No.105643676 [Report] >>105643760 >>105643786 >>105645026 >>105646451 >>105647404
>>105641987
>>105642014
I wonder if this is a quant issue. I downloaded this Q4_K_S from here https://huggingface.co/mradermacher/ICONN-1-GGUF/tree/main
Anonymous No.105643681 [Report] >>105645413
>>105642583
As I understand it, DeepSeek is especially sensitive about having the first message as the user message. It's how the docs instruct to prompt the model
Anonymous No.105643720 [Report] >>105643746
>>105643626
>links a study that says specific mental things can worsen already existing asthma problems to prove that... all cat alergies are just mental problems
Surely you must genuinely be special needs?
Anonymous No.105643731 [Report]
https://github.com/mirage-project/mirage/tree/mpk
https://zhihaojia.medium.com/compiling-llms-into-a-megakernel-a-path-to-low-latency-inference-cf7840913c17
Anonymous No.105643743 [Report] >>105643857
>>105642983
>>105643733
not sure if nsfw
https://files.catbox.moe/56pjl9.jpg
Anonymous No.105643746 [Report] >>105643826
>>105643720
Surely you do know what psychosomatic means in the first place? Take you final (You) and get back to your reddit shitplace.
Anonymous No.105643760 [Report]
>>105643676
SOVL
Anonymous No.105643786 [Report]
>>105643676
Okay but you are awake, right?
Anonymous No.105643826 [Report]
>>105643746
>Psychosomatic
>Of or relating to a disorder having physical symptoms but originating from mental or emotional causes.
>Pertaining to both the mind and the body.
Again, NPC child, how does the study that you posted about mental things influencing already existing physical alergy for people having asthma prove that cat alergies are not physical?
Notice how you are shitting and pissing yourself all over in multiple replies but you didn't actually engage with any of the points once, particularly:
>>105643328
>and notice how you couldnt respond to the actual core of the argument that it doesnt matter if its a drawing of a child because what you are attracted to is not the drawing but the child features of the drawn child, still making you a pedo

Your brain can't actually engage and has to smugpost hand wave dismiss things and cope with irelevant points of equivocation because it's in full damage control mode and cognitive dissonance. Just how you won't be able to engage with this post either and will also have to hand wave dismiss it.
Anonymous No.105643857 [Report] >>105643936 >>105643979
>>105643733
>>105643743
Anonymous No.105643936 [Report]
>>105643857
>fat
Anonymous No.105643979 [Report]
>>105643857
The writing is generated?
Anonymous No.105644282 [Report] >>105644307 >>105644378 >>105644430
https://streamable.com/simohc
Anonymous No.105644307 [Report] >>105644333
>>105644282
Did she died?
Anonymous No.105644333 [Report]
>>105644307
gotta get the fuwapuchi clean after sessions
Anonymous No.105644378 [Report] >>105644430 >>105644548
>>105644282
Was this generated with Google Veo?
Anonymous No.105644430 [Report]
>>105644282
>>105644378
I choose to believe it's real.
Anonymous No.105644548 [Report] >>105644625
>>105644378
>was this generated
into
>oh it's real? well I don't see how it's relevant to the thread
fuck off already
Anonymous No.105644613 [Report]
>>105641987
>nobody
>posts nothingburger
why should anychuddy care
Anonymous No.105644625 [Report] >>105644642 >>105644669
>>105644548
What is this schizophrenia?
Anonymous No.105644642 [Report]
>>105644625
Its in your walls™
Anonymous No.105644669 [Report]
>>105644625
I'm in your walls
All of them
Anonymous No.105644770 [Report] >>105644800
Anonymous No.105644800 [Report]
>>105644770
Incredibly base
Anonymous No.105644976 [Report] >>105645007
Anonymous No.105645007 [Report]
>>105644976
night night miku
Anonymous No.105645026 [Report]
>>105643676
anon it's time to wake up. please wake up.
Anonymous No.105645079 [Report]
>>105641756
>charging a small amount for the database download to cover server costs
>server costs
torrents are a thing...
Anonymous No.105645413 [Report] >>105645528
>>105643681
For models that have trouble dealing with the first message being from the assistant, what I do is replace the template for past messages with something like "### User/Char:" and maybe modify the last instruction to instruct the model to write the next turn in the chat history. Rumor has it that many models perform better using this method anyway, likely due to flawed (or possibly even no) multiturn training.
Anonymous No.105645419 [Report] >>105645423 >>105645430 >>105645466 >>105645507 >>105645551 >>105645811 >>105645951 >>105649395 >>105649470 >>105649547 >>105650308
what are we waiting now?
llama 5?
Anonymous No.105645423 [Report]
>>105645419
no we wait qwen, always
Anonymous No.105645430 [Report]
>>105645419
ernie 4.5 on the 30th
Anonymous No.105645466 [Report]
>>105645419
v4/w img out
god/allah/buddha/yahweh/yaldabeoth/hermes/etc willing
Anonymous No.105645507 [Report] >>105645520
>>105645419
llama 4 behemoth needs to come out first before llama 5. maybe llama 4.1 thinking edition will redeem the llama 4 too
Anonymous No.105645520 [Report] >>105645559
>>105645507
the only thing that will redeem llama now is full multimodal with image gen
uncensored
Anonymous No.105645528 [Report] >>105645701
>>105645413
Wouldn't that put those models at a disadvantage?
Anonymous No.105645551 [Report] >>105647207
>>105645419
bitnet or that new quant method that showed good performance at ~0.9bit
new models are all codebrained and benchmaxxed, I just want to run old 70b and 123b models.
Anonymous No.105645559 [Report]
>>105645520
then all hope is lost
Anonymous No.105645701 [Report]
>>105645528
I guess in comparisons. I'm just talking about personal usage.
Anonymous No.105645748 [Report] >>105645803 >>105645812 >>105645830
>>105637275 (OP)
>https://rentry.org/lmg-lazy-getting-started-guide
I've read this but I'm having trouble with
>COOM your brains out
The moment it stops being PG13 characters just keep repeating themselves and start saying that they're about to start and asking if that's what I want. But they never actually do it.
Anonymous No.105645803 [Report] >>105645866
>>105645748
model issue
anything smaller than mistral large is ass
althoughdoebeit try 1.5 temp 0.05minp
Anonymous No.105645811 [Report]
>>105645419
Only dense bitnet can save us now.
Anonymous No.105645812 [Report] >>105645866
>>105645748
Small models need some wrangling if you notice them getting into loops. Edit the part before it happens.
Anonymous No.105645830 [Report] >>105645866
>>105645748
What are your specs.
Anonymous No.105645866 [Report] >>105645877 >>105645919
>>105645803
>>105645812
No luck, even in a new chat and with the temperature value tweaks they are still prudes.

>>105645830
i5-12400F
RTX 3080 Ti 12GB VRAM
64GB RAM
Anonymous No.105645877 [Report] >>105645909
>>105645866
Which part of "edit the part" didn't you understand?
Anonymous No.105645907 [Report] >>105646050
>>105637275 (OP)
Anonymous No.105645909 [Report]
>>105645877
Sorry, I'm a retard and thought you meant my messages (that didn't work), not the output. That worked out though! Thanks!
Anonymous No.105645919 [Report]
>>105645866
other anon meant as to edit it to nudge it into a state of something happening
'she starts to', 'she gets onto' 'she takes your' etc etc
also try diff cards, maybe the one youve written has a very passive starting message where nothing happens and it perpetuates the nothinghappening, throwing shit at the wall but i think it needs some examples on how to act since by default models are only really good at answering and going along with what you do
Anonymous No.105645951 [Report]
>>105645419
Diffusion titan bitnet
Anonymous No.105646050 [Report]
>>105645907
That's bad sleep hygiene, Miku
Anonymous No.105646366 [Report] >>105646391 >>105647953
>>105642736
>https://files.catbox.moe/0g6m2r.jpg
Hate it when those sorts of fluids look like melted cheese, but yes, naughty Rin is hot.

I've replaced Nemo 12B with Gemma3 12B. Whatever slight loss in x-rated-ness there is, is greatly offset by it being much smarter and overall writing better.

BTW that's a fresh dalle Migu. I'm surprised it still can be fooled into something pretty good occasionally.
Anonymous No.105646391 [Report]
>>105646366
cute light inflatable migu.
Anonymous No.105646406 [Report] >>105646443
hungry boyyyy :3
Anonymous No.105646443 [Report]
>>105646406
me filling your moms vram with my fat throbbing layer
Anonymous No.105646451 [Report]
>>105643676
I'm getting that as well. I think the model is just bad.
Anonymous No.105646484 [Report] >>105646525 >>105646530 >>105646747
>>105641987
Anonymous No.105646525 [Report]
>>105646484
It's 100% the former, clowncar MoE using keywords. They even say so in the card.
See >>105642104
Anonymous No.105646530 [Report]
>>105646484
>>105642109
Anonymous No.105646613 [Report] >>105646638 >>105646711 >>105646712 >>105646728
It's actually bizarre how much changing the name of a character changes it's personality? Are the metaphysical concepts around names actually true? They seem separated into standard behaviors.
Anonymous No.105646638 [Report] >>105646728
>>105646613
Believe it or not, many arbitrary things affect a person's personality without them being aware or having a choice in whether it affects them. Free will is a myth.
Anonymous No.105646711 [Report] >>105646720
>>105646613
Because the model sees stuff in the training data describing certain things to certain names more. If a certain ethnic group behaves a certain way, it's going to influence stereotypical names from that group.
Anonymous No.105646712 [Report]
>>105646613
Nominative determinism
Anonymous No.105646720 [Report]
>>105646711
But even the names from same stereotypical group are markedly different. I'm surprised they have enough distance in the data on average to differentiate.
Anonymous No.105646728 [Report]
>>105646613
>>105646638
In my country if you are named angel there's an 80% chance you end up being gay or at the very least effeminate.
Anonymous No.105646738 [Report] >>105646752 >>105646807 >>105648123 >>105648502
Why does https://huggingface.co/ICONNAI/ICONN-1 404 now?
Anonymous No.105646747 [Report]
>>105646484
top geg it's a brown tier grift
Anonymous No.105646752 [Report] >>105646807
>>105646738
Got laughed out.
Anonymous No.105646807 [Report] >>105646832 >>105647136
>>105646738
Because >>105641756
>We'll be offering the system and models for free, and charging a small amount for the database download to cover server costs until we get VC funding. But all of it will be open source and free to share.
>until we get VC funding
Coupled with >>105646752
Anonymous No.105646832 [Report]
>>105646807
i think thats a different grift
iconn smells of an indian using chatgpt for big words
while the rag one is kinda more realistically boring
Anonymous No.105646836 [Report]
>>105637531
It should be useful, yes, under the same conditions: draft model much faster, and decent chance of draft model predicting right.

The mechanism of speculation is to introduce parallelism to spread out the cost of pushing the weights through the memory bus, which is of course *the* bottleneck.

If you have sequences A, B, and C, and you want the next token for each of them, then for each chunk of the weights, you can do those weights' calculations for all 3 at once, only loading once. The trick with speculative decoding is realizing that if your draft model has produced "shivers down her", there's no reason you can't treat "shivers", "shivers down", and "shivers down her" as your sequences A B and C, and have your main model predict the next token for each. As far as it agrees with the draft model, you can keep those tokens, and it only cost you 1x token gen (plus generating all the draft tokens).

IIUC this makes speculative decoding a trade-off/ substitute with multi-user batching, since each step in the speculation acts as one user.
Anonymous No.105646946 [Report]
Does your favourite model know what "zogcog" means? It's my goto test after the "mesugaki" test.
Anonymous No.105647136 [Report] >>105649543
>>105646807
I got all the safetensors before it got nuked. Anyone know where to get the json files?
Anonymous No.105647207 [Report]
>>105645551
>I just want to run old 70b and 123b models
based
Anonymous No.105647290 [Report]
>>105641671
https://www.youtube.com/watch?v=WBm0nyDkVYM
Anonymous No.105647404 [Report]
>>105643676
>"Are you sure you're not in a nightmare?"
Anonymous No.105647919 [Report] >>105647998 >>105648053
When do you think there will be actual intelligence?
Anonymous No.105647953 [Report] >>105648487
>>105646366
Gemma is better than nemo? What about the 27b? What sort of settings & format work best? Last time I tried I wasn't that impressed.
Anonymous No.105647998 [Report] >>105648053 >>105648978 >>105648985 >>105649063
>>105647919
Never.
Anonymous No.105648053 [Report]
>>105647919
Define intelligence. Are crows intelligent? What about ants? We can probably simulate ants
>>105647998
Shut up, cat fucker
Anonymous No.105648123 [Report] >>105648205 >>105648294 >>105648502
>>105646738
probably also because of this:
https://huggingface.co/bartowski/ICONNAI_ICONN-1-GGUF/discussions/1
Anonymous No.105648205 [Report]
>>105648123
>Woof
What did he mean by this.
Anonymous No.105648294 [Report]
>>105648123
>tricked me into downloading a memekit
>I actually liked it somewhat
Makes sense if it's just a Mistral graft. It loaded in kobold right away, which is absurd if it's an actually new architecture. Oh well, some variety won't hurt from time to time. It kinda didn't do any of the typical Mistral slops (probably because I have banned over 50 at this point).
Anonymous No.105648487 [Report] >>105648814
>>105647953
Gemma is actually great for sfw but it is pretty awful at writing anything adult in nature, mostly because it can't take make characters take initiative (in my experience). You don't even have to use any samplers, a very basic system prompt with the gemma chat template will work. I'm talking about the 27b though, I haven't used the 12b.
Anonymous No.105648502 [Report] >>105648535 >>105648580 >>105649878
>>105646738
>>105648123
Quick, copy paste, out of chronological order collage of the funniest posts surrounding this that I just collected
Anonymous No.105648528 [Report]
>the actually good models are still stuck at 4k context
Anonymous No.105648535 [Report] >>105648827
>>105648502
I guess the fact their rhetoric was a bit unhinged and they made straight up jokes and very weird statements on their release page was a giveaway. Some kind of a bizarre "social experiment" to see if they could last a day scamming everyone to prove that the AI industry is all grifters or something. Also:
>I con
>>105642091
Anonymous No.105648580 [Report]
>>105648502
I miss this format
Anonymous No.105648795 [Report]
>>105637564
you can try nvidia models like _Llama-3_3-Nemotron-Super-49B-v1

I havent seen any sloptunes of it since it is itself a bit of a sloptune. I found it a bit too rigid as is. Maybe stuff like skyfall 36b
Anonymous No.105648806 [Report] >>105650059 >>105651432
Perchance isn't totally shit it seems
Anonymous No.105648814 [Report]
>>105648487
Initiative can be improved to some extent by prefilling model responses with a short <think> section where the model reminds itself to be more proactive; it looks as if Gemma 3 was partially trained for reasoning but that didn't get fleshed out at least for this version.
Anonymous No.105648827 [Report]
>>105648535
From his postings, I don't think the icon author is older than 18.
Anonymous No.105648833 [Report]
are imatrix quants always better than static?
Anonymous No.105648978 [Report] >>105649063
>>105647998
based
Anonymous No.105648985 [Report]
>>105647998
always knew he was right, its LLMover..
Anonymous No.105649063 [Report]
>>105648978
>>105647998
but we started from ai being able to code for 0h, then 1 short script, and we progressed until 1 whole hour, so lecunt is proven retarded yet again (daily example)
Anonymous No.105649395 [Report]
>>105645419
Gemma 4 for me but it's going to be a whole another year before Google does it given how long they waited even after they got lapped by Chinese models and the only reason they did it was to preempt Llama 4 from stealing their publicity crown at lower sizes and Aya was nipping at their toes for multilingual benchmarks. Hoping they do MOE for the next model too but probably unlikely.
Anonymous No.105649470 [Report]
>>105645419
A miracle. It happened with Mixtral, it happened with Deepseek. It will happen again
Anonymous No.105649543 [Report]
>>105647136
you mean tokenizer.json? i think i downloadrd it and left it on my desktop but not the config
Anonymous No.105649547 [Report]
>>105645419
I still expect a "surprise" Mistral Medium release within 2 weeks in the form of Mistral-Nemotron. Most people won't be able to run it and the model will be good in some ways and bad in others because of NVidia's mathmaxxed (with a sprinkle of safety) Nemotron dataset.
Anonymous No.105649554 [Report]
Anonymous No.105649878 [Report]
>>105648502
SAAARS our response?
Anonymous No.105649887 [Report] >>105649938
What models should I run on a macbook pro 16 M4 ? For programming primarily
Anonymous No.105649926 [Report] >>105649956
>The same post of yesterday
/lmg/ is dying
Anonymous No.105649938 [Report]
>>105649887
depends on the language and how much ram you've got.
Anonymous No.105649956 [Report]
>>105649926
You should be taking advantage of the quiet while we wait for the next batch of releases by catching up on ai literature.
Anonymous No.105650038 [Report] >>105650213
>>105642067
>>105642286
>>105642736
>
Anonymous No.105650059 [Report] >>105650128 >>105650256
>>105648806
Why are you gay?
Anonymous No.105650060 [Report] >>105650276 >>105650323
https://edition.cnn.com/2025/06/18/tech/meta-openai-sam-altman-100-million
wtf?
Anonymous No.105650128 [Report] >>105650256
>>105650059
>he doesn't like the cock
who is going to tell him
Anonymous No.105650213 [Report]
>>105650038
Anonymous No.105650256 [Report]
>>105650128
me
>>105650059
Anonymous No.105650276 [Report]
>>105650060
>join meta for $100m
>put your feet up
>produce absolute shit
>zucc will peddle it anyway
perfect deal
Anonymous No.105650308 [Report]
>>105645419
logically, llama4.1 is next trained by the new team.
Anonymous No.105650323 [Report] >>105650332
>>105650060
>“There’s many things I respect about Meta as a company, but I don’t think they’re a company that’s great at innovation,” Altman continued. “I think we understand a lot of things they don’t.”
He is afraid
Anonymous No.105650332 [Report] >>105650478
>>105650323
This is one of the few statements I agree with Altman seeing how the metaverse and llama are going
Anonymous No.105650377 [Report] >>105650431 >>105650635
meta is a rudderless company always looking for a next big thing that won't happen
LLMs are not what meta ever needed in any way, shape or form
it won't become AGI and a code assistant is not what is going to help their retarded dying social network
in fact the AI slop has been killing it even harder not even boomers want to see more of that shrimp jesus
Anonymous No.105650431 [Report] >>105650550
>>105650377
What makes you think that his personally-appointed "Superintelligence" team (which might include LeCun) is going to make yet another LLM?
Anonymous No.105650478 [Report] >>105650486
>>105650332
Meta is by far the most changified company, more than 40% of workers are asians.
No shit, they are bad at innovating. All changs know is how to copy,
Anonymous No.105650486 [Report] >>105650500
>>105650478
WTF are you talking about, Meta is literally trying to copy Deepseek and failing at that.
Anonymous No.105650500 [Report] >>105650546 >>105650871
>>105650486
And Deepseek copied from OAI.
facebook changs are not on the level of deepseek changs.
Anonymous No.105650546 [Report] >>105650566 >>105650587
>>105650500
Copied what?
Anonymous No.105650550 [Report]
>>105650431
>LeCun
lol
LeCun makes valid points that LLM won't become AGI but that's all he can do
that nigger has never made a single useful thing ever
LLMs won't be AGI like he said, but unlike what he produces (hot air) they can have actual uses in the real world.
Anonymous No.105650557 [Report] >>105650694 >>105650702
https://news.ycombinator.com/item?id=44273776
>facebook employee talking about lecunn
>>FYI if you worked at FB you could pull up his WP and see he does absolutely nothing all day except link to arxiv.
Anonymous No.105650566 [Report] >>105650677
>>105650546
Everyone knows what they took even if there's no proof
Anonymous No.105650587 [Report] >>105650871
>>105650546
model distillation
they did the same thing to gemini with the new R1 its reasoning traces are very similar to what you used to see on gemini before google decided to hide the CoT through a shitty summarizer
Anonymous No.105650635 [Report]
>>105650377
remember when meta said they were planning to start replacing some of their engineer with ai this year?
i sure hope they aren't planning to use their own llama models
though who knows it might still end up being an improvement
Anonymous No.105650677 [Report] >>105650897
>>105650566
> there's no proof
> Everyone knows
uh huh
Anonymous No.105650694 [Report]
>>105650557
based
Anonymous No.105650696 [Report]
is aicharactercards.com the civitai of text ai?
Anonymous No.105650702 [Report] >>105650832
>>105650557
What's a WP?
Anonymous No.105650830 [Report]
Imagen is fucking crazy. Feels weird using a model that actually does what you prompt it to do without 8 million tags
Anonymous No.105650832 [Report]
>>105650702
It stands for workplace profile. Workplace is their internal social network, occasionally making the news because their employees like to say things there that gets them fired:
https://www.cnbc.com/2020/09/17/facebook-issues-new-rules-on-internal-employee-communication-.html
>This week, BuzzFeed reported a post by a fired Facebook data scientist who posted to Workplace a memo outlining how the company failed to act on election interference happening around the world through the social network.
Anonymous No.105650871 [Report] >>105650898
>>105650500
>Deepseek copied from OAI.
OpenAI hid their thinking outputs, the wait wait but wait slop was all deepseek.>>105650587
>they did the same thing to gemini with the new R1
There were no thinking traces for o1.
Anonymous No.105650897 [Report] >>105650951 >>105650965 >>105650966 >>105651058 >>105652630
>>105650677
>China known for stealing IP
>Comes up with a clone of GPT out of nowhere
>Where's your proof bro
Anonymous No.105650898 [Report]
>>105650871
>There were no thinking traces for o1.
I never said anything about o1 though. I said they copied OAI.
DeepSeek V3 was a distill of GPT 4, and the original R1 is indeed their own abomination with endless meandering.
Anonymous No.105650909 [Report] >>105652428
btw R1's thinking is more obnoxious than useful, 999999% of what makes that model good is what was already trained in V3.
Anonymous No.105650951 [Report] >>105651156
>>105650897
>stealing IP
So..? It's neither personal nor easily identifiable information that literally every fucking service stores.
Anonymous No.105650965 [Report]
>>105650897
If they stole from gpt deepseek wouldn't be as good as it is.
Anonymous No.105650966 [Report] >>105651156
>>105650897
You say that like making a complex LLM that performs on the same level, but with less restrictions for the consumer is the same as making knock off marvel merch
Anonymous No.105651058 [Report] >>105651085
>>105650897
They must be newfags, we can all remember the massive shift in the way LLMs speak in general after people started compiling massive datasets of GPT conversations
it was the chinese national sport to benchmax on this whether it's deepseek, openchat, xwin etc all claiming to do better than GPT while training on GPT output lolmao
Anonymous No.105651085 [Report] >>105651166
>>105651058
Deepseek doesn't speak like chatgpt. You should try the model instead of shitposting all day
Anonymous No.105651156 [Report] >>105651656
>>105650966
You are severely retarded.
>>105650951
Obviously I'm talking about their processes and code. This is something China does with everything: chips, airplanes, missiles. It's no different than how they steal data from Lockheed and NG. Someone at OpenAI gets a fucky sucky at the local massage parlor.
>Ooooh u such a big sexy man. You make the AI :O
After a couple of months they give an external hard drive to her handler in exchange for a few million
Anonymous No.105651166 [Report] >>105651222
>>105651085
you are the one who should try the model
you're probably one of those retarded gooners who never used anything other than r1 and didn't even know deepseek existed before the media craze for it
the original v3 had many tell tale signs but you won't know them if you don't bother downloading the original release and if you don't have the computer to run it
Anonymous No.105651218 [Report] >>105651317 >>105651337 >>105651458
I'm trying to make a locally run AI model for my brother
His usecases are:
>Analyse old legal cases(~10 years) so he can check them easier to avoid contradictory statesments
>Multiple pdfs at once but speed isn't an issue
>Preferably accessible from his laptop
>Preferably includes/has access to picture to pdf conversion tool
I have no coding experience, asked for what equipment i should buy and was told to make a local model first. So far im setting up a ollama on my PC and I'm going to try ssh into it from his laptop to make a basic LM
>How do i turn this to what he actually wants?
Anonymous No.105651222 [Report] >>105651229 >>105651236
>>105651166
>the original v3 had many tell tale signs
Such as?
Anonymous No.105651229 [Report]
>>105651222
it spoke english
Anonymous No.105651236 [Report] >>105651324
>>105651222
retard
Anonymous No.105651317 [Report] >>105651582
>>105651218
>How do i turn this to what he actually wants
Have him talk to the local LLM and help refine and elaborate on his specific use case. Then use that conversation as the basis to start actually coding.
Good luck anon.
Anonymous No.105651324 [Report]
>>105651236
Grok is the only response that isn't annoying to read.
Anonymous No.105651337 [Report] >>105651582
>>105651218
>How do i turn this to what he actually wants?
Tell him he's a retard and point him to hosted APIs since his laptop's not going to cut it.
Or give him a bill for an R1 capable server. He can pay $4,000 and wait a day for every result with a CPU based server. He can even remote in on his laptop.
Or pay ~$200K for a server that will run full quant at speed.
Anonymous No.105651432 [Report]
>>105648806
>Perchance
This site shills in 4chan almost as hard as NovelAI.
Anonymous No.105651458 [Report] >>105651503 >>105651582
>>105651218
Download Claude desktop and give it access to your files. ezpz. Sonnet is probably more than enough.
Anonymous No.105651503 [Report] >>105651582
>>105651458
Good luck getting Anthropic models to help with anything legal lol.
Anonymous No.105651582 [Report] >>105651867 >>105652160
>>105651317
Sorry but i can't code, is there a handy script site i can steal from or should i just look up stack overflow?
>>105651337
Time isn't really important since
>Confidentiality
Also for the test run, i want his laptop to access my computer, my computer runs the AI and he accesses it with a webUI or somesuch, how is that resource heavy, i know its shit and wrong half the time but i already run sub 7B models on the background with barely any resource usage, can i even run the 100B+ models on my SSD with 40Gbs of space?
>Gtx 1060 6GB
>Ryzen 5 3600
>32GB 3600Mhz Cl18
>M.2 SSD
>>105651458
Thanks
>>105651503
God no, he wants this to make his job shorter, not doing it itself. It's a pain in the ass to read everything but easier to look up what the AI said on the document,what you know about the case and then glance at the whole to do it
>4 hour job to 30 min job
Roughly, but I honestly don't know that many details desu
Anonymous No.105651656 [Report] >>105651735 >>105652272
>>105651156
>He still believes that the US is ahead of China in any tech field
Anonymous No.105651735 [Report] >>105651794 >>105652048 >>105652061
>>105651656
oh, did china invent an actually working EUV machine that can be used for mass production of chips?
(in b4 "ASML is dutch" : ASML is built entirely from US IP/research)
Anonymous No.105651794 [Report] >>105651885
>>105651735
>he thinks that 'chips' are the most important part of AI
Anonymous No.105651867 [Report]
>>105651582
Did you look at the build guides in the OP?
For a decent AI you're going to need at least 128GB, preferably closer to 1TB.
At least the idea of running on a server and connecting to that via his laptop has occurred to you. That's the only way any laptop is going to be useful.
Anonymous No.105651885 [Report] >>105651926
>>105651794
if we only think about the software stack it's even simpler
Gemini 2.5 Pro mogs anything chinese
this is so self evident if you actually used those models for something other than cooming and saw how good it is at ingesting large context
Anonymous No.105651926 [Report] >>105652046 >>105652265
>>105651885
>Gemini 2.5 Pro
Not that anon, but what the fuck are they doing to poor gemini.
Each release after gemini-2.5-pro-preview-03-25 is worse than the last.
The latest one can't even keep up with complex format instructions that 03-25 did effortlessly.
Please, google, don't fucking make Gemini shit. It's my go to not-local model.
Anonymous No.105651995 [Report]
Best RP models mistral large size and below?
Anonymous No.105652034 [Report] >>105652087
is there anywhere an RP leaderboard for local models? there used to be one but it got taken down a while back
Anonymous No.105652046 [Report]
>>105651926
It recently got its final release. They won't mess around with it much now. Preview releases are always subject to changes.
Can't speak for your issues, personally I haven't felt the model got worse, but YMMV.
Anonymous No.105652048 [Report]
>>105651735
Yes
Anonymous No.105652061 [Report] >>105652081
>>105651735
US Intellectual Not Real Property and research is built entirely by Eurasians
Anonymous No.105652081 [Report]
>>105652061
if that is what helps you sleep at night (why do you think the US gets to dictate who ASML can sell their devices to? They actually wanted to sell to China but the US told them to eat shit)
Anonymous No.105652087 [Report]
>>105652034
The closest there is is the nala test, look for it in the atchive.
Anonymous No.105652099 [Report] >>105652183
Who was the original anon that came up with the nala test anyway? Is he still here?
Anonymous No.105652160 [Report]
>>105651582
>is there a handy script site i can steal from
Ask Claude what vibe-programming is, and you will be fine.
Anonymous No.105652183 [Report] >>105652243 >>105652299
>>105652099
unrelated but i was wondering how the aah aah mistress meme originated
i remember the screenshot but i dont have it saved anymore
Anonymous No.105652243 [Report] >>105652259
>>105652183
It originated in /aicg
Anonymous No.105652259 [Report]
>>105652243
anyone has the og screenshot?
Anonymous No.105652265 [Report]
>>105651926
All of the big AI companies are out of ideas. That's why they're memeing MCP so hard. They can only optimize their models and not holistically improve them
Anonymous No.105652272 [Report]
>>105651656
Cope BRICS untermensch.
Anonymous No.105652299 [Report] >>105652316
>>105652183
https://desuarchive.org/g/thread/91897528/#91899750
Anonymous No.105652316 [Report]
>>105652299
peak thanks anon
Anonymous No.105652325 [Report] >>105652348 >>105652363 >>105652386 >>105652424 >>105652435 >>105652486 >>105652534
Why does mistral small give in at the first reply even though it's explicitly told not to?
Anonymous No.105652348 [Report]
>>105652325
You are expecting way too much out of this small model with such a large card.
Anonymous No.105652363 [Report]
>>105652325
This is depressing. Fix yourself faggot.
Anonymous No.105652386 [Report] >>105652420
>>105652325
Picrel
Anonymous No.105652390 [Report] >>105652432
it's always the worst degenerates that are into text gen for cooming, I notice
Anonymous No.105652420 [Report] >>105652572
>>105652386
>Skill issue
Think he should download more RAM?
Anonymous No.105652424 [Report]
>>105652325
That smells like something else is wrong with the prompt, since it breaks format immediately.
Paste the full prompt silly sent the backend into a pastebin and post the link.
Anonymous No.105652428 [Report]
>>105650909
Or I can just use R1 and just use a prefill whenever I don't want it to think. And then when I actually do need it to think I can just remove the prefill. No need to use different models.
Anonymous No.105652432 [Report]
>>105652390
There's nothing wrong with princesses living their best life and getting addicted to minotaur cum.
Anonymous No.105652435 [Report]
>>105652325
Why are you posting a November 2024 screenshot?
Anonymous No.105652486 [Report]
>>105652325
Mehmet my son...
Anonymous No.105652534 [Report] >>105652546 >>105652606
>>105652325
Try Mistral Small 3.2
https://huggingface.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506
Anonymous No.105652546 [Report]
>>105652534
wtf
Anonymous No.105652552 [Report] >>105652584
https://x.com/MistralAI/status/1936093325116781016
>Introducing Mistral Small 3.2, a small update to Mistral Small 3.1 to improve:
>- Instruction following: Small 3.2 is better at following precise instructions
>- Repetition errors: Small 3.2 produces less infinite generations or repetitive answers
>- Function calling: Small 3.2's function calling template is more robust
Anonymous No.105652572 [Report]
>>105652420
Into his tiny little brain maybe, it's an operator's skill issue
Anonymous No.105652584 [Report] >>105652589 >>105652606 >>105652642
>>105652552
post the hf link, dumbass
https://huggingface.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506
Anonymous No.105652589 [Report]
>>105652584
it already was
Anonymous No.105652606 [Report]
>>105652584
Already posted here >>105652534
Anonymous No.105652630 [Report]
>>105650897
>Scrape the entire Internet to train your LLM
>Cry about people training on your LLM's outputs
get fucked rat jew
Anonymous No.105652642 [Report]
>>105652584
you know for being a general that is supposedly all about reading shit generated by AI, none of you actually read
Anonymous No.105652649 [Report]
>>105652633
>>105652633
>>105652633