Search - 4rchive

►Recent Highlights from the Previous Thread: >>106163327

--Paper: HRM: 27M-parameter model outperforms larger models on reasoning task:
>106163637 >106163660 >106163706
--GPT-OSS refusal patterns on SFW prompts reveal overblocking and political/copyright sensitivity:
>106164711 >106164734 >106164739 >106164755 >106164769 >106164783 >106164791 >106164817 >106164834 >106164829 >106165316 >106164818 >106164839 >106164849 >106164860 >106164921 >106165250 >106165305 >106165470 >106165385 >106164908 >106164918 >106165288 >106165401 >106164943 >106164954 >106165311 >106164745 >106165327 >106165338 >106165348 >106165203 >106164789 >106164804 >106164815 >106164928
--Qwen3-4B model variants benchmarked across multiple AI evaluation tasks:
>106164159
--Mocking OpenAI's delayed open-weights model as underwhelming distill, not breakthrough:
>106163392 >106163746 >106163774 >106163789 >106163894 >106163913 >106163937 >106163985 >106164364
--AI hallucinates policy rulebooks from training data artifacts:
>106163403 >106163468
--Running massive models via mmap and partial offloading in llama.cpp:
>106164211 >106164249 >106165447
--Upcoming glm support in ik_llama to improve VRAM efficiency:
>106164256 >106164295
--vLLM supports schema-guided generation via tool_choice and internal JSON parsing:
>106163504
--gpt-oss-120b and gpt-oss-20b underperform despite high expectations:
>106163680 >106163708 >106163734 >106164339 >106163753
--MikuPad integration limitations with Ollama and workarounds:
>106163505 >106163586 >106163675 >106163815 >106163998 >106163596
--Anthropic's values-based hiring evokes cult-like corporate alignment culture:
>106163539 >106163627 >106163635 >106163652
--Qwen/Qwen3-4B-Thinking-2507:
>106163454 >106163490
--Miku (free space):
>106163430 >106163590

►Recent Highlight Posts from the Previous Thread: >>106163346 >>106164719

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script