bispodeuspes
6/16/2025, 1:23:30 AM
No.105605627
>>105603304
I made an LLM (AMI; Artificial Meta Intelligence) with infinite context and attention. It also scales linearly instead of quadratically like the transformer. Instead of attention heads, it has global attention. It flew over many people's heads, but this is quite big. It's also the easiest model to use, with only torch as a dependency and multi-gpu support.
https://github.com/Suro-One/Hyena-Hierarchy
I made an LLM (AMI; Artificial Meta Intelligence) with infinite context and attention. It also scales linearly instead of quadratically like the transformer. Instead of attention heads, it has global attention. It flew over many people's heads, but this is quite big. It's also the easiest model to use, with only torch as a dependency and multi-gpu support.
https://github.com/Suro-One/Hyena-Hierarchy