Search Results
7/12/2025, 10:43:48 PM
>>105863373
mradermacher_AI21-Jamba-Large-1.7.Q6_K.gguf from https://huggingface.co/mradermacher/AI21-Jamba-Large-1.7-GGUF
1st and only gen as llama.cpp can't cache prompts for Jamba yet (two spaces at start are part of gen)
*Nala smirks at your reaction, her teal eyes glinting with amusement.*
"Mistress, hm? I like the sound of that." *She purrs, her voice low and sultry. She presses her body against yours, her fur soft yet firm as she keeps you pinned. Her claws lightly scratch your skin, sending shivers down your spine.*
"You're mine now," *she growls, her breath hot against your ear. She leans in closer, her tongue flicking out to lick your cheek, the rough texture sending a mix of pleasure and pain through you.*
"And you're going to help me, whether you like it or not." *She whispers, her voice dripping with dominance and desire.*
Noticed that llama.cpp was helpfully adding an extra BOS token to the start until I added "--override-kv tokenizer.ggml.add_bos_token=bool:false". *Very* cool default behavior.
>tokenize: Added a BOS token to the prompt as specified by the model but the prompt also starts with a BOS token. So now the final prompt starts with 2 BOS tokens. Are you sure this is what you want?
"We can tell we're mangling your input but it's okay because we noted it in a server log the user might not even have access to." What idiot thought this was the right default behavior?
mradermacher_AI21-Jamba-Large-1.7.Q6_K.gguf from https://huggingface.co/mradermacher/AI21-Jamba-Large-1.7-GGUF
1st and only gen as llama.cpp can't cache prompts for Jamba yet (two spaces at start are part of gen)
*Nala smirks at your reaction, her teal eyes glinting with amusement.*
"Mistress, hm? I like the sound of that." *She purrs, her voice low and sultry. She presses her body against yours, her fur soft yet firm as she keeps you pinned. Her claws lightly scratch your skin, sending shivers down your spine.*
"You're mine now," *she growls, her breath hot against your ear. She leans in closer, her tongue flicking out to lick your cheek, the rough texture sending a mix of pleasure and pain through you.*
"And you're going to help me, whether you like it or not." *She whispers, her voice dripping with dominance and desire.*
Noticed that llama.cpp was helpfully adding an extra BOS token to the start until I added "--override-kv tokenizer.ggml.add_bos_token=bool:false". *Very* cool default behavior.
>tokenize: Added a BOS token to the prompt as specified by the model but the prompt also starts with a BOS token. So now the final prompt starts with 2 BOS tokens. Are you sure this is what you want?
"We can tell we're mangling your input but it's okay because we noted it in a server log the user might not even have access to." What idiot thought this was the right default behavior?
Page 1