changing the tokenizer settings makes some vast differences