Transformers v5 support
#15
by
AntonV
HF Staff
- opened
Dependent on https://github.com/huggingface/transformers/pull/42028 and requires the latest transformers version (from main)
Usage example:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"MiniMaxAI/MiniMax-M2.1",
device_map="auto",
revision="refs/pr/15",
)
tokenizer = AutoTokenizer.from_pretrained("MiniMaxAI/MiniMax-M2.1", revision="refs/pr/15")
messages = [
{"role": "user", "content": "What is your favourite condiment?"},
{"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
{"role": "user", "content": "Do you have mayonnaise recipes?"}
]
model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda")
generated_ids = model.generate(**model_inputs, max_new_tokens=100)
response = tokenizer.batch_decode(generated_ids)[0]
print(response)
AntonV
changed pull request status to
open
SGLang/vLLM still uses transformers v4. Changing the tokenizer_class from GPT2Tokenizer to TokenizersBackend will break vLLM/SGLang inference.
[2026-01-09 07:37:53 TP0] Scheduler hit an exception: Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/sglang/srt/managers/scheduler.py", line 2932, in run_scheduler_process
scheduler = Scheduler(
^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/sglang/srt/managers/scheduler.py", line 330, in __init__
self.init_tokenizer()
File "/usr/local/lib/python3.12/dist-packages/sglang/srt/managers/scheduler.py", line 443, in init_tokenizer
self.tokenizer = get_tokenizer(
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/sglang/srt/utils/hf_transformers_utils.py", line 496, in get_tokenizer
raise e
File "/usr/local/lib/python3.12/dist-packages/sglang/srt/utils/hf_transformers_utils.py", line 461, in get_tokenizer
tokenizer = AutoTokenizer.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/transformers/models/auto/tokenization_auto.py", line 1137, in from_pretrained
raise ValueError(
ValueError: Tokenizer class TokenizersBackend does not exist or is not currently imported.
Perhaps PreTrainedTokenizerFast is a more compatible option?
Let me check if we can simply revert back to GPT2 Tokenizer, https://github.com/huggingface/transformers/pull/42894 should enable TokenizersBackend by default when we don't specify it in the auto mapping
Yes it works, still loading tokenizers backend. That should work for both v4 (GPT2 Tokenizers) and v5 (TokenizersBackend) then
Updating the other PR as well