Transformers v5 support

#15

by AntonV HF Staff - opened 2 days ago

base: refs/heads/main

←

from: refs/pr/15

Discussion Files changed

+26

-1003

AntonV

2 days ago

•

edited 1 day ago

Dependent on https://github.com/huggingface/transformers/pull/42028 and requires the latest transformers version (from main)

Usage example:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "MiniMaxAI/MiniMax-M2.1",
    device_map="auto",
    revision="refs/pr/15",
)

tokenizer = AutoTokenizer.from_pretrained("MiniMaxAI/MiniMax-M2.1", revision="refs/pr/15")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda")

generated_ids = model.generate(**model_inputs, max_new_tokens=100)

response = tokenizer.batch_decode(generated_ids)[0]

print(response)

v5 version200351ba

update docs1f5d476e

AntonV changed pull request status to open 1 day ago

rogeryoungh

MiniMax org about 13 hours ago

SGLang/vLLM still uses transformers v4. Changing the tokenizer_class from GPT2Tokenizer to TokenizersBackend will break vLLM/SGLang inference.

#14 has same problem. @awni

[2026-01-09 07:37:53 TP0] Scheduler hit an exception: Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/sglang/srt/managers/scheduler.py", line 2932, in run_scheduler_process
    scheduler = Scheduler(
                ^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/sglang/srt/managers/scheduler.py", line 330, in __init__
    self.init_tokenizer()
  File "/usr/local/lib/python3.12/dist-packages/sglang/srt/managers/scheduler.py", line 443, in init_tokenizer
    self.tokenizer = get_tokenizer(
                     ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/sglang/srt/utils/hf_transformers_utils.py", line 496, in get_tokenizer
    raise e
  File "/usr/local/lib/python3.12/dist-packages/sglang/srt/utils/hf_transformers_utils.py", line 461, in get_tokenizer
    tokenizer = AutoTokenizer.from_pretrained(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/transformers/models/auto/tokenization_auto.py", line 1137, in from_pretrained
    raise ValueError(
ValueError: Tokenizer class TokenizersBackend does not exist or is not currently imported.

Perhaps PreTrainedTokenizerFast is a more compatible option?

AntonV

about 9 hours ago

•

edited about 9 hours ago

Let me check if we can simply revert back to GPT2 Tokenizer, https://github.com/huggingface/transformers/pull/42894 should enable TokenizersBackend by default when we don't specify it in the auto mapping

revert tokenizer config418288ea

AntonV

about 9 hours ago

Yes it works, still loading tokenizers backend. That should work for both v4 (GPT2 Tokenizers) and v5 (TokenizersBackend) then

Updating the other PR as well

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment