Qwen models initialization fails with TypeError: unhashable type: 'dict' and AssertionError due to rope_scaling

When attempting to load Qwen models (e.g., Qwen2.5 / Qwen3) using nanovllm, the initialization crashes during the RoPE (Rotary Position Embedding) setup. This occurs even if the model's config.json explicitly sets "rope_scaling": null.

The crash happens in two sequential stages depending on how the parameter is passed:

First, it throws a TypeError: unhashable type: 'dict' because a dictionary is passed to get_rope, which is wrapped by a cache decorator.

If the dictionary is bypassed, it hits an AssertionError in rotary_embedding.py because nanovllm currently does not support rope_scaling.

[rank0]: Traceback (most recent call last):
[rank0]:   File "/mnt/e/study/nanovllm/nano-kvllm-main/chat_cli.py", line 74, in <module>
[rank0]:     main()
[rank0]:   File "/mnt/e/study/nanovllm/nano-kvllm-main/chat_cli.py", line 9, in main
[rank0]:     llm = LLM(
[rank0]:   File "/mnt/e/study/nanovllm/nano-kvllm-main/KvChat/engine/llm_engine.py", line 30, in __init__
[rank0]:     self.model_runner = ModelRunner(config, 0, self.events)
[rank0]:   File "/mnt/e/study/nanovllm/nano-kvllm-main/KvChat/engine/model_runner.py", line 30, in __init__
[rank0]:     self.model = Qwen3ForCausalLM(hf_config,config)
[rank0]:   File "/mnt/e/study/nanovllm/nano-kvllm-main/KvChat/models/qwen3.py", line 216, in __init__
[rank0]:     self.model = Qwen3Model(config,vllm_config)
[rank0]:   File "/mnt/e/study/nanovllm/nano-kvllm-main/KvChat/models/qwen3.py", line 185, in __init__
[rank0]:     self.layers = nn.ModuleList([Qwen3DecoderLayer(config,vllm_config) for _ in range(config.num_hidden_layers)])
[rank0]:   File "/mnt/e/study/nanovllm/nano-kvllm-main/KvChat/models/qwen3.py", line 185, in <listcomp>
[rank0]:     self.layers = nn.ModuleList([Qwen3DecoderLayer(config,vllm_config) for _ in range(config.num_hidden_layers)])
[rank0]:   File "/mnt/e/study/nanovllm/nano-kvllm-main/KvChat/models/qwen3.py", line 132, in __init__
[rank0]:     self.self_attn = Qwen3Attention(
[rank0]:   File "/mnt/e/study/nanovllm/nano-kvllm-main/KvChat/models/qwen3.py", line 55, in __init__
[rank0]:     self.rotary_emb = get_rope(
[rank0]: TypeError: unhashable type: 'dict'

Implicit Mutation by transformers: In recent versions of  (e.g., ) or specific Qwen config implementations, even if  is set in the , the  object initializes  to a default dictionary (or dynamically parses it for long-context models).transformers4.51.0"rope_scaling": nullconfig.jsonQwen3Configconfig.rope_scaling

Cache Hash Collision:  likely uses a caching mechanism (like ) on the  function. Passing the  dict directly from the config causes the cache hasher to crash ().nanovllm/layers/rotary_embedding.py@lru_cacheget_roperope_scalingunhashable type: 'dict'

Lack of RoPE Scaling Implementation: Even if the dict is sanitized,  at line 59 explicitly asserts . This indicates that RoPE scaling algorithms (like Yarn or Dynamic NTK) are not yet implemented in .rotary_embedding.pyassert rope_scaling is None

Since  currently does not support RoPE scaling, we should explicitly discard the  parameter during the  initialization to allow the model to load properly for standard context lengths.nanovllmrope_scalingQwen3Attention

self.rotary_emb = get_rope(
            self.head_dim,
            rotary_dim=self.head_dim,
            max_position=max_position,
            base=rope_theta,
            # Explicitly set to None to bypass unhashable dict error 
            # and satisfy the assertion in rotary_embedding.py
            rope_scaling=None, 
        )
Environment:

Models: Qwen3 / Qwen2.5 series

Transformers version: 4.51.0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen models initialization fails with TypeError: unhashable type: 'dict' and AssertionError due to rope_scaling #189

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Qwen models initialization fails with TypeError: unhashable type: 'dict' and AssertionError due to rope_scaling #189

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions