Add PLaMo 3 model support#1234
Open
mitmul wants to merge 12 commits intoml-explore:mainfrom
Open
Conversation
Author
|
Hi @angeloskath, sorry for the direct ping. This PR has been ready for review for a few days and currently has no reviewer assigned. The diff is intentionally scoped to native PLaMo 3 model support plus focused model tests:
I also re-ran the focused test locally:
Result: Would you or another maintainer be able to take a look when you have bandwidth? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Thank you to the mlx-lm maintainers for building and maintaining this excellent library. It is a pleasure to contribute support for another open model family to the project.
Summary
plamo3model support for conversion and generation.AutoTokenizerpath. The official PLaMo 3 repositories includetokenization_plamo.pyandauto_mapmetadata forPlamo3Tokenizer, so this PR does not vendor a tokenizer implementation into mlx-lm.Tokenizer note
PLaMo 3 tokenizer use requires Hugging Face remote code, so users should pass
--trust-remote-codeor settokenizer_config={"trust_remote_code": True}when loading these checkpoints. The upstream tokenizer/modeling code also has additional runtime dependencies (torchandnumba) that are not added to mlx-lm's core dependencies by this PR. This follows the existing PLaMo 2 behavior, where using the upstream tokenizer can require model-specific remote-code dependencies.About PLaMo 3
PLaMo 3 is a next-generation LLM series developed by Preferred Networks in collaboration with NICT. The official PFN blog describes it as part of an effort to build safe, high-performance Japanese domestic LLMs using large, high-quality datasets with attention to Japanese culture and society.
The blog explains that PLaMo 3 moves away from the Samba-based architecture used in PLaMo 2 and instead combines full attention with sliding-window attention, similar in spirit to Gemma 3. This is intended to reduce inference time and KV-cache memory usage while still allowing full-attention layers to capture relationships between distant tokens. PFN reports pretraining experiments for 2B, 8B, and 31B base models, with data mixed across English, Japanese, code, and multilingual corpora, and has published PLaMo 3 NICT 2B/8B/31B Base checkpoints on Hugging Face.
Reference: https://tech.preferred.jp/ja/blog/plamo_3_8b_31b/
HuggingFace:
Validation
python -m pytest tests/test_models.py -k plamo3 -q