Safetensors

Pure Elixir parser for the safetensors file format.

This library reads .safetensors files and loads tensors directly into Nx tensors. It's designed for loading ML model weights in Elixir applications.

Why This Exists

The safetensors format is the standard for storing ML model weights. It's:

Safe: No arbitrary code execution (unlike pickle)
Fast: Memory-mapped access, zero-copy when possible
Simple: JSON header + raw tensor data

To run ML models in Elixir, we need to load these weights. This library provides that capability without requiring Python.

The Big Picture

┌─────────────────────────────────────────────────────────────┐
│                  Model Loading Pipeline                      │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐   │
│  │ .safetensors │ -> │  Safetensors │ -> │  Nx Tensors  │   │
│  │    file      │    │   (parser)   │    │   (EMLX)     │   │
│  └──────────────┘    └──────────────┘    └──────────────┘   │
│                                                              │
│  File Format:                                                │
│  ┌────────────────────────────────────────────────────────┐ │
│  │ 8 bytes  │  N bytes (JSON)  │  tensor data...          │ │
│  │ header   │  {"tensor_name": │  [raw bytes]             │ │
│  │  size    │   {dtype, shape, │                          │ │
│  │          │    offsets}, ...}│                          │ │
│  └────────────────────────────────────────────────────────┘ │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Installation

Add to your mix.exs:

def deps do
  [
    {:safetensors, "~> 0.1"}
  ]
end

Or from GitHub:

def deps do
  [
    {:safetensors, github: "notactuallytreyanastasio/safetensors_ex"}
  ]
end

Usage

Reading a File

# Read all tensors from a file
{:ok, tensors} = Safetensors.read("model.safetensors")

# tensors is a map: %{"tensor_name" => %Nx.Tensor{}}
weight = tensors["model.embed_tokens.weight"]

Reading Specific Tensors

# Only load specific tensors (more memory efficient)
{:ok, tensors} = Safetensors.read("model.safetensors",
  only: ["model.embed_tokens.weight", "lm_head.weight"]
)

Inspecting Metadata

# Get tensor info without loading data
{:ok, metadata} = Safetensors.metadata("model.safetensors")

# metadata: %{
#   "tensor_name" => %{
#     dtype: :f16,
#     shape: [32000, 4096],
#     data_offsets: [0, 262144000]
#   },
#   ...
# }

Streaming Large Models

# For models that don't fit in memory, stream tensors
Safetensors.stream("model.safetensors", fn name, tensor ->
  # Process each tensor as it's loaded
  process_tensor(name, tensor)
end)

Supported Data Types

Safetensors dtype	Nx type	Notes
`F32`	`:f32`	32-bit float
`F16`	`:f16`	16-bit float
`BF16`	`:bf16`	Brain float 16
`I64`	`:s64`	Signed 64-bit int
`I32`	`:s32`	Signed 32-bit int
`I16`	`:s16`	Signed 16-bit int
`I8`	`:s8`	Signed 8-bit int
`U8`	`:u8`	Unsigned 8-bit int
`U32`	`:u32`	Unsigned 32-bit int (for quantized weights)
`BOOL`	`:u8`	Boolean as u8

Backend Support

Tensors are created with the current Nx default backend:

# Load directly to GPU
Nx.default_backend({EMLX.Backend, device: :gpu})
{:ok, tensors} = Safetensors.read("model.safetensors")
# All tensors now on GPU

Quantized Models

For 4-bit quantized models (like Qwen3-8B-4bit), weights are stored as :u32 with packed int4 values:

{:ok, tensors} = Safetensors.read("model.safetensors")

# Quantized weights come in triplets
w = tensors["model.layers.0.self_attn.q_proj.weight"]      # u32, packed int4
scales = tensors["model.layers.0.self_attn.q_proj.scales"]  # f16 or bf16
biases = tensors["model.layers.0.self_attn.q_proj.biases"]  # f16 or bf16

# Use with EMLX.quantized_matmul
output = EMLX.quantized_matmul(input, w, scales, biases, true, 64, 4)

File Format Details

The safetensors format is simple:

Header size (8 bytes, little-endian u64): Size of JSON header
Header (N bytes, UTF-8 JSON): Tensor metadata
Data (remaining bytes): Raw tensor data, concatenated

Header structure:

{
  "__metadata__": {"format": "pt"},
  "tensor_name": {
    "dtype": "F16",
    "shape": [4096, 4096],
    "data_offsets": [0, 33554432]
  }
}

Data offsets are relative to the start of the data section.

Error Handling

case Safetensors.read("model.safetensors") do
  {:ok, tensors} ->
    # Use tensors
  {:error, :file_not_found} ->
    # File doesn't exist
  {:error, :invalid_header} ->
    # Corrupted or invalid file
  {:error, {:unsupported_dtype, dtype}} ->
    # Unknown data type
end

Performance

Memory-mapped: Large files are memory-mapped for efficiency
Lazy loading: Tensors loaded on-demand when streaming
Zero-copy: When possible, data is used directly without copying

Relationship to Other Projects

This is used by:

bobby_posts: Loads Qwen3-8B-4bit weights
bumblebee: Model weight loading (Bumblebee has its own loader, but this works standalone)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
lib		lib
test		test
.formatter.exs		.formatter.exs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mix.exs		mix.exs
mix.lock		mix.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Safetensors

Why This Exists

The Big Picture

Installation

Usage

Reading a File

Reading Specific Tensors

Inspecting Metadata

Streaming Large Models

Supported Data Types

Backend Support

Quantized Models

File Format Details

Error Handling

Performance

Relationship to Other Projects

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Safetensors

Why This Exists

The Big Picture

Installation

Usage

Reading a File

Reading Specific Tensors

Inspecting Metadata

Streaming Large Models

Supported Data Types

Backend Support

Quantized Models

File Format Details

Error Handling

Performance

Relationship to Other Projects

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages