Skip to content

Add SizedArray{T, n}#2

Open
andyferris wants to merge 2 commits intomasterfrom
ajf/sized-array
Open

Add SizedArray{T, n}#2
andyferris wants to merge 2 commits intomasterfrom
ajf/sized-array

Conversation

@andyferris
Copy link
Copy Markdown
Owner

@andyferris andyferris commented Mar 16, 2026

Add a new mutable, fixed-size, contiguous buffer type SizedMemory{T, n} to Core, where T is the element type and n is a compile-time integer type parameter specifying the number of elements. This provides a primitive for building statically-sized AbstractArray types that support all element kinds (isbits, boxed, unions), unlike the NTuple-based approach used by StaticArrays.jl's MArray which is a hack that only works for isbits types.

SizedMemory stores element data inline with no header fields (no length or ptr), making it minimal-overhead and a candidate for stack allocation via escape analysis. It uses the same element storage strategies as GenericMemory (arrayelem layout flags, GC pointer scanning) but is structurally simpler — no MemoryRef indirection, no ownership modes, and the length is derived from the type parameter rather than stored per-instance.

Zero-sized instances (n=0 or sizeof(T)==0) are singletons, following Memory's existing pattern.


Example codegen

Here's an example at the REPL defining some basic copy and summation functions.

julia> f3 = SizedMemory{Float64, 3}()
3-element SizedMemory{Float64, 3}:
 5.0e-324
 6.3540281108534e-310
 6.354028114115e-310

julia> f3[:] .= 0
3-element view(::SizedMemory{Float64, 3}, :) with eltype Float64:
 0.0
 0.0
 0.0

julia> f3[2] = 2
2

julia> f3
3-element SizedMemory{Float64, 3}:
 0.0
 2.0
 0.0

julia> sum(f3)
2.0

julia> function my_copy(v::SizedMemory{T, n}) where {T, n}
           v2 = SizedMemory{T, n}()
           for i = 1:n
               v2[i] = v[i]
           end
           return v2
       end
my_copy (generic function with 1 method)

julia> my_copy(f3)
3-element SizedMemory{Float64, 3}:
 0.0
 2.0
 0.0

julia> @code_native my_copy(f3)
        .file   "my_copy"
        .section        .ltext,"axl",@progbits
        .globl  julia_my_copy_0                 # -- Begin function julia_my_copy_0
        .p2align        4
        .type   julia_my_copy_0,@function
julia_my_copy_0:                        # @julia_my_copy_0
; Function Signature: my_copy(SizedMemory{Float64, 3})
; ┌ @ REPL[16]:1 within `my_copy`
# %bb.0:                                # %top
        #DEBUG_VALUE: my_copy:v <- [$rdi+0]
        push    rbp
        mov     rbp, rsp
        push    r14
        push    rbx
        mov     rbx, rdi
        #APP
        mov     rax, qword ptr fs:[0]
        #NO_APP
        mov     rax, qword ptr [rax - 8]
; │ @ REPL[16]:2 within `my_copy`
; │┌ @ boot.jl:672 within `SizedMemory`
        movabs  rcx, offset "+Core.SizedMemory#0"
        mov     r14, qword ptr [rcx]
        mov     rdi, qword ptr [rax + 16]
        movabs  rax, offset ijl_gc_small_alloc
        mov     esi, 408
        mov     edx, 32
        mov     rcx, r14
        call    rax
        mov     qword ptr [rax - 8], r14
; │└
; │ @ REPL[16]:4 within `my_copy`
; │┌ @ sizedmemory.jl:12 within `getindex`
        vmovsd  xmm0, qword ptr [rbx]           # xmm0 = mem[0],zero
; │└
; │┌ @ sizedmemory.jl:19 within `setindex!`
        vmovsd  qword ptr [rax], xmm0
; │└
; │┌ @ sizedmemory.jl:12 within `getindex`
        vmovsd  xmm0, qword ptr [rbx + 8]       # xmm0 = mem[0],zero
; │└
; │┌ @ sizedmemory.jl:19 within `setindex!`
        vmovsd  qword ptr [rax + 8], xmm0
; │└
; │┌ @ sizedmemory.jl:12 within `getindex`
        vmovsd  xmm0, qword ptr [rbx + 16]      # xmm0 = mem[0],zero
; │└
; │┌ @ sizedmemory.jl:19 within `setindex!`
        vmovsd  qword ptr [rax + 16], xmm0
; │└
; │ @ REPL[16]:6 within `my_copy`
        pop     rbx
        pop     r14
        pop     rbp
        ret
.Lfunc_end0:
        .size   julia_my_copy_0, .Lfunc_end0-julia_my_copy_0
; └
                                        # -- End function
        .section        ".note.GNU-stack","",@progbits

The only call is a small allocation, and bounds checks are ellided.

Here's an example of escape analysis allowing us to create a scratch space on the stack and working on that for a result:

julia> function my_sum(v::SizedMemory{T, n}) where {T, n}
           v2 = SizedMemory{T, n}()
           for i = 1:n
               v2[i] = v[i]
           end
           out = zero(T)
           for i = 1:n
               out += v2[i]
           end
           return out
       end

my_sum (generic function with 1 method)

julia> my_sum(f3)
2.0

julia> @code_native my_sum(f3)
        .file   "my_sum"
        .section        .ltext,"axl",@progbits
        .globl  julia_my_sum_0                  # -- Begin function julia_my_sum_0
        .p2align        4
        .type   julia_my_sum_0,@function
julia_my_sum_0:                         # @julia_my_sum_0
; Function Signature: my_sum(SizedMemory{Float64, 3})
; ┌ @ REPL[18]:1 within `my_sum`
# %bb.0:                                # %top
        #DEBUG_VALUE: my_sum:v <- [$rdi+0]
        push    rbp
        mov     rbp, rsp
        vxorpd  xmm0, xmm0, xmm0
; │ @ REPL[18]:8 within `my_sum`
; │┌ @ float.jl:492 within `+`
        vaddsd  xmm0, xmm0, qword ptr [rdi]
        vaddsd  xmm0, xmm0, qword ptr [rdi + 8]
        vaddsd  xmm0, xmm0, qword ptr [rdi + 16]
; │└
; │ @ REPL[18]:10 within `my_sum`
        pop     rbp
        ret
.Lfunc_end0:
        .size   julia_my_sum_0, .Lfunc_end0-julia_my_sum_0
; └
                                        # -- End function
        .section        ".note.GNU-stack","",@progbits

Codegen did pretty well. I think the sum could possibly used a wider SIMD instruction for element 1 and 2?

These patterns cover most of our @generated functions in StaticArrays.jl - we should be able to write StaticArrays as standard methods, no @inbounds or manually unrolled loops, able to use imperative algorithms where they make more sense. Obviously MArray can be a direct wrapper around SizedMemory, but it can also be used for "scratch space" to construct the elements for a SArray and constructing the SArray on return (or any other StaticArray for that matter).

andyferris and others added 2 commits March 16, 2026 13:51
Add a new mutable, fixed-size, contiguous buffer type `SizedMemory{T, n}`
to `Core`, where `T` is the element type and `n` is a compile-time integer
type parameter specifying the number of elements. This provides a primitive
for building statically-sized `AbstractArray` types that support all element
kinds (isbits, boxed, unions), unlike the `NTuple`-based approach used by
StaticArrays.jl's `MArray` which only works for isbits types.

`SizedMemory` stores element data inline with no header fields (no `length`
or `ptr`), making it minimal-overhead and a candidate for stack allocation
via escape analysis. It uses the same element storage strategies as
`GenericMemory` (arrayelem layout flags, GC pointer scanning) but is
structurally simpler — no MemoryRef indirection, no ownership modes, and
the length is derived from the type parameter rather than stored per-instance.

Zero-sized instances (n=0 or sizeof(T)==0) are singletons, following
Memory's existing pattern.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@andyferris andyferris self-assigned this Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant