This file contains the old releases of this repo when it was called gstaichi, before we renamed it to Quadrants.
v4.7.0b1 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.7.0b1 |
| Date | 2026-01-16 |
| Commit | hp/disable-o3-windows |
This pre-release is to test changing llvm optimization on windows from -O3 to -O1.
- [cuda,amdgpu,cpu,vulkan] Add public API for kernel clock counter. by @duburcqa in #314
- [Perf] dlpack with amdgpu by @hughperkins in #313
- [MISC] Rename excluded_parameters => template_slot_locations by @hughperkins in #338
- [MISC] Create ASTTransformerGlobalContext and remove Kernel from Runtime by @hughperkins in #340
- [MISC] Add xfailing py dataclass tests, and test instrumentation by @hughperkins in #342
- [MISC] Rename leaves => parameters by @hughperkins in #343
- [MISC] Improve information in exceptions during fuse_args by @hughperkins in #341
- [Misc] Add ti.clock_speed_hz() by @hughperkins in #346
- [Type] Py dataclass arguments can be renamed by @hughperkins in #353
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.6.0...v4.7.0b1
| Field | Value |
|---|---|
| Tag | v4.6.0 |
| Date | 2025-12-30 |
| Commit | main |
This release introduces a number of performance improvements when running single-threaded on CPU.
- [Perf] Early return earlier when materializing kernels. by @duburcqa in #324
- [Perf] Cache compiled kernel data systematically. by @duburcqa in #325
- [Perf] Enable dlpack on metal for pytorch ! <= 2.9.1 by @hughperkins in #336
- [Misc] Refactor part1: ONLY moves by @hughperkins in #326
- [Misc] Refactor _func_base.py functions into new base class FuncBase by @hughperkins in #327
- [Misc] Factorize out ASTGenerator, and remove debug dump to checksums.csv by @hughperkins in #328
- [Misc] Fuse extract_args method by @hughperkins in #330
- [Misc] Factorize out _try_load_fastcache by @hughperkins in #331
- [Misc] Factorize out launch context buffer cache by @hughperkins in #329
- [Misc] Miscellaneous refactors around kernel.py and associated files by @hughperkins in #332
- [Misc] Rename ASTTransformerContext to ASTTransformerFuncContext by @hughperkins in #334
- [Misc] Renaming args => py_args; process_args => fuse_args by @hughperkins in #335
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.5.0...v4.6.0
v4.6.0b4 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.6.0b4 |
| Date | 2025-12-27 |
| Commit | hp/renaming-dataclass-args-v2 |
This pre-release is to test enabling renaming py dataclass parameters when passing to sub-functions.
- [Type] Can rename py dataclass structs when calling sub functions by @hughperkins in #333
- [Perf] Early return earlier when materializing kernels. by @duburcqa in #324
- [Perf] Cache compiled kernel data systematically. by @duburcqa in #325
- [Misc] Refactor part1: ONLY moves by @hughperkins in #326
- [Misc] Refactor _func_base.py functions into new base class FuncBase by @hughperkins in #327
- [Misc] Factorize out ASTGenerator, and remove debug dump to checksums.csv by @hughperkins in #328
- [Misc] Fuse extract_args method by @hughperkins in #330
- [Misc] Factorize out _try_load_fastcache by @hughperkins in #331
- [Misc] Factorize out launch context buffer cache by @hughperkins in #329
- [Misc] Miscellaneous refactors around kernel.py and associated files by @hughperkins in #332
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.5.0...v4.6.0b4
v4.6.0b3 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.6.0b3 |
| Date | 2025-12-27 |
| Commit | hp/renaming-dataclass-args-v2 |
This pre-release is to test enabling renaming py dataclass parameters when passing to sub-functions.
- [Type] Can rename py dataclass structs when calling sub functions by @hughperkins in #333
- [Perf] Early return earlier when materializing kernels. by @duburcqa in #324
- [Perf] Cache compiled kernel data systematically. by @duburcqa in #325
- [Misc] Refactor part1: ONLY moves by @hughperkins in #326
- [Misc] Refactor _func_base.py functions into new base class FuncBase by @hughperkins in #327
- [Misc] Factorize out ASTGenerator, and remove debug dump to checksums.csv by @hughperkins in #328
- [Misc] Fuse extract_args method by @hughperkins in #330
- [Misc] Factorize out _try_load_fastcache by @hughperkins in #331
- [Misc] Factorize out launch context buffer cache by @hughperkins in #329
- [Misc] Miscellaneous refactors around kernel.py and associated files by @hughperkins in #332
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.5.0...v4.6.0b3
v4.6.0rc3 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.6.0rc3 |
| Date | 2025-12-27 |
| Commit | main |
This pre-release provides two performance improvements from @duburcqa .
- [Perf] Early return earlier when materializing kernels. by @duburcqa in #324
- [Perf] Cache compiled kernel data systematically. by @duburcqa in #325
- [Misc] Refactor part1: ONLY moves by @hughperkins in #326
- [Misc] Refactor _func_base.py functions into new base class FuncBase by @hughperkins in #327
- [Misc] Factorize out ASTGenerator, and remove debug dump to checksums.csv by @hughperkins in #328
- [Misc] Fuse extract_args method by @hughperkins in #330
- [Misc] Factorize out _try_load_fastcache by @hughperkins in #331
- [Misc] Factorize out launch context buffer cache by @hughperkins in #329
- [Misc] Miscellaneous refactors around kernel.py and associated files by @hughperkins in #332
- [Misc] Rename ASTTransformerContext to ASTTransformerFuncContext by @hughperkins in #334
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.5.0...foo
v4.6.0b2 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.6.0b2 |
| Date | 2025-12-26 |
| Commit | hp/renaming-dataclass-args-v2 |
This pre-release is to test enabling renaming py dataclass parameters when passing to sub-functions.
- [Type] Can rename py dataclass structs when calling sub functions by @hughperkins in #333
- [Perf] Early return earlier when materializing kernels. by @duburcqa in #324
- [Perf] Cache compiled kernel data systematically. by @duburcqa in #325
- [Misc] Refactor part1: ONLY moves by @hughperkins in #326
- [Misc] Refactor _func_base.py functions into new base class FuncBase by @hughperkins in #327
- [Misc] Factorize out ASTGenerator, and remove debug dump to checksums.csv by @hughperkins in #328
- [Misc] Fuse extract_args method by @hughperkins in #330
- [Misc] Factorize out _try_load_fastcache by @hughperkins in #331
- [Misc] Factorize out launch context buffer cache by @hughperkins in #329
- [Misc] Miscellaneous refactors around kernel.py and associated files by @hughperkins in #332
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.5.0...v4.6.0b2
v4.6.0rc2 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.6.0rc2 |
| Date | 2025-12-18 |
| Commit | hp/func-base-refactorization |
This pre-release is for a behind-the-scenes refactor of kernel_impl.py
- [Misc] Refactorizing kernel_impl.py by @hughperkins in #319
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.6.0...v4.6.0rc2
v4.6.0rc1 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.6.0rc1 |
| Date | 2025-12-18 |
| Commit | hp/func-base-refactorization |
This pre-release is for a behind-the-scenes refactor of kernel_impl.py
- [Misc] Refactorizing kernel_impl.py by @hughperkins in #319
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.6.0...v4.6.0rc1
| Field | Value |
|---|---|
| Tag | v4.5.0 |
| Date | 2025-12-18 |
| Commit | main |
This release provides new performance improvements.
- [perf] Speed up python-side arg processing. by @duburcqa in #323
- [Perf] Speed up computation of cache key. by @duburcqa in #321
- [Misc] Add optimization instrumentation to dump kernels to files by @hughperkins in #317
- [Misc] Add TI_DUMP_CFG to dump CFG graph during optimization passes by @hughperkins in #318
- [Misc] Remove unused inlining module by @hughperkins in #316
- [Misc] TI_DUMP_IR honors config.debug_dump_path by @hughperkins in #322
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.4.0...v4.5.0
v4.5.0b2 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.5.0b2 |
| Date | 2025-12-17 |
| Commit | hp/func-base-refactorization |
This pre-release is to test that a refactorization of kernel_impl.py, under the hood, doesn't break anything.
- [Misc] Refactorizing kernel_impl.py by @hughperkins in https://github.com/Genesis-Embodied-AI/gstaichi/pulls
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.4.0...v4.5.0b2
v4.5.0b1 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.5.0b1 |
| Date | 2025-12-17 |
| Commit | hp/func-base-refactorization |
This pre-release is to test that a refactorization of kernel_impl.py, under the hood, doesn't break anything.
- [Misc] Refactorizing kernel_impl.py by @hughperkins in https://github.com/Genesis-Embodied-AI/gstaichi/pulls
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.4.0...v4.5.0b1
| Field | Value |
|---|---|
| Tag | v4.4.0 |
| Date | 2025-12-17 |
| Commit | main |
This release enables AMDGPU in the WIndows and Linux wheels, improves performance for kernels having templated primitive parameters, and fixes a segfault when using to_dlpack on fields.
- [Perf] Fix edge-cases defeating caching mechanism by @duburcqa in #320
- [Bug] Fix dlpack segfault for field by @erizmr in #312
- [AMDGPU] Enable AMDGPU build and AMDGPU test runner by @hughperkins in #306
- [Build] Remove conda by @hughperkins in #311
- [Build] Remove pybind11 and libc++-*-dev from linux build by @hughperkins in #305
- [Build] Windows build uses clang 20 by @hughperkins in #310
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.3.1...v4.4.0
v4.4.0rc2 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.4.0rc2 |
| Date | 2025-12-17 |
| Commit | duburcqa/fix_caching_edge_cases |
- [Perf] Fix edge-cases defeating caching mechanism by @duburcqa in #320
- [Build] Remove conda by @hughperkins in #311
- [Build] Remove pybind11 and libc++-*-dev from linux build by @hughperkins in #305
- [Bug] Fix dlpack segfault for field by @erizmr in #312
- [Build] Windows build uses clang 20 by @hughperkins in #310
- [AMDGPU] Enable AMDGPU build and AMDGPU test runner by @hughperkins in #306
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.3.1...v4.4.0rc2
v4.4.0rc1 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.4.0rc1 |
| Date | 2025-12-17 |
| Commit | duburcqa/fix_caching_edge_cases |
- [Perf] Fix edge-cases defeating caching mechanism by @duburcqa in #320
- [Build] Remove conda by @hughperkins in #311
- [Build] Remove pybind11 and libc++-*-dev from linux build by @hughperkins in #305
- [Bug] Fix dlpack segfault for field by @erizmr in #312
- [Build] Windows build uses clang 20 by @hughperkins in #310
- [AMDGPU] Enable AMDGPU build and AMDGPU test runner by @hughperkins in #306
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.3.1...v4.4.0rc1
v4.4.0b3 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.4.0b3 |
| Date | 2025-12-17 |
| Commit | hp/func-base-refactorization-factorize-cache |
This pre-release is to test the kernel_impl.py refactor.
- [Misc] Refactorizing kernel_impl.py
- [Build] Remove conda by @hughperkins in #311
- [Build] Remove pybind11 and libc++-*-dev from linux build by @hughperkins in #305
- [Bug] Fix dlpack segfault for field by @erizmr in #312
- [Build] Windows build uses clang 20 by @hughperkins in #310
- [AMDGPU] Enable AMDGPU build and AMDGPU test runner by @hughperkins in #306
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.3.1...v4.4.0b3
v4.4.0b2 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.4.0b2 |
| Date | 2025-12-16 |
| Commit | hp/func-base-refactorization |
This pre-release is to test the kernel_impl.py refactor.
- [Misc] Refactorizing kernel_impl.py
- [Build] Remove conda by @hughperkins in #311
- [Build] Remove pybind11 and libc++-*-dev from linux build by @hughperkins in #305
- [Bug] Fix dlpack segfault for field by @erizmr in #312
- [Build] Windows build uses clang 20 by @hughperkins in #310
- [AMDGPU] Enable AMDGPU build and AMDGPU test runner by @hughperkins in #306
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.3.1...v4.4.0b2
v4.4.0b1 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.4.0b1 |
| Date | 2025-12-16 |
| Commit | hp/func-base-refactorization |
This pre-release is to test the kernel_impl.py refactor.
- [Misc] Refactorizing kernel_impl.py
- [Build] Remove conda by @hughperkins in #311
- [Build] Remove pybind11 and libc++-*-dev from linux build by @hughperkins in #305
- [Bug] Fix dlpack segfault for field by @erizmr in #312
- [Build] Windows build uses clang 20 by @hughperkins in #310
- [AMDGPU] Enable AMDGPU build and AMDGPU test runner by @hughperkins in #306
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.3.1...v4.4.0b1
| Field | Value |
|---|---|
| Tag | v4.3.1 |
| Date | 2025-12-03 |
| Commit | main |
This release fixes a bug with offline cache.
- [Bug] Fix parallel cache write by @hughperkins in #309
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.3.0...v4.3.1
| Field | Value |
|---|---|
| Tag | v4.3.0 |
| Date | 2025-11-29 |
| Commit | main |
This release provides some additional kernel launchtime accelerations, fixes several fastcache bugs, and several to_dlpack bugs.
- [Pref] Avoid runtime overhead due to custom data_oriented attribute getter. by @duburcqa in #279
- [Perf] Redo fastcache kernel key to reuse existing front end cache key by @hughperkins in #283
- [Perf] Speed up argument processing on Python-side. by @duburcqa in #282
- [Perf] Implement to_dlpack for ndarrays on Metal by @hughperkins in #287
- [Perf] Add fastcache test for dupe kernels and fix failure by @hughperkins in #286
- [Perf] Fastcache key contains gstaichi version by @hughperkins in #289
- [Bug] Fix dlpack zero-copy memory alignment by @erizmr in #298
- [Bug] Fix ndarray memory leak by @BernardoCovas and @hughperkins in #278
- [Build] Remove Dockerfile by @hughperkins in #276
- [Build] Pin manylinux arm to 2025.11.11-1 by @hughperkins in #285
- [Build] Change Mac CI build from Mac 15 to Mac 26 by @hughperkins in #288
- [Build] Only use clang 20 or un-versioned for linux build by @hughperkins in https://github.com/Genesis-Embodied
- [Build] Remove uploads of wheels to aws by @hughperkins in #292 AI/gstaichi/pull/290
- [Build] Remove some final clang 20 warnings about using VLAs by @hughperkins in #295
- [Build] Split linux CI gpu tests into three parallel jobs by @hughperkins in #300
- [Misc] Add ti.dump_compile_config() by @hughperkins in #281
- [Misc] Fix homepage URL in pyproject.toml by @oliver-batchelor-work in #291
- [Test] Add tests for consistency between kernel accessors, external accessors, to/from numpy by @hughperkins in #296
- [Type] Fix u1 consistency tests for vulkan and metal by @hughperkins in #302
- [Type] Enable dlpack for metal u1 by @hughperkins in #303
- [SPIRV] Add additional spir-v dump before optimization by @hughperkins in #304
- @oliver-batchelor-work made their first contribution in #291
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.2.0...v4.3.0
| Field | Value |
|---|---|
| Tag | v4.2.0 |
| Date | 2025-11-17 |
| Commit | main |
This release upgrades gstaichi to use LLVM 20, enabling use of compute capability up to and including sm_120.
- [Build] Remove libjpg by @hughperkins in #277
- [Build] Llvm 20 by @johnnynunez and @hughperkins in #275
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.1.0...v4.2.0
v4.2.0b1 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.2.0b1 |
| Date | 2025-11-17 |
| Commit | hp/llvm-20 |
This pre-release tests upgradsing to LLVM 20.
- [Build] Upgrade to LLVM-20 by @hughperkins in #275
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.1.0...v4.2.0b1
| Field | Value |
|---|---|
| Tag | v4.1.0 |
| Date | 2025-11-16 |
| Commit | main |
This release adds to_dlpack, which provides zero-copy usage of gstaichi tensors in torch; and upgrades LLVM from 15.0.7 to 18.1.8.
- [Type] Add to_dlpack to ndarray tensors by @hughperkins in #270
- [Type] Add to_dlpack for dense fields by @hughperkins in #272
- [Build] LLVM-18 by @johnnynunez and @hughperkins in #274
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.0.0...v4.1.0
v4.1.0b6 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.1.0b6 |
| Date | 2025-11-16 |
| Commit | hp/llvm-18-v2 |
This pre-release is in order to test migrating to LLVM-18.
- [Build] Migrate to LLVM-18 by @hughperkins in #274
- [Type] Add to_dlpack to ndarray tensors by @hughperkins in #270
- [Type] Add to_dlpack for dense fields by @hughperkins in #272
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.0.0...v4.1.0b6
v4.1.0b2 (pre-release)
| Field | Value |
|---|---|
| Tag | v4.1.0b2 |
| Date | 2025-11-15 |
| Commit | hp/llvm-18-v2 |
This pre-release is in order to test migrating to LLVM-18.
- [Build] Migrate to LLVM-18 by @hughperkins in #273
- [Type] Add to_dlpack to ndarray tensors by @hughperkins in #270
- [Type] Add to_dlpack for dense fields by @hughperkins in #272
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v4.0.0...v4.1.0b2
| Field | Value |
|---|---|
| Tag | v4.0.0 |
| Date | 2025-11-12 |
| Commit | main |
This release increases the speed of non-batched ndarray on CPU by 4.5x in Genesis benchmarks. We are removing support for textures, hence the major version bump.
- [cpu] Move from 'nehalem' to 'x86-64-v3' as x86 micro-architecture baseline. by @duburcqa in #265
- [Misc] Remove textures by @hughperkins in #268
- [Perf] Prune unused dataclass fields from being passed to kernels by @hughperkins in #259
- [Perf] Add template mapper key caching. by @duburcqa in #264
- [Perf] Avoid dynamic cast if possible. by @duburcqa in #263
- [Perf] Replace dynamic-size std::vector by fixed-size std::array. by @duburcqa in #266
- [Perf] Do not copy kernel parameters. by @duburcqa in #267
- [Perf] Minor launch kernel python overhead optimisation. by @duburcqa in #269
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v3.3.0b7...v4.0.0
| Field | Value |
|---|---|
| Tag | v3.3.0 |
| Date | 2025-11-07 |
| Commit | main |
This pre-release adds support for ARM on Linux, and a further ndarray performance improvement.
- [Perf] Minor performance improvement. by @duburcqa in #258
- [Build] Add build for ARM by @johnnynunez and @hughperkins in #261
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v3.2.1...v3.3.0
| Field | Value |
|---|---|
| Tag | v3.2.1 |
| Date | 2025-11-02 |
| Commit | main |
This pre-release fixes a bug with ndarray optimizations.
- [Perf] Store template mapper key at instance level in #260
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v3.2.0...v3.2.1
| Field | Value |
|---|---|
| Tag | v3.2.0 |
| Date | 2025-11-02 |
| Commit | main |
This release adds many optimizations so that ndarrays run much faster, changing from 11x slower than fields before this release, to 1.8x slower than fields with this release. (on a specific Genesis test, using a 5090 GPU)
- Optimize kernel launch overhead. in #250
- Cleanup launch kernel logics in #251
- Further optimization of kernel launch overhead. in #252
- Optimize of kernel launch overhead on C++ side. in #253
- Diagnose launch context buffer cache-ability. in #254
- Add python caching of launch context buffer. in #255
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v3.1.1...v3.2.0
| Field | Value |
|---|---|
| Tag | v3.1.1 |
| Date | 2025-10-24 |
| Commit | main |
This release fixes a critical bug in fastcache source code reader.
- [Bug] Improve robustness of fastcache source code file reader in #249
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v3.1.0...v3.1.1
| Field | Value |
|---|---|
| Tag | v3.1.0 |
| Date | 2025-10-22 |
| Commit | main |
This release fixes various fast cache bugs, in particular it aims at fixing fast cache corruption that was occuring.
- [Build] Free up CI runner disk space in #244
- [Perf] Add fastcache= to ti.kernel and mark other approaches as deprecated in #243
- [Perf] Reduce fast cache spam for data oriented members in #242
- [Perf] Avoid fast cache corruption and recover from errors in #239
- [Perf] Enable NamedTuple data oriented classes for fastcache in #248
- [Perf] Fast cache works with derived torch tensors now. in #241
- [Lang] Add option to raise exception if use templated floats in #247
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v3.0.0...v3.1.0
| Field | Value |
|---|---|
| Tag | v3.0.0 |
| Date | 2025-10-14 |
| Commit | main |
This adds additional validation for pure kernels, and for AD-compatible kernels. It automatically adds primitive values in data oriented objects to the fast cache key. The major version upgrade is because adstack is no longer automatically enabled, and needs to be activated explicitly using new option ti.init(ad_stack_experimental_enable=True).
- [Test] Completely skip test_matrix_ndarray_oob on windows in #238
- [Misc] Remove apparently unused fp16 includes in #199
- [Perf] Add primitive values in data oriented to fastcache key in #237
- [Perf] Detect pure violation in #230
- [Autodiff] Fail on non-static range in backwards in #229
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.6.1...v3.0.0
| Field | Value |
|---|---|
| Tag | v2.6.1 |
| Date | 2025-10-10 |
| Commit | main |
This patch release fixes an OOM issue we encountered in Genesis CI for push_differentiable.py example.
- [Bug] Fix memory OOM in Genesis push_differentiable example in #228
- [Bug] Fix dataclass expansion with kwargs, and add test for this in #235
- [Doc] 95pct CI build succes to 80pct in #224
- [Build] Rename linux_x86 folder and file to linux in #236
- [Test] Disable flaky OOB test on windows in #234
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.6.0...v2.6.1
| Field | Value |
|---|---|
| Tag | v2.6.0 |
| Date | 2025-10-07 |
| Commit | main |
This release doesn't change anything user-facing. Under the hood, we are migrating to our in-house built llvm, as the first step of upgrading LLVM to a newer version.
- [Type] Add tests for () indexing, and ndarray ndim == 0 in #233
- [Build] Hopefully fix autoapi for docs build in #232
- [Build] Make build no longer download pybind11 in #231
- [Build] Build using freshly built LLVM 15.0.7 in #216
- [Test] Improve error message for TI_LIB_DIR when running c++ unit tests in #223
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.5.0...v2.6.0
| Field | Value |
|---|---|
| Tag | v2.5.0 |
| Date | 2025-10-04 |
| Commit | main |
This release fixes two bugs on Mac Metal that were causing some Genesis CI tests to fail.
- [build] Try moving more things into pyproject.toml in #189
- [Build] Fix error function in #214
- [Build] Split mac build into build and test in #217
- [Build] Remove git fetch tags, and make changelog, and some other files from misc in #215
- [Build] Windows pypi publish waits for test to finish first in #221
- [Build] Linux pypi publish depends also on test gpu in #222
- [Type] Remove type: ignore from some files in gstaichi folder in #183
- [Misc] Add capabilities to facilitate fastcache debugging in #206
- [Misc] Move taskgen class declaration to header in #210
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.4.0...v2.5.0
| Field | Value |
|---|---|
| Tag | v2.4.0 |
| Date | 2025-09-16 |
| Commit | main |
This release:
adds a new pure: bool parameter to @ti.kernel, which marks a kernel as only accessing data passed in as kernel parameters, and therefore eligible for fast src-ll cache. It is equivalent to adding @ti.pure in front of the @ti.kernel annotation, but easier to parametrize.
upgrades many external dependencies, and removes unused ones
- [Build] Split windows build into build job and test job, so reruns are faster in #191
- [Lang] Add
pureparameter to@ti.kernelin #190
- [Misc] Update googletest to 1.17.0 in #185
- [Misc] Remove GLFW library in #201
- [Misc] Remove GLM library in #198
- [Misc] Upgrade all Vulkan external libraries to 1.4.321 in #196
- [Misc] Upgrade vulkan sdk to 1.4.321.1 in #204
- [Misc] Upgrade Eigen to commit 70d8d9 in #192
- [Cuda] Remove misleading cuda_version() function, and add link to posts about slim libdevice.10.bc in #202
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.3.1...v2.4.0
| Field | Value |
|---|---|
| Tag | v2.3.1 |
| Date | 2025-09-12 |
| Commit | main |
This patch release fixes two bugs in fast cache:
- templated primitive kernel parameters are now handled correctly
- return values from a kernel no longer cause a crash
- [Perf] Cache template values for fast cache in #177
- [Perf] Fix issue with return type with fastcache, and add unit test for this in #187
- [Build] Ruff now checks for unused imports in #178
- [Build] Migrate build metadata to pyproject.toml in #179
- [Misc] Delete non-working version check in #182
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.3.0...v2.3.1
| Field | Value |
|---|---|
| Tag | v2.3.0 |
| Date | 2025-09-09 |
| Commit | main |
This release adds the possibility of using a static inline if as the top level expression in a for-loop iterator.
- [Build] Remove directx headers in #175
- [Lang] Allow static if expression as top level in for loop iterator in #176
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.2.1...v2.3.0
| Field | Value |
|---|---|
| Tag | v2.2.1 |
| Date | 2025-09-08 |
| Commit | main |
This patch release fixes an issue with fast cache that meant one had to run the same script 3 times to be fully cached; and a crash bug after ti.reset for ndarrays.
- [Type] Fix NotImplementedError in #165
- [Perf] Ensure consistent module name when using src-ll cache in #164
- [Bug] Fix ndarray crash after ti reset in #172
- [Misc] Better error reporting in #167
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.2.0...v2.2.1
| Field | Value |
|---|---|
| Tag | v2.2.0 |
| Date | 2025-09-04 |
| Commit | main |
This release is focusing on enabling src-ll cache for Genesis.
- [Misc] Add logging for invalid params to pure kernels in #144
- [Misc] Add ti init option print_non_pure in #145
- [Build] Pin pytest-rerunfailures to < 16 in #162
- [Perf] Disable fast cache for fields in #163
- [Test] Instrument fe-ll-cache so we can test when we get a cache hit in #161
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.1.1...v2.2.0
| Field | Value |
|---|---|
| Tag | v2.1.1 |
| Date | 2025-08-27 |
| Commit | main |
Patch release to fix a bug in SRC-LL-Cache that caused repeated calls to a cached function to fail.
- [Type] Revise the ndarray annotations on test_ndarray.py in #134
- [Type] Clean typing on misc.py in #142
- [Build] Wheel built on macosx 15 runs on lower mac versions in #154
- [Build] Docs are built with correct version number displayed now in #157
- [Misc] Add src_ll_cache flag to ti.init to disable src-ll-cache in #143
- [Misc] Remove superfluous self.compiled-kernel_data = None in #160
- [Perf] Improve testing of pure functions to call function repeatedly after cache load; and fix failure in #159
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.1.0...v2.1.1
| Field | Value |
|---|---|
| Tag | v2.1.0 |
| Date | 2025-08-26 |
| Commit | main |
This release removes spam associated with the new PTX cache, and removes the incorrect warning about the wheel being 'restricted'. We also start to add some initial documentation.
- [Build] Remove restricted warning in #149
- [Misc] Reduce ptx cache spam in #156
- [Doc] Add new doc in #89
- [Test] Fix broken merge of test ndarray max num args skip in #140
- [Doc] Add new doc in #89
- [Doc] Remove copyright from header in #150
- [Doc] Fix readme links to docs. in #148
- [Build] Remove restricted warning in #149
- [Misc] Reduce ptx cache spam in #156
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v2.0.0...v2.0.1
| Field | Value |
|---|---|
| Tag | v2.0.0 |
| Date | 2025-08-22 |
| Commit | main |
Py dataclasses can now be nested. We added faster cache load time [*1] for kernels annotated with @ti.pure, and running on CUDA. We removed paddle and argpack.
Note that we are using semver.org, and since we are removing things, which is a backwards-incompatible change, hence the major version bump.
[*1] Concretely, on Genesis simulator, running on a Ubuntu 24.04 box, with an NVidia 5090 GPU, kernel cache load time for single_franka_envs.py has changed as follows:
- baseline: 7.2s
- with SRC-LL cache added: 2.9s
- with PTX cache added: 4.6s
- with both SRC-LL cache and PTX cache added: 0.3s
- [Type] Add nested py dataclasses and enable .shape for members in #91
- [Perf] Add SRC-LL caching to accelerate cache load time in #131
- [Perf] Add ptx caching to accelerate cache load time in #130 #131
- [Misc] Remove Paddle, argpack in #132 #127
- [vulkan] Fix test_print on Vulkan in #118
- [Type] Add nested py dataclasses and enable .shape for members in #91
- [type] Fix pyright warnings in #136
- [Misc] Remove paddle in #132
- [Misc] Remove argpack in #127
- [Misc] Fix bug with np.bool in args hasher and construct path for debugging in #138
- [Test] Rename test_py_dataclass.py, and change ti.template() to ti.Template in #135
- [Test] Use temporary_module to avoid leaving modules behind in namespace in #146
- [Build] Enable building cpp tests in CI in #120
- [Build] Add gpu runner in #133
- [Build] Run c++ tests on gpu ci runner in #139
Full Changelog: https://github.com/Genesis-Embodied-AI/gstaichi/compare/v1.0.1...v2.0.0
| Field | Value |
|---|---|
| Tag | v1.0.1 |
| Date | 2025-08-07 |
| Commit | main |
This initial release is mostly to ensure that our own CI build system is working and can publish wheels correctly. We provide support for passing heterogeneous python dataclasses into kernels and sub-functions., We made some initial typing improvements. We removed functionality we won't be using.
All contributions in this release are from Genesis team (@hughperkins).
- Remove C API, AOT, DX11, DX12, Android, IOS, OpenGL, GLES, UI, CLI (#123, #115, #27, #124)
- Increase max kernel args to 512 (#114)
- Add TI_SHOW_COMPILING flag to show when a kernel is compiled (#92)
- Fixed a broad family of debug mode crashes (#8705) (technically part of upstream 1.7.4, since we contributed this to upstream)
- Add fields to python dataclasses support (#76)
- Add python dataclasses support (#73)
- Allow ti.types.NDArray with square brackets as type (#42)
- Allow ti.template and ti.Template for typing (#41)
- Allow none return typing, for kernels and funcs (#18)
- Add pyi stubs to wheel build (#49)
- [Misc] Remove CLI (#124)
- [Misc] Remove shaders (#126)
- [Misc] Remove C api, AOT, DX11, DX12, Android, IOS (#123)
- [Misc] Increase max kernel args to 512 (#114)
- [Misc] Remove opengl and gles (#115)
- [Misc] Add TI_SHOW_COMPILING (#92)
- [Misc] Remove ui, and related doc and examples (#27)
- [Misc] Migrate to TI_DUMP_IR and TI_LOAD_IR and compare with 1 (#53)
- [Misc] Add TI_DUMP_AST (#52)
- [Misc] Dump struct uses file sequencer (#40)
- [Misc] Add TAICHI_DUMP_IR and TAICHI_LOAD_IR (#15)
- [Build] Publish to pypi (#113)
- [Build] Migrate from environment to repo secrets (#106)
- [Build] Merge from upstream (#112)
- [Build] reduce concurrency, and only test python 3.10 (#111)
- [Build] Update git config --global --add safe.directory for gs-taichi (#109)
- [Build] Update gitignore for whl, so, stubs, CHANGELOG.md (#102)
- [Build] Try to fix manylinux build issue (#105)
- [Build] Migrate pyright to run from manylinux (#80)
- [Build] Add gc before each unit test, to prevent ndarray issues (#74)
- [Build] Fix broken builds, hopefully (#77)
- [Build] Reduce concurrency (#75)
- [Build] Run on merge to main (#70)
- [Build] Remove a remaining .github old script (#72)
- [Build] Add stub postprocessing (#57)
- [Build] Remove the old build scripts ,that we arent using currently (#36)
- [Build] Win build on matrix, and upload to s3 (#51)
- [Build] re-add scikit-build (#55)
- [Build] Build and publish api doc (#46)
- [Build] Add pyi stubs to wheel build (#49)
- [Build] Grid of macos python versions and upload to s3 (#50)
- [Build] Explicitly start sccache (#45)
- [Build] Manylinux wheel (#37)
- [Build] Remove conditional guard on -Wno-unused-but-set-variable (#48)
- [Build] Make apple definitions conditional on apple (#47)
- [Build] Enable pyright on all files (that dont have
# type: ignoreat the top) (#44) - [Build] Add codeowners (#38)
- [Build] Add mac build for Mac OS 14 and 15 (#5)
- [Build] Enable ruff check (#32)
- [Build] Increase sccache timeout (#35)
- [Build] Bulk mark # type: ignore (#31)
- [Build] Add name to windows 2025 build, for status check registration (#28)
- [Build] Fix build issue with filesystem (#29)
- [Build] Run ruff check --select I --fix (#20)
- [Build] Fix status check names (#21)
- [Build] Blanket 3 retries (#26)
- [Build] Add Windows github runner (#4)
- [Build] Add linters (#6)
- [Build] Fix tools test on wheel (#17)
- [Build] Add clang-tidy linter, and fix lint errors (#11)
- [Build] Improve link checker (#9)
- [Build] Linux x86 runner (#3)
- [Type] Factorization for nested structs (#99)
- [Type] Update _graph.py to use GraphBuilderCxx (#119)
- [Type] Typing 5: Rename imported cpp classes to end in Cxx (#69)
- [Type] Typing batch 4, including handling kernel/func/real_func wrapper (#67)
- [Type] Add fields to dataclasses.dataclass support (#76)
- [Type] Fix ti.Template on ti.funcs (#94)
- [Type] Typing additions batch 2 c (#61)
- [Type] Add ndarray struct (#73)
- [Type] Bunch of typing added for kernel_impl.py, impl.py, and related (#60)
- [Type] Add lots of typing to ast_transformer.py (#54)
- [Type] Allow ti.types.NDArray with square brackets as type (#42)
- [Type] Allow ti.template and ti.Template for typing (#41)
- [Type] Allow none return typing, for kernels and funcs (#18)
- [Test] Add a test to cover calling class method (#96)
- [Test] test_api sorts the names (needed for renaming taichi => gstaichi) (#108)
- [Test] Unit tests print full traceback on exception (#101)
- [Test] Make test_print no longer break if stdout output (#68)
- [Test] Don't use print to test quant (#62)
- [Test] Fold py38_only.py into appropriate other test scripts (#39)
- [Doc] Update issue template for gstaichi (#121)
- [Doc] The readme is ready to be made publicly visible (#82)
- [Doc] Migrate to sphinx (#90)
- [Doc] Nuke doc and examples (#88)
- [Doc] Migrate docs links to point into repo, rather than to docs server (#33)
- [Doc] Remove rfcs (#34)
- [Doc] Check markup links (#7)
- [Cuda] Move implementation from jit_cuda.h into jit_cuda.cpp (#23)
- [Mac] Fix metal device build (#12)
- [Rhi] [bug] Fix the Unified Allocator to no longer return first two allocations as dupes (#8705) (technically part of 1.7.4, since we contributed this to upstream)
- [Vulkan] Fix exception for max ndarray args test on mac vulkan (#122)