Thanks for your interest in contributing to nnsight! This guide covers everything you need to get started.
- Python 3.10+
- PyTorch >= 2.4.0
- Git
- Fork and clone the repository:
git clone https://github.com/<your-username>/nnsight.git
cd nnsight- Install in development mode:
pip install -e ".[test]"This installs nnsight in editable mode along with test dependencies (pytest, pytest-cov).
- Verify your setup:
pytest tests/test_tiny.py --device cpu -x- diffusers: Required for
DiffusionModelsupport. Install withpip install diffusers. - vllm: Required for vLLM integration. Install with
pip install vllm>=0.12.
These are not required for core development or running the main test suite.
src/nnsight/
├── _c/ # C extensions (pymount)
├── intervention/
│ ├── batching.py # Batchable interface
│ ├── envoy.py # Core Envoy wrapper (module proxy)
│ ├── interleaver.py # Hook dispatch, thread management
│ ├── backends/ # Execution backends
│ ├── tracing/
│ │ ├── base.py # Tracer base class, source extraction
│ │ ├── tracer.py # InterleavingTracer, ScanningTracer
│ │ ├── invoker.py # Invoker for batched execution
│ │ ├── iterator.py # Step iteration (tracer.iter)
│ │ ├── globals.py # Global state, pymount lifecycle
│ │ ├── backwards.py # Gradient tracing
│ │ └── util.py # Exception handling, frame utils
│ └── serialization.py # NDIF serialization
├── modeling/
│ ├── base.py # NNsight base class
│ ├── language.py # LanguageModel
│ ├── vlm.py # VisionLanguageModel
│ ├── diffusion.py # DiffusionModel
│ ├── huggingface.py # HuggingFace model mixin
│ ├── transformers.py # Transformers integration
│ └── mixins/ # Meta loading, dispatch, remote
├── schema/ # Config schema
├── ndif.py # Remote execution (NDIF)
└── util.py # Utilities
NNsight uses deferred execution with thread-based synchronization:
- Code inside
with model.trace(...)is extracted via AST, compiled, and run in a worker thread. - When the thread accesses
.outputor.input, it blocks until the model's forward pass provides the value via a PyTorch hook. - Each invoke runs in its own thread, executing serially in definition order.
Understanding this architecture is important for working on the intervention system. See CLAUDE.md and NNsight.md for deep technical documentation.
pytest tests/test_lm.py tests/test_tiny.py tests/test_0516_features.py \
tests/test_debug.py tests/test_memory_cleanup.py tests/test_multiple_wrappers.py \
--device cpupytest tests/test_tiny.py --device cpu -xRequires diffusers to be installed. Uses a tiny test model (~1.4M params) that runs on CPU:
pytest tests/test_diffusion.py --device cpuIf you have a CUDA GPU available:
pytest tests/test_lm.py --device cuda:0Tests are organized with pytest markers: scan, source, iter, cache, rename, skips, order. You can run a specific category:
pytest tests/test_lm.py -m iter --device cpu- Check existing issues to see if someone is already working on what you have in mind.
- For larger changes, open an issue first to discuss the approach.
- For bug fixes, include a minimal reproduction if possible.
- Follow existing patterns in the codebase. NNsight doesn't use a formatter, but consistency with surrounding code is expected.
- Keep changes focused. A bug fix shouldn't include unrelated refactoring.
- Don't add docstrings, comments, or type annotations to code you didn't change.
- Only add comments where the logic isn't self-evident.
- Tests go in the
tests/directory. - Use the existing fixtures in
tests/conftest.pywhere possible. - All tests should pass on CPU (
--device cpu). GPU-specific tests should be skipped when no GPU is available. - Use
@torch.no_grad()for inference-only tests. - For optional dependencies (like diffusers), use
pytest.importorskip()at the top of the test file.
- Module access order matters. Within a single invoke, modules must be accessed in forward-pass execution order. Accessing layer 5 then layer 2 will deadlock.
.save()is required to persist values outside a trace context. Values without.save()are garbage collected.- Source cache issues. If you hit
AttributeError: 'Info' object has no attribute 'pull', clear the cache withGlobals.cache.clear()fromnnsight.intervention.tracing.globals. - Benchmarking. Define trace functions at module level (not inside loops). Warm up 2-3 iterations before timing. Use
torch.cuda.synchronize()for GPU timing.
- Create a branch from the latest
main:
git checkout -b my-feature main-
Make your changes and add tests.
-
Run the test suite:
pytest tests/test_tiny.py tests/test_lm.py --device cpu -x-
Push your branch and open a pull request against
main. -
In your PR description:
- Explain what the change does and why.
- Reference any related issues.
- Note any breaking changes.
If your PR isn't getting the attention it deserves, come bug us on Discord!
Open an issue at github.com/ndif-team/nnsight/issues with:
- NNsight version (
python -c "import nnsight; print(nnsight.__version__)") - Python and PyTorch versions
- A minimal code example that reproduces the issue
- The full error traceback
- Documentation: nnsight.net
- Forum: discuss.ndif.us
- Discord: discord.gg/6uFJmCSwW7
By contributing to nnsight, you agree that your contributions will be licensed under the MIT License.