Skip to content

Latest commit

 

History

History
250 lines (186 loc) · 7.88 KB

File metadata and controls

250 lines (186 loc) · 7.88 KB

CLAUDE.md

FAIRChem is Meta FAIR Chemistry's ML framework for atomistic simulations. Core abstractions: foundation models (UMA) with backbone+heads architecture, ASE calculator integration, Hydra-based config, and multi-task training via TorchTNT.

Development Commands

# Install
pip install -e packages/fairchem-core[dev]

# Tests (always pass -c flag)
pytest tests -c packages/fairchem-core/pyproject.toml
pytest tests/core/models/test_uma.py -vv
pytest tests/core -m "not gpu"

# Lint & format — REQUIRED for every modified file before committing
pre-commit run --files path/to/modified_file.py

# CLI
fairchem -c config.yaml [overrides...]

Code Style

IMPORTANT: You MUST run pre-commit run --files /path/to/modified_file.py on every file you modify, before considering the task complete. No exceptions.

Every file must start with:

"""
Copyright (c) Meta Platforms, Inc. and affiliates.

This source code is licensed under the MIT license found in the
LICENSE file in the root directory of this source tree.
"""

from __future__ import annotations

Line length: 88 characters. Linter: Ruff (config in ruff.toml).

Docstrings use Google convention. No text on opening/closing quote lines:

# WRONG
"""This is wrong."""

# RIGHT
"""
Short description.
"""

# RIGHT (with args)
"""
Short description.

Args:
    x: The input tensor.

Returns:
    The processed tensor.
"""

Imports: isort enforced via Ruff. fairchem.core is known-first-party. Use TYPE_CHECKING for type-only imports:

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from fairchem.core.datasets.atomic_data import AtomicData

Testing Conventions

All tests go in tests/ (mirrors src/fairchem/ structure). Always run with:

pytest tests -c packages/fairchem-core/pyproject.toml

Test Markers

  • @pytest.mark.gpu: GPU-only (auto-skipped when CUDA unavailable)
  • @pytest.mark.cpu_and_gpu: Runs on both CPU and GPU
  • @pytest.mark.dgl: Requires fairchem_cpp
  • @pytest.mark.inference_check: Inference validation (skipped by default)

Key Fixtures

Root conftest (tests/conftest.py):

  • seed_fixture (function): Seeds all RNGs to 42
  • water_xyz_file (session): Path to a minimal 3-atom water XYZ file
  • compile_reset_state (function): Resets torch.compiler before/after test
  • setup_before_each_test (autouse): Cleans up Ray, GPU memory, distributed state

Core conftest (tests/core/conftest.py):

  • dummy_binary_dataset (session, parametrized): ASE dataset in both LMDB and CIF formats
  • fake_uma_dataset (session): Full UMA training dataset path + config
  • direct_checkpoint (session): Trained model checkpoint (inference + resume)
  • direct_mole_checkpoint (session): Trained MOLE checkpoint
  • torch_deterministic (function): Enables deterministic algorithms
  • snapshot (function): Syrupy snapshot with approximate numpy comparison (Approx)

Dataset conftest (tests/core/datasets/conftest.py):

  • structures (module): List of test atoms [H2O molecule, Cu bulk, Pt slab]

Test Patterns

GPU/CPU dual tests:

@pytest.mark.gpu
def test_something_gpu():
    _test_something("cuda")


def test_something_cpu():
    _test_something("cpu")


def _test_something(device):
    # shared implementation
    ...

Snapshot testing with approximate comparison:

def test_values(snapshot):
    result = compute_something()
    assert pytest.approx(result.numpy(), abs=1e-3) == snapshot

Integration tests using the CLI:

from tests.core.testing_utils import launch_main


def test_training(fake_uma_dataset):
    sys_args = [
        "--config",
        "tests/core/units/mlip_unit/test_mlip_train.yaml",
        f"datasets.data_root_dir={fake_uma_dataset}",
        "job.device_type=CPU",
        "max_steps=2",
    ]
    launch_main(sys_args)

Architecture

Model System (Backbone + Heads)

Models use HydraModel (registered as "hydra"): one backbone extracts features, multiple heads predict properties.

BackboneInterface.forward(data: AtomicData) -> dict[str, Tensor]   # features
HeadInterface.forward(data: AtomicData, emb: dict) -> dict[str, Tensor]  # predictions

Primary backbone: escnmd_backbone (SO(3)-equivariant eSCN with MD modifications). Heads: MLP_Energy_Head, Linear_Force_Head, DatasetSpecificSingleHeadWrapper.

Registry Pattern

Components are registered for dynamic Hydra instantiation:

@registry.register_model("my_backbone")
class MyBackbone(nn.Module, BackboneInterface): ...

Available decorators: register_model, register_dataset, register_loss, register_task, register_logger, register_trainer.

Lookup: registry.get_model_class("my_backbone") or by full import path "fairchem.core.models.my_module.MyBackbone".

Data Flow

ASE Atoms -> AtomicData.from_ase() -> graph generation -> backbone -> heads -> predictions

AtomicData required fields: pos, atomic_numbers, cell, pbc, natoms, edge_index, cell_offsets, nedges, charge, spin, fixed, tags. Optional targets: energy, forces, stress.

Batching via atomicdata_list_to_batch(). Multi-task collation via MTCollater (fills missing targets with inf for loss masking).

Configuration (Hydra)

YAML configs use _target_ keys for component instantiation:

runner:
  _target_: fairchem.core.components.train.train_runner.TrainEvalRunner
  train_eval_unit:
    _target_: fairchem.core.units.mlip_unit.mlip_unit.MLIPTrainEvalUnit
    model:
      _target_: fairchem.core.models.base.HydraModel
      backbone: ${backbone}

Config sections: job, runner, datasets, tasks, backbone, optimizer. Default configs in configs/. Overrides via CLI: fairchem -c config.yaml key=value.

Task Names

  • oc20: Catalysis (Open Catalyst)
  • omat: Inorganic materials
  • omol: Molecules
  • odac: Metal-organic frameworks
  • omc: Molecular crystals

Training Flow

TrainEvalRunner orchestrates training via TorchTNT's fit(). Core unit: MLIPTrainEvalUnit (handles forward pass, loss, metrics, EMA, gradient clipping). Checkpoints use DCP (Distributed Checkpoint Protocol) with dcp_to_torch_save() for inference export.

Model Loading and Inference

from fairchem.core import pretrained_mlip, FAIRChemCalculator

predictor = pretrained_mlip.get_predict_unit("uma-s-1p1", device="cuda")
calc = FAIRChemCalculator(predictor, task_name="oc20")
atoms.calc = calc

Repository Structure

src/fairchem/core/
├── models/              # Backbones and heads (UMA, eSCN-MD, GemNet)
├── datasets/            # Data loading (LMDB, ASE), collaters, samplers
├── components/          # Runner components (train, evaluate, calculate)
├── units/               # TorchTNT train/eval/predict units
├── modules/             # Loss, schedulers, normalizers, evaluators
├── launchers/           # Local, Ray, SLURM job launchers
├── common/              # Registry, distributed utils, logging
├── graph/               # Graph generation, neighbor finding with PBC
└── _cli.py              # CLI entry point

tests/                   # All tests (mirrors src structure)
packages/                # Installable packages (fairchem-core, fairchem-data-*)
configs/                 # Hydra YAML configs (datasets, tasks, backbone, optimizer)

Key Dependencies

  • torch~=2.8.0, e3nn>=0.5 - PyTorch + equivariant neural networks
  • ase>=3.26.0 - Atomic Simulation Environment
  • torchtnt - PyTorch training framework (TrainUnit/EvalUnit)
  • hydra-core + omegaconf - Configuration management
  • lmdb - Dataset storage format
  • ray[serve]>=2.53.0 - Distributed computing

Anytime we learn something that could be beneficial in future coding sessions, automatically add it to CLAUDE.md.

This includes:

  • Gotchas that are not obvious
  • Subtle bugs that manifest under specific conditions
  • Repeat corrections I make to the output of coding agents