Overview
The ElasticWeightConsolidator implements Elastic Weight Consolidation (EWC) and related regularization-based continual learning methods. It computes Fisher Information matrices to identify task-critical parameters and applies quadratic penalties during subsequent task learning, preventing catastrophic forgetting while allowing plasticity on less important weights.
Key Responsibilities
- Fisher Information computation — Diagonal and block-diagonal Fisher approximation from task-specific data
- Parameter importance mapping — Online importance estimation using both EWC (Kirkpatrick et al., 2017) and Synaptic Intelligence (Zenke et al., 2017)
- Task-specific regularization — Quadratic penalty terms anchoring important parameters to their post-training values
- Online Fisher updates — Incremental Fisher matrix accumulation across tasks without storing full task datasets
- Memory Aware Synapses (MAS) — Unsupervised importance estimation via gradient magnitude (Aljundi et al., 2018)
- Multi-task consolidation — Merging importance maps across task sequence with configurable decay
- Importance visualization — Heatmaps of parameter importance across layers and tasks
Interfaces
Inputs
model: nn.Module — Neural network whose parameters are being consolidated
task_data: DataLoader — Task-specific data for Fisher computation
task_id: str — Unique task identifier
lambda_ewc: float — Regularization strength (default: 5000.0)
fisher_method: Literal["diagonal", "block_diagonal", "kfac"] — Fisher approximation type
online: bool — Whether to use online (running) Fisher estimation
Outputs
ConsolidationResult — Contains penalty loss, per-layer importance stats, forgetting risk score
ImportanceMap — Dict[str, Tensor] mapping parameter names to importance scores
FisherSnapshot — Serializable Fisher matrix for checkpointing
Acceptance Criteria
References
Overview
The
ElasticWeightConsolidatorimplements Elastic Weight Consolidation (EWC) and related regularization-based continual learning methods. It computes Fisher Information matrices to identify task-critical parameters and applies quadratic penalties during subsequent task learning, preventing catastrophic forgetting while allowing plasticity on less important weights.Key Responsibilities
Interfaces
Inputs
model: nn.Module— Neural network whose parameters are being consolidatedtask_data: DataLoader— Task-specific data for Fisher computationtask_id: str— Unique task identifierlambda_ewc: float— Regularization strength (default: 5000.0)fisher_method: Literal["diagonal", "block_diagonal", "kfac"]— Fisher approximation typeonline: bool— Whether to use online (running) Fisher estimationOutputs
ConsolidationResult— Contains penalty loss, per-layer importance stats, forgetting risk scoreImportanceMap— Dict[str, Tensor] mapping parameter names to importance scoresFisherSnapshot— Serializable Fisher matrix for checkpointingAcceptance Criteria
References