Skip to content

Add from_logits support to FocalLoss and eps for API consistency with DiceLoss#1268

Merged
qubvel merged 7 commits intoqubvel-org:mainfrom
Harsh-2005d:add-from-logits-focalloss
Mar 19, 2026
Merged

Add from_logits support to FocalLoss and eps for API consistency with DiceLoss#1268
qubvel merged 7 commits intoqubvel-org:mainfrom
Harsh-2005d:add-from-logits-focalloss

Conversation

@Harsh-2005d
Copy link
Copy Markdown
Contributor

Summary

This PR adds a from_logits and eps parameter to FocalLoss to align its API with DiceLoss.

Currently, DiceLoss supports both logits and probability inputs via the from_logits flag, while FocalLoss always assumes raw logits. This creates inconsistency when using models configured with an activation function (e.g., activation="softmax"), requiring users to manually remove activations or wrap the loss.

This change introduces a from_logits flag to FocalLoss to support both logits and probability inputs in a consistent manner.


Motivation

Adresses the Issue #1263
Example of current behavior:

model = smp.create_model(..., activation="softmax")
outputs = model(x)  # probabilities

dice_loss = smp.losses.DiceLoss(mode="multiclass", from_logits=False)  # works
focal_loss = smp.losses.FocalLoss(mode="multiclass")  # expects logits → incorrect

Because FocalLoss always assumes logits, using it with probability outputs results in incorrect loss computation.

This PR resolves that inconsistency.


Implementation Details

  • Added from_logits: bool = True parameter to FocalLoss.

  • When from_logits=False:

    • For binary/multilabel modes:

      • Probabilities are converted to logits using inverse sigmoid.
    • For multiclass mode:

      • Softmax probabilities are converted to log-space before applying the existing focal computation.
  • No changes were made to the underlying focal formulation.

  • Backward compatibility is preserved (from_logits=True by default).


Tests

Added tests to verify:

  • FocalLoss works correctly with from_logits=False.
  • Probability-based input produces finite and consistent results.
  • Good predictions yield lower loss than bad predictions.

All existing tests pass.


Backward Compatibility

  • Default behavior remains unchanged.
  • Existing code using logits is unaffected.
  • New functionality only activates when from_logits=False.

@Harsh-2005d
Copy link
Copy Markdown
Contributor Author

@qubvel also probably a better implementation of focal loss would be something like this by open-mm lab, if u want i can do that also

Copy link
Copy Markdown
Collaborator

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! Please see the comments, looks good otherwise.

Re open-mm lab loss: if you can provide some details why it would be a better option I would be happy to consider adding it, thanks!

Comment thread segmentation_models_pytorch/losses/focal.py Outdated
Comment thread segmentation_models_pytorch/losses/focal.py Outdated
Comment thread segmentation_models_pytorch/losses/focal.py Outdated
Comment thread segmentation_models_pytorch/losses/focal.py Outdated
Harsh-2005d and others added 4 commits March 11, 2026 19:18
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
@Harsh-2005d Harsh-2005d requested a review from qubvel March 11, 2026 13:53
@Harsh-2005d
Copy link
Copy Markdown
Contributor Author

The current fix keeps the existing implementation unchanged, but it still reconstructs logits from probabilities, which is not strictly equivalent to the original logits formulation. The OpenMMLab implementation handles logits and activated probabilities separately and also provides an optimized CUDA implementation. I kept this PR minimal, but I can open a separate issue to discuss the differences between the two implementations if that would be helpful.

@qubvel
Copy link
Copy Markdown
Collaborator

qubvel commented Mar 12, 2026

Please run make fixup to fix style test, otherwise looks good.

also provides an optimized CUDA implementation. I kept this PR minimal

thanks for keeping it minimal and easy to review, I would prefer to merge it as is, don't want to add optimized CUDA kernels atm

@qubvel qubvel enabled auto-merge (squash) March 19, 2026 13:14
@Harsh-2005d
Copy link
Copy Markdown
Contributor Author

what's the issue here?

@qubvel
Copy link
Copy Markdown
Collaborator

qubvel commented Mar 19, 2026

No issues, merging!

@qubvel qubvel merged commit 4bf6ec0 into qubvel-org:main Mar 19, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants