Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
9506a8f
draft structure of DataDriftAnalyzer
alelavml3 Jun 24, 2025
d1e5ef0
first ks implementation
alelavml3 Jun 24, 2025
028c872
ks after test
alelavml3 Jun 24, 2025
a026027
just test rule accept testWorkers parameter
alelavml3 Jun 24, 2025
3d5106e
new drift info
alelavml3 Jun 24, 2025
ea30f43
testing for analyzer
alelavml3 Jun 25, 2025
70cb647
formatting with bonferroni, new monitoring specs
alelavml3 Jun 25, 2025
8023a91
abstract class for data batch analyzer
alelavml3 Jun 25, 2025
945127b
detection is performed in online or offline according to the monitori…
alelavml3 Jun 25, 2025
27c4733
fix test import but still wrong because it is streaming
alelavml3 Jun 25, 2025
5fca60f
batch drift analyzer
alelavml3 Jun 25, 2025
df80ef2
doc strings
alelavml3 Jun 25, 2025
768b681
test support with polars and pandas
alelavml3 Jun 25, 2025
549a515
Handle new extras in tests; linting according to previous py versions
GiovanniGiacometti Jun 26, 2025
7d1e5db
Parametrize for tests done in a loop
GiovanniGiacometti Jun 26, 2025
fce49d1
Refactor scan method and tests of batch-analyzer
GiovanniGiacometti Jun 30, 2025
fddd086
Improvements to Monitoring Algorithm base class
GiovanniGiacometti Jun 30, 2025
72dd083
Default algorithms builders
GiovanniGiacometti Jun 30, 2025
0b84b97
Refactor classes to accept an instance of algorithms rather than buil…
GiovanniGiacometti Jun 30, 2025
765eb4f
HuggingFace integration uses monitoring modules
GiovanniGiacometti Jun 30, 2025
1eaad45
wip in sklearn general detector
alelavml3 Jul 15, 2025
25869a0
fix tests
alelavml3 Jul 15, 2025
7cbaf46
sklearn detector uses standard monitoring algorithms
alelavml3 Jul 16, 2025
c69b39d
better comment
alelavml3 Jul 16, 2025
468ab57
Merge pull request #7 from ml-cube/dev-drift-analysis
alelavml3 Jul 16, 2025
c662ed1
mypy check on py3.10
alelavml3 Jul 16, 2025
713ed19
remove comma in action extrast list
alelavml3 Jul 16, 2025
94c3e90
remove legacy detector
alelavml3 Jul 16, 2025
121a4be
online without bonf
gloriadesideri Jul 24, 2025
656c983
added online monitoring algorithms tests, fixed errors
gloriadesideri Jul 29, 2025
9b4c82d
check for no columns in datasets, sdded tests
gloriadesideri Jul 30, 2025
daae074
fixed pr comments: added river as optional dependency
gloriadesideri Jul 31, 2025
6e92d34
minor fixes
gloriadesideri Jul 31, 2025
e3e6628
added comment to clarify comparison size = 1 in online algorithm, fix…
gloriadesideri Jul 31, 2025
79d922a
Merge pull request #9 from gloriadesideri/main
Jul 31, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .github/actions/validation/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ description: format, lint, test code
runs:
using: "composite"
steps:

- name: 📦 Install uv
uses: astral-sh/setup-uv@v6

Expand All @@ -17,7 +16,7 @@ runs:

- name: 🦾 💅 🧪 Install and validate extras
run: |
extras=("sklearn" "huggingface")
extras=("sklearn" "huggingface" "polars" "pandas")

for extra in "${extras[@]}"; do
echo "🦾 Installing extra: $extra"
Expand Down
41 changes: 36 additions & 5 deletions Justfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,33 +7,63 @@ set quiet
default:
just --list --unsorted

# --------------------------------------------------
# Developer Setup

# Synchronize the environment by installing all the dependencies
dev-sync:
uv sync --cache-dir .uv_cache --all-extras

# Synchronize the environment by installing the specified extra dependency
# Currently used within the CI to install extra dependencies and test them.
dev-sync-extra extra:
uv sync --cache-dir .uv_cache --extra {{extra}}

# Synchronize the environment by installing all the dependencies except the dev ones
prod-sync:
uv sync --cache-dir .uv_cache --all-extras --no-dev
uv sync --cache-dir .uv_cache --all-extras --no-default-groups

# Synchronize the environment by installing the extra dependency
# specified. Doesn't install the dev dependencies.
prod-sync-extra extra:
uv sync --cache-dir .uv_cache --extra {{extra}} --no-default-groups

# Install the pre-commit hooks
install-hooks:
uv run pre-commit install

# --------------------------------------------------
# Validation

# Run ruff formatting
format:
uv run ruff format

# Run ruff linting and mypy type checking
lint:
uv run ruff check --fix
uv run mypy --ignore-missing-imports --install-types --non-interactive --package ml3_drift
uv run mypy --ignore-missing-imports --install-types --non-interactive --package ml3_drift --python-version 3.10


# Default value for testWorkers is auto (meaning all workers available)
# If you want to pass a custom value (such as 4): `just testWorkers=4 test`
# We also run ruff on tests files (it's so fast that it's worth it)

# Little caveat: when running tests with only an extra installed, you'd like
# to avoid having docs dependencies installed (since, for instance, a mkdocs plugin
# requires Pandas, which is one of our extra dependencies). This happens by default
# since docs dependencies are not installed as default dependencies by uv (see pyproject.toml).
# They are only installed when building / serving the documentation. However, if you first
# build the documentation, then run the tests, you will have the docs dependencies installed.
# Should not be a practical problem (especially since in CI environments we don't install docs dependencies),
# but it's worth noting.

# Run the tests with pytest
testWorkers := "auto"
test:
uv run pytest --verbose --color=yes -n auto --exitfirst tests
uv run ruff format tests
uv run ruff check tests --fix
uv run pytest --verbose --color=yes -n {{testWorkers}} --exitfirst tests

# Run linters, formatters and tests
validate: format lint test
Expand All @@ -43,11 +73,12 @@ validate: format lint test

# Generate the documentation
build-docs:
uv run mkdocs build
# Make sure mkdocs is installed
uv run --group docs mkdocs build

# Serve the documentation locally
serve-docs:
uv run mkdocs serve
uv run --group docs mkdocs serve

# --------------------------------------------------
# Publishing
Expand Down
8 changes: 6 additions & 2 deletions examples/huggingface/text_embedding_monitoring.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,10 @@
from ml3_drift.huggingface.drift_detection_pipeline import (
HuggingFaceDriftDetectionPipeline,
)
from ml3_drift.huggingface.univariate.ks import KSDriftDetector
from ml3_drift.monitoring.algorithms.batch.bonferroni import (
BonferroniCorrectionAlgorithm,
)
from ml3_drift.monitoring.algorithms.batch.ks import KSAlgorithm
from ml3_drift.callbacks.base import logger_callback


Expand Down Expand Up @@ -37,7 +40,8 @@
# to monitor the drift in the embeddings.

hf_pipe = HuggingFaceDriftDetectionPipeline(
drift_detector=KSDriftDetector(
drift_detector=BonferroniCorrectionAlgorithm(
algorithm=KSAlgorithm(p_value=0.05),
callbacks=[
partial(
logger_callback,
Expand Down
45 changes: 34 additions & 11 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,19 +7,28 @@ dynamic = ["version"]
license = { text = "Apache-2.0" }
readme = "README.md"

dependencies = []
dependencies = [
"scipy>=1.15.3",
]

# -------------------------------------------------
# Extra dependencies. This package is designed to be
# used within one extra at a time, hence we check each
# extra separately. Remember to update the list of extras
# in the validation action to ensure tests are run
# used with different libraries, which means that our code
# should work only when not all extras are installed.
# Remember to update the list of extras
# in the validation CICD to ensure tests are run
# for your new extra
[project.optional-dependencies]

sklearn = ["scikit-learn>=1.6.1"]

huggingface = ["scipy>=1.15.2", "transformers[torch]>=4.52.3"]
huggingface = ["transformers[torch]>=4.52.3"]

polars = ["polars>=1.31.0"]

pandas = ["pandas>=2.2.3"]

river = ["river>=0.22.0"]


# -------------------------------------------------
Expand All @@ -28,31 +37,45 @@ huggingface = ["scipy>=1.15.2", "transformers[torch]>=4.52.3"]
dev = [
"ipykernel>=6.29.5",
"mypy>=1.15.0",
"pillow>=11.2.1", # for image support in tests
"pre-commit>=4.1.0",
"pytest>=8.3.4",
"pytest-xdist>=3.6.1",
"ruff>=0.9.5",
# for docs
# for image support in tests
"pillow>=11.2.1",
]

docs = [
"mkdocs-minify-plugin>=0.7.1",
"mkdocs-glightbox>=0.3.4",
"mkdocs-table-reader-plugin>=2.0.1",
"mkdocs-macros-plugin",
"mkdocs>=1.5.0",
"mkdocs-material>=9.5.0",
"mkdocs-material-extensions>=1.1",
"pygments>=2.14",
"pymdown-extensions>=9.9.1",
"jinja2>=3.0",
"markdown>=3.2",
"mkdocs-minify-plugin>=0.7.1",
"mkdocs-glightbox>=0.3.4",
"mkdocs-table-reader-plugin>=2.0.1",
"mkdocs-macros-plugin",
"openpyxl",
]

# -------------------------------------------------

# Default groups for uv
[tool.uv]
default-groups = ["dev"]

# -------------------------------------------------

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.version]
path = "src/ml3_drift/__init__.py"


# Set pytest folder
[tool.pytest.ini_options]
testpaths = ["tests"]
Comment on lines +80 to +81

Copilot AI Sep 9, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The pytest configuration section is placed after the uv configuration, but it should be grouped with other tool configurations earlier in the file for better organization. Consider moving this section near other [tool.*] sections.

Copilot uses AI. Check for mistakes.
3 changes: 0 additions & 3 deletions ruff.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,6 @@ exclude = [
line-length = 88
indent-width = 4

# Assume Python 3.9
target-version = "py39"

[lint]
# Enable Pyflakes (`F`) and a subset of the pycodestyle (`E`) codes by default.
# Unlike Flake8, Ruff doesn't enable pycodestyle warnings (`W`) or
Expand Down
Loading