Skip to content

[FEATURE] created_by Provenance Tag Support in ML Commons #4752

@dbwiddis

Description

@dbwiddis

Is your feature request related to a problem?

The ML Commons stats framework (MLStatsJobProcessor) publishes adoption metrics for models, agents, and connectors as OTel counters with rich tags describing what was created: service provider, model type, deployment mode, etc. However, there is no way to attribute which plugin or caller provisioned a given resource. This makes it impossible to distinguish, in the metrics, between resources created by an automated plugin provisioning flow (e.g., Flow Framework plugin) vs. resources created directly by users via the API, or other plugins.

The MachineLearningClient interface (used by all plugins integrating with ML Commons) provides no mechanism to pass caller provenance. The underlying input objects, MLCreateConnectorInput, MLRegisterModelInput, and MLAgent, have no created_by field. The transport actions that persist these objects (TransportCreateConnectorAction, TransportRegisterModelAction, TransportRegisterAgentAction) never record provenance. And MLModel.getTags() / MLAgent.getTags() have no such dimension to emit.

As a concrete example: a plugin (like Flow Framework) that automates ML resource provisioning (connectors, models, agents) as a "one-and-done" setup step wants to measure how many users are in active continued use of the resources it provisioned, as distinct from resources provisioned by other means. This is currently impossible with the existing stats framework.

What solution would you like?

Add an optional created_by field as first-class metadata across the ML resource creation path, surfaced as a tag in the adoption metrics framework. The changes required span four areas:

  1. Domain objects and input classes (common module)

Add String createdBy to MLCreateConnectorInput, MLRegisterModelInput, MLAgent, and MLModel. (Given that Connectors are currently not used in stats and have a tight relationship to models, we can leave them out.) Implement toXContent, parse, writeTo, and StreamInput constructors in each class, version-gated on a new VERSION_X_Y_Z constant following the existing pattern.

  1. Transport actions (plugin module)

    • TransportRegisterModelAction: copy createdBy from MLRegisterModelInput onto MLModel before indexing
    • TransportRegisterAgentAction: MLAgent is indexed directly, so no additional propagation is needed beyond Step 1
  2. Tag emission (common module)

  • MLModel.getTags()/ getTags(Connector): add created_by tag to all three tag-building paths (remote, pre-trained, custom)
  • MLAgent.getTags(): add created_by tag
  1. Connector metrics in MLStatsJobProcessor (plugin module)

AdoptionMetric.CONNECTOR_COUNT is currently defined but never incremented. As part of this work, add connector collection to MLStatsJobProcessor parallel to the existing model collection, reading created_by from the stored connector document and emitting it as a tag. This completes coverage for all three resource types.

With these changes, a plugin provisioning ML resources via the ML Client simply sets the field on the input builder:

MLCreateConnectorInput.builder()
    // ... existing fields ...
    .createdBy("my-plugin")
    .build();

MLRegisterModelInput.builder()
    // ... existing fields ...
    .createdBy("my-plugin")
    .build();

MLAgent.builder()
    // ... existing fields ...
    .createdBy("my-plugin")
    .build();

The stats framework then emits metrics like:

ml.commons.MODEL_COUNT{created_by="my-plugin", deployment="remote", service_provider="bedrock", type="llm", ...}
ml.commons.AGENT_COUNT{created_by="my-plugin", type="conversational", ...}
ml.commons.CONNECTOR_COUNT{created_by="my-plugin", service_provider="bedrock", ...}

What alternatives have you considered?

  1. Using the existing app_type field on MLAgent: MLAgent already has an appType field, but it is a user-facing classification of the agent's functional purpose (e.g. "chatbot"), not a record of which plugin provisioned it. Overloading it for provenance would conflate two distinct concepts and would not cover connectors or models, which have no equivalent field.

  2. Tagging via connector/model parameters: A plugin could embed a created_by key in the parameters map of a connector or model. However, this is an undocumented convention with no guarantee of surviving updates, no first-class support in getTags(), and no way to filter it out of functional parameters passed to the remote endpoint.

  3. Tracking provenance outside ML Commons: The calling plugin could maintain its own index of resource IDs it provisioned and join that against ML Commons data at query time. This is fragile, requires the plugin to manage additional state, and produces metrics that are disconnected from the rich tag context (service provider, model type, etc.) that MLStatsJobProcessor already computes.

Do you have any additional context?

  • created_by is purely informational metadata — a free-form string with no validation or enforcement by ML Commons. The framework does not need to know or care about the value.
  • The field follows the exact version-gating pattern already used, ensuring backward compatibility in mixed-version clusters where older nodes simply ignore the field.
  • created_by will be visible in GET model/agent/connector API responses, which is desirable for operator visibility into resource provenance.
  • This is not a security boundary. Any caller can set any value. It is not intended to replace or interact with the existing owner/user access control fields.

Metadata

Metadata

Assignees

Labels

3.7Items marked for 3.7 releaseenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions