Skip to content

Add created_by field to ML resources for adoption metrics attribution#4754

Open
dbwiddis wants to merge 9 commits intoopensearch-project:mainfrom
dbwiddis:created-by
Open

Add created_by field to ML resources for adoption metrics attribution#4754
dbwiddis wants to merge 9 commits intoopensearch-project:mainfrom
dbwiddis:created-by

Conversation

@dbwiddis
Copy link
Copy Markdown
Member

@dbwiddis dbwiddis commented Mar 31, 2026

Description

Adds a created_by field to ML agents, connectors, and models to enable
attribution of adoption metrics by the client that created the resource
(e.g. flow-framework, opensearch-dashboards).

This is distinct from the authenticated user (User/FGAC) — created_by
identifies the calling client, which may differ from the user when a
workflow engine like Flow Framework creates resources on behalf of a user.

Data model

  • MLAgent: new created_by field, version-gated serialization at
    VERSION_3_7_0, emitted in getTags() (defaults to "unknown")
  • Connector / AbstractConnector / HttpConnector:
    new created_by field, version-gated serialization at VERSION_3_7_0
  • MLCreateConnectorInput: new created_by field, flows through to
    connector via XContent round-trip in TransportCreateConnectorAction
  • MLModel: new created_by field, version-gated serialization at
    VERSION_3_7_0, emitted in all three getTags() methods
    (getRemoteModelTags, getPreTrainedModelTags, getCustomModelTags),
    defaults to "unknown"
  • MLRegisterModelInput: new created_by field, wired through to
    MLModel in MLModelManager (all three builder sites, including
    the previously missing tenantId fix in registerModelFromUrl)

Validation

  • MLRegisterAgentRequest: validates created_by as optional safe-text
    field; restructured validate() to seed exception from validateFields
    and eliminate manual error-merging loop
  • MLCreateConnectorRequest: validates created_by as optional safe-text field
  • MLRegisterModelRequest: validates created_by as optional safe-text field
  • MLUpdateConnectorRequest: created_by is intentionally not validated
    here — it is set at creation time only and is not merged by
    HttpConnector.update(); also fixed a pre-existing bug where
    validateFields result overwrote any prior connectorId null error

Metrics

  • MLStatsJobProcessor requires no changes — created_by flows through
    automatically via model.getTags() and agent.getTags()

Versioning

  • CommonValue: added VERSION_3_7_0 constant; VERSION_3_6_0 was
    considered but removed as unused in production code — tests use the
    existing VERSION_3_5_0 constant to represent a pre-created_by node
  • All new fields are version-gated: nodes on older versions will
    deserialize created_by as null
  • CREATED_BY_FIELD constant centralized in CommonValue; duplicate
    local definitions removed from MLAgent, MLModel, HttpConnector,
    MLCreateConnectorInput, and MLRegisterModelInput

Testing

  • Unit tests added for serialization (stream in/out at VERSION_3_7_0
    and VERSION_3_5_0), XContent parse/write, and getTags() emission
    for all three resource types
  • Happy-path createdBy tests added to MachineLearningNodeClientTest
    verifying the field is passed through for registerAgent, register (Model),
    and createConnector
  • Validation tests added to MLRegisterAgentRequestTest,
    MLCreateConnectorRequestTests, and MLRegisterModelRequestTest
  • All existing tests pass

Related Issues

Resolves #4752

Check List

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dbwiddis dbwiddis requested a deployment to ml-commons-cicd-env-require-approval March 31, 2026 05:36 — with GitHub Actions Waiting
@dbwiddis dbwiddis requested a deployment to ml-commons-cicd-env-require-approval March 31, 2026 05:36 — with GitHub Actions Waiting
@dbwiddis dbwiddis requested a deployment to ml-commons-cicd-env-require-approval March 31, 2026 05:36 — with GitHub Actions Waiting
@dbwiddis dbwiddis requested a deployment to ml-commons-cicd-env-require-approval March 31, 2026 05:36 — with GitHub Actions Waiting
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 31, 2026

PR Reviewer Guide 🔍

(Review updated until commit e85bb6f)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Add created_by field to MLAgent

Relevant files:

  • common/src/main/java/org/opensearch/ml/common/agent/MLAgent.java
  • common/src/main/java/org/opensearch/ml/common/transport/agent/MLRegisterAgentRequest.java
  • common/src/test/java/org/opensearch/ml/common/agent/MLAgentTest.java
  • common/src/test/java/org/opensearch/ml/common/transport/agent/MLRegisterAgentRequestTest.java
  • client/src/test/java/org/opensearch/ml/client/MachineLearningNodeClientTest.java
  • plugin/src/test/java/org/opensearch/ml/action/agents/DeleteAgentTransportActionTests.java
  • plugin/src/test/java/org/opensearch/ml/action/agents/GetAgentTransportActionTests.java

Sub-PR theme: Add created_by field to Connector

Relevant files:

  • common/src/main/java/org/opensearch/ml/common/connector/AbstractConnector.java
  • common/src/main/java/org/opensearch/ml/common/connector/AwsConnector.java
  • common/src/main/java/org/opensearch/ml/common/connector/Connector.java
  • common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java
  • common/src/main/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInput.java
  • common/src/main/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorRequest.java
  • common/src/main/java/org/opensearch/ml/common/transport/connector/MLUpdateConnectorRequest.java
  • common/src/test/java/org/opensearch/ml/common/connector/HttpConnectorTest.java
  • common/src/test/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInputTests.java
  • common/src/test/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorRequestTests.java
  • common/src/test/java/org/opensearch/ml/common/transport/connector/MLUpdateConnectorRequestTests.java

Sub-PR theme: Add created_by field to MLModel and registration

Relevant files:

  • common/src/main/java/org/opensearch/ml/common/MLModel.java
  • common/src/main/java/org/opensearch/ml/common/transport/register/MLRegisterModelInput.java
  • common/src/main/java/org/opensearch/ml/common/transport/register/MLRegisterModelRequest.java
  • common/src/test/java/org/opensearch/ml/common/MLModelTests.java
  • common/src/test/java/org/opensearch/ml/common/transport/register/MLRegisterModelInputTest.java
  • common/src/test/java/org/opensearch/ml/common/transport/register/MLRegisterModelRequestTest.java
  • plugin/src/main/java/org/opensearch/ml/model/MLModelManager.java

⚡ Recommended focus areas for review

Missing Version

VERSION_3_6_0 is skipped — the sequence jumps from VERSION_3_5_0 directly to VERSION_3_7_0. If VERSION_3_6_0 is intentionally absent, this should be documented. Otherwise, the version gate may be incorrect and could cause serialization issues when nodes running 3.6.x communicate with nodes running 3.7.x.

public static final Version VERSION_3_7_0 = Version.fromString("3.7.0");
Error Merging Regression

The previous code assigned exception = validateFields(fieldsToValidate) directly, which would overwrite any prior null-connector-id error. The new code correctly merges errors, but the prior behavior was actually a bug that silently dropped the connector-id error when field validation also failed. Verify that the new merged behavior is the intended fix and that existing tests cover the combined-error case.

ActionRequestValidationException fieldException = validateFields(fieldsToValidate);
if (fieldException != null) {
    for (String error : fieldException.validationErrors()) {
        exception = addValidationError(error, exception);
    }
}
Unvalidated Input

The createdBy value from registerModelInput is passed directly into MLModel without any sanitization or length check at the manager level. While request-level validation exists, the manager is also called from internal paths (e.g., registerModelFromUrl). Confirm that all call sites that invoke these builder paths go through the validated request layer before reaching the manager.

.tenantId(registerModelInput.getTenantId())
.createdBy(registerModelInput.getCreatedBy())
Null Exception Seed

validateFields is called and its result is used as the seed exception for subsequent addValidationError calls. If validateFields returns null and no context-management errors exist, exception remains null and is returned correctly. However, if validateFields returns a non-null exception and context-management checks also fail, errors are appended correctly. Verify that validateFields always returns a compatible ActionRequestValidationException (not a subclass) so that addValidationError chaining works as expected.

Map<String, FieldDescriptor> fieldsToValidate = new HashMap<>();
fieldsToValidate.put("Agent created_by field", new FieldDescriptor(mlAgent.getCreatedBy(), false));
ActionRequestValidationException exception = validateFields(fieldsToValidate);

if (mlAgent.getContextManagementName() != null && mlAgent.getContextManagement() != null) {
    exception = addValidationError("Cannot specify both context_management_name and context_management", exception);
}
if (mlAgent.getContextManagementName() != null) {
    exception = validateContextManagementTemplateName(mlAgent.getContextManagementName(), exception);
}
if (mlAgent.getContextManagement() != null) {
    exception = validateInlineContextManagement(mlAgent.getContextManagement(), exception);
}

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 31, 2026

PR Code Suggestions ✨

Latest suggestions up to e85bb6f

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Ensure version sequence has no gaps

The createdBy field is gated behind VERSION_3_7_0, but the version sequence jumps
from VERSION_3_5_0 to VERSION_3_7_0, skipping VERSION_3_6_0. If there is an
intermediate 3.6.0 release, the serialization/deserialization logic will be
incorrect for nodes running that version. Ensure the version constant matches the
actual release where this feature is introduced, or add the missing intermediate
version constant.

common/src/main/java/org/opensearch/ml/common/CommonValue.java [107]

+public static final Version VERSION_3_6_0 = Version.fromString("3.6.0");
 public static final Version VERSION_3_7_0 = Version.fromString("3.7.0");
Suggestion importance[1-10]: 5

__

Why: The version sequence jumps from VERSION_3_5_0 to VERSION_3_7_0, skipping VERSION_3_6_0. If a 3.6.0 release exists, serialization logic gated on VERSION_3_7_0 could be incorrect for nodes on that version. However, this may be intentional if 3.6.0 is not a planned release.

Low
General
Verify all model registration paths propagate createdBy

The registerModelFromUrl path now correctly sets both tenantId and createdBy, but
it's worth verifying that the pre-trained model registration path (e.g.,
registerPrebuiltModel or similar methods) also propagates createdBy from
registerModelInput to the MLModel being indexed. If any registration path is missed,
models registered through that path will always have a null createdBy, breaking
attribution metrics.

plugin/src/main/java/org/opensearch/ml/model/MLModelManager.java [847-857]

-                .modelState(MLModelState.REGISTERING)
-                .modelConfig(registerModelInput.getModelConfig())
-                .deploySetting(registerModelInput.getDeploySetting())
-                .createdTime(now)
-                .lastUpdateTime(now)
-                .isHidden(registerModelInput.getIsHidden())
-                .guardrails(registerModelInput.getGuardrails())
-                .modelInterface(registerModelInput.getModelInterface())
-                .tenantId(registerModelInput.getTenantId())
-                .createdBy(registerModelInput.getCreatedBy())
-                .build();
-            IndexRequest indexModelMetaRequest = new IndexRequest(ML_MODEL_INDEX);
-            if (functionName == FunctionName.METRICS_CORRELATION) {
+.modelState(MLModelState.REGISTERING)
+.modelConfig(registerModelInput.getModelConfig())
+.deploySetting(registerModelInput.getDeploySetting())
+.createdTime(now)
+.lastUpdateTime(now)
+.isHidden(registerModelInput.getIsHidden())
+.guardrails(registerModelInput.getGuardrails())
+.modelInterface(registerModelInput.getModelInterface())
+.tenantId(registerModelInput.getTenantId())
+.createdBy(registerModelInput.getCreatedBy())
+.build();
Suggestion importance[1-10]: 4

__

Why: This is a valid concern about completeness — other registration paths may not propagate createdBy. However, the suggestion only asks to verify rather than providing a concrete fix, and the existing_code matches the improved_code, limiting its actionable value.

Low
Ensure all validation errors are properly accumulated

The validateFields method may return a new ActionRequestValidationException
instance, but subsequent calls to addValidationError and
validateContextManagementTemplateName/validateInlineContextManagement may not
accumulate errors into the same exception object if those methods create a new
exception internally. Verify that all validation errors are properly accumulated
into a single exception, similar to the pattern used in MLUpdateConnectorRequest
where errors from validateFields are explicitly merged.

common/src/main/java/org/opensearch/ml/common/transport/agent/MLRegisterAgentRequest.java [56-68]

 Map<String, FieldDescriptor> fieldsToValidate = new HashMap<>();
 fieldsToValidate.put("Agent created_by field", new FieldDescriptor(mlAgent.getCreatedBy(), false));
 ActionRequestValidationException exception = validateFields(fieldsToValidate);
 
 if (mlAgent.getContextManagementName() != null && mlAgent.getContextManagement() != null) {
     exception = addValidationError("Cannot specify both context_management_name and context_management", exception);
 }
 if (mlAgent.getContextManagementName() != null) {
-    exception = validateContextManagementTemplateName(mlAgent.getContextManagementName(), exception);
+    ActionRequestValidationException ctxException = validateContextManagementTemplateName(mlAgent.getContextManagementName(), exception);
+    if (ctxException != null) {
+        for (String error : ctxException.validationErrors()) {
+            exception = addValidationError(error, exception);
+        }
+    }
 }
 if (mlAgent.getContextManagement() != null) {
-    exception = validateInlineContextManagement(mlAgent.getContextManagement(), exception);
+    ActionRequestValidationException inlineException = validateInlineContextManagement(mlAgent.getContextManagement(), exception);
+    if (inlineException != null) {
+        for (String error : inlineException.validationErrors()) {
+            exception = addValidationError(error, exception);
+        }
+    }
 }
Suggestion importance[1-10]: 3

__

Why: The validateContextManagementTemplateName and validateInlineContextManagement methods already accept and return the accumulated exception object (following the addValidationError pattern), so errors are already properly chained. The suggestion's concern about error accumulation is not actually a problem in this code.

Low

Previous suggestions

Suggestions up to commit 31fb021
CategorySuggestion                                                                                                                                    Impact
Possible issue
Missing intermediate version constant in sequence

The PR introduces VERSION_3_7_0 but skips VERSION_3_6_0. If the versioning sequence
is intentional, it should be documented. Otherwise, if VERSION_3_6_0 is expected to
exist (e.g., for other features), this gap could cause serialization compatibility
issues when nodes running 3.6.x communicate with nodes running 3.7.x, as the
created_by field would be missing from the stream.

common/src/main/java/org/opensearch/ml/common/CommonValue.java [107]

+public static final Version VERSION_3_6_0 = Version.fromString("3.6.0");
 public static final Version VERSION_3_7_0 = Version.fromString("3.7.0");
Suggestion importance[1-10]: 3

__

Why: The suggestion raises a valid concern about the version gap between VERSION_3_5_0 and VERSION_3_7_0, but adding VERSION_3_6_0 may not be necessary if that version doesn't exist in the project's release plan. This is more of a documentation/verification concern than a code bug.

Low
Validation errors may be lost due to incorrect chaining

The refactored validate() method no longer accumulates errors from validateFields
into the same exception variable used by subsequent addValidationError calls. If
validateFields returns a non-null exception and subsequent validations also fail,
the errors from validateFields will be lost because addValidationError creates a new
exception chain starting from the passed-in exception. The exception variable
returned by validateFields should be passed into subsequent addValidationError
calls.

common/src/main/java/org/opensearch/ml/common/transport/agent/MLRegisterAgentRequest.java [56-68]

+Map<String, FieldDescriptor> fieldsToValidate = new HashMap<>();
+fieldsToValidate.put("Agent created_by field", new FieldDescriptor(mlAgent.getCreatedBy(), false));
+ActionRequestValidationException exception = validateFields(fieldsToValidate);
 
+if (mlAgent.getContextManagementName() != null && mlAgent.getContextManagement() != null) {
+    exception = addValidationError("Cannot specify both context_management_name and context_management", exception);
+}
+if (mlAgent.getContextManagementName() != null) {
+    exception = validateContextManagementTemplateName(mlAgent.getContextManagementName(), exception);
+}
+if (mlAgent.getContextManagement() != null) {
+    exception = validateInlineContextManagement(mlAgent.getContextManagement(), exception);
+}
Suggestion importance[1-10]: 2

__

Why: The 'existing_code' and 'improved_code' are identical, meaning the suggestion doesn't actually propose any code change. Looking at the actual code, the exception variable from validateFields is correctly passed into subsequent addValidationError calls, so the described bug doesn't exist in the PR code.

Low
General
Inconsistent field population across registration paths

The registerModelFromUrl method now sets tenantId and createdBy, but looking at the
diff, the modelName, modelGroupId, algorithm, and other fields that are set in the
other two performIndexRemoteModel methods appear to be missing from this builder
call. While this may be pre-existing, the addition of tenantId here (not present
before) suggests this was a bug fix bundled in. Verify that all required fields are
consistently set across all three model registration paths.

plugin/src/main/java/org/opensearch/ml/model/MLModelManager.java [847-857]

+// Ensure all required fields are set consistently with other registration paths
 .modelState(MLModelState.REGISTERING)
 .modelConfig(registerModelInput.getModelConfig())
 .deploySetting(registerModelInput.getDeploySetting())
 .createdTime(now)
 .lastUpdateTime(now)
 .isHidden(registerModelInput.getIsHidden())
 .guardrails(registerModelInput.getGuardrails())
 .modelInterface(registerModelInput.getModelInterface())
 .tenantId(registerModelInput.getTenantId())
 .createdBy(registerModelInput.getCreatedBy())
 .build();
Suggestion importance[1-10]: 3

__

Why: The suggestion asks to verify consistency across registration paths but the 'improved_code' is identical to the 'existing_code', providing no concrete fix. It's a valid concern to verify but doesn't constitute an actionable code change.

Low
Suggestions up to commit 6b67eec
CategorySuggestion                                                                                                                                    Impact
Possible issue
Accumulate all validation errors into single exception

The validateFields method returns a new ActionRequestValidationException instance,
but subsequent calls to addValidationError and
validateContextManagementTemplateName/validateInlineContextManagement may create
separate exception instances rather than accumulating errors into the same one. This
could result in only the last validation error being returned. Ensure all validation
errors are accumulated into a single exception instance.

common/src/main/java/org/opensearch/ml/common/transport/agent/MLRegisterAgentRequest.java [56-68]

+ActionRequestValidationException exception = null;
 Map<String, FieldDescriptor> fieldsToValidate = new HashMap<>();
 fieldsToValidate.put("Agent created_by field", new FieldDescriptor(mlAgent.getCreatedBy(), false));
-ActionRequestValidationException exception = validateFields(fieldsToValidate);
+ActionRequestValidationException fieldException = validateFields(fieldsToValidate);
+if (fieldException != null) {
+    for (String error : fieldException.validationErrors()) {
+        exception = addValidationError(error, exception);
+    }
+}
 
 if (mlAgent.getContextManagementName() != null && mlAgent.getContextManagement() != null) {
     exception = addValidationError("Cannot specify both context_management_name and context_management", exception);
 }
 if (mlAgent.getContextManagementName() != null) {
     exception = validateContextManagementTemplateName(mlAgent.getContextManagementName(), exception);
 }
 if (mlAgent.getContextManagement() != null) {
     exception = validateInlineContextManagement(mlAgent.getContextManagement(), exception);
 }
Suggestion importance[1-10]: 7

__

Why: This is a real bug: validateFields returns a new exception instance, and subsequent addValidationError calls pass exception (which may be null) rather than the field exception, potentially losing field validation errors. The improved code correctly merges all errors into a single exception instance, similar to the pattern used in MLUpdateConnectorRequest.

Medium
Verify version constant doesn't skip a release

The VERSION_3_7_0 constant is used as the version gate for the new created_by field
in serialization/deserialization. However, the versions jump from VERSION_3_5_0 to
VERSION_3_7_0, skipping VERSION_3_6_0. If 3.6.0 is a valid release version, nodes
running that version will incorrectly skip reading/writing the created_by field,
causing deserialization errors in mixed-version clusters. Verify whether
VERSION_3_6_0 should be defined and used instead, or confirm that 3.6.0 is not a
planned release.

common/src/main/java/org/opensearch/ml/common/CommonValue.java [107]

+public static final Version VERSION_3_6_0 = Version.fromString("3.6.0");
 public static final Version VERSION_3_7_0 = Version.fromString("3.7.0");
Suggestion importance[1-10]: 5

__

Why: The concern about skipping VERSION_3_6_0 is valid if that version is a planned release, as it could cause deserialization issues in mixed-version clusters. However, this is more of a verification/confirmation request rather than a definitive bug, and the improved code only adds a new constant without changing the existing one.

Low
General
Ensure createdBy is set in all registration paths

The createdBy field is now propagated from registerModelInput to MLModel in
performIndexRemoteModel, but it appears the pre-trained model registration path
(e.g., registerModelFromUrl) also needs to propagate createdBy to the MLModel being
indexed. Check all model registration paths to ensure createdBy is consistently set,
otherwise some registration flows will silently drop the attribution data.

plugin/src/main/java/org/opensearch/ml/model/MLModelManager.java [644-653]

+.modelConfig(registerModelInput.getModelConfig())
+.deploySetting(registerModelInput.getDeploySetting())
+.createdTime(now)
+.lastUpdateTime(now)
+.isHidden(registerModelInput.getIsHidden())
+.guardrails(registerModelInput.getGuardrails())
+.modelInterface(registerModelInput.getModelInterface())
+.tenantId(registerModelInput.getTenantId())
+.createdBy(registerModelInput.getCreatedBy())
+.build();
 
-
Suggestion importance[1-10]: 3

__

Why: The suggestion asks to verify that createdBy is propagated in all registration paths, but looking at the PR diff, registerModelFromUrl at line 856 already includes .createdBy(registerModelInput.getCreatedBy()). The existing_code and improved_code are identical, making this a verification request with no actual code change needed.

Low
Suggestions up to commit ae4ea18
CategorySuggestion                                                                                                                                    Impact
General
Ensure version constants are sequentially consistent

The createdBy field is gated behind VERSION_3_7_0, but the version sequence jumps
from VERSION_3_5_0 to VERSION_3_7_0, skipping VERSION_3_6_0. If VERSION_3_6_0 is
ever introduced later, the ordering could cause subtle serialization bugs. Ensure
the version used for this feature matches the actual planned release version.

common/src/main/java/org/opensearch/ml/common/CommonValue.java [107]

+public static final Version VERSION_3_6_0 = Version.fromString("3.6.0");
 public static final Version VERSION_3_7_0 = Version.fromString("3.7.0");
Suggestion importance[1-10]: 4

__

Why: The suggestion raises a valid concern about skipping VERSION_3_6_0, but this is a design/planning issue rather than a code bug. The improved_code just adds a new constant without removing or changing the existing one, which doesn't directly fix the stated concern about serialization ordering.

Low
Verify intentional addition of tenantId in URL model path

The registerModelFromUrl method now correctly propagates both tenantId and
createdBy, but the two other performIndexRemoteModel overloads already had tenantId
set before this PR. Verify that the registerModelFromUrl path previously missing
tenantId was intentional or a pre-existing bug, and confirm that adding it here
doesn't break existing behavior for URL-based model registration.

plugin/src/main/java/org/opensearch/ml/model/MLModelManager.java [847-857]

+.modelState(MLModelState.REGISTERING)
+.modelConfig(registerModelInput.getModelConfig())
+.deploySetting(registerModelInput.getDeploySetting())
+.createdTime(now)
+.lastUpdateTime(now)
+.isHidden(registerModelInput.getIsHidden())
+.guardrails(registerModelInput.getGuardrails())
+.modelInterface(registerModelInput.getModelInterface())
+.tenantId(registerModelInput.getTenantId())
+.createdBy(registerModelInput.getCreatedBy())
+.build();
 
-
Suggestion importance[1-10]: 3

__

Why: The existing_code and improved_code are identical, so no actual change is proposed. The suggestion only asks the reviewer to verify behavior, which is a valid observation about a pre-existing missing tenantId in registerModelFromUrl, but it's just a verification request.

Low
Possible issue
Ensure validation errors are properly chained

The refactored validation no longer accumulates errors from
validateContextManagementTemplateName and validateInlineContextManagement into the
same exception returned by validateFields. The result of validateFields is
overwritten by addValidationError calls, but validateContextManagementTemplateName
and validateInlineContextManagement may return a new exception object that doesn't
include the createdBy validation errors. Ensure all validation errors are properly
chained by passing the existing exception into each subsequent validation call.

common/src/main/java/org/opensearch/ml/common/transport/agent/MLRegisterAgentRequest.java [56-68]

+Map<String, FieldDescriptor> fieldsToValidate = new HashMap<>();
+fieldsToValidate.put("Agent created_by field", new FieldDescriptor(mlAgent.getCreatedBy(), false));
+ActionRequestValidationException exception = validateFields(fieldsToValidate);
 
+if (mlAgent.getContextManagementName() != null && mlAgent.getContextManagement() != null) {
+    exception = addValidationError("Cannot specify both context_management_name and context_management", exception);
+}
+if (mlAgent.getContextManagementName() != null) {
+    exception = validateContextManagementTemplateName(mlAgent.getContextManagementName(), exception);
+}
+if (mlAgent.getContextManagement() != null) {
+    exception = validateInlineContextManagement(mlAgent.getContextManagement(), exception);
+}
Suggestion importance[1-10]: 1

__

Why: The existing_code and improved_code are identical, meaning no actual change is proposed. The suggestion text describes a concern about error chaining, but looking at the code, validateContextManagementTemplateName and validateInlineContextManagement already receive the exception parameter and chain errors properly.

Low
Suggestions up to commit a6d4005
CategorySuggestion                                                                                                                                    Impact
Possible issue
Validation regression removes required field checks

The refactored validate() method no longer validates the agent's name and type
fields, which were previously validated in the original code. This is a regression —
the agent name and type should still be validated to ensure they are not null or
empty before proceeding.

common/src/main/java/org/opensearch/ml/common/transport/agent/MLRegisterAgentRequest.java [56-68]

 Map<String, FieldDescriptor> fieldsToValidate = new HashMap<>();
+fieldsToValidate.put("Agent name", new FieldDescriptor(mlAgent.getName(), true));
+fieldsToValidate.put("Agent type", new FieldDescriptor(mlAgent.getType(), true));
 fieldsToValidate.put("Agent created_by field", new FieldDescriptor(mlAgent.getCreatedBy(), false));
 ActionRequestValidationException exception = validateFields(fieldsToValidate);
 
 if (mlAgent.getContextManagementName() != null && mlAgent.getContextManagement() != null) {
     exception = addValidationError("Cannot specify both context_management_name and context_management", exception);
 }
 if (mlAgent.getContextManagementName() != null) {
     exception = validateContextManagementTemplateName(mlAgent.getContextManagementName(), exception);
 }
 if (mlAgent.getContextManagement() != null) {
     exception = validateInlineContextManagement(mlAgent.getContextManagement(), exception);
 }
Suggestion importance[1-10]: 7

__

Why: The refactored validate() method in the PR only validates createdBy but the old code also validated other fields. However, looking at the old hunk, the original code didn't explicitly validate name or type via validateFields either — it only checked context management. The suggestion may be pointing to a real gap but the original code didn't validate name/type in this method either, so the regression claim needs verification.

Medium
General
Missing intermediate version constant definition

The VERSION_3_7_0 constant is being used as the version gate for the new created_by
field, but the versions jump from VERSION_3_5_0 to VERSION_3_7_0, skipping
VERSION_3_6_0. If 3.6.0 is a valid release version, this gap could cause issues with
backward compatibility for nodes running that version. Verify that VERSION_3_6_0 is
not needed or add it if required.

common/src/main/java/org/opensearch/ml/common/CommonValue.java [107]

+public static final Version VERSION_3_6_0 = Version.fromString("3.6.0");
 public static final Version VERSION_3_7_0 = Version.fromString("3.7.0");
Suggestion importance[1-10]: 4

__

Why: The suggestion points out a potential gap between VERSION_3_5_0 and VERSION_3_7_0, but this is a design decision that may be intentional if 3.6.0 is not a planned release. The suggestion asks to verify rather than fix a confirmed bug, and the improved code just adds a constant that may not be needed.

Low
Verify complete model metadata propagation in new code path

The registerModelFromUrl method now correctly propagates tenantId and createdBy, but
looking at the diff, tenantId was not previously set in this code path. This is a
good fix, but it should be verified that the modelGroupId and other fields like
modelName are also being set in this builder call, as they appear to be missing from
the visible diff hunk, which could result in incomplete model metadata being
indexed.

plugin/src/main/java/org/opensearch/ml/model/MLModelManager.java [847-857]

+// Verify all required fields are set in the builder, including modelName, modelGroupId, etc.
 .modelState(MLModelState.REGISTERING)
 .modelConfig(registerModelInput.getModelConfig())
 .deploySetting(registerModelInput.getDeploySetting())
 .createdTime(now)
 .lastUpdateTime(now)
 .isHidden(registerModelInput.getIsHidden())
 .guardrails(registerModelInput.getGuardrails())
 .modelInterface(registerModelInput.getModelInterface())
 .tenantId(registerModelInput.getTenantId())
 .createdBy(registerModelInput.getCreatedBy())
 .build();
Suggestion importance[1-10]: 2

__

Why: This suggestion asks to verify that other fields are set in the builder, but the improved_code is essentially the same as existing_code with just a comment added. It doesn't provide a concrete fix and the concern about missing fields is speculative without seeing the full builder chain.

Low
Suggestions up to commit 7bf11f7
CategorySuggestion                                                                                                                                    Impact
Possible issue
Missing tenantId propagation in URL model registration

The registerModelFromUrl method sets createdBy but is missing tenantId, while the
other two performIndexRemoteModel methods set both. This inconsistency means
tenantId will be null for URL-based model registrations, which could cause
authorization or multi-tenancy issues.

plugin/src/main/java/org/opensearch/ml/model/MLModelManager.java [853-856]

 .guardrails(registerModelInput.getGuardrails())
 .modelInterface(registerModelInput.getModelInterface())
+.tenantId(registerModelInput.getTenantId())
 .createdBy(registerModelInput.getCreatedBy())
 .build();
Suggestion importance[1-10]: 8

__

Why: The registerModelFromUrl method sets createdBy but omits tenantId, while both performIndexRemoteModel methods set both fields. This inconsistency could cause multi-tenancy issues for URL-based model registrations where tenantId would be null.

Medium
General
Unused version constant added to production code

VERSION_3_6_0 is introduced solely to serve as a "version before created_by support"
in tests, but it is not used in any production code. If this version constant is
only needed for tests, it should either be defined in the test scope or documented
clearly. Adding unused production constants can cause confusion about which version
introduced which feature.

common/src/main/java/org/opensearch/ml/common/CommonValue.java [106-107]

 public static final Version VERSION_3_6_0 = Version.fromString("3.6.0");
+// VERSION_3_7_0 introduces the created_by field for ML resources
 public static final Version VERSION_3_7_0 = Version.fromString("3.7.0");
Suggestion importance[1-10]: 2

__

Why: The improved_code only adds a comment and doesn't actually change the code meaningfully. While VERSION_3_6_0 is only used in tests, the suggestion doesn't propose moving it to test scope, making it a low-impact style suggestion.

Low
Inconsistent validation pattern across request classes

The validation for created_by in MLRegisterAgentRequest manually iterates over
errors and re-adds them, while MLCreateConnectorRequest and MLRegisterModelRequest
directly return the result of validateFields. This inconsistency is because the
agent request has additional validation logic before and after. However, the manual
error merging approach is fragile — if validateFields returns an exception with
multiple errors, they are all correctly merged, but the pattern diverges from the
rest of the codebase and could be simplified by restructuring the validation order.

common/src/main/java/org/opensearch/ml/common/transport/agent/MLRegisterAgentRequest.java [61-69]

+// Validate created_by as an optional safe-text field
+Map<String, FieldDescriptor> fieldsToValidate = new HashMap<>();
+fieldsToValidate.put(MLAgent.CREATED_BY_FIELD, new FieldDescriptor(mlAgent.getCreatedBy(), false));
+ActionRequestValidationException fieldException = validateFields(fieldsToValidate);
+if (fieldException != null) {
+    for (String error : fieldException.validationErrors()) {
+        exception = addValidationError(error, exception);
+    }
+}
 
-
Suggestion importance[1-10]: 2

__

Why: The existing_code and improved_code are identical, so no actual change is proposed. The suggestion points out a style inconsistency but doesn't provide a concrete fix, making it a low-value observation.

Low

@dbwiddis dbwiddis requested a deployment to ml-commons-cicd-env-require-approval March 31, 2026 05:49 — with GitHub Actions Waiting
@dbwiddis dbwiddis requested a deployment to ml-commons-cicd-env-require-approval March 31, 2026 05:49 — with GitHub Actions Waiting
@dbwiddis dbwiddis requested a deployment to ml-commons-cicd-env-require-approval March 31, 2026 05:49 — with GitHub Actions Waiting
@dbwiddis dbwiddis requested a deployment to ml-commons-cicd-env-require-approval March 31, 2026 05:49 — with GitHub Actions Waiting
@github-actions
Copy link
Copy Markdown

Persistent review updated to latest commit 7bf11f7

@dbwiddis dbwiddis marked this pull request as draft March 31, 2026 15:51
@github-actions
Copy link
Copy Markdown

Persistent review updated to latest commit a6d4005

@dbwiddis dbwiddis requested a deployment to ml-commons-cicd-env-require-approval March 31, 2026 17:18 — with GitHub Actions Waiting
@dbwiddis dbwiddis had a problem deploying to ml-commons-cicd-env-require-approval March 31, 2026 17:38 — with GitHub Actions Failure
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env-require-approval March 31, 2026 17:38 — with GitHub Actions Inactive
@dbwiddis dbwiddis had a problem deploying to ml-commons-cicd-env-require-approval March 31, 2026 17:38 — with GitHub Actions Error
@dbwiddis dbwiddis had a problem deploying to ml-commons-cicd-env-require-approval March 31, 2026 17:38 — with GitHub Actions Failure
@github-actions
Copy link
Copy Markdown

Persistent review updated to latest commit 31fb021

@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env-require-approval March 31, 2026 18:00 — with GitHub Actions Inactive
@dbwiddis dbwiddis had a problem deploying to ml-commons-cicd-env-require-approval March 31, 2026 18:00 — with GitHub Actions Failure
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env-require-approval March 31, 2026 18:00 — with GitHub Actions Inactive
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env-require-approval March 31, 2026 18:00 — with GitHub Actions Inactive
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 86.07595% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.38%. Comparing base (9585180) to head (e85bb6f).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...ommon/transport/register/MLRegisterModelInput.java 50.00% 5 Missing and 1 partial ⚠️
...rc/main/java/org/opensearch/ml/common/MLModel.java 87.50% 2 Missing ⚠️
...common/transport/agent/MLRegisterAgentRequest.java 80.00% 1 Missing and 1 partial ⚠️
...n/java/org/opensearch/ml/common/agent/MLAgent.java 91.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #4754      +/-   ##
============================================
+ Coverage     77.36%   77.38%   +0.02%     
- Complexity    11895    11922      +27     
============================================
  Files           963      963              
  Lines         53304    53370      +66     
  Branches       6500     6520      +20     
============================================
+ Hits          41240    41303      +63     
- Misses         9304     9311       +7     
+ Partials       2760     2756       -4     
Flag Coverage Δ
ml-commons 77.38% <86.07%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Adds an optional `created_by` field to MLAgent to allow plugins using
the ML Client API (e.g. flow-framework) to attribute provisioned agents
for adoption metrics tracking.

- Add VERSION_3_6_0 and VERSION_3_7_0 constants to CommonValue
- Add `created_by` field to MLAgent with version-gated serialization
- Emit `created_by` tag in getTags() defaulting to "unknown" when null
- Validate `created_by` in MLRegisterAgentRequest using SAFE_INPUT_PATTERN
- Update all affected test files

Signed-off-by: Daniel Widdis <widdis@gmail.com>
Adds an optional `created_by` field to HttpConnector, AwsConnector,
AbstractConnector, the Connector interface, and MLCreateConnectorInput
to allow plugins using the ML Client API to attribute provisioned
connectors for adoption metrics tracking.

- Add `created_by` to Connector interface (getter/setter)
- Add `created_by` field to AbstractConnector via @Setter
- Add `created_by` to HttpConnector builder, XContent parse/write,
  and version-gated stream serialization (VERSION_3_7_0)
- Add `created_by` to AwsConnector builder, delegating to HttpConnector
- Add `created_by` to MLCreateConnectorInput builder, XContent
  parse/write, and version-gated stream serialization (VERSION_3_7_0)

Signed-off-by: Daniel Widdis <widdis@gmail.com>
Adds an optional `created_by` field to MLModel and MLRegisterModelInput
to allow plugins using the ML Client API to attribute provisioned models
for adoption metrics tracking.

- Add CREATED_BY_FIELD constant and field to MLModel with toXContent,
  parse, and version-gated stream serialization (VERSION_3_7_0)
- Add CREATED_BY_FIELD constant and field to MLRegisterModelInput with
  toXContent, both parse overloads, and version-gated stream
  serialization (VERSION_3_7_0)

Signed-off-by: Daniel Widdis <widdis@gmail.com>
Add created_by to the Tags returned by getRemoteModelTags,
getPreTrainedModelTags, and getCustomModelTags. Defaults to
'unknown' when createdBy is null, consistent with MLAgent.

Signed-off-by: Daniel Widdis <widdis@gmail.com>
Consistent with MLRegisterAgentRequest, validate created_by as an
optional safe-text field in MLCreateConnectorRequest and
MLRegisterModelRequest.

Signed-off-by: Daniel Widdis <widdis@gmail.com>
…n and tenantId

- Centralize CREATED_BY_FIELD constant in CommonValue; remove duplicate
  definitions from MLAgent, MLModel, HttpConnector, MLCreateConnectorInput,
  and MLRegisterModelInput
- Add created_by validation to MLUpdateConnectorRequest (was missing
  alongside MLCreateConnectorRequest)
- Use human-readable labels for created_by validation across all four
  request classes
- Fix missing tenantId in registerModelFromUrl MLModel builder in
  MLModelManager

Signed-off-by: Daniel Widdis <widdis@gmail.com>
- MachineLearningNodeClientTest: add happy path tests verifying
  createdBy is passed through for registerAgent, register, and
  createConnector
- MLRegisterAgentRequestTest: add validate_ValidCreatedBy and
  validate_InvalidCreatedBy
- MLCreateConnectorRequestTests: add validateWithValidCreatedBy and
  validateWithInvalidCreatedBy
- MLRegisterModelRequestTest: add validate_ValidCreatedBy and
  validate_InvalidCreatedBy

Signed-off-by: Daniel Widdis <widdis@gmail.com>
…ror merge bug

created_by is set at creation time only and is not merged by
HttpConnector.update(), so validating it on update was misleading.

Also fix a pre-existing bug where exception = validateFields(...) would
silently discard any prior connectorId null error; now merges field
errors into the existing exception via addValidationError.

Signed-off-by: Daniel Widdis <widdis@gmail.com>
…d field invalid

Signed-off-by: Daniel Widdis <widdis@gmail.com>
@dbwiddis dbwiddis marked this pull request as ready for review March 31, 2026 18:47
@dbwiddis dbwiddis had a problem deploying to ml-commons-cicd-env-require-approval March 31, 2026 18:49 — with GitHub Actions Error
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env-require-approval March 31, 2026 18:49 — with GitHub Actions Inactive
@dbwiddis dbwiddis had a problem deploying to ml-commons-cicd-env-require-approval March 31, 2026 18:49 — with GitHub Actions Failure
@dbwiddis dbwiddis temporarily deployed to ml-commons-cicd-env-require-approval March 31, 2026 18:49 — with GitHub Actions Inactive
@github-actions
Copy link
Copy Markdown

Persistent review updated to latest commit e85bb6f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] created_by Provenance Tag Support in ML Commons

1 participant