Is your feature request related to a problem? Please describe
Proposal: Priority-Based Ordering for System-Generated Search Pipeline Processors
Summary
Lift the current one-processor-per-stage-per-type limit in the system-generated search pipeline framework by introducing a getPriority() method on SystemGeneratedProcessor. This enables multiple plugins to register system-generated processors at the same execution stage without conflicts, while guaranteeing deterministic execution order.
Motivation
The system-generated search pipeline framework (introduced in OpenSearch 3.3) currently enforces a hard limit of one system-generated processor per processor type per execution stage. This limit was introduced to avoid non-deterministic ordering, since factory iteration uses HashMap.entrySet() which provides no ordering guarantees.
As the framework is adopted by more plugins, this limit is becoming a blocker. Concretely:
- The
mmr_rerank processor (k-NN plugin) occupies the SearchResponse PRE slot
- A proposed
hybrid-explain processor (neural-search plugin) also needs the SearchResponse PRE slot
- Both slots in the SearchResponse type (PRE and POST) are now occupied, with no room for future processors
The limit must be lifted in a way that:
- Preserves deterministic execution order
- Does not require plugin authors to modify OpenSearch core to register a new processor
- Fails fast and clearly when two processors conflict
- Scales to future processors without renumbering
Describe the solution you'd like
Proposed Solution
1. Add getPriority() to SystemGeneratedProcessor (Core Change)
Add a single abstract method to the SystemGeneratedProcessor interface:
public interface SystemGeneratedProcessor extends Processor {
/**
* Execution order within the same processor type and execution stage.
* Lower value runs first. Must be a unique positive integer among all
* system-generated processors of the same type at the same execution stage
* for a given search request.
*
* Plugin authors must choose a value that reflects when in the pipeline
* lifecycle their processor needs to run relative to others.
* See the OpenSearch documentation for recommended priority bands.
*
* @return a positive integer representing execution priority
*/
int getPriority();
// existing methods unchanged
default ExecutionStage getExecutionStage() { return ExecutionStage.POST_USER_DEFINED; }
default void evaluateConflicts(ProcessorConflictEvaluationContext ctx) {}
}
getPriority() is abstract (no default) to force every implementor to make a conscious, explicit decision about execution timing. A silent default would mask ordering bugs.
2. Replace ensureSingleProcessor with ensureNoDuplicatePriority (Core Change)
In SystemGeneratedPipelineWithMetrics, replace the current hard size check:
// BEFORE (current):
private static <T extends Processor> void ensureSingleProcessor(...) {
if (processors.size() > 1) {
throw new IllegalArgumentException("Cannot support more than one...");
}
}
// AFTER (proposed):
private static <T extends Processor> void ensureNoDuplicatePriority(
String typeName,
SystemGeneratedProcessor.ExecutionStage stage,
List<T> processors
) {
Map<Integer, List<String>> byPriority = new LinkedHashMap<>();
for (T processor : processors) {
int priority = ((SystemGeneratedProcessor) processor).getPriority();
byPriority.computeIfAbsent(priority, k -> new ArrayList<>()).add(processor.getType());
}
for (Map.Entry<Integer, List<String>> entry : byPriority.entrySet()) {
if (entry.getValue().size() > 1) {
throw new IllegalArgumentException(String.format(
Locale.ROOT,
"System-generated %s processors %s at stage %s share the same priority %d. " +
"Each processor must declare a unique priority. " +
"See the OpenSearch documentation for recommended priority bands.",
typeName, entry.getValue(), stage, entry.getKey()
));
}
}
}
3. Sort Processors by Priority Before Building the Pipeline (Core Change)
In generateProcessors(), after collecting processors into lists.pre and lists.post, sort them:
lists.pre.sort(Comparator.comparingInt(p -> ((SystemGeneratedProcessor) p).getPriority()));
lists.post.sort(Comparator.comparingInt(p -> ((SystemGeneratedProcessor) p).getPriority()));
This makes execution order deterministic and explicit — lower priority number always runs first, regardless of factory registration order.
Priority Band Convention (Documentation, Not Code)
Priority bands are a documentation convention only — no constants are defined in core. This ensures plugin authors never need to modify core to introduce a new processor. The bands are defined per processor type, since each type operates on different data at a different point in the search lifecycle.
SearchRequest Processor Bands
| Band |
Name |
Meaning |
| 1–99 |
Query Rewriting |
Fundamentally rewrites the query structure (expansion, neural translation) |
| 100–299 |
Query Parameter Injection |
Injects/modifies parameters (size, k) without changing query structure |
| 300–499 |
Query Validation/Routing |
Validates query or adjusts routing |
| 500–699 |
General Request Enrichment |
Adds context/metadata; no strong ordering dependency |
| 700–999 |
Request Finalization |
Last-chance modifications before dispatch |
SearchPhaseResults Processor Bands
| Band |
Name |
Meaning |
| 1–99 |
Score Normalization |
Normalizes raw shard scores to a common scale |
| 100–299 |
Score Combination |
Combines normalized scores from multiple sub-queries |
| 300–499 |
Candidate Selection |
Selects which candidates proceed to FETCH phase |
| 500–699 |
Phase Result Enrichment |
Attaches metadata/context for downstream response processors |
| 700–999 |
Phase Result Finalization |
Final adjustments before FETCH phase |
SearchResponse Processor Bands
| Band |
Name |
Meaning |
| 1–99 |
Score Modification |
Changes numeric score values in the response |
| 100–299 |
Score-Derived Computation |
Reads finalized scores to produce derived data (explanation, stats) |
| 300–499 |
Result Set Shaping |
Changes which hits are in the result set or their order (reranking, deduplication, truncation) |
| 500–699 |
General Response Enrichment |
Adds metadata/annotations; no strong ordering dependency |
| 700–899 |
Content Transformation |
Transforms hit content/fields (highlighting, field rewriting) |
| 900–999 |
Output Formatting |
Final response structure shaping |
Scoping: Priority uniqueness is enforced within (processorType, executionStage). A SearchRequest processor with priority 150 and a SearchResponse processor with priority 150 are completely independent and never conflict.
Impact on Existing Processors
All existing SystemGeneratedProcessor implementations must be updated to implement getPriority(). Proposed assignments:
| Processor |
Plugin |
Type |
Stage |
Priority |
Band |
mmr_over_sample |
k-NN |
SearchRequest |
POST |
150 |
Query Parameter Injection |
hybrid-explain (proposed) |
neural-search |
SearchResponse |
PRE |
150 |
Score-Derived Computation |
mmr_rerank |
k-NN |
SearchResponse |
PRE |
350 |
Result Set Shaping |
semantic-highlighter |
neural-search |
SearchResponse |
POST |
750 |
Content Transformation |
With this ordering, the Response PRE stage executes as:
priority 150: hybrid-explain → enriches explanation before any hits are removed
priority 350: mmr_rerank → reorders and trims the result set
This is semantically correct: explanation enrichment must run before result set trimming, because hits removed by MMR reranking should still have their explanation data attached while they are in the response.
Changes Required
OpenSearch Core (opensearch-project/OpenSearch):
SystemGeneratedProcessor.java — add abstract getPriority() method
SystemGeneratedPipelineWithMetrics.java — replace ensureSingleProcessor with ensureNoDuplicatePriority, add sort by priority
- Documentation — add priority band tables to the system-generated search processors page
neural-search plugin (opensearch-project/neural-search):
SemanticHighlightingProcessor.java — implement getPriority() returning 750
- New
SystemExplanationProcessor.java — implement getPriority() returning 150, getExecutionStage() returning PRE_USER_DEFINED
k-NN plugin (opensearch-project/k-NN):
mmr_rerank processor — implement getPriority() returning 350
mmr_over_sample processor — implement getPriority() returning 150
Backward Compatibility
Since getPriority() is abstract, this is a compile-time breaking change for any existing SystemGeneratedProcessor implementation. All known implementations are in first-party OpenSearch plugins and will be updated in the same release. Third-party plugin authors will receive a compile error that directs them to add getPriority() — this is intentional, as it forces explicit ordering decisions.
The runtime behavior for a single processor per stage is unchanged: a list of one element sorted by priority is still a list of one element.
Alternatives Considered
Keep the 1-per-stage limit and use a different execution stage: Not viable — both Response PRE and POST slots are now occupied by existing processors.
Define ExecutionPriority constants in core: Rejected — this would require plugin authors to modify core to add new bands, defeating the purpose of a plugin-extensible system.
Use a default priority value: Rejected — a default silently masks ordering bugs. Making getPriority() abstract forces explicit, documented decisions.
Use Integer.MAX_VALUE as default: Rejected for the same reason as above, and additionally because two processors both defaulting to MAX_VALUE would conflict at runtime rather than compile time.
Related component
Search
Describe alternatives you've considered
No response
Additional context
No response
Is your feature request related to a problem? Please describe
Proposal: Priority-Based Ordering for System-Generated Search Pipeline Processors
Summary
Lift the current one-processor-per-stage-per-type limit in the system-generated search pipeline framework by introducing a
getPriority()method onSystemGeneratedProcessor. This enables multiple plugins to register system-generated processors at the same execution stage without conflicts, while guaranteeing deterministic execution order.Motivation
The system-generated search pipeline framework (introduced in OpenSearch 3.3) currently enforces a hard limit of one system-generated processor per processor type per execution stage. This limit was introduced to avoid non-deterministic ordering, since factory iteration uses
HashMap.entrySet()which provides no ordering guarantees.As the framework is adopted by more plugins, this limit is becoming a blocker. Concretely:
mmr_rerankprocessor (k-NN plugin) occupies the SearchResponse PRE slothybrid-explainprocessor (neural-search plugin) also needs the SearchResponse PRE slotThe limit must be lifted in a way that:
Describe the solution you'd like
Proposed Solution
1. Add
getPriority()toSystemGeneratedProcessor(Core Change)Add a single abstract method to the
SystemGeneratedProcessorinterface:getPriority()is abstract (no default) to force every implementor to make a conscious, explicit decision about execution timing. A silent default would mask ordering bugs.2. Replace
ensureSingleProcessorwithensureNoDuplicatePriority(Core Change)In
SystemGeneratedPipelineWithMetrics, replace the current hard size check:3. Sort Processors by Priority Before Building the Pipeline (Core Change)
In
generateProcessors(), after collecting processors intolists.preandlists.post, sort them:This makes execution order deterministic and explicit — lower priority number always runs first, regardless of factory registration order.
Priority Band Convention (Documentation, Not Code)
Priority bands are a documentation convention only — no constants are defined in core. This ensures plugin authors never need to modify core to introduce a new processor. The bands are defined per processor type, since each type operates on different data at a different point in the search lifecycle.
SearchRequest Processor Bands
SearchPhaseResults Processor Bands
SearchResponse Processor Bands
Scoping: Priority uniqueness is enforced within
(processorType, executionStage). A SearchRequest processor with priority 150 and a SearchResponse processor with priority 150 are completely independent and never conflict.Impact on Existing Processors
All existing
SystemGeneratedProcessorimplementations must be updated to implementgetPriority(). Proposed assignments:mmr_over_sample150hybrid-explain(proposed)150mmr_rerank350semantic-highlighter750With this ordering, the Response PRE stage executes as:
This is semantically correct: explanation enrichment must run before result set trimming, because hits removed by MMR reranking should still have their explanation data attached while they are in the response.
Changes Required
OpenSearch Core (
opensearch-project/OpenSearch):SystemGeneratedProcessor.java— add abstractgetPriority()methodSystemGeneratedPipelineWithMetrics.java— replaceensureSingleProcessorwithensureNoDuplicatePriority, add sort by priorityneural-search plugin (
opensearch-project/neural-search):SemanticHighlightingProcessor.java— implementgetPriority()returning750SystemExplanationProcessor.java— implementgetPriority()returning150,getExecutionStage()returningPRE_USER_DEFINEDk-NN plugin (
opensearch-project/k-NN):mmr_rerankprocessor — implementgetPriority()returning350mmr_over_sampleprocessor — implementgetPriority()returning150Backward Compatibility
Since
getPriority()is abstract, this is a compile-time breaking change for any existingSystemGeneratedProcessorimplementation. All known implementations are in first-party OpenSearch plugins and will be updated in the same release. Third-party plugin authors will receive a compile error that directs them to addgetPriority()— this is intentional, as it forces explicit ordering decisions.The runtime behavior for a single processor per stage is unchanged: a list of one element sorted by priority is still a list of one element.
Alternatives Considered
Keep the 1-per-stage limit and use a different execution stage: Not viable — both Response PRE and POST slots are now occupied by existing processors.
Define
ExecutionPriorityconstants in core: Rejected — this would require plugin authors to modify core to add new bands, defeating the purpose of a plugin-extensible system.Use a default priority value: Rejected — a default silently masks ordering bugs. Making
getPriority()abstract forces explicit, documented decisions.Use
Integer.MAX_VALUEas default: Rejected for the same reason as above, and additionally because two processors both defaulting toMAX_VALUEwould conflict at runtime rather than compile time.Related component
Search
Describe alternatives you've considered
No response
Additional context
No response