Adding translator for filter query by nagarajg17 · Pull Request #2667 · opensearch-project/opensearch-migrations

nagarajg17 · 2026-04-09T06:05:10Z

Description

This translates Solr filter query to OpenSearch equivalent

Testing

UTs and Integ

Check List

New functionality includes testing
Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

codecov · 2026-04-09T06:06:52Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.54%. Comparing base (b3bb603) to head (fbbf0b4).

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #2667      +/-   ##
============================================
+ Coverage     72.52%   72.54%   +0.01%     
  Complexity       90       90              
============================================
  Files           709      711       +2     
  Lines         32988    33011      +23     
  Branches       2823     2831       +8     
============================================
+ Hits          23925    23948      +23     
  Misses         7784     7784              
  Partials       1279     1279

Flag	Coverage Δ
gradle	`68.74% <ø> (ø)`
node	`92.71% <100.00%> (+0.07%)`	⬆️
python	`76.56% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

k-rooot · 2026-04-09T11:05:09Z

Please look into the code coverage and integration test failures.

k-rooot · 2026-04-09T12:58:06Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

+  match: (ctx) => ctx.params.has('fq'),
+  apply: (ctx) => {
+    // Get all fq values (can be multiple)
+    const fqValues = ctx.params.getAll('fq');


Should we need to filter out empty or whitespace-only values before processing? If this is handled at the parser please make sure we have sufficient test coverage to confirm the handling.

Solr allows filter(condition) syntax inside queries to cache sub-clauses independently. Should we detect and normalize filter(...) constructs?

Added validations to filter out empty or whitespace-only values and tests

Actually it is a separate syntax or query itself from fq. This needs PEG changes, separate translate transformation rule etc, so had excluded in this PR scope. Anyway have added it now

k-rooot · 2026-04-09T13:00:23Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

+    if (fqValues.length === 0) return;
+
+    // Parse each fq into OpenSearch DSL
+    const filters = fqValues.map(parseFq);


Solr allows complex boolean expressions within a single fq (e.g., +A +B) which are semantically equivalent to multiple fq params. The current implementation does not explicitly account for intra-fq boolean structure. If translateQ does not normalize +A +B into must, behavior may diverge. Please ensure translateQ preserves boolean semantics consistently between multiple fq and single fq with boolean clauses.
eg: fq=+A +B should behave same as fq=A&fq=B

Callout: Currently, if one fq fails to parse, the entire transformation will fail. We will need to consider the translation modes for later phases to handle these failures.

Good point. The parser does handle +/- prefix syntax (see prefixExpr in solr.pegjs). This applies to not just fq even to q param. When +/- is used and there are 2 or more terms, it is implicit AND instead of OR. Currently PEG works with implicit OR which is true for rest of all other usecases. We can fix at PEG level or before calling PEG itself to send q.op=AND for such cases. I'm tracking the fix in fine tune task. Once we add the fix it will work for both q and fq, no explicit changes will be needed for fq

Ack, it supports fail fast now

Once the fix is implemented, please ensure we have explicit test coverage validating equivalence between multiple fq params and single fq with + (required) clauses. Please add a code level TODO to ensure this is not slipped through the crack.

k-rooot · 2026-04-09T13:03:22Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

+    if (existingQuery) {
+      boolQuery.set('must', [existingQuery]);
+    }


Wrapping existingQuery in an array is correct, but if existingQuery is already a bool query, this may lead to unnecessary nesting. Can we flatten when possible to avoid deep nesting and improve query performance/readability.

k-rooot · 2026-04-09T13:06:16Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

+    }
+
+    // Add filter clauses
+    boolQuery.set('filter', filters);


This overwrites any existing filter clause if the original query already had one (e.g., from previous transforms like query-q ). Should we merge with existing filters instead of overwriting?

const existingFilter = existingQuery?.get?.('bool')?.get?.('filter') || []; boolQuery.set('filter', [...existingFilter, ...filters]);

We are not overriding it. We are wrapping existing query and adding filter on top of it. For merging filters is handled as part of above comment of nesting

const boolQuery = new Map<string, any>(); if (existingQuery) { boolQuery.set('must', [existingQuery]); } boolQuery.set('filter', filters);

k-rooot · 2026-04-09T13:07:59Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

+    boolQuery.set('filter', filters);
+
+    // Replace query with the new bool wrapper
+    ctx.body.set('query', new Map([['bool', boolQuery]]));


I would like to understand why we are doing this replacement? Will it impact any transformations from query-q?

This replaces the entire query with a new bool structure, which may discard other clauses such as should or must_not that exist in the original query. I would recommend merging into the existing bool query when present instead of replacing it.

Answered in above comments. We are wrapping existing query and adding filter on it. It won't impact anything. If existing query from query-q is bool query, have merged filter to that now

k-rooot · 2026-04-09T13:14:51Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

+  return result.dsl;
+}
+
+export const request: MicroTransform<RequestContext> = {


Solr supports negation in fq (e.g., -domainStatus:deleted). These should be mapped to must_not instead of filter.

Filter will have any kind of query, must_not is inside that. Translation produces

{ "query": { "bool": { "must": [{ "match_all": {} }], "filter": [ { "bool": { "must_not": [{ "match": { "category": { "query": "software" } } }] } } ] } } }

This introduces an extra level of nesting, whereas OpenSearch allows must_not directly at the top-level bool, which is more natural and avoids unnecessary wrapping. Consider lifting must_not to the top-level bool.must_not when the fq is a pure negation, for a flatter and more idiomatic query structure.

k-rooot · 2026-04-09T17:06:51Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

+ * Parse a single fq value using the same query engine as q.
+ * Creates a params map with the fq value as 'q' for translateQ.
+ */
+function parseFq(fq: string): Map<string, any> {


Solr fq supports local params (e.g., {!frange}, {!geofilt}, {!cache=false}), which are not standard query syntax. Passing raw fq into translateQ may fail or misinterpret {!frange l=10 u=100}field, {!cache=false}field:value.

Introduce a preprocessing step to parse/extract local params and route to appropriate DSL constructs will help.

const { localParams, query } = parseLocalParams(fq); // Example mapping // {!frange l=10 u=100}price → range query

Local params is not P0 and tracked separately. It needs separate translation mechanism. It is out of scope for this PR. Just like we are calling supporting all query types of fq by calling translateQ, it would be the same for local params as well when we add support for it. It isn't strictly coupled to this task

k-rooot · 2026-04-09T17:07:39Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

@@ -0,0 +1,65 @@
+/**


A key feature of fq is that filters are cached independently for performance. OpenSearch does not have identical semantics, but filter context is cache-friendly and must is not. Should we support cache hints and map them appropriately (if possible).

// fq={!cache=false}status:active // → may require bypassing filter context or marking differently

Solr supports cost for ordering filter execution, especially for expensive filters and post-filters. At minimum we must parse and ignore explicitly with documentation, or optionally reorder filters based on cost.

Solr distinguishes between normal filters (pre-query) and post filters (executed after main query when cost >= 100). All filters are currently treated equally in OpenSearch. We must either document or support mapping high-cost filters → post_filter (OpenSearch equivalent concept)

{ "query": { ... }, "post_filter": { ... } }

Solr fq semantics are strictly:multiple fq → intersection (AND) but union (OR) can be achieved via boolean queries inside fq or filter(condition) tricks. The current implementation assumes all fq → AND filters only. Should we ensure OR logic inside fq is preserved via should?

Solr explicitly recommends single fq with multiple clauses when reused together and multiple fq when independent (for caching efficiency). The transform loses this distinction where everything becomes a flat filter array. Should we optionally preserve grouping information for better optimization?

Solr allows function queries inside fq (e.g., mul(popularity,price)). These require special handling in OpenSearch (e.g., script queries). Please ensure translateQ supports function queries in filter context.

{ "script": { "script": "doc['popularity'] * doc['price'] > 10" } }

Similar to other comment, local params out of scope for this PR

Cost is a local parameter used with non-cached filter queries (fq) to hint at the order they should be evaluated. This also requires local params. Using fq for non caching scenarios will be very rare and not a general usecase. Not much research I've done but I know OpenSearch doesn't have same equivalent, could be for same reason. This IMO we must implement only if we ever receive a customer request

Above point would suffice I think

We are processing each fq param and the adding result/transformation query in filter array. If fq is using boolean or range or any query inside it, it is processed appropriately

Good observation. OpenSearch does not cache the final combined result of the entire filter array. It typically caches each filter clause (or reusable filter bitset) independently, and then combines them at query time

Functional queries as discussed with product as part of other tasks, its not P0. More over it is independent task. When we add functional queries support, fq also supports automatically just like we are supporting all other query types now

On 1, 2 and 3, please update the fq rule transform to flag unsupported constructs, aligned with validations introduced in PR #2688. Additionally document all unsupported fq features (cache, cost., etc) in the LIMITATIONS file.

Thanks for the confirmation. Can we add a targeted test for plain OR within a single fq to explicitly validate the contract?

Solr’s distinction between single fq with multiple clauses (treated as one logical unit) and multiple fq params (independent filters) is not just about caching, but also about preserving logical grouping and intent. I'd still see treat this as a limitation. Any thoughts on documenting that this distinction is not preserved, or optionally preserving grouping for closer semantic parity/debuggability.

Same comment as 1 for function queries in fq.

k-rooot · 2026-04-09T17:12:08Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

+
+export const request: MicroTransform<RequestContext> = {
+  name: 'filter-query-fq',
+  match: (ctx) => ctx.params.has('fq'),


Solr requires proper escaping of special characters in fq. If ctx.params does not already decode values, queries may break or mis-parse. Please make sure decoding is applied before translateQ. Please add sufficient test coverage for this as well

URL decoding is handled by URLSearchParams which is used to construct ctx.params. This applies equally to q, fq, and all other params - it's not specific to this PR's changes. If there's a specific escaping scenario you're concerned about which only applies to fq, happy to add a test case, but the decoding layer is upstream of the transform pipeline and not scope of this task

The intent of the comment was to ensure that escaped/special characters in fq are correctly preserved and parsed through translateQ, since fq often contains ranges, phrases, and special syntax. Adding test coverage for fq values with escaped/special characters to validate end to end behavior through the transform will future proof the transform rule for fq. Thoughts?

k-rooot · 2026-04-09T17:14:18Z

...pture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.test.ts

@@ -0,0 +1,218 @@
+import { describe, it, expect } from 'vitest';


I'll review the UTs in the next revision.

Signed-off-by: Nagaraj G <narajg@amazon.com>

k-rooot · 2026-04-14T16:49:40Z

...pture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.test.ts

+  return obj;
+}
+
+describe('filter-query-fq request transform', () => {


Any thoughts on adding below test coverage

Multiple merge scenario. The current test covers scenario of merging with a single fq, but not with multiple fq values when existing filters are present.

No op when match passes but apply skips

k-rooot · 2026-04-14T16:50:58Z

...pture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.test.ts

+      const result = mapToObject(ctx.body.get('query'));
+      expect(result.bool.filter).toHaveLength(1);
+      // The fq itself becomes a bool query
+      expect(result.bool.filter[0].bool).toBeDefined();


This only checks presence of bool, not correctness of structure (must, should, etc.). Any thoughts on adding assertions to validate actual semantics?

k-rooot · 2026-04-14T16:51:13Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

@@ -0,0 +1,65 @@
+/**


On 1, 2 and 3, please update the fq rule transform to flag unsupported constructs, aligned with validations introduced in PR #2688. Additionally document all unsupported fq features (cache, cost., etc) in the LIMITATIONS file.

Thanks for the confirmation. Can we add a targeted test for plain OR within a single fq to explicitly validate the contract?

Solr’s distinction between single fq with multiple clauses (treated as one logical unit) and multiple fq params (independent filters) is not just about caching, but also about preserving logical grouping and intent. I'd still see treat this as a limitation. Any thoughts on documenting that this distinction is not preserved, or optionally preserving grouping for closer semantic parity/debuggability.

Same comment as 1 for function queries in fq.

k-rooot · 2026-04-14T16:51:55Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

+  return result.dsl;
+}
+
+export const request: MicroTransform<RequestContext> = {


This introduces an extra level of nesting, whereas OpenSearch allows must_not directly at the top-level bool, which is more natural and avoids unnecessary wrapping. Consider lifting must_not to the top-level bool.must_not when the fq is a pure negation, for a flatter and more idiomatic query structure.

k-rooot · 2026-04-14T16:52:04Z

...ficCapture/SolrTransformations/transforms/src/solr-to-opensearch/features/filter-query-fq.ts

+
+export const request: MicroTransform<RequestContext> = {
+  name: 'filter-query-fq',
+  match: (ctx) => ctx.params.has('fq'),


The intent of the comment was to ensure that escaped/special characters in fq are correctly preserved and parsed through translateQ, since fq often contains ranges, phrases, and special syntax. Adding test coverage for fq values with escaped/special characters to validate end to end behavior through the transform will future proof the transform rule for fq. Thoughts?

k-rooot · 2026-04-14T17:22:59Z

...Capture/SolrTransformations/transforms/src/solr-to-opensearch/query-engine/parser/solr.pegjs

+filterFunc
+  = "filter(" _ expr:query _ ")" boost:boost? {
+      const node = { type: 'filter', child: expr };
+      if (boost !== null) return { type: 'boost', child: node, value: boost };


Wrapping filter() with a boost node is syntactically correct, but semantically misleading since filter() is a non-scoring construct. Boost on a filter wrapper is effectively a no-op in Solr/OpenSearch filter context. Can we normalise or explicitly documenting that boost on filter() has no effect.

k-rooot · 2026-04-14T17:23:18Z

...Capture/SolrTransformations/transforms/src/solr-to-opensearch/query-engine/parser/solr.pegjs

+// bool.filter for equivalent non-scoring behavior.
+// See: https://solr.apache.org/guide/solr/latest/query-guide/standard-query-parser.html
+filterFunc
+  = "filter(" _ expr:query _ ")" boost:boost? {


The grammar allows any query inside filter() without validation. While flexible, this may allow unsupported or unintended constructs (e.g., nested filter, scoring queries).

The rule assumes query always produces a valid expression. Should we ensure grammar rejects empty expressions

k-rooot · 2026-04-14T17:26:03Z

...nsformations/transforms/src/solr-to-opensearch/query-engine/parser/filterNode.parser.test.ts

+
+const emptyParams = new Map<string, string>();
+
+describe('FilterNode parsing', () => {


Any thoughts on addong below test coverage

nested filter() coverage

boost + boolean

Precedence validations

prefix + boost

k-rooot · 2026-04-14T17:29:39Z

...ansformations/transforms/src/solr-to-opensearch/query-engine/transformer/rules/filterRule.ts

+  transformChild: TransformChild,
+): Map<string, any> => {
+  // Transform the child node
+  const childResult = transformChild(node.child);


If childResult is already a bool query with a filter clause, this creates unnecessary nesting.

In filter-query-fq, filters are merged into an existing bool.filter. Here, we always create a new bool. This results in results in slightly different shapes vs fq. If this is intended, good to document this. Similar comment for boost on filter nodes.

The rule assumes transformChild(node.child) always returns a valid query. Is this intentional?

k-rooot · 2026-04-14T17:32:40Z

...rmations/transforms/src/solr-to-opensearch/query-engine/transformer/rules/filterRule.test.ts

+
+/** Helper to convert nested Maps to plain objects for easier assertion. */
+function mapToObject(map: Map<string, any>): any {


Should we add test coverage for following aspects to future proof the rule - nested filters, negation and boost.

nagarajg17 requested review from AndreKurait, gregschohn, jugal-chauhan and sumobrian as code owners April 9, 2026 06:05

nagarajg17 had a problem deploying to migrations-cicd-require-approval April 9, 2026 06:05 — with GitHub Actions Error

nagarajg17 force-pushed the feature/fq branch from 807095f to 60cce85 Compare April 9, 2026 15:38

nagarajg17 had a problem deploying to migrations-cicd-require-approval April 9, 2026 15:38 — with GitHub Actions Error

k-rooot requested changes Apr 9, 2026

View reviewed changes

sumobrian mentioned this pull request Apr 10, 2026

Solr to OpenSearch Shim to assess migration viability #2504

Open

nagarajg17 force-pushed the feature/fq branch from 60cce85 to 10cf238 Compare April 11, 2026 12:27

nagarajg17 had a problem deploying to migrations-cicd-require-approval April 11, 2026 12:27 — with GitHub Actions Error

Adding translation for filter query

fbbf0b4

Signed-off-by: Nagaraj G <narajg@amazon.com>

nagarajg17 force-pushed the feature/fq branch from 10cf238 to fbbf0b4 Compare April 12, 2026 10:21

nagarajg17 requested a deployment to migrations-cicd-require-approval April 12, 2026 10:21 — with GitHub Actions Waiting

k-rooot reviewed Apr 14, 2026

View reviewed changes

		@@ -0,0 +1,218 @@
		import { describe, it, expect } from 'vitest';


		const emptyParams = new Map<string, string>();

		describe('FilterNode parsing', () => {


		/** Helper to convert nested Maps to plain objects for easier assertion. */
		function mapToObject(map: Map<string, any>): any {

Conversation

nagarajg17 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Check List

Uh oh!

codecov bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

k-rooot commented Apr 9, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagarajg17 Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagarajg17 Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagarajg17 Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagarajg17 Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagarajg17 Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagarajg17 Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nagarajg17 Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

nagarajg17 commented Apr 9, 2026 •

edited

Loading

codecov bot commented Apr 9, 2026 •

edited

Loading

nagarajg17 Apr 11, 2026 •

edited

Loading

nagarajg17 Apr 12, 2026 •

edited

Loading

nagarajg17 Apr 11, 2026 •

edited

Loading

nagarajg17 Apr 11, 2026 •

edited

Loading

nagarajg17 Apr 11, 2026 •

edited

Loading

nagarajg17 Apr 11, 2026 •

edited

Loading

nagarajg17 Apr 11, 2026 •

edited

Loading