Skip to content

Fix APM metrics accuracy: server-side filtering, chart totals, and throughput normalization#2623

Merged
ps48 merged 6 commits intoopensearch-project:mainfrom
ps48:apm-ui-updates
Mar 24, 2026
Merged

Fix APM metrics accuracy: server-side filtering, chart totals, and throughput normalization#2623
ps48 merged 6 commits intoopensearch-project:mainfrom
ps48:apm-ui-updates

Conversation

@ps48
Copy link
Copy Markdown
Member

@ps48 ps48 commented Mar 24, 2026

Description

  • Server vs client metric filtering: Added remoteService="" filter to all node-level PromQL queries to show incoming (server-side) metrics instead of outgoing (client-side) metrics. This ensures service dashboards display the service's own performance, not what it observes as a client of other services.
  • Chart-total consistency: Wrapped count chart queries (request, fault, error) with sum_over_time[step] so chart data points are consistent with health donut totals in the Application Map flyout.
  • Throughput as req/s: Added configurable Window Duration field to APM Settings (default 60s, matching Data Prepper window_duration). Throughput is now displayed as "req/s" instead of "req/int" by dividing gauge values by the configured window duration. Applied to Service Overview metric tile and Services Home table.
  • PPL timestamp formatting: Updated PPL queries to use date format from OSD settings and added time utility functions for consistent timestamp handling in the correlations flyout.

Issues Resolved

#2545

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

ps48 added 4 commits March 23, 2026 23:14
Signed-off-by: ps48 <pshenoy36@gmail.com>
Signed-off-by: ps48 <pshenoy36@gmail.com>
…umbers

Signed-off-by: ps48 <pshenoy36@gmail.com>
Signed-off-by: ps48 <pshenoy36@gmail.com>
@ps48 ps48 added the bug Something isn't working label Mar 24, 2026
Signed-off-by: ps48 <pshenoy36@gmail.com>
tracesDatasetId: '',
serviceMapDatasetId: '',
prometheusDataSourceId: '',
windowDuration: '60',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does any of the saved object attributes type need to change? ApmConfigEntity? would be better to enforce a type on this formData

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes updated here: 2d2876b

} catch {
return undefined;
}
}, [timeRange]);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. this logic occurred multiple times, better to extract into a function

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved it to common place useChartStepWindow as a hook

window?: string
): string =>
window
? `sum(sum_over_time(request{environment="${environment}",service="${serviceName}",remoteService="${remoteService}",remoteOperation="${remoteOperation}",namespace="span_derived",remoteService!=""}[${window}]))`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remoteService!="" is unnecessary?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is unnecessary and removed it.

':' +
('0' + date.getUTCSeconds()).slice(-2) +
'.' +
('00' + date.getUTCMilliseconds()).slice(-3)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there existing code to do this?

Copy link
Copy Markdown
Member Author

@ps48 ps48 Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OSD core has this in query_enhancements/common/utils.ts. formatDate is similar but uses local time, not UTC

Signed-off-by: ps48 <pshenoy36@gmail.com>
@ps48 ps48 merged commit 014bc47 into opensearch-project:main Mar 24, 2026
12 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants