Add Prometheus scrape/pull source to prometheus plugin [Adding in the same Prometheus Remote write source] by srikanthpadakanti · Pull Request #6743 · opensearch-project/data-prepper

srikanthpadakanti · 2026-04-08T20:24:42Z

Description

Add Prometheus scrape/pull source to the existing prometheus plugin. The source periodically HTTP GETs target /metrics endpoints, parses the Prometheus text exposition format, and converts metrics into Data Prepper events (Gauge, Sum, Histogram, Summary). Supports static target URLs, configurable scrape interval/timeout, HTTP Basic Auth, Bearer Token, and TLS certificates. Both scrape and remote write modes can run simultaneously under the same plugin.

Three usage modes:

 Mode 1: Remote write only (existing behavior, unchanged)
  source:
    prometheus:
      port: 9090

Mode 2: Scrape only
  source:
    prometheus:
      scrape:
        targets:
          - url: "http://host:9090/metrics"

Mode 3: Both simultaneously
  source:
    prometheus:
      port: 9090
      scrape:
        targets:
          - url: "http://host:9090/metrics"

Issues Resolved

#1997
Resolves #1997

Check List

[ X ] New functionality includes testing.
New functionality has a documentation issue. Please link to it in this PR.
- [ X ] New functionality has javadoc added
[ X ] Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

dlvenable

Thanks @srikanthpadakanti .

At a high-level would a scraper be expected to get data from the same Prometheus nodes pushing? I'm wondering if it makes sense to have a separate source or not.

dlvenable · 2026-04-10T15:41:33Z

...in/java/org/opensearch/dataprepper/plugins/source/prometheus/ScrapeAuthenticationConfig.java

+    @JsonProperty("http_basic")
+    private HttpBasicConfig httpBasic;
+
+    @JsonProperty("bearer_token")


Maybe this should be an oauth2 group with only bearer_token for consistency.

scrape: oauth2: bearer_token:

Bearer token here is a static token passed as an HTTP Authorization header, not an OAuth2 flow. Nesting it under oauth2 would imply token endpoint and refresh support that we do not have. Kept it flat under authentication to match the actual behavior.

Happy to add an oauth2 group later when we support the full flow with token exchange.

dlvenable · 2026-04-10T15:42:51Z

.../src/main/java/org/opensearch/dataprepper/plugins/source/prometheus/ScrapeTargetScraper.java

+public class ScrapeTargetScraper {
+
+    private static final Logger LOG = LoggerFactory.getLogger(ScrapeTargetScraper.class);
+    private static final String ACCEPT_HEADER_VALUE = "text/plain;version=0.0.4";


Does the version direct Prometheus to provide the scrape data in a certain format? Will it return Not Acceptable if it doesn't support the requested version?

Yes. The version=0.0.4 in the Accept header identifies the Prometheus text exposition format per the spec. If the version parameter is missing, the server falls back to the most recent text format version which is still 0.0.4

Prometheus servers do not return 406 for this, they serve the default format. Content negotiation between text, OpenMetrics, and protobuf is handled via distinct content types (text/plain, application/openmetrics-text, application/vnd.google.protobuf). We can add OpenMetrics support as a follow-up.

dlvenable · 2026-04-10T15:43:48Z

.../src/main/java/org/opensearch/dataprepper/plugins/source/prometheus/ScrapeTargetScraper.java

+     */
+    public String scrape(final String url) {
+        final URI uri = URI.create(url);
+        final String baseUri = uri.getScheme() + "://" + uri.getAuthority();


Let's use Spring's UriComponentsBuilder instead of string manipulation. If Armeria has something like this it would be ideal to use that instead.

Agreed on avoiding string manipulation. Will switch to java.net.URI constructor to extract the base URI cleanly rather than adding a Spring dependency. Armeria's WebClient already handles URI parsing internally so we just need the base extraction to be safe.

dlvenable · 2026-04-10T15:46:32Z

.../src/main/java/org/opensearch/dataprepper/plugins/source/prometheus/ScrapeTargetScraper.java

+        final HttpHeadersBuilder builder = HttpHeaders.builder()
+                .add(HttpHeaderNames.ACCEPT, ACCEPT_HEADER_VALUE);
+
+        if (config.getAuthentication() != null) {


It would probably be ideal to have small classes and a common interface for updating the request based on the authentication. This could allow more complicated auth schemes like SigV4 in the future.

I had some similar comments on an http sink PR that is in progress currently. #6747 (comment). Though my comments went the other way. Have the code more amenable to basic auth instead of SigV4. 😄

Extracted into a ScrapeRequestAuthenticator interface with BasicAuthenticator and BearerTokenAuthenticator as separate classes. Factory method handles selection. Adding SigV4 or other schemes in the future is just a new implementation without touching the scraper.

dlvenable · 2026-04-10T15:52:04Z

...src/main/java/org/opensearch/dataprepper/plugins/source/prometheus/TextExpositionParser.java

+import java.util.Map;
+import java.util.TreeMap;
+
+public class TextExpositionParser {


What is the relation between this and RemoteWriteProtobufParser. Is there opportunity for shared code for parsing?

Different input formats with no parsing overlap. Text parser handles line-oriented exposition, protobuf parser handles Snappy-compressed binary.

Both already share the same output builders (JacksonGauge, JacksonSum, etc.) from data-prepper-api. No practical code to extract.

dlvenable · 2026-04-10T15:53:35Z

...c/main/java/org/opensearch/dataprepper/plugins/source/prometheus/PrometheusScrapeConfig.java

+    @JsonProperty("flatten_labels")
+    private boolean flattenLabels = false;
+
+    @JsonProperty("insecure")


If any of the URLs in targets has a non-HTTPS scheme we should throw a validation error unless insecure is true.

You can add a JSR @AssertTrue to handle this in this class.

Done.
Added @AssertTrue validation.

.../main/java/org/opensearch/dataprepper/plugins/source/prometheus/PrometheusScrapeService.java

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

srikanthpadakanti · 2026-04-10T17:44:18Z

Thanks @srikanthpadakanti .

At a high-level would a scraper be expected to get data from the same Prometheus nodes pushing? I'm wondering if it makes sense to have a separate source or not.

A scraper would not typically target the same nodes pushing via remote write. They serve different use cases: remote write is Prometheus servers configured to push, while scrape pulls from any endpoint exposing the Prometheus text exposition format, including application instances that are not Prometheus servers themselves.

The reason they share a plugin is the precedent from kkondaka's feedback on PR #6627 where the recommendation was to keep it as a single prometheus source with config options for different modes, similar to how s3 source handles s3-sqs and s3-scan under one plugin. Both modes can also run simultaneously which simplifies pipeline configuration when users need both ingestion paths.

That said, if you prefer a separate source I am open to splitting it out.

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

srikanthpadakanti requested review from KarstenSchnitter, dinujoh, dlvenable, engechas, graytaylor0, kkondaka, oeyh, san81, sb2k16 and srikanthjg as code owners April 8, 2026 20:24

srikanthpadakanti force-pushed the prometheus-scrape-source branch from f327b15 to d118aa1 Compare April 8, 2026 21:29

dlvenable requested changes Apr 10, 2026

View reviewed changes

Add Prometheus scrape/pull source to prometheus plugin

713b2d2

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

srikanthpadakanti force-pushed the prometheus-scrape-source branch from d118aa1 to 713b2d2 Compare April 10, 2026 17:29

Address review feedback for scrape source

d59016d

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Conversation

srikanthpadakanti commented Apr 8, 2026

Description

Issues Resolved

Check List

Uh oh!

dlvenable left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

srikanthpadakanti commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants