Add Prometheus scrape/pull source to prometheus plugin [Adding in the same Prometheus Remote write source] #6743
Conversation
f327b15 to
d118aa1
Compare
dlvenable
left a comment
There was a problem hiding this comment.
Thanks @srikanthpadakanti .
At a high-level would a scraper be expected to get data from the same Prometheus nodes pushing? I'm wondering if it makes sense to have a separate source or not.
| @JsonProperty("http_basic") | ||
| private HttpBasicConfig httpBasic; | ||
|
|
||
| @JsonProperty("bearer_token") |
There was a problem hiding this comment.
Maybe this should be an oauth2 group with only bearer_token for consistency.
scrape:
oauth2:
bearer_token:
There was a problem hiding this comment.
Bearer token here is a static token passed as an HTTP Authorization header, not an OAuth2 flow. Nesting it under oauth2 would imply token endpoint and refresh support that we do not have. Kept it flat under authentication to match the actual behavior.
Happy to add an oauth2 group later when we support the full flow with token exchange.
| public class ScrapeTargetScraper { | ||
|
|
||
| private static final Logger LOG = LoggerFactory.getLogger(ScrapeTargetScraper.class); | ||
| private static final String ACCEPT_HEADER_VALUE = "text/plain;version=0.0.4"; |
There was a problem hiding this comment.
Does the version direct Prometheus to provide the scrape data in a certain format? Will it return Not Acceptable if it doesn't support the requested version?
There was a problem hiding this comment.
Yes. The version=0.0.4 in the Accept header identifies the Prometheus text exposition format per the spec. If the version parameter is missing, the server falls back to the most recent text format version which is still 0.0.4
Prometheus servers do not return 406 for this, they serve the default format. Content negotiation between text, OpenMetrics, and protobuf is handled via distinct content types (text/plain, application/openmetrics-text, application/vnd.google.protobuf). We can add OpenMetrics support as a follow-up.
| */ | ||
| public String scrape(final String url) { | ||
| final URI uri = URI.create(url); | ||
| final String baseUri = uri.getScheme() + "://" + uri.getAuthority(); |
There was a problem hiding this comment.
Let's use Spring's UriComponentsBuilder instead of string manipulation. If Armeria has something like this it would be ideal to use that instead.
There was a problem hiding this comment.
Agreed on avoiding string manipulation. Will switch to java.net.URI constructor to extract the base URI cleanly rather than adding a Spring dependency. Armeria's WebClient already handles URI parsing internally so we just need the base extraction to be safe.
| final HttpHeadersBuilder builder = HttpHeaders.builder() | ||
| .add(HttpHeaderNames.ACCEPT, ACCEPT_HEADER_VALUE); | ||
|
|
||
| if (config.getAuthentication() != null) { |
There was a problem hiding this comment.
It would probably be ideal to have small classes and a common interface for updating the request based on the authentication. This could allow more complicated auth schemes like SigV4 in the future.
I had some similar comments on an http sink PR that is in progress currently. #6747 (comment). Though my comments went the other way. Have the code more amenable to basic auth instead of SigV4. 😄
There was a problem hiding this comment.
Extracted into a ScrapeRequestAuthenticator interface with BasicAuthenticator and BearerTokenAuthenticator as separate classes. Factory method handles selection. Adding SigV4 or other schemes in the future is just a new implementation without touching the scraper.
| import java.util.Map; | ||
| import java.util.TreeMap; | ||
|
|
||
| public class TextExpositionParser { |
There was a problem hiding this comment.
What is the relation between this and RemoteWriteProtobufParser. Is there opportunity for shared code for parsing?
There was a problem hiding this comment.
Different input formats with no parsing overlap. Text parser handles line-oriented exposition, protobuf parser handles Snappy-compressed binary.
Both already share the same output builders (JacksonGauge, JacksonSum, etc.) from data-prepper-api. No practical code to extract.
| @JsonProperty("flatten_labels") | ||
| private boolean flattenLabels = false; | ||
|
|
||
| @JsonProperty("insecure") |
There was a problem hiding this comment.
If any of the URLs in targets has a non-HTTPS scheme we should throw a validation error unless insecure is true.
You can add a JSR @AssertTrue to handle this in this class.
.../main/java/org/opensearch/dataprepper/plugins/source/prometheus/PrometheusScrapeService.java
Show resolved
Hide resolved
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
d118aa1 to
713b2d2
Compare
A scraper would not typically target the same nodes pushing via remote write. They serve different use cases: remote write is Prometheus servers configured to push, while scrape pulls from any endpoint exposing the Prometheus text exposition format, including application instances that are not Prometheus servers themselves. The reason they share a plugin is the precedent from kkondaka's feedback on PR #6627 where the recommendation was to keep it as a single prometheus source with config options for different modes, similar to how That said, if you prefer a separate source I am open to splitting it out. |
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Description
Add Prometheus scrape/pull source to the existing prometheus plugin. The source periodically HTTP GETs target
/metricsendpoints, parses the Prometheus text exposition format, and converts metrics into Data Prepper events (Gauge, Sum, Histogram, Summary). Supports static target URLs, configurable scrape interval/timeout, HTTP Basic Auth, Bearer Token, and TLS certificates. Both scrape and remote write modes can run simultaneously under the same plugin.Three usage modes:
Issues Resolved
#1997
Resolves #1997
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.