908 Commits

Author SHA1 Message Date
Julien
6e83b49dd6 Merge pull request #17685 from akshatsinha0/fix-aws-sd-setdirectory
Fix(discovery/aws): Added SetDirectory method to EC2SDConfig.
2026-04-15 14:35:46 +02:00
Julien
12de1243c0 Merge pull request #18518 from prometheus/release-3.11
Merge back release 3.11.2
2026-04-14 10:26:38 +02:00
Julien Pivotto
4cc50803ff discovery/consul: fix catalog watch trigger and improve filter tests
When health_filter is set without explicit services, the catalog needs
to be watched to enumerate services. Add watchedFilter to the condition
that triggers catalog watching.

Improve the filter test suite:
- Replace defer with t.Cleanup for stub servers.
- Rewrite TestFilterOption to assert that the catalog receives the filter
  and the health endpoint does not.
- Rewrite TestHealthFilterOption to assert that health_filter is routed
  correctly to the health endpoint only.
- Add TestBothFiltersOption to verify both filters are routed to their
  respective endpoints when both are configured.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-04-10 10:26:40 +02:00
Arthur Silva Sens
72293ff1d2 Merge pull request #18433 from codeboten/codeboten/remove-docker-monopkg-dep
Some checks failed
buf.build / lint and publish (push) Has been cancelled
CI / Go tests (push) Has been cancelled
CI / More Go tests (push) Has been cancelled
CI / Go tests for Prometheus upgrades and downgrades (push) Has been cancelled
CI / Go tests with previous Go version (push) Has been cancelled
CI / UI tests (push) Has been cancelled
CI / Go tests on Windows (push) Has been cancelled
CI / Mixins tests (push) Has been cancelled
CI / Compliance testing (push) Has been cancelled
CI / Build Prometheus for common architectures (0) (push) Has been cancelled
CI / Build Prometheus for common architectures (1) (push) Has been cancelled
CI / Build Prometheus for common architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (0) (push) Has been cancelled
CI / Build Prometheus for all architectures (1) (push) Has been cancelled
CI / Build Prometheus for all architectures (10) (push) Has been cancelled
CI / Build Prometheus for all architectures (11) (push) Has been cancelled
CI / Build Prometheus for all architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (3) (push) Has been cancelled
CI / Build Prometheus for all architectures (4) (push) Has been cancelled
CI / Build Prometheus for all architectures (5) (push) Has been cancelled
CI / Build Prometheus for all architectures (6) (push) Has been cancelled
CI / Build Prometheus for all architectures (7) (push) Has been cancelled
CI / Build Prometheus for all architectures (8) (push) Has been cancelled
CI / Build Prometheus for all architectures (9) (push) Has been cancelled
CI / Report status of build Prometheus for all architectures (push) Has been cancelled
CI / Check generated parser (push) Has been cancelled
CI / golangci-lint (push) Has been cancelled
CI / fuzzing (push) Has been cancelled
CI / codeql (push) Has been cancelled
CI / Publish main branch artifacts (push) Has been cancelled
CI / Publish release artefacts (push) Has been cancelled
CI / Publish UI on npm Registry (push) Has been cancelled
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled
Stale Check / stale (push) Has been cancelled
Lock Threads / action (push) Has been cancelled
chore: remove dependency on github.com/docker/docker
2026-04-09 11:58:10 -03:00
Julien Pivotto
1e73d2fcde discovery/consul: add health_filter for Health API filtering
The filter field was documented as targeting the Catalog API but since
PR #17349 it was also passed to the Health API. This broke existing
configs using Catalog-only fields like ServiceTags, which the Health API
rejects (it uses Service.Tags instead).

Introduce a separate health_filter field that is passed exclusively to
the Health API, while filter remains catalog-only. Update the docs to
explain the two-phase discovery (Catalog for service listing, Health for
instances) and the field name differences between the two APIs.

Fixes #18479

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-04-09 16:03:16 +02:00
avilevy18
eb220862e5 discovery, scrape: Use backoff interval for throttling discovery updates; add DiscoveryReloadOnStartup option for short-lived environments (#18187)
* Adding scape on shutdown

Signed-off-by: avilevy <avilevy@google.com>

* scrape: replace skipOffsetting to make the test offset deterministic instead of skipping it entirely

Signed-off-by: avilevy <avilevy@google.com>

* renamed calculateScrapeOffset to getScrapeOffset

Signed-off-by: avilevy <avilevy@google.com>

* discovery: Add skipStartupWait to bypass initial discovery delay

In short-lived environments like agent mode or serverless, the
Prometheus process may only execute for a few seconds. Waiting for
the default 5-second `updatert` ticker before sending the first
target groups means the process could terminate before collecting
any metrics at all.

This commit adds a `skipStartupWait` option to the Discovery Manager
to bypass this initial delay. When enabled, the sender uses an
unthrottled startup loop that instantly forwards all triggers. This
ensures both the initial empty update from `ApplyConfig` and the
first real targets from discoverers are passed downstream immediately.

After the first ticker interval elapses, the sender cleanly breaks out
of the startup phase, resets the ticker, and resumes standard
operations.

Signed-off-by: avilevy <avilevy@google.com>

* scrape: Bypass initial reload delay for ScrapeOnShutdown

In short-lived environments like agent mode or serverless, the default
5-second `DiscoveryReloadInterval` can cause the process to terminate
before the scrape manager has a chance to process targets and collect
any metrics.

Because the discovery manager sends an initial empty update upon
configuration followed rapidly by the actual targets, simply waiting
for a single reload trigger is insufficient—the real targets would
still get trapped behind the ticker delay.

This commit introduces an unthrottled startup loop in the `reloader`
when `ScrapeOnShutdown` is enabled. It processes all incoming
`triggerReload` signals immediately during the first interval. Once
the initial tick fires, the `reloader` resets the ticker and falls
back into its standard throttled loop, ensuring short-lived processes
can discover and scrape targets instantly.

Signed-off-by: avilevy <avilevy@google.com>

* test(scrape): refactor time-based manager tests to use synctest

Addresses PR feedback to remove flaky, time-based sleeping in the scrape manager tests.

Add TestManager_InitialScrapeOffset and TestManager_ScrapeOnShutdown to use the testing/synctest package, completely eliminating real-world time.Sleep delays and making the assertions 100% deterministic.

- Replaced httptest.Server with net.Pipe and a custom startFakeHTTPServer helper to ensure all network I/O remains durably blocked inside the synctest bubble.
- Leveraged the skipOffsetting option to eliminate random scrape jitter, making the time-travel math exact and predictable.
- Using skipOffsetting also safely bypasses the global singleflight DNS lookup in setOffsetSeed, which previously caused cross-bubble panics in synctest.
- Extracted shared boilerplate into a setupSynctestManager helper to keep the test cases highly readable and data-driven.

Signed-off-by: avilevy <avilevy@google.com>

* Clarify use cases in InitialScrapeOffset comment

Signed-off-by: avilevy <avilevy@google.com>

* test(scrape): use httptest for mock server to respect context cancellation

- Replaced manual HTTP string formatting over `net.Pipe` with `httptest.NewUnstartedServer`.
- Implemented an in-memory `pipeListener` to allow the server to handle `net.Pipe` connections directly. This preserves `synctest` time isolation without opening real OS ports.
- Added explicit `r.Context().Done()` handling in the mock HTTP handler to properly simulate aborted requests and scrape timeouts.
- Validates that the request context remains active and is not prematurely cancelled during `ScrapeOnShutdown` scenarios.
- Renamed `skipOffsetting` to `skipJitterOffsetting`.
- Addressed other PR comments.

Signed-off-by: avilevy <avilevy@google.com>

* tmp

Signed-off-by: bwplotka <bwplotka@gmail.com>

* exp2

Signed-off-by: bwplotka <bwplotka@gmail.com>

* fix

Signed-off-by: bwplotka <bwplotka@gmail.com>

* scrape: fix scrapeOnShutdown context bug and refactor test helpers
The scrapeOnShutdown feature was failing during manager shutdown because
the scrape pool context was being cancelled before the final shutdown
scrapes could execute. Fix this by delaying context cancellation
in scrapePool.stop() until after all scrape loops have stopped.
In addition:
- Added test cases to verify scrapeOnShutdown works with InitialScrapeOffset.
- Refactored network test helper functions from manager_test.go to
  helpers_test.go.
- Addressed other comments.

Signed-off-by: avilevy <avilevy@google.com>

* Update scrape/scrape.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: avilevy18 <105948922+avilevy18@users.noreply.github.com>

* feat(discovery): add SkipInitialWait to bypass initial startup delay

This adds a SkipInitialWait option to the discovery Manager, allowing consumers sensitive to startup latency to receive the first batch of discovered targets immediately instead of waiting for the updatert ticker.

To support this without breaking the immediate dropped target notifications introduced in #13147, ApplyConfig now uses a keep flag to only trigger immediate downstream syncs for obsolete or updated providers. This prevents sending premature empty target groups for brand-new providers on initial startup.

Additionally, the scrape manager's reloader loop is updated to process the initial triggerReload immediately, ensuring the end-to-end pipeline processes initial targets without artificial delays.

Signed-off-by: avilevy <avilevy@google.com>

* scrape: Add TestManagerReloader and refactor discovery triggerSync

Adds a new TestManagerReloader test suite using synctest to assert
behavior of target updates, discovery reload ticker intervals, and
ScrapeOnShutdown flags.

Updates setupSynctestManager to allow skipping initial config setup by
passing an interval of 0.

Also renames the 'keep' variable to 'triggerSync' in ApplyConfig inside
discovery/manager.go for clarity, and adds a descriptive comment.

Signed-off-by: avilevy <avilevy@google.com>

* feat(discovery,scrape): rename startup wait options and add DiscoveryReloadOnStartup

- discovery: Rename `SkipInitialWait` to `SkipStartupWait` for clarity.
- discovery: Pass `context.Context` to `flushUpdates` to handle cancellation and avoid leaks.
- scrape: Add `DiscoveryReloadOnStartup` to `Options` to decouple startup discovery from `ScrapeOnShutdown`.
- tests: Refactor `TestTargetSetTargetGroupsPresentOnStartup` and `TestManagerReloader` to use table-driven tests and `synctest` for better stability and coverage.

Signed-off-by: avilevy <avilevy@google.com>

* feat(discovery,scrape): importing changes proposed in 043d710

- Refactor sender to use exponential backoff
- Replaces `time.NewTicker` in `sender()` with an exponential backoff
  to prevent panics on non-positive intervals and better throttle updates.
- Removes obsolete `skipStartupWait` logic.
- Refactors `setupSynctestManager` to use an explicit `initConfig` argument

Signed-off-by: avilevy <avilevy@google.com>

* fix: updating go mod

Signed-off-by: avilevy <avilevy@google.com>

* fixing merge

Signed-off-by: avilevy <avilevy@google.com>

* fixing issue: 2 variables but NewTestMetrics returns 1 value

Signed-off-by: avilevy <avilevy@google.com>

* Update discovery/manager.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: avilevy18 <105948922+avilevy18@users.noreply.github.com>

* Refactor setupSynctestManager initConfig into a separate function

Signed-off-by: avilevy <avilevy@google.com>

---------

Signed-off-by: avilevy <avilevy@google.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: avilevy18 <105948922+avilevy18@users.noreply.github.com>
Co-authored-by: bwplotka <bwplotka@gmail.com>
2026-04-03 11:01:49 +01:00
alex boten
ab63eeb0d2 update deprecated calls
Signed-off-by: alex boten <223565+codeboten@users.noreply.github.com>
2026-04-02 15:54:32 -07:00
alex boten
39672984b5 chore: remove dependency on github.com/docker/docker
The package makes vulnerability scanners unhappy, and the functionality is available in the smaller moby/moby packages.

Signed-off-by: alex boten <223565+codeboten@users.noreply.github.com>
2026-04-02 15:11:24 -07:00
George Krajcsovits
f1b226a2f3 discovery: fix build error in TestGaugeLastUpdateTimestamp (#18430)
Some checks failed
buf.build / lint and publish (push) Has been cancelled
CI / Go tests (push) Has been cancelled
CI / More Go tests (push) Has been cancelled
CI / Go tests for Prometheus upgrades and downgrades (push) Has been cancelled
CI / Go tests with previous Go version (push) Has been cancelled
CI / UI tests (push) Has been cancelled
CI / Go tests on Windows (push) Has been cancelled
CI / Mixins tests (push) Has been cancelled
CI / Compliance testing (push) Has been cancelled
CI / Build Prometheus for common architectures (0) (push) Has been cancelled
CI / Build Prometheus for common architectures (1) (push) Has been cancelled
CI / Build Prometheus for common architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (0) (push) Has been cancelled
CI / Build Prometheus for all architectures (1) (push) Has been cancelled
CI / Build Prometheus for all architectures (10) (push) Has been cancelled
CI / Build Prometheus for all architectures (11) (push) Has been cancelled
CI / Build Prometheus for all architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (3) (push) Has been cancelled
CI / Build Prometheus for all architectures (4) (push) Has been cancelled
CI / Build Prometheus for all architectures (5) (push) Has been cancelled
CI / Build Prometheus for all architectures (6) (push) Has been cancelled
CI / Build Prometheus for all architectures (7) (push) Has been cancelled
CI / Build Prometheus for all architectures (8) (push) Has been cancelled
CI / Build Prometheus for all architectures (9) (push) Has been cancelled
CI / Report status of build Prometheus for all architectures (push) Has been cancelled
CI / Check generated parser (push) Has been cancelled
CI / golangci-lint (push) Has been cancelled
CI / fuzzing (push) Has been cancelled
CI / codeql (push) Has been cancelled
CI / Publish main branch artifacts (push) Has been cancelled
CI / Publish release artefacts (push) Has been cancelled
CI / Publish UI on npm Registry (push) Has been cancelled
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled
NewTestMetrics returns a single value but the test was
assigning it to two variables.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2026-04-02 20:42:03 +02:00
Will Bollock
b70a871988 fix(discovery): delete expired refresh metrics on reload (#17614)
Building off config-specific Prometheus refresh metrics from an earlier
PR (https://github.com/prometheus/prometheus/pull/17138), this deletes
refresh metrics like `prometheus_sd_refresh_duration_seconds` and
`prometheus_sd_refresh_failures_total` when the underlying scrape job
configuration is removed on reload. This reduces un-needed cardinality
from scrape job specific metrics while still preserving metrics that
indicate overall health of a service discovery engine.

For example,
`prometheus_sd_refresh_failures_total{config="linode-servers",mechanism="linode"} 1`
will no longer be exported by Prometheus when the `linode-servers`
scrape job for the Linode service provider is removed. The generic,
service discovery specific `prometheus_sd_linode_failures_total` metric
will persist however.

* fix: add targetsMtx lock for targets access

* test: validate refresh/discover metrics are gone

* ref: combine sdMetrics and refreshMetrics

Good idea from @bboreham to combine sdMetrics and refreshMetrics!
They're always passed around together and don't have much of a
reason not to be combined. mechanismMetrics makes it clear what kind of
metrics this is used for (service discovery mechanisms).

---------

Signed-off-by: Will Bollock <wbollock@linode.com>
2026-04-02 13:43:35 +01:00
Julien Pivotto
a2d741cbf7 discovery/kubernetes: fix pod_test.go after makeNode signature change
PR #17601 extended makeNode with annotations and conditions parameters
but missed updating two call sites in pod_test.go.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-04-02 10:03:00 +02:00
Frederic Branczyk
b014aa101d Merge pull request #17601 from joelkbiju12/ft-kubernetes-nodeready-label
Some checks failed
buf.build / lint and publish (push) Has been cancelled
CI / Go tests (push) Has been cancelled
CI / More Go tests (push) Has been cancelled
CI / Go tests for Prometheus upgrades and downgrades (push) Has been cancelled
CI / Go tests with previous Go version (push) Has been cancelled
CI / UI tests (push) Has been cancelled
CI / Go tests on Windows (push) Has been cancelled
CI / Mixins tests (push) Has been cancelled
CI / Compliance testing (push) Has been cancelled
CI / Build Prometheus for common architectures (0) (push) Has been cancelled
CI / Build Prometheus for common architectures (1) (push) Has been cancelled
CI / Build Prometheus for common architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (0) (push) Has been cancelled
CI / Build Prometheus for all architectures (1) (push) Has been cancelled
CI / Build Prometheus for all architectures (10) (push) Has been cancelled
CI / Build Prometheus for all architectures (11) (push) Has been cancelled
CI / Build Prometheus for all architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (3) (push) Has been cancelled
CI / Build Prometheus for all architectures (4) (push) Has been cancelled
CI / Build Prometheus for all architectures (5) (push) Has been cancelled
CI / Build Prometheus for all architectures (6) (push) Has been cancelled
CI / Build Prometheus for all architectures (7) (push) Has been cancelled
CI / Build Prometheus for all architectures (8) (push) Has been cancelled
CI / Build Prometheus for all architectures (9) (push) Has been cancelled
CI / Report status of build Prometheus for all architectures (push) Has been cancelled
CI / Check generated parser (push) Has been cancelled
CI / golangci-lint (push) Has been cancelled
CI / fuzzing (push) Has been cancelled
CI / codeql (push) Has been cancelled
CI / Publish main branch artifacts (push) Has been cancelled
CI / Publish release artefacts (push) Has been cancelled
CI / Publish UI on npm Registry (push) Has been cancelled
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled
discovery: adding kubernetes node condition labels
2026-04-01 18:05:27 +02:00
Aurélien Duboc
fa5960bf08 feat(discovery): add Outscale VM service discovery (#18139)
Add Outscale VM service discovery using osc-sdk-go, including optional secret_key_file support, metrics, docs, and configuration examples. Document the default region (eu-west-2).

Signed-off-by: Aurelien Duboc <aurelienduboc96@gmail.com>
2026-03-31 18:03:51 +02:00
Vladimir Skesov
20ff771593 discovery: add DigitalOcean Managed Databases service discovery
This adds 'databases' role to digitalocean_sd_config to discover DigitalOcean
Managed Database clusters. It follows the multi-role design pattern by
introducing a 'role' parameter (default: 'droplets').

Includes:
- Support for Managed Databases API.
- Pagination handling for Databases API.
- Comprehensive meta labels for database targets.
- Updated documentation and tests.

Signed-off-by: Vladimir Skesov <skesov@gmail.com>
2026-03-30 11:31:17 +03:00
Pierluigi Lenoci
73902efbd0 discovery/vultr: upgrade govultr from v2 to v3 (#18347)
* discovery/vultr: upgrade govultr from v2 to v3

The govultr/v2 library is no longer actively maintained. Upgrade to
govultr/v3 (v3.28.1) which receives regular updates and security
patches.

The v3 library is API-compatible with v2 for the Instance.List
method used by the Vultr SD, with the only change being an
additional *http.Response return value.

Signed-off-by: Pierluigi Lenoci <pierluigi.lenoci@gmail.com>

* discovery/vultr: check HTTP response status code

Validate that the Vultr API returns a 2xx status code after listing
instances, as the *http.Response from govultr v3 is now available.

Signed-off-by: Pierluigi Lenoci <pierluigi.lenoci@gmail.com>

* discovery/vultr: fix linter error in error string capitalization

Error strings should not be capitalized per Go conventions (ST1005).

Signed-off-by: Pierluigi Lenoci <pierluigi.lenoci@gmail.com>

---------

Signed-off-by: Pierluigi Lenoci <pierluigi.lenoci@gmail.com>
2026-03-26 09:42:25 +01:00
Ogulcan Aydogan
7bbff490a3 discovery/azure: fix system managed identity when client_id is empty
When using ManagedIdentity authentication with system-assigned identity,
the client_id field is intentionally left empty. However, the current code
unconditionally sets options.ID = azidentity.ClientID(cfg.ClientID), which
passes an empty string instead of nil. The Azure SDK treats an empty
ClientID as a request for a user-assigned identity with an empty client ID,
rather than falling back to system-assigned identity.

Fix by only setting options.ID when cfg.ClientID is non-empty, matching the
pattern already used in storage/remote/azuread/azuread.go.

Fixes #16634

Signed-off-by: Ogulcan Aydogan <ogulcanaydogan@hotmail.com>
2026-03-19 10:49:17 +00:00
Jonas L.
a9d90952ba Deprecate Hetzner Cloud server datacenter labels (#17850)
[hcloud.Server.Datacenter] is deprecated and will be removed after 1 July 2026. Use [hcloud.Server.Location] instead.

See https://docs.hetzner.cloud/changelog#2025-12-16-phasing-out-datacenters

Changes to Hetzner meta labels:

- `__meta_hetzner_datacenter`
    - is deprecated for the role `robot` but kept for backward compatibility. Using `__meta_hetzner_robot_datacenter` is preferred.
    - is deprecated for the role `hcloud` and will stop working after the 1 July 2026.

-  `__meta_hetzner_hcloud_datacenter_location` label
    - is deprecated but kept for backward compatibility, the same data is available in the [`hcloud.Server.Location`](https://pkg.go.dev/github.com/hetznercloud/hcloud-go/v2/hcloud#Server) struct.
    - using `__meta_hetzner_hcloud_location` is preferred.

-  `__meta_hetzner_hcloud_datacenter_location_network_zone`
    - is deprecated but kept for backward compatibility, the same data is available in the [`hcloud.Server.Location`](https://pkg.go.dev/github.com/hetznercloud/hcloud-go/v2/hcloud#Server) struct.
    - using `__meta_hetzner_hcloud_location_network_zone` is preferred.

- `__meta_hetzner_hcloud_location`
    - replacement label for `__meta_hetzner_hcloud_datacenter_location`

- `__meta_hetzner_hcloud_location_network_zone`
    - replacement label for `__meta_hetzner_hcloud_datacenter_location_network_zone`

- `__meta_hetzner_robot_datacenter`
    - replacement label for `__meta_hetzner_datacenter` with the role `robot`.

Signed-off-by: Jonas Lammler <jonas.lammler@hetzner-cloud.de>
2026-03-19 11:25:01 +01:00
Munem Hashmi
89b3ad45a8 discovery/file: restore atomic file writes in tests (#18259)
PR #17269 replaced atomic os.Rename-based file writes with
os.WriteFile to fix a Windows flake. However, os.WriteFile is not
atomic (it truncates then writes), and fsnotify can fire between
the truncate and write, causing the watcher to read an empty file
and replace valid targets with empty ones.

Restore atomicity by writing to a temporary file and renaming.
On Windows, retry the rename with a short backoff to handle
transient "Access is denied" errors when the file watcher or
readFile holds an open handle to the destination.

Fixes #18237

Signed-off-by: Munem Hashmi <munem.hashmi@gmail.com>
2026-03-11 08:12:56 +01:00
Matt
5a02b92c0e AWS SD: RDS Role (#18206)
Some checks failed
buf.build / lint and publish (push) Has been cancelled
CI / Go tests (push) Has been cancelled
CI / More Go tests (push) Has been cancelled
CI / Go tests with previous Go version (push) Has been cancelled
CI / UI tests (push) Has been cancelled
CI / Go tests on Windows (push) Has been cancelled
CI / Mixins tests (push) Has been cancelled
CI / Compliance testing (push) Has been cancelled
CI / Build Prometheus for common architectures (0) (push) Has been cancelled
CI / Build Prometheus for common architectures (1) (push) Has been cancelled
CI / Build Prometheus for common architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (0) (push) Has been cancelled
CI / Build Prometheus for all architectures (1) (push) Has been cancelled
CI / Build Prometheus for all architectures (10) (push) Has been cancelled
CI / Build Prometheus for all architectures (11) (push) Has been cancelled
CI / Build Prometheus for all architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (3) (push) Has been cancelled
CI / Build Prometheus for all architectures (4) (push) Has been cancelled
CI / Build Prometheus for all architectures (5) (push) Has been cancelled
CI / Build Prometheus for all architectures (6) (push) Has been cancelled
CI / Build Prometheus for all architectures (7) (push) Has been cancelled
CI / Build Prometheus for all architectures (8) (push) Has been cancelled
CI / Build Prometheus for all architectures (9) (push) Has been cancelled
CI / Report status of build Prometheus for all architectures (push) Has been cancelled
CI / Check generated parser (push) Has been cancelled
CI / golangci-lint (push) Has been cancelled
CI / fuzzing (push) Has been cancelled
CI / codeql (push) Has been cancelled
CI / Publish main branch artifacts (push) Has been cancelled
CI / Publish release artefacts (push) Has been cancelled
CI / Publish UI on npm Registry (push) Has been cancelled
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled
Sync repo files / repo_sync (push) Has been cancelled
Stale Check / stale (push) Has been cancelled
Lock Threads / action (push) Has been cancelled
2026-03-04 12:17:38 +01:00
Frederic Branczyk
b9ce7f3be0 Merge pull request #18192 from rexagod/17193
discovery/k8s: Dedup EPS for `*DualStack` policies
2026-03-04 11:46:44 +01:00
Frederic Branczyk
44699107d2 Merge pull request #17774 from rexagod/16747
discovery/kubernetes: Support linked pod controllers
2026-03-04 11:07:14 +01:00
Matthieu MOREL
45b9329e68 chore: fix emptyStringTest issues from gocritic (#18226)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2026-03-04 08:24:50 +01:00
Pranshu Srivastava
ebacf4a084 fixup! fixup! discovery/kubernetes: Support linked pod controllers
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
2026-03-03 18:30:52 +05:30
Pranshu Srivastava
f21785f5a9 fixup! discovery/kubernetes: Support linked pod controllers
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
2026-03-03 18:30:52 +05:30
Pranshu Srivastava
228b94f6eb discovery/kubernetes: Support linked pod controllers
Extended Kubernetes SD to support the following pod-based labels:
* `__meta_kubernetes_pod_deployment_name`
* `__meta_kubernetes_pod_cronjob_name`
* `__meta_kubernetes_pod_job_name`

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
2026-03-03 18:30:48 +05:30
Pranshu Srivastava
2684af0ca8 fixup! discovery/k8s: Dedup EPS for *DualStack policies
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
2026-03-03 18:03:37 +05:30
George Krajcsovits
318980a5c2 Merge pull request #17207 from thomas-gouveia/feat/16634/add-support-for-workload-identity-azure-discovery
feat: add support for Azure Workload Identity authentication method for Azure discovery [#16634]
2026-03-03 12:59:58 +01:00
George Krajcsovits
feb741e470 Merge pull request #18215 from mmorel-35/gocritic
chore: fix httpNoBody issues from gocritic
2026-03-03 12:50:19 +01:00
Hank (Yong-Han) Chen
264be9aa74 fix(discovery/file): Fix flaky test on Windows by replacing os.rename with os.WriteFile (#17269)
Signed-off-by: Yong-Han Chen <hank96015@gmail.com>
2026-03-03 12:43:50 +01:00
Matthieu MOREL
026d284c43 chore: fix httpNoBody issues from gocritic
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2026-03-02 20:06:30 +01:00
Frederic Branczyk
1751685dd4 Merge pull request #18194 from rexagod/11726
Some checks failed
buf.build / lint and publish (push) Has been cancelled
CI / Go tests (push) Has been cancelled
CI / More Go tests (push) Has been cancelled
CI / Go tests with previous Go version (push) Has been cancelled
CI / UI tests (push) Has been cancelled
CI / Go tests on Windows (push) Has been cancelled
CI / Mixins tests (push) Has been cancelled
CI / Compliance testing (push) Has been cancelled
CI / Build Prometheus for common architectures (0) (push) Has been cancelled
CI / Build Prometheus for common architectures (1) (push) Has been cancelled
CI / Build Prometheus for common architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (0) (push) Has been cancelled
CI / Build Prometheus for all architectures (1) (push) Has been cancelled
CI / Build Prometheus for all architectures (10) (push) Has been cancelled
CI / Build Prometheus for all architectures (11) (push) Has been cancelled
CI / Build Prometheus for all architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (3) (push) Has been cancelled
CI / Build Prometheus for all architectures (4) (push) Has been cancelled
CI / Build Prometheus for all architectures (5) (push) Has been cancelled
CI / Build Prometheus for all architectures (6) (push) Has been cancelled
CI / Build Prometheus for all architectures (7) (push) Has been cancelled
CI / Build Prometheus for all architectures (8) (push) Has been cancelled
CI / Build Prometheus for all architectures (9) (push) Has been cancelled
CI / Report status of build Prometheus for all architectures (push) Has been cancelled
CI / Check generated parser (push) Has been cancelled
CI / golangci-lint (push) Has been cancelled
CI / fuzzing (push) Has been cancelled
CI / codeql (push) Has been cancelled
CI / Publish main branch artifacts (push) Has been cancelled
CI / Publish release artefacts (push) Has been cancelled
CI / Publish UI on npm Registry (push) Has been cancelled
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled
Stale Check / stale (push) Has been cancelled
Lock Threads / action (push) Has been cancelled
discovery: Introduce `prometheus_sd_last_update_timestamp_seconds`
2026-03-02 17:00:42 +01:00
Frederic Branczyk
bf139a370c Merge pull request #18006 from rexagod/13525
discovery/kubernetes: Support node role selectors for pod roles
2026-03-02 10:03:26 +01:00
Pranshu Srivastava
c023066ca9 discovery: Introduce prometheus_sd_last_update_timestamp_seconds
The metric tracks the last update sent to SD consumers, and includes the
manager name. This allows for monitoring SD state based on far ago its
last heartbeat was.

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
2026-02-25 18:45:24 +05:30
Pranshu Srivastava
a87b603464 discovery/k8s: Dedup EPS for *DualStack policies
In case of {Prefer,Require}DualStack policies in Services, K8s will
create two `EndpointSlices` resources for each IP family address type
specified. This created duplicate targets.

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
2026-02-25 16:00:19 +05:30
Bryan Boreham
d28d33afe0 Merge branch 'main' into fix/consul-filter-health-endpoint
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2026-02-24 11:25:35 +00:00
Aurélien Duboc
1a5da4fbe0 fix(discovery): apply EC2 SD endpoint and guard refreshAZIDs nils (#18133)
Signed-off-by: Aurelien Duboc <aurelienduboc96@gmail.com>
2026-02-24 12:12:57 +01:00
Matt
ce30ae49f3 [FEATURE] AWS SD: Add Elasticache Role (#18099)
* AWS SD: Elasticache

This change adds Elasticache to the AWS SD.

Co-authored-by: Ben Kochie <superq@gmail.com>
Signed-off-by: Matt <small_minority@hotmail.com>

---------

Signed-off-by: Matt <small_minority@hotmail.com>
Co-authored-by: Ben Kochie <superq@gmail.com>
2026-02-22 15:56:03 +01:00
Julien Pivotto
7d0a39ac93 chore(lint): enable wg.Go
Since our minimum supported go version is now go 1.25, we can use wg.Go.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-02-17 15:21:51 +01:00
Joe Adams
68df59ba0c Merge pull request #18029 from matt-gp/ecs-sd-bug-17920
Some checks failed
buf.build / lint and publish (push) Has been cancelled
CI / Go tests (push) Has been cancelled
CI / More Go tests (push) Has been cancelled
CI / Go tests with previous Go version (push) Has been cancelled
CI / UI tests (push) Has been cancelled
CI / Go tests on Windows (push) Has been cancelled
CI / Mixins tests (push) Has been cancelled
CI / Build Prometheus for common architectures (0) (push) Has been cancelled
CI / Build Prometheus for common architectures (1) (push) Has been cancelled
CI / Build Prometheus for common architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (0) (push) Has been cancelled
CI / Build Prometheus for all architectures (1) (push) Has been cancelled
CI / Build Prometheus for all architectures (10) (push) Has been cancelled
CI / Build Prometheus for all architectures (11) (push) Has been cancelled
CI / Build Prometheus for all architectures (2) (push) Has been cancelled
CI / Build Prometheus for all architectures (3) (push) Has been cancelled
CI / Build Prometheus for all architectures (4) (push) Has been cancelled
CI / Build Prometheus for all architectures (5) (push) Has been cancelled
CI / Build Prometheus for all architectures (6) (push) Has been cancelled
CI / Build Prometheus for all architectures (7) (push) Has been cancelled
CI / Build Prometheus for all architectures (8) (push) Has been cancelled
CI / Build Prometheus for all architectures (9) (push) Has been cancelled
CI / Report status of build Prometheus for all architectures (push) Has been cancelled
CI / Check generated parser (push) Has been cancelled
CI / golangci-lint (push) Has been cancelled
CI / fuzzing (push) Has been cancelled
CI / codeql (push) Has been cancelled
CI / Publish main branch artifacts (push) Has been cancelled
CI / Publish release artefacts (push) Has been cancelled
CI / Publish UI on npm Registry (push) Has been cancelled
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled
AWS SD: ECS Discover Standalone Tasks
2026-02-10 22:03:28 -05:00
Ganesh Vernekar
847e474bf4 Kubernetes SD: Disable WatchListClient in tests
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2026-02-10 12:39:34 -08:00
Ganesh Vernekar
1698aada1e Update Go dependencies for v3.10 release
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
2026-02-10 12:39:33 -08:00
George Krajcsovits
246341b169 Merge pull request #18041 from matt-gp/aws-sd-msk-optimisations
AWS SD: Optimise MSK Role
2026-02-10 12:17:09 +01:00
Matt
e27bcdf03f AWS SD: Load Region Fallback (#18019)
* AWS SD: Load Region Fallback

---------

Signed-off-by: matt-gp <small_minority@hotmail.com>
2026-02-10 12:02:24 +01:00
matt-gp
96a87a06ea AWS SD: Optmise MSK Role
Signed-off-by: matt-gp <small_minority@hotmail.com>
2026-02-08 19:42:12 +00:00
matt-gp
5329355ffb AWS SD: ECS Standalone Tasks
The current ECS role in AWS SD assumes that a task is part of a service.
This means that tasks that are started as part of AWS Batch will get
missed and not be discovered. This changed fixes this so that standalone
tasks can be discovered as well.

Signed-off-by: matt-gp <small_minority@hotmail.com>
2026-02-08 16:04:17 +00:00
Matt
cf9d093f8f prw2: Move Remote Write 2.0 CT to be per Sample; Rename to ST (start timestamp) (#17411) (#17600)
Relates to
https://github.com/prometheus/prometheus/issues/16944#issuecomment-3164760343

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: matt-gp <small_minority@hotmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2026-02-05 10:40:45 +01:00
Pranshu Srivastava
99e464c239 discovery/kubernetes: Support node role selectors for pod roles
The node informers for pod roles respected filtering based on selectors
([1]). However, node selectors for pod roles were not allowed. This
patch address that, and additionally adds pod filtering logic, to not
populate their target groups when they belong to a filtered node.

[1]:https://github.com/prometheus/prometheus/pull/10080/changes#diff-e9ca22962ad7d9a7bb4cb82d209dc2dd5301f457d8242da5535557866d2ea1eaR667

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
2026-02-04 02:26:50 +05:30
Ben Kochie
9b549fa118 Update Go yaml v3 library (#17913)
Replace archived `gopkg.in/yaml.v3` with supported `go.yaml.in/yaml/v3`.

Fixes: https://github.com/prometheus/prometheus/issues/16415

Signed-off-by: SuperQ <superq@gmail.com>
2026-01-22 16:36:30 +01:00
Matt
39e524088c Fix(discovery/aws): Create Copies of Default Config (#17769)
Signed-off-by: matt-gp <small_minority@hotmail.com>
2026-01-22 11:34:01 +01:00
dependabot[bot]
99c8351d0e chore(deps): bump github.com/hetznercloud/hcloud-go/v2 from 2.32.0 to 2.33.0 (#17762)
* chore(deps): bump github.com/hetznercloud/hcloud-go/v2

Bumps [github.com/hetznercloud/hcloud-go/v2](https://github.com/hetznercloud/hcloud-go) from 2.32.0 to 2.33.0.
- [Release notes](https://github.com/hetznercloud/hcloud-go/releases)
- [Changelog](https://github.com/hetznercloud/hcloud-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/hetznercloud/hcloud-go/compare/v2.32.0...v2.33.0)

---
updated-dependencies:
- dependency-name: github.com/hetznercloud/hcloud-go/v2
  dependency-version: 2.33.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

* Use `server.Datacenter` until next minor release - disable linting of it in the meantime

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2026-01-07 13:21:56 +01:00