Commit graph

81 commits

Author SHA1 Message Date
bwplotka
c133a969af Merge branch 'main' into start-time-main-sync 2026-03-12 08:28:15 +00:00
Bartlomiej Plotka
a73202012b
tsdb/wlog[PERF]: optimize WAL watcher reads (up to 540x less B/op; 13000x less allocs/op) (#18250)
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Compliance testing (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
See the detailed analysis https://docs.google.com/document/d/1efVAMcEw7-R_KatHHcobcFBlNsre-DoThVHI8AO2SDQ/edit?tab=t.0

I ran extensive benchmarks using synthetic data as well as real WAL segments pulled from the prombench runs.

All benchmarks are here https://github.com/prometheus/prometheus/compare/bwplotka/wal-reuse?expand=1

* optimization(tsdb/wlog): reuse Ref* buffers across WAL watchers' reads

Signed-off-by: bwplotka <bwplotka@gmail.com>

* optimization(tsdb/wlog): avoid expensive error wraps

Signed-off-by: bwplotka <bwplotka@gmail.com>

* optimization(tsdb/wlog): reuse array for filtering

Signed-off-by: bwplotka <bwplotka@gmail.com>

* fmt

Signed-off-by: bwplotka <bwplotka@gmail.com>

* lint fix

Signed-off-by: bwplotka <bwplotka@gmail.com>

* tsdb/record: add test for clear() on histograms

Signed-off-by: bwplotka <bwplotka@gmail.com>

* updated WriteTo with what's currently expected

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-03-11 09:17:13 +00:00
bwplotka
3678ff9042 tests(tsdb/wlog): Tighten watcher tail tests
Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-03-03 13:06:16 +00:00
George Krajcsovits
223f016c44
feat(tsdb): allow using ST capable XOR chunks - retain format on read (#18013)
* feat(tsdb): allow appending to ST capable XOR chunk optionally

Only for float samples as of now.  Supports for in-order and out-of-order
samples.

Make sure that on readout the ST capable chunks are returned automatically.
When the chunks are returned as is, this is trivially true.
When a chunk needs to be re-coded due to deletion (tombstone) markers,
we take the encoding of the original chunk.
When a chunk needs to be created from overlapping chunks, we observe
whether ST is zero or not and create the new chunk based on that.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2026-02-20 09:15:51 +01:00
bwplotka
5ac1080a60 refactor: sed enableStStorage/enableSTStorage
Signed-off-by: bwplotka <bwplotka@gmail.com>
2026-02-17 11:11:46 +00:00
Owen Williams
b57f5b59b3
tsdb: ST-in-WAL: Counter implementation and benchmarks (#17671)
Initial implementation of https://github.com/prometheus/prometheus/issues/17790.
Only implements ST-per-sample for Counters. Tests and benchmarks updated.

Note: This increases the size of the RefSample object for all users, whether st-per-sample is turned on or not.

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2026-02-12 13:17:50 -05:00
Ben Kochie
e14795bbf4
Remove copyright date from headers (#17785)
Remove copyright dates from various files as part of [PROM-50].

[PROM-50]: https://github.com/prometheus/proposals/blob/main/proposals/0050-remove-copyright-dates.md

Signed-off-by: SuperQ <superq@gmail.com>
2026-01-05 13:46:21 +01:00
Björn Rabenstein
b8d19543b8
Add histogram validation in remote-read and during reducing resolution (#17561)
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
ReduceResolution is currently called before validation during
ingestion. This will cause a panic if there are not enough buckets in
the histogram. If there are too many buckets, the spurious buckets are
ignored, and therefore the error in the input histogram is masked.

Furthermore, invalid negative offsets might cause problems, too.

Therefore, we need to do some minimal validation in reduceResolution.
Fortunately, it is easy and shouldn't slow things down. Sadly, it
requires to return errors, which triggers a bunch of code changes.
Even here is a bright side, we can get rud of a few panics. (Remember:
Don't panic!)

In different news, we haven't done a full validation of histograms
read via remote-read. This is not so much a security concern (as you
can throw off Prometheus easily by feeding it bogus data via
remote-read) but more that remote-read sources might be makeshift and
could accidentally create invalid histograms. We really don't want to
panic in that case. So this commit does not only add a check of the
spans and buckets as needed for resolution reduction but also a full
validation during remote-read.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-11-21 00:22:24 +01:00
Ben Kochie
204249fcb5
Update golangci-lint (#17478)
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
* Update golangci-lint to v2.6.0
* Fixup various linting issues.
* Fixup deprecations.
* Add exception for `labels.MetricName` deprecation.

Signed-off-by: SuperQ <superq@gmail.com>
2025-11-05 13:47:34 +01:00
Ben Kochie
48956f60d7
Update modernize (#17471)
Apply additional Go modernize tool improvements.

Signed-off-by: SuperQ <superq@gmail.com>
2025-11-04 05:13:49 +00:00
Bartlomiej Plotka
a4da440dad
fix: Fix slicelabels corruption when used with proto decoding (#17150)
* fix: Fix slicelabels corruption when used with proto decoding

Alternative to https://github.com/prometheus/prometheus/pull/16957/

Signed-off-by: bwplotka <bwplotka@gmail.com>

* addressed comments

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
2025-10-07 12:06:48 +01:00
George Krajcsovits
35d9f28c87
Update tsdb/record/record.go
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
2025-09-24 14:27:37 +02:00
György Krajcsovits
30f941c57c
fix(wal): ignore invalid native histogram schemas on load
Reduce the resolution of histograms as needed and ignore invalid
schemas while emitting a warning log.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-24 11:41:25 +02:00
beorn7
747c5ee2b1 Apply analyzer "modernize" to the whole codebase
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.

This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.

Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 14:48:41 +02:00
Matthieu MOREL
cef219c31c chore: enable unused-receiver rule from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-08-04 09:43:33 +00:00
Matthieu MOREL
5fa1146e21
chore: enable gci linter (#16245)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-03-22 15:46:13 +00:00
pudongair
308c8c48c1
chore: fix some comments (#16237)
Signed-off-by: pudongair <744355276@qq.com>
2025-03-19 16:28:34 +01:00
Matthieu MOREL
c7d4b53ec1 chore: enable unused-parameter from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-02-19 19:50:28 +01:00
piguagua
a82f2b8168
chore: fix function name and struct name in comment (#15827)
Signed-off-by: piguagua <piguagua@aliyun.com>
2025-01-17 21:26:08 +01:00
György Krajcsovits
a7ccc8e091 record_test.go: avoid captures, simply return test refs
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-01-02 12:45:20 +01:00
Carrie Edwards
1508149184 Update benchmark test and comment 2024-12-27 09:09:13 -08:00
György Krajcsovits
df88de5800 Fix lint for real
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-12 12:52:01 +01:00
György Krajcsovits
cf36792e14 Fix unused import
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-12 12:49:28 +01:00
György Krajcsovits
fdb1516af1 Fix lint errors
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-12 12:47:43 +01:00
György Krajcsovits
d64d1c4c0a Benchmark encoding classic and nhcb
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-12 10:59:06 +01:00
György Krajcsovits
8f572fe905 fix(lint): linter errors
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-12-10 16:25:20 +01:00
Carrie Edwards
a046417bc0 Use new record type only for NHCB 2024-12-06 13:46:20 -08:00
Carrie Edwards
6b44c1437f Fix comment and histogram record string 2024-12-05 09:21:47 -08:00
Carrie Edwards
6684344026 Rename old histogram record type, use old names for new records 2024-12-05 09:21:47 -08:00
Carrie Edwards
454f6d39ca Add separate handling for histograms and custom bucket histograms 2024-12-05 09:21:47 -08:00
Carrie Edwards
37df50adb9 Attempt for record type 2024-12-05 09:21:47 -08:00
Carrie Edwards
cfcd51538d Remove references to custom values record 2024-12-05 09:21:47 -08:00
Carrie Edwards
6d413fad36 Use histogram records for custom value handling 2024-12-05 09:21:47 -08:00
Carrie Edwards
aa144b7263 Handle custom buckets in WAL and WBL 2024-12-05 09:21:47 -08:00
Nathan Baulch
50cd453c8f
chore: Fix typos (#14868)
* Fix typos

---------

Signed-off-by: Nathan Baulch <nathan.baulch@gmail.com>
2024-09-10 22:32:03 +02:00
Arve Knudsen
e410a215fb Fix a couple of comments
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-07-05 15:25:42 +02:00
Arve Knudsen
d699dc3c77
Fix language in docs and comments (#14041)
Fix language in docs and comments

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
2024-05-08 17:57:09 +02:00
Bryan Boreham
925134e6de tsdb tests: make work with labels SymbolTable
Need to initialize decoders with SymbolTable.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Bryan Boreham
93b72ec5dd tsdb: create SymbolTables for labels as required
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Bryan Boreham
c0e36e6bb3 Standardise exemplar label as "trace_id"
This is consistent with the OpenTelemetry standard, and an example in OpenMetrics.

https://github.com/open-telemetry/opentelemetry-specification/blob/89aa01348139/specification/metrics/data-model.md#exemplars
https://github.com/OpenObservability/OpenMetrics/blob/138654493130/specification/OpenMetrics.md#exemplars-1

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-15 14:20:08 +00:00
Bryan Boreham
39af788dbd Tests: use replacement DeepEquals using go-cmp
Use DeepEqual replacement using go-cmp, which is more flexible.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-08 19:30:20 +00:00
Bryan Boreham
8065bef172 Move metric type definitions to common/model
They are used in multiple repos, so common is a better place for them.
Several packages now don't depend on `model/textparse`, e.g.
`storage/remote`.

Also remove `metadata` struct from `api.go`, since it was identical to
a struct in the `metadata` package.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-12-19 18:56:54 +00:00
Filip Petkovski
10a82f87fd
Enable reusing memory when converting between histogram types
The 'ToFloat' method on integer histograms currently allocates new memory
each time it is called.

This commit adds an optional *FloatHistogram parameter that can be used
to reuse span and bucket slices. It is up to the caller to make sure the
input float histogram is not used anymore after the call.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2023-12-08 10:22:59 +01:00
Matthieu MOREL
9c4782f1cc
golangci-lint: enable testifylint linter (#13254)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-12-07 11:35:01 +00:00
Bryan Boreham
1bfb3ed062
Labels: reduce allocations when creating from TSDB WAL (#13044)
* Labels: reduce allocations when creating from TSDB

When reading the WAL, by passing references into the buffer we can avoid
copying strings under `-tags stringlabels`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-11-14 11:36:35 +00:00
Matthieu MOREL
469e415d09
Update record.go
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-11-11 21:01:24 +01:00
Matthieu MOREL
69c07ec6ae
Update record_test.go
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-11-11 20:57:42 +01:00
Matthieu MOREL
63691d82a5
tsdb/record: use Go standard errors package
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-11-11 20:52:49 +01:00
Bryan Boreham
26fa2e8356 TSDB: Pre-size buffer to read samples from WAL
When reading the WAL this method is called with buffers from a pool, on
multiple goroutines. Pre-allocating sufficient size avoids slow growth
and many reallocations in `append`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-10-17 17:31:26 +00:00
Marc Tudurí
4851ced266
tsdb: Support native histograms in snapshot on shutdown (#12258)
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
2023-07-05 11:44:13 +02:00