Commit graph

13054 commits

Author SHA1 Message Date
Vitah Lin
2432f55278
Fix CI Codecov v6 coverage upload configuration (#15147)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
PR https://github.com/redis/redis/pull/14937 updates the Codecov
workflow configuration for `codecov/codecov-action` v6.

The action no longer accepts the singular `file` input, so this switches
to `files` to ensure `./src/redis.info` is uploaded correctly.
2026-04-30 21:38:25 +08:00
Vitah Lin
417cc6e4fc
test: stabilize HOTKEYS MULTI/EXEC test by increasing iteration count (#15129)
## Problem

The test `HOTKEYS - commands inside MULTI/EXEC` in
`tests/unit/hotkeys.tcl` is flaky on fast hardware. This PR raises its
inner loop count from 7 to 30 to make `key2` reliably appear in the CPU
top-K.

Failed CI:
https://github.com/redis/redis/actions/runs/25051455424/job/73380034469?pr=15128

Inside `MULTI`/`EXEC`, each queued command's per-command CPU time is
recorded as `c->duration = ustime() - call_timer` (microseconds,
integer). Very fast commands such as `SET` against a small value can
complete in less than 1 µs and therefore be measured as `0`.

`hotkeyStatsUpdateCurrentCmd` then forwards that zero duration as the
weight to `chkTopKUpdate`, which has an explicit early return on `weight
== 0`:

```c
sds chkTopKUpdate(chkTopK *topk, char *item, int itemlen, counter_t weight) {
    if (weight == 0) return NULL;
    ...
}
```

In the original test, `key2` is `SET` only 7 times inside the
transaction. On fast hosts (the failure was observed on an ARM box with
`ustime()` ticking at 1 µs resolution) it is possible for all 7 calls to
be measured as 0 µs, which means `key2` is never inserted into the CPU
top-K and the assertion

```tcl
assert [dict exists $cpu_result $key2]
```

fails. `key1` has 21 calls and is statistically safe.

The author already anticipated this and left a comment ("Send multiple
commands to avoid <1us cpu for $key2"), but 7 iterations turned out to
be insufficient.

## Changes

Bump the iteration count from 7 to 30. With `key2` now `SET` 30 times
the probability of every single call being measured as 0 µs becomes
negligible on any realistic hardware.
2026-04-30 16:15:11 +03:00
Shubham S Taple
0bbb196c46
Fix sharded pubsub unsubscribe lookup using cached command slot (#15094)
Fixes #15085 

## Problem
getKeySlot() may return `server.current_client->slot` while a command is
executing instead of computing the slot from the provided string.

The unsubscribe can be triggered by another client, in which case server.current_client is not the client being unsubscribed, so getKeySlot() would return that client's cached slot. Using this wrong slot would make the lookup in type.serverPubSubChannels miss the channel and ultimately trigger the assertion below.

## Fix
Always use keyHashSlot() instead of getKeySlot() on unsubscribe.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2026-04-29 22:04:06 +08:00
Filipe Oliveira (Redis)
48eaa75257
Add batched prefetch for HGETALL on hashtable encoding (#14988)
## Summary

Add a batched prefetch fast path for `HGETALL` on hashtable-encoded
hashes. When iterating large hash tables, pointer chasing through
scattered heap allocations (`dictEntry` → `Entry` → value SDS) causes
cache misses that dominate CPU time (~10% flat in `dictNext`).

The new path collects dict entries in batches of configured batch size,
issues software prefetches for the `Entry` structs and their value SDS
data, then emits replies while the data is cache-warm. This hides memory
latency by overlapping prefetch with reply generation.
2026-04-29 16:09:51 +08:00
Shubham S Taple
247307de96
Pass context to RM_GetUserUsername() to support auto memory management (#15042)
Following #14890

## Problem
RM_GetUserUsername() documents that the returned RedisModuleString can
be freed via automatic memory management, but it always creates the
string with ctx=NULL so it cannot be tracked by RedisModule_AutoMemory.
Modules following the documentation may leak memory.

## Fix
Fixes `RedisModule_GetUserUsername` to accept a `RedisModuleCtx *` and create the returned `RedisModuleString` with that context, allowing RedisModule auto-memory management to track/free it as documented.
2026-04-28 16:45:31 +08:00
Moti Cohen
5a05863e97
t_string: rewrite SET GET propagation in place (#15114)
Optimize SET key value GET propagation rewriting in setGenericCommand() 
by removing GET arguments in-place with rewriteClientCommandArgument(). 
This avoids the overhead of allocating a new argv vector and 
incrementing reference counts for every retained argument.

The optimization is scoped to the no-expire SET ... GET rewrite path. 
It also adds test coverage for cases with repeated GET tokens to 
ensure robust string semantics and consistent replication behavior.

Changes:
- Use rewriteClientCommandArgument(c, j, NULL) for in-place removal.
- Eliminate redundant argv allocations and refcount increments.
- Improve performance of SET GET in high-throughput write streams.
2026-04-28 10:05:59 +03:00
Raj Kripal Danday
625b6f58f6
tracking: fix self-overlap returning non-zero loop index (#15073)
Fixes checkPrefixCollisionsOrReply() to return 0 (failure) on any provided-prefix self-overlap, instead of accidentally returning a non-zero loop index for overlaps found after the first prefix.

Signed-off-by: Raj Danday <rajkripal.danday@gmail.com>
2026-04-28 09:24:47 +08:00
Curtis Means
861917603e
Update SECURITY.md vulnerability reporting instructions (#15089)
Made-with: Cursor

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Documentation-only change; no code or runtime behavior is affected,
but it changes the official intake channels for vulnerability reports.
> 
> **Overview**
> Updates `SECURITY.md` to redirect vulnerability reporters from
emailing the core team to using the **Redis Vulnerability Disclosure
Program** link, with GitHub’s *Report a Vulnerability* as an
alternative.
> 
> Adds a dedicated security contact email (`security@redis.com`) for
questions and includes brief rationale for the new reporting path.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
eeaa8c4ada. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-04-27 19:03:31 +03:00
Filipe Oliveira (Redis)
77628f370a
perf+security: drop SCAN vector pre-allocation; rely on grow-by-doubling (#15118)
## Summary
Follow-up to #15065. The merged code calls `vecReserve(&keys, count)`
where `count` is user-supplied. A client can pass a giant `COUNT` (e.g.
`HSCAN k 0 COUNT 10000000000000`) and the server pre-allocates the
corresponding pointer slots before any work happens — ~80 TB on a 64-bit
build. Pre-reserve DoS surface flagged in code review.

## Fix
Drop the pre-reserve entirely. The vec already starts on a 256-pointer
stack buffer and grows-by-doubling driven by **actual cardinality** of
the dictionary, not by user-supplied `COUNT`.

## Why drop the pre-reserve (vs cap it)
The pre-reserve doesn't pay measurable performance — `vecPush()`'s
grow-by-doubling path is amortized O(1) and the dominant cost on SCAN
workloads is the per-entry callback work, not vector growth.
2026-04-27 23:13:14 +08:00
Filipe Oliveira (Redis)
7a80aade96
perf: replace list with vec in scanGenericCommand key collection (#15065)
## Motivation

With the append-only pointer vector (`vec`) introduced in #15039, the
SCAN keys collection path is a natural consumer: `scanCallback` pushes
each key to a `list` via `listAddNodeTail`, which allocates a `listNode`
(~48 bytes + jemalloc overhead) per key. `scanGenericCommand` then
iterates the list once to emit replies and frees every node plus the
list itself. For `SCAN COUNT 500`, that is ~500 node allocations + frees
on the hot path of a single command.

This PR replaces that with `vec` — a 256-element stack buffer covers
typical `COUNT` values without any heap allocation, and larger scans
grow to a single heap allocation via `vecPush`.

## Change

Single-file diff in `src/db.c` — ~30 touched lines, net +6 LOC:

- `scanData.keys`: `list *` → `vec *`
- `scanCallback`: `listAddNodeTail(keys, key)` → `vecPush(keys, key)`
- `scanGenericCommand`:
  - `listCreate()` → stack-backed `vec` with 256-element `keys_stack[]`
- Reply loop: `listFirst / listNodeValue / listDelNode` → `vecGet` index
loop
- The old `listSetFreeMethod(keys, sdsfreegeneric)` was only active for
ZSET (which allocates temporary sds for scores) and listpack paths; we
    track that via a `free_collected` flag and do an explicit `sdsfree`
    loop before `vecRelease`. The listpack early-return paths (OBJ_SET,
    listpack, listpack_ex) call `vecRelease(&keys)` directly since they
    never called the callback.
- `#include "vector.h"` added

No algorithmic changes — SCAN cursor iteration, pattern matching, expiry
filtering, type filtering and reply formatting are unchanged.

## Benchmarks

Run via `redis-benchmarks-specification` on `x86-aws-m7i.metal-24xl`
(Intel Sapphire Rapids) and `arm-aws-m8g.metal-24xl` (Neoverse-V2
Graviton4). `unstable` baseline is `n=5`; PR is `n=2-3` on commit
`56458ce42` (the first push of this branch — the rebased commit
`6e4aff26f` is a no-op rebase over upstream, identical tree).

### x86-aws-m7i.metal-24xl

| Test | unstable | PR | Δ |
|------|---------:|---:|--:|
| `memtier_benchmark-1Mkeys-generic-scan-count-10-incremental-iteration`
| 176,929 (n=5) | 181,185 (n=3) | **+2.4%** |
|
`memtier_benchmark-1Mkeys-generic-scan-count-10-incremental-iteration-high-cursor-count`
| 157,405 (n=5) | 164,025 (n=3) | **+4.2%** |
| `memtier_benchmark-1Mkeys-generic-scan-count-50-incremental-iteration`
| 99,770 (n=5) | 110,862 (n=2) | **+11.1%** |
|
`memtier_benchmark-1Mkeys-generic-scan-count-100-incremental-iteration`
| 61,722 (n=5) | 71,445 (n=3) | **+15.8%** |
|
`memtier_benchmark-1Mkeys-generic-scan-count-500-incremental-iteration`
| 18,994 (n=5) | 22,594 (n=2) | **+19.0%** |
| `memtier_benchmark-1Mkeys-generic-scan-count-500-pipeline-10` | 25,677
(n=5) | 35,442 (n=2) | **+38.0%** |
| `memtier_benchmark-1Mkeys-generic-scan-pipeline-10` | 824,033 (n=5) |
920,415 (n=2) | **+11.7%** |
| `memtier_benchmark-1Mkeys-generic-scan-type-pipeline-10` | 764,420
(n=5) | 852,255 (n=2) | **+11.5%** |
| `memtier_benchmark-1Mkeys-generic-scan-cursor-count-500-pipeline-10` |
15,264 (n=5) | 19,688 (n=2) | **+29.0%** |
| `memtier_benchmark-1Mkeys-generic-scan-cursor-pipeline-10` | 491,250
(n=5) | 564,721 (n=2) | **+15.0%** |

### arm-aws-m8g.metal-24xl

| Test | unstable | PR | Δ |
|------|---------:|---:|--:|
| `memtier_benchmark-1Mkeys-generic-scan-count-10-incremental-iteration`
| 195,917 (n=5) | 204,520 (n=3) | **+4.4%** |
|
`memtier_benchmark-1Mkeys-generic-scan-count-10-incremental-iteration-high-cursor-count`
| 177,644 (n=5) | 182,682 (n=3) | **+2.8%** |
| `memtier_benchmark-1Mkeys-generic-scan-count-50-incremental-iteration`
| 103,337 (n=5) | 118,119 (n=2) | **+14.3%** |
|
`memtier_benchmark-1Mkeys-generic-scan-count-100-incremental-iteration`
| 66,199 (n=5) | 77,436 (n=3) | **+17.0%** |
|
`memtier_benchmark-1Mkeys-generic-scan-count-500-incremental-iteration`
| 18,869 (n=5) | 21,790 (n=2) | **+15.5%** |
| `memtier_benchmark-1Mkeys-generic-scan-count-500-pipeline-10` | 27,621
(n=5) | 38,585 (n=2) | **+39.7%** |
| `memtier_benchmark-1Mkeys-generic-scan-pipeline-10` | 789,621 (n=5) |
893,041 (n=2) | **+13.1%** |
| `memtier_benchmark-1Mkeys-generic-scan-type-pipeline-10` | 725,833
(n=5) | 878,881 (n=2) | **+21.1%** |
| `memtier_benchmark-1Mkeys-generic-scan-cursor-count-500-pipeline-10` |
11,061 (n=5) | 13,996 (n=2) | **+26.5%** |
| `memtier_benchmark-1Mkeys-generic-scan-cursor-pipeline-10` | 411,119
(n=5) | 483,889 (n=2) | **+17.7%** |

Pattern is consistent across both architectures: gains scale with
`COUNT` (more keys collected per call → more `listNode` allocations
avoided). The ~+40% peak on `count-500-pipeline-10` is where the
per-call allocator overhead dominated the previous implementation.

No test regresses. Every delta is positive.

## Tests

- `./runtest --single unit/scan` — 0 exceptions
- `./runtest --single unit/type/hash` — 0 exceptions (exercises HSCAN
path)

## References

- **#15039** — @moticless introduced `vec` (append-only pointer vector
with optional stack-backed storage).
- **#14958** (Subkey notification for hash fields, @ShooterIT) is the
first in-tree consumer of `vec`; this PR is a second, small consumer.
2026-04-27 14:16:36 +03:00
sggeorgiev
c61099eaa0
Replace recursive rax tree freeing with iterative traversal (#15103)
`raxRecursiveFree` and `raxRecursiveFreeWithCtx` used C call-stack
recursion to walk the entire radix tree during `raxFree`. On trees with
pathologically deep paths (long keys with no shared prefixes) this could
overflow the thread stack and crash the process.

This PR replaces both recursive functions with a single unified
iterative helper (`raxFreeNodesWithCallback`) that maintains an explicit
heap-allocated `raxStack` — the same stack structure already used
elsewhere in the rax code (e.g. `raxIterator`). The helper accepts both
callback variants (with and without a user-supplied context) so the two
public entry points `raxFreeWithCallback` and `raxFreeWithCbAndContext`
now both delegate to it. Child pointers are now enumerated forward from
`raxNodeFirstChildPtr` instead of backward from `raxNodeLastChildPtr`,
which is simpler and consistent with how the rest of the codebase
traverses children. No functional change: every node is still visited
exactly once, its optional data callback is still invoked before the
node is freed, and `rax->numnodes` is decremented identically.
2026-04-27 08:47:43 +03:00
sggeorgiev
47c51369ee
Reject corrupt stream RDB with shared NACK across consumers (#15081)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
**Summary**

Detects and rejects corrupt stream RDB payloads where the same NACK
(pending entry) is referenced by more than one consumer, which violates
a stream data-structure.

**Changes**

- **`rdbLoadObject` (stream consumer PEL loading)**: Added a guard that
checks `nack->consumer != NULL` before assigning the consumer pointer.
When a second consumer's PEL references a NACK that was already claimed
by a prior consumer, the loader now reports a corrupt RDB error and
aborts instead of silently overwriting the pointer. Without this check,
two consumers share the same `streamNACK`, and freeing the first
consumer's PEL leaves the second with a dangling pointer.
- **`corrupt-dump.tcl`**: Added a regression test that crafts a stream
with two consumers (`consumerA`, `consumerB`) whose PELs both reference
the same entry (`1-0`). The `RESTORE` command is expected to fail with
`"Bad data format"`, and the server must remain responsive (`PING`
succeeds).

**Benefits**

- **Fail-fast on corrupt data**: The invariant violation is caught at
load time with a clear diagnostic message rather than manifesting as a
crash later during normal operation.
- **Regression coverage**: The crafted payload in the test ensures this
class of corruption is permanently guarded against.
2026-04-23 15:46:48 +03:00
Vitah Lin
fafc47251a
Fix signed integer overflow in scan count parameter (#14982)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
### Problem 
In `scanGenericCommand`, `maxiterations = count * 10` overflows when
`count > LONG_MAX / 10`, causing undefined behavior.

### Changed 
1. Use saturating arithmetic to prevent overflow.
2. Added a test to trigger the overflow path, detectable by UBSan.
2026-04-23 17:38:42 +08:00
Darsheel Rathore
303667a40c
Fix use-after-free in RM_RegisterClusterMessageReceiver() (#15059)
RM_RegisterClusterMessageReceiver() unlinks a receiver node from the
clusterReceivers[type] linked list when the callback is set to NULL, but
when removing the head node (prev == NULL), the code updates
clusterReceivers[type]->next instead of clusterReceivers[type] itself.

This leaves clusterReceivers[type] pointing to the freed node, so any
later traversal through clusterReceivers[type] dereferences a dangling
pointer.

Fix by updating clusterReceivers[type] directly when prev == NULL.

Fixes #15057

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2026-04-23 16:44:36 +08:00
sggeorgiev
63f02e7876
Fix double ERR prefix in XNACK error replies (#15091)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Several `addReplyError` and `addReplyErrorFormat` calls in
`xnackCommand` included a redundant `"ERR "` prefix in the message
string. Since `addReplyErrorLength` already prepends `-ERR ` to the RESP
reply, clients received `ERR ERR ...` for these error paths.

This PR removes the redundant prefix from all five affected calls and
tightens the corresponding test patterns to match from the beginning of
the error message (`"ERR ..."` instead of `"*...*"`), so any future
double-prefix regression will be caught.
2026-04-22 09:12:04 +03:00
Filipe Oliveira (Redis)
0fa78fd8fd
perf: widen fast_float_strtod fast path to 17-19 digit mantissas (#15061)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
## Root cause

Roughly 50% of random double scores generated by the ZADD listpack
workload have 17-19 significant digits, which exceed
`MAX_MANTISSA_FAST_PATH` (`2^53`). These inputs fall through to the
`strtod()` fallback:

```c
char static_buf[128];
memcpy(buf, nptr, len);           /* memcpy back! */
buf[len] = '\0';                   /* null-term */
double result = strtod(buf, ...);  /* glibc strtod — ~10× slower on ARM */
```

The original C++ `fast_float` library handled the same 17-19 digit
inputs with Eisel-Lemire / bigint arithmetic without falling back to
`strtod()`. That is what the pure-C replacement lost.

## Fix

Compute `mantissa * 10^exponent` in 128-bit integer arithmetic using
`__uint128_t`, then convert to double with a single IEEE
round-to-nearest-even cast. Supported for `|exp| in [0, 19]` where
`10^|exp|` fits in `uint64`; cases outside that range (or otherwise
outside the fast path's preconditions) still fall through to `strtod()`.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2026-04-20 20:45:49 +08:00
Omer Shadmi
58dc4f3c85
Update RediSearch to 8.8 RC1 (v8.7.90) (#15072)
Update RediSearch module version to 8.8 RC1 (v8.7.90)


Made with [Cursor](https://cursor.com)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Low risk: a single version bump that changes which RediSearch git tag
is cloned/built; main risk is build/runtime incompatibility from the
upstream RC update.
> 
> **Overview**
> Updates the RediSearch module build configuration to fetch and build
upstream `redisearch` tag `v8.7.90` (8.8 RC1) instead of `v8.5.90`.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
21e121c738. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-04-19 10:49:34 +03:00
charsyam
8677971360
Remove unnecessary -ERR and \r\n for addReplyErrorFormat in extractLongLatOrReply() (#14995)
In addReplyErrorLength and addReplyErrorFormatInternal, `-ERR` is
automatically prepended if the message doesn’t start with `-`, so the
initial `-ERR` is unnecessary. Also, trailing `\r\n` will be trimmed, so
it doesn’t need to be included.

---------

Signed-off-by: charsyam <charsyam@naver.com>
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
2026-04-17 17:28:13 +08:00
Vitah Lin
8aeea8c210
Increase threshold for HPEXPIRETIME persists after RDB reload test (#15047) 2026-04-17 16:36:02 +08:00
Vitah Lin
15cb40dac2
Fix command-docs and corrupt-dump-fuzzer of OBJ_GCRA type (#15055)
### Problem 

While the new type `OBJ_GCRA` was added, several related code paths were
not updated accordingly, leading to failures in the
`reply-schemas-validator` CI job and `corrupt-dump-fuzzer.tcl`

##### reply-schemas-validator

Failed CI:
https://github.com/redis/redis/actions/runs/24485248057/job/71558533290#step:10:903
```shell
Traceback (most recent call last):
  File "/home/runner/work/redis/redis/./utils/req-res-log-validator.py", line 238, in process_file
    jsonschema.validate(instance=res.json, schema=req.schema, cls=schema_validator)
  File "/home/runner/.local/lib/python3.12/site-packages/jsonschema/validators.py", line 1121, in validate
    raise error
jsonschema.exceptions.ValidationError: 'rate_limit' is not valid under any of the given schemas

Failed validating 'oneOf' in schema['patternProperties']['^.*$']['properties']['group']:
    {'description': 'the functional group to which the command belongs',
     'oneOf': [{'const': 'bitmap'},
               {'const': 'cluster'},
               {'const': 'connection'},
               {'const': 'generic'},
               {'const': 'geo'},
               {'const': 'hash'},
               {'const': 'hyperloglog'},
               {'const': 'list'},
               {'const': 'module'},
               {'const': 'pubsub'},
               {'const': 'scripting'},
               {'const': 'sentinel'},
               {'const': 'server'},
               {'const': 'set'},
               {'const': 'sorted-set'},
               {'const': 'stream'},
               {'const': 'string'},
               {'const': 'transactions'}]}

On instance['gcrasetvalue']['group']:
    'rate_limit'
```


##### `corrupt-dump-fuzzer.tcl`

Also fixed `: Fuzzer corrupt restore payloads - sanitize_dump: yes in
tests/integration/corrupt-dump-fuzzer.tcl`

Failed daily test :
https://github.com/redis/redis/actions/runs/24485248057/job/71558533312#step:6:8652
```shell
Server crashed (by signal: 0, err: key "gcra" not known in dictionary), with payload: "\x1C\x0A\x02\x5F\x37\xC0\x06\xC0\x00\x02\x5F\x39\xC0\x08\x02\x5F\x33\x02\x5F\x35\x02\x5F\x31\xC0\x02\xC0\x04\x0E\x00\xA9\x71\xBF\xEE\x6F\x46\xEF\xA6"
violating commands:
Done 1434 cycles in 600 seconds.
RESTORE: successful: 601, rejected: 833
Total commands sent in traffic: 1194776, crashes during traffic: 1 (0 by signal).
[: Fuzzer corrupt restore payloads - sanitize_dump: yes in tests/integration/corrupt-dump-fuzzer.tcl
Expected '1' to be equal to '0' (context: type eval line 155 cmd {assert_equal $stat_terminated_in_traffic 0} proc ::test)
[147/147 done]: integration/corrupt-dump-fuzzer (1201 seconds)
```

### Changed

This change completes the necessary updates across all relevant
components to ensure consistent handling of the rate_limit group and
restores CI stability.
2026-04-17 10:30:43 +03:00
Yuan Wang
4757561861
Subkey notification for hash fields (#14958)
## Motivation

Redis's existing keyspace notification system operates at the **key
level** only — when a hash field is modified via `HSET`, `HDEL`, or
`HEXPIRE`, the subscriber receives the key name and the event type, but
not **which fields** were affected, therefore, these notifications has
very little practical value.

This PR introduces a subkey notification system that extends keyspace
events to include field-level (subkey) details for hash operations,
through both Pub/Sub channels and the Module API.

## New Pub/Sub Notification Channels

Four new channels are added:

|Channel Format | Payload |
|---------------|---------|
| `__subkeyspace@<db>__:<key>` | `<event>\|<len>:<subkey>[,...]` |
|`__subkeyevent@<db>__:<event>` |
`<key_len>:<key>\|<len>:<subkey>[,...]` |
| `__subkeyspaceitem@<db>__:<key>\n<subkey>` | `<event>` |
|`__subkeyspaceevent@<db>__:<event>\|<key>` | `<len>:<subkey>[,...]` |

**Design rationale for 4 channels:**
- **Subkeyspace**: Subscribe to a specific key, receive all field
changes in a single message — efficient for key-centric consumers.
- **Subkeyevent**: Subscribe to a specific event type, receive
key+fields — efficient for event-centric consumers.
- **Subkeyspaceitem**: Subscribe to a specific key+field combination —
the most selective, one message per field, no parsing needed.
- **Subkeyspaceevent**: Subscribe to event+key combination, receiving
only the affected fields — server-side filtering on both dimensions.

Subkeys are encoded in a length-prefixed format (`<len>:<subkey>`) to
support binary-safe field names containing delimiters.

**Safety guards:**
- Events containing `|` are skipped for `__subkeyspace` and
`__subkeyspaceevent ` channels (to avoid parsing ambiguity).
- Keys containing `\n` are skipped for the `__subkeyspaceitem` channel
(newline is the key/subkey separator).
- Subkeys channels are only published when `subkeys != NULL && count >
0`.

## Hash Command Integration

The following hash operations now emit subkey level notifications with
the affected field names:

| Command | Event | Subkeys |
|---------|-------|---------|
| `HSET` / `HMSET` | `hset` | All fields being set |
| `HSETNX` | `hset` | The field (if set) |
| `HDEL` | `hdel` | All fields deleted |
| `HGETDEL` | `hdel` / `hexpired` | Deleted or lazily expired fields |
| `HGETEX` | `hexpire` / `hpersist` / `hdel` / `hexpired` | Affected
fields per event |
| `HINCRBY` | `hincrby` | The field |
| `HINCRBYFLOAT` | `hincrbyfloat` | The field |
| `HEXPIRE` / `HPEXPIRE` / `HEXPIREAT` / `HPEXPIREAT` | `hexpire` |
Updated fields |
| `HPERSIST` | `hpersist` | Persisted fields |
| `HSETEX` | `hset` / `hdel` / `hexpire` / `hexpired` | Affected fields
per event |
| Field expiration (active/lazy) | `hexpired` | All expired fields
(batched) |

For field expiration, expired fields are collected into a dynamic array
and sent as a single batched notification after the expiration loop,
rather than one notification per field.

## Module API

Three new APIs and one new callback type:

```c
/* Function pointer type for keyspace event notifications with subkeys from modules. */
typedef void (*RedisModuleNotificationWithSubkeysFunc)(
    RedisModuleCtx *ctx, int type, const char *event,
    RedisModuleString *key, RedisModuleString **subkeys, int count);

/* Subscribe to keyspace notifications with subkey information.
 *
 * This is the extended version of RM_SubscribeToKeyspaceEvents. When subkeys
 * are available, the `subkeys` array and `count` are passed to the callback.
 * `subkeys` contains only the names of affected subkeys (values are not included),
 * and `count` is the number of elements. The array may contain duplicates when
 * the same subkey appears more than once in a command (e.g. HSET key f1 v1 f1 v2
 * produces subkeys=["f1","f1"], count=2). When no subkeys are present, `subkeys`
 * will be NULL and `count` will be 0. Whether events without subkeys are delivered
 * depends on the `flags` parameter (see below).
 *
 * `types` is a bit mask of event types the module is interested in
 * (using the same REDISMODULE_NOTIFY_* flags as RM_SubscribeToKeyspaceEvents).
 *
 * `flags` controls delivery filtering:
 *  - REDISMODULE_NOTIFY_FLAG_NONE: The callback is invoked for all matching
 *    events regardless of whether subkeys are present, so a separate
 *    RM_SubscribeToKeyspaceEvents registration can be omitted.
 *  - REDISMODULE_NOTIFY_FLAG_SUBKEYS_REQUIRED: The callback is only invoked
 *    when subkeys are not empty. Events without subkey information (e.g. SET,
 *    EXPIRE, DEL) are skipped.
 *
 * The callback signature is:
 *   void callback(RedisModuleCtx *ctx, int type, const char *event,
 *                 RedisModuleString *key, RedisModuleString **subkeys, int count);
 *
 * The subkeys array and its contents are only valid during the callback.
 * The underlying objects may be stack-allocated or temporary, so
 * RM_RetainString must NOT be used on them. To keep a subkey beyond
 * the callback (e.g. in a RM_AddPostNotificationJob callback), use
 * RM_HoldString (which handles static objects by copying) or
 * RM_CreateStringFromString to make a deep copy before returning.
 */
int RM_SubscribeToKeyspaceEventsWithSubkeys(RedisModuleCtx *ctx, int types, int flags, RedisModuleNotificationWithSubkeysFunc callback);

/* Unregister a module's callback from keyspace notifications with subkeys
 * for specific event types.
 *
 * This function removes a previously registered subscription identified by
 * the event mask, delivery flags, and the callback function.
 *
 * Parameters:
 *  - ctx: The RedisModuleCtx associated with the calling module.
 *  - types: The event mask representing the notification types to unsubscribe from.
 *  - flags: The delivery flags that were used during registration.
 *  - callback: The callback function pointer that was originally registered.
 *
 * Returns:
 *  - REDISMODULE_OK on successful removal of the subscription.
 *  - REDISMODULE_ERR if no matching subscription was found. */ 
int RM_UnsubscribeFromKeyspaceEventsWithSubkeys(
    RedisModuleCtx *ctx, int types, int flags,
    RedisModuleNotificationWithSubkeysFunc cb);

/* Like RM_NotifyKeyspaceEvent, but also triggers subkey-level notifications
 * when subkeys are provided. Both key-level (keyspace/keyevent) and
 * subkey-level (subkeyspace/subkeyevent/subkeyspaceitem/subkeyspaceevent)
 * channels are published to, depending on the server configuration.
 *
 * This is the extended version of RM_NotifyKeyspaceEvent and can actually
 * replace it. When called with subkeys=NULL and count=0, it behaves
 * identically to RM_NotifyKeyspaceEvent. */
int RM_NotifyKeyspaceEventWithSubkeys(
    RedisModuleCtx *ctx, int type, const char *event,
    RedisModuleString *key, RedisModuleString **subkeys, int count);
```

## Configuration

Subkey notifications are controlled via the existing
`notify-keyspace-events` configuration string with four new characters:
`notify-keyspace-events` "STIV"

**S** -> Subkeyspace events, published with `__subkeyspace@<db>__:<key>`
prefix.
**T** -> Subkeyevent events, published with
`__subkeyevent@<db>__:<event>` prefix.
**I** -> Subkeyspaceitem events, published per subkey with
`__subkeyspaceitem@<db>__:<key>\n<subkey>` prefix.
**V** -> Subkeyspaceevent events, published with
`__subkeyspaceevent@<db>__:<event>|<key>` prefix.

These flags are **independent** from the existing key-level flags (`K`,
`E`, etc.). Enabling subkey notifications does **not** implicitly enable
or depend on keyspace/keyevent notifications, and vice versa.

## Known Limitations

- **Duplicate fields in subkey notifications**: Subkey notification
payloads may contain duplicate field names when the same field is
affected more than once within a single command. Since duplicate fields
are not the common case and deduplication would introduce significant
overhead on every notification, we chose not to deduplicate at this
time.
- **Subkey is sds encoding object**: We assume the subkey is sds
encoding object, and access it by `subkey->ptr`, and there is an assert,
redis will crash if not.
2026-04-17 13:39:04 +08:00
debing.sun
ca6e471a3f
Fix decrRefCount on NULL robj on corrupt KEY_META payload (#15034)
## Summary

This PR fixes two issues when processing corrupt data in
rdbLoadCheckModuleValue():

1. When handling `RDB_MODULE_OPCODE_STRING` opcode,
rdbGenericLoadStringObject() can return NULL on a corrupt payload. The
code called decrRefCount(o) unconditionally without a NULL check,
resulting in a NULL pointer dereference crash.

2. The while loop condition was `!= RDB_MODULE_OPCODE_EOF`, which means
a truncated payload (causing rdbLoadLen to return RDB_LENERR) would
never exit the loop, since `RDB_LENERR != RDB_MODULE_OPCODE_EOF` is
always true, potentially causing an infinite hang.
2026-04-16 21:50:49 +08:00
Aviv David
6339fd739e
DataTypes update 8.8 RC1 (#15036) 2026-04-16 13:43:10 +03:00
Moti Cohen
fa6d4c3d63
Fix SIGABRT in HSETEX when a field appears twice in the FIELDS list (#14956)
HSETEX crashed on assert() with a SIGABRT when the same field appeared
more than once in the FIELDS list and an expiry time was given
(EX/PX/EXAT/PXAT).

Root cause: hfieldPersist() and the KEEP_TTL path in hashTypeSet() both
asserted that dictExpireMeta->expireMeta.trash == 0, meaning the hash
must be globally registered in the HFE DS. This is incorrect during
HSETEX execution because hashTypeSetExDone(), which registers the hash
globally and clears trash, called only at the end of flow. The private
per-field ebuckets are fully valid regardless of the global registration state.

Fix: Remove both incorrect assertions. The operations on the private
ebuckets (ebRemove in hfieldPersist, ebAdd in the KEEP_TTL path) are
correct and do not require the hash to be globally registered.

Tests: Added two regression tests covering the crash scenarios:
- HSETEX EX with a duplicate field (existing field, expiry given)
- HSETEX FNX EX with a duplicate field (no prior field, FNX condition
passes)
2026-04-16 13:16:52 +03:00
Ozan Tezcan
eb74450fca
Log node address when ASM starts (#15056)
Log source/destination address on import/migrate start events for easier
debugging.
2026-04-16 12:13:01 +03:00
Mincho Paskalev
3bcfbbe92a
Add new OBJ_GCRA type (#14905)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
[PR ](https://github.com/redis/redis/pull/14826) introduced a new rate
limiting command which stores its internal implementation-detail data
into a string key.

Since this will prevent a client from detecting type errors or
accidental overwrites or value invalidations, f.e via SET or INCR this
PR introduces a new data type - OBJ_GCRA specifically created for that
new command.

Furthermore, a new RATE_LIMIT KSN type was introduced for emitting "gcra" events on such keys.

GCRASETTAT was renamed to GCRASETVALUE.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2026-04-15 17:46:22 +03:00
Ozan Tezcan
b89bc044a3
Reduce overhead in command propagation (#15003)
Refactor command propagation code to reduce overhead on master

Currently, the main bottleneck is `feedReplicationBuffer()`. It is
called for each argument in the command and has bookkeeping overhead on
every call (e.g. checking whether to attach replicas to the replication
backlog). It is also not inlined by the compiler. These costs become
more visible with pipelining and commands with many arguments (e.g. HSET
with many fields).

Changes:

- Defer all bookkeeping to be done once per command instead of once per
command argument.
- Refactor the hot path so the compiler can inline
`replBufWriterAppend()`.
- Add `replBufWritterAppendBulkLen()` that uses shared RESP headers for
small values, avoiding formatting overhead.

These changes should not introduce any behavioral change.

**TODO:** In a follow-up PR, explore forwarding the exact command from
the client querybuf to avoid re-serialization. Many commands are
propagated without modification and can benefit from this.

--


| Benchmark | Before (ops/s) | After (ops/s) | Improvement |
|---|---|---|---|
| SET | 256,048 | 265,131 | **+3%** |
| SET (pipeline) | 1,477,310 | 1,671,272 | **+13%** |
| HSET 10 fields | 145,000 | 158,000 | **+9%** |
| HSET 10 fields (pipeline) | 363,483 | 430,855 | **+18%** |
| HSET 10 fields, 15B values (pipeline) | 387,443 | 487,135 | **+26%** |
| ZADD 5 members | 180,700 | 193,519 | **+7%** |
| ZADD 5 members (pipeline) | 466,453 | 564,872 | **+21%** |

------
Co-authored-by: Yuan Wang <yuan.wang@redis.com>
2026-04-15 17:08:36 +03:00
Yuan Wang
2f1a8b2bad
Dismiss dict bucket arrays in fork child to reduce CoW (#14979)
During RDB saving and AOF rewriting, the fork child already dismisses
(madvise(MADV_DONTNEED)) individual key-value objects after serializing them.
However, the hash table bucket arrays of each dict were never dismissed,
leaving large contiguous allocations subject to CoW when the parent
modifies them.

This PR extends the dismiss mechanism to cover dict bucket arrays,
reducing CoW memory overhead.

- **Expires kvstore** — dismissed upfront before saving starts, since the
child never accesses expires directly, after embeding expire time in the key object.
- **Slot dicts** (cluster mode) — dismissed per-slot as the iterator moves
   to the next slot during RDB saving or AOF rewriting.
- **DB keys kvstore** (standalone mode) — dismissed per-DB after each DB is
   fully serialized during RDB saving or AOF rewriting.
2026-04-15 20:34:36 +08:00
Salvatore Sanfilippo
670993a89d
Replace fast_float C++ library with pure C implementation (#14661)
The fast_float dependency required C++ (libstdc++) to build Redis. This
commit replaces the 3800-line C++ template library with a minimal pure C
implementation (~360 lines) that provides the same functionality needed
by Redis.

This is **very important** because Redis build process would fail
without g++ installed, a common situation in Linux distributions even
after installing the basic build tools: we want the build process of
Redis to be the simplest possible. Also Redis sometimes is compiled in
embedded systems lacking the g++ toolchain. There is no reason to depend
on C++ in a project written in C.

## The C implementation uses
1. Fast path (Clinger's algorithm) for numbers with mantissa <= 2^53 and
exponent in [-22, 22], covering ~99% of real-world cases.
2. Fallback to strtod() for complex cases to ensure correctly-rounded
results.

## Changes
- Move new fast_float_strtod.c(C implementation) from deps into Redis
core since it is now a single file and no longer needs a separate
directory.
- Remove all c++ dependencies

The implementation was tested against both strtod and the original C++
implementation with 10,000+ test cases including edge cases, special
values (inf/nan), and random inputs.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Mincho Paskalev <minchopaskal@gmail.com>
Co-authored-by: Moti Cohen <moti.cohen@redis.com>
2026-04-15 20:33:55 +08:00
Vitah Lin
3cd464263b
Fix gen_write_load error on MOVED/ASK during atomic-slot-migration tests (#15016)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
2026-04-15 08:34:40 +08:00
Moti Cohen
3f810d35bf
Introduce internal append-only pointer vector DS (#15039)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Refactoring work for follow-ups (e.g. subkey notifications
#14958), splitting reusable infrastructure from feature logic.

Optimized for stack allocation with optional growth to heap. Usage:

Start on stack (grow to heap):
  vec v;
  void *vstack[8];
  vecInit(&v, vstack, 8);

Start embedded (grow to heap):
  typedef struct {
    vec v;
    void *vembedded[8];
  } obj;
  vecInit(&obj.v, obj.vembedded, 8);

Heap only (capacity 8 or 0):
  vecInit(&v, NULL, 8);
  vecInit(&v, NULL, 0);

Reserve based on size:
  vecInit(&v, vstack, 8);
  vecReserve(&v, varsize); // <=8 uses stack, else heap
2026-04-14 18:45:48 +03:00
debing.sun
2049c7fe32
Fix wrong argv index in xinfoReplyWithStreamInfo for slot alloc size tracking (#15037)
`xinfoReplyWithStreamInfo` passed the wrong key(c->argv[1]) instead of
`c->argv[2]` to `updateSlotAllocSize` when updating per-slot memory
tracking.

Fix by passing the key explicitly to `xinfoReplyWithStreamInfo` instead
of relying on a hardcoded argv index.
Also, add the `-DDEBUG_ASSERTIONS` flag to the test-ubuntu-jemalloc CI
to cover this debug assertion.
2026-04-14 19:26:42 +08:00
Sergei Georgiev
80f1ebda88
Add AGGREGATE COUNT option to ZUNION, ZINTER, ZUNIONSTORE, and ZINTERSTORE (#14892)
Some checks failed
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
### Overview

This PR adds a new `COUNT` aggregation mode to the `ZUNIONSTORE`,
`ZINTERSTORE`, `ZUNION`, and `ZINTER` sorted set commands. When
`AGGREGATE COUNT` is specified, the resulting score for each element
reflects how many input sets contain it (optionally scaled by
`WEIGHTS`), rather than combining the actual scores of the elements.
This enables a common use case — counting set membership frequency —
directly at the command level, without application-side workarounds.

### Problem Statement

For developers who need to know **how many input sorted sets contain
each element**, there is no single-command solution today.

**Example:** given several game leaderboards, find how many leaderboards
each player appears in.

The existing aggregation modes (`SUM`, `MIN`, `MAX`) all operate on the
elements' scores. To ignore scores and just count set membership, you'd
currently need to copy each sorted set with all scores set to 1, then
run `ZUNIONSTORE`/`ZINTERSTORE` with `SUM` — requiring multiple round
trips, temporary keys, and application-level locking to avoid races.

A `COUNT` aggregation mode solves this directly.

### Solution

Introduces `AGGREGATE COUNT` as a fourth aggregation mode:

- `ZINTER numkeys key [key ...] [WEIGHTS weight [weight ...]] [AGGREGATE
<SUM | MIN | MAX | COUNT>] [WITHSCORES]`
- `ZINTERSTORE destination numkeys key [key ...] [WEIGHTS weight [weight
...]] [AGGREGATE <SUM | MIN | MAX | COUNT>]`
- `ZUNION numkeys key [key ...] [WEIGHTS weight [weight ...]] [AGGREGATE
<SUM | MIN | MAX | COUNT>] [WITHSCORES]`
- `ZUNIONSTORE destination numkeys key [key ...] [WEIGHTS weight [weight
...]] [AGGREGATE <SUM | MIN | MAX | COUNT>]`

When `COUNT` is specified, **the scores in the input sets are ignored**.
Note that `WEIGHTS` is **not** ignored — each set contributes its weight
(default 1) per element, and the contributions are summed.

**Implementation details:**

A new helper function `zuiWeightedScore()` computes the per-set
contribution:

```c
inline static double zuiWeightedScore(double score, double weight, int aggregate) {
    return (aggregate == REDIS_AGGR_COUNT) ? weight : weight * score;
}
```

The `zunionInterAggregate()` function treats `COUNT` identically to
`SUM` — it adds the per-set contributions. All four call sites where
`weight * score` was previously computed inline are updated to use
`zuiWeightedScore()`.

### Examples

```
> ZADD s1 1 foo 1 bar
> ZADD s2 2 foo 2 bar
> ZADD s3 3 foo
```

**With `SUM` (existing behavior, for comparison):**

```
> ZINTERSTORE t1 3 s1 s2 s3 WEIGHTS 10 5 3 AGGREGATE SUM
(integer) 1
> ZRANGE t1 0 -1 WITHSCORES
1) "foo"
2) "29"

> ZUNIONSTORE t1 3 s1 s2 s3 WEIGHTS 10 5 3 AGGREGATE SUM
(integer) 2
> ZRANGE t1 0 -1 WITHSCORES
1) "bar"
2) "20"
3) "foo"
4) "29"
```

**With `COUNT` and `WEIGHTS`:**

```
> ZINTERSTORE t1 3 s1 s2 s3 WEIGHTS 10 5 3 AGGREGATE COUNT
(integer) 1
> ZRANGE t1 0 -1 WITHSCORES
1) "foo"
2) "18"

> ZUNIONSTORE t1 3 s1 s2 s3 WEIGHTS 10 5 3 AGGREGATE COUNT
(integer) 2
> ZRANGE t1 0 -1 WITHSCORES
1) "bar"
2) "15"
3) "foo"
4) "18"
```

**With `COUNT` and no specified `WEIGHTS`** — resulting score equals the
number of input sorted sets containing the element:

```
> ZINTERSTORE t1 3 s1 s2 s3 AGGREGATE COUNT
(integer) 1
> ZRANGE t1 0 -1 WITHSCORES
1) "foo"
2) "3"

> ZUNIONSTORE t1 3 s1 s2 s3 AGGREGATE COUNT
(integer) 2
> ZRANGE t1 0 -1 WITHSCORES
1) "bar"
2) "2"
3) "foo"
4) "3"
```

### Backward Compatibility

This is a fully additive change. The new `COUNT` keyword is only
recognized after the `AGGREGATE` token in the four affected commands.
Existing commands, arguments, and default behavior (`AGGREGATE SUM`) are
completely unchanged. No new command is introduced, and no existing
response format is modified.
2026-04-14 09:21:53 +03:00
Moti Cohen
e1d35aca01
Fix HEXPIRE numfields overflow (#15021)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Validate HEXPIRE-family field counts without parser overflow
keep flexible option order; only require fields fit in argv
add tests for INT_MAX numfields across HEXPIRE/HPEXPIRE/HEXPIREAT/HPEXPIREAT
2026-04-13 09:46:46 +03:00
h.o.t. neglected
e8da0e5b47
Fix brittle assert_match patterns for unexpected slowlog fields (#14948) 2026-04-13 14:45:14 +08:00
ShubhamTaple
0d85627bf0
Use no_value dict type for stream_idmp_keys to explicitly mark it as a key-only set (#14987)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
Fixes #14985

### Problem

dict stream_idmp_keys was using objectKeyPointerValueDictType, in this
dict type dicts are expected to have RObj as keys and Pointers as
values, but stream_idmp_keys was not using the value field at all.

### Solution
This PR fixes the above issue by implementing new dict type
(objectKeyNoValueDictType) for stream_idmp_keys

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2026-04-10 23:25:56 +08:00
Momchil Marinov
ae9552663d
RED-183356: Automate tarball creation (#14911)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
This PR implements the tarball creation job by reusing 01 script. 
It splits the original job to smaller jobs and moves the gate and test
jobs before the upload job.
The job outputs the SHA of the tar and the size.  
Link to a run:
https://github.com/m-marinov/redis/actions/runs/23437802059
2026-04-09 17:58:37 +03:00
dagecko
e97fe246aa
Pin third-party action to commit SHA and move secrets to step env (#14937)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
2026-04-09 10:17:39 +08:00
Sergei Georgiev
0be39e5032
Fix missing consumer propagation on empty XREADGROUP (#14963)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
## Summary

Fixes consumer replication inconsistency when `XREADGROUP` is called for
a new consumer but no `XCLAIM` commands are propagated to the replica.
Previously, consumer creation was only propagated to replicas when
`noack=true`, relying on `XCLAIM` propagation to implicitly create the
consumer in the non-NOACK path. However, if no messages exist to read,
no `XCLAIM` is generated, and the consumer is silently lost on the
replica.

This is a follow-up to the original fix in
[redis/redis#7140](https://github.com/redis/redis/issues/7140) /
[redis/redis#7526](https://github.com/redis/redis/pull/7526), which
introduced `XGROUP CREATECONSUMER` propagation but only for the `NOACK`
case.

## Changes

- **`xreadgroupCommand` (src/t_stream.c):** Replaced the `if (noack)`
guard around the `streamPropagateConsumerCreation()` call with a
deferred check after `streamReplyWithRange()`. Consumer creation is now
propagated when `noack || propCount == 0` — that is, only when no
`XCLAIM` commands were generated. This avoids redundant propagation in
the common case where `XCLAIM` already implicitly creates the consumer
on the replica, while correctly handling both the NOACK path (where
PEL/XCLAIM is skipped entirely) and the no-messages path (where there is
nothing to XCLAIM).
- **Test (tests/unit/type/stream-cgroups.tcl):** Added replication test
`"XREADGROUP propagates new consumer to replica"` that sets up a
master-replica pair and verifies consumer propagation in two cases: (1)
without NOACK when no messages are available to deliver, and (2) with
NOACK when messages are delivered but XCLAIM is skipped.

## Benefits

- **Master-replica consistency:** Consumers created by `XREADGROUP` are
now visible on replicas whenever no `XCLAIM` would otherwise create them
— covering both the NOACK path and the empty-stream path.
- **No redundant propagation:** The noack || propCount == 0 condition
avoids emitting a superfluous XGROUP CREATECONSUMER when XCLAIM commands
are already propagated and would implicitly create the consumer on the
replica.
2026-04-08 14:59:22 +03:00
charsyam
c77d60d6b8
fix trivial double-free issue in rdbLoadObject (#15011)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
2026-04-08 09:50:51 +08:00
Sergei Georgiev
747dfe578e
Add XNACK command for releasing stream messages back to the group (#14797)
Some checks failed
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
### Overview

This PR enhances Redis Streams consumer groups by adding a new `XNACK`
command that allows consumers to explicitly release pending messages
back to the group without acknowledging them. Released (NACKed) entries
become immediately available for re-delivery to other consumers,
eliminating the idle-timeout delay currently required for message
recovery. The command supports three modes — SILENT, FAIL, and FATAL —
giving consumers fine-grained control over delivery counter semantics to
handle graceful shutdowns, transient failures, and poison messages
respectively.

### Problem Statement

For developers using Redis Streams with consumer groups, there are
several common scenarios where a consumer needs to release a message it
has claimed without acknowledging it:

1. **Transient internal failures**: A consumer may fail to process a
message because of problems unrelated to the message itself — for
example, it cannot connect to an external service to fetch required
context. The message is perfectly valid and should be retried promptly
by another consumer.

2. **Resource pressure**: A consumer under resource stress (low CPU, low
memory) may be unable to handle a specific message (e.g., a complex or
large message) within acceptable QoS. It should leave the opportunity to
other consumers in the group, with minimal delay.

3. **Graceful shutdown**: A consumer about to shut down would like to
immediately release all unprocessed messages it has claimed, so they can
be picked up by remaining consumers without waiting for idle timeouts.

4. **Poison / malicious messages**: A consumer may detect or suspect
that a claimed message is invalid or malicious and wants to mark it as
permanently failed (for dead-letter queue processing when available).

**Currently, a consumer cannot NACK a message.** It can either:

- **XACK** it — marks it as "processed" and removes it from the PEL
entirely, losing the ability to redeliver it
- **Leave it pending** — requires other consumers to discover it via
`XPENDING` and claim it via `XCLAIM`/`XAUTOCLAIM` or `XREADGROUP CLAIM`
after the idle timeout expires, introducing a long, unnecessary delay

In all these cases, the logic that applications must implement
introduces **message handling delays**, **implementation complexity**,
and **code duplication** across consumer implementations.

### Solution

Introduces a new `XNACK` (Negative ACKnowledge) command that explicitly
releases pending messages from their owning consumer back to the group's
PEL, making them immediately claimable via `XCLAIM` and `XAUTOCLAIM`,
and prioritized for re-delivery in `XREADGROUP CLAIM`:

```
XNACK key group <SILENT|FAIL|FATAL> IDS numids id [id ...] [RETRYCOUNT count] [FORCE]
```

When executed, the command:

1. **Disassociates** the entry from its owning consumer (`consumer =
NULL`)
2. **Repositions** the entry to the head of the PEL time-ordered list
(`delivery_time = 0`), making it immediately claimable with any
`min-idle-time` threshold
3. **Adjusts the delivery counter** based on the specified mode, giving
consumers fine-grained control over retry semantics
4. **Returns** the count of successfully NACKed entries

**Mode** controls the delivery counter adjustment and communicates the
reason for the NACK:

| Mode | Delivery Counter Behavior | Use Case |

|----------|---------------------------------------------------|---------------------------------------------|
| `SILENT` | Decrement by 1 (undo the delivery increment) | Consumer
shutdown / transient internal error — the delivery "didn't count" |
| `FAIL` | No change (keep the incremented value) | Message too complex
for this consumer, but may work for others — count this as an attempt |
| `FATAL` | Set to `LLONG_MAX` | Invalid / suspected malicious message —
mark as permanently failed |

The three modes map directly to the real-world scenarios above:

- **SILENT** for graceful shutdown or transient failures unrelated to
the message
- **FAIL** for resource-constrained consumers that cannot handle a
specific message
- **FATAL** for poison message detection and dead-letter queue
integration

**Optional parameters:**

- **`RETRYCOUNT count`**: Directly sets `delivery_count` to the
specified value, overriding the mode-based adjustment
- **`FORCE`**: Creates new unowned PEL entries for IDs that are not
already in the group PEL (the entry must exist in the stream). When
`FORCE` creates an entry, the delivery counter is set to `0` (or to
`RETRYCOUNT` if specified, or to `LLONG_MAX` if mode is `FATAL`). This
is used internally for AOF rewrite and replication.

### Response Format

The command returns an integer — the number of messages successfully
NACKed (released back to the group PEL):

```
127.0.0.1:6379> XADD mystream 1-0 f v1
"1-0"
127.0.0.1:6379> XADD mystream 2-0 f v2
"2-0"
127.0.0.1:6379> XGROUP CREATE mystream grp 0
OK
127.0.0.1:6379> XREADGROUP GROUP grp c1 STREAMS mystream >
1) 1) "mystream"
   2) 1) 1) "1-0"
         2) 1) "f"
            2) "v1"
      2) 1) "2-0"
         2) 1) "f"
            2) "v2"
127.0.0.1:6379> XNACK mystream grp FAIL IDS 2 1-0 2-0
(integer) 2
```

After XNACK, the entries appear with an empty consumer in XPENDING
output:

```
127.0.0.1:6379> XPENDING mystream grp - + 10
1) 1) "1-0"
   2) ""
   3) (integer) -1
   4) (integer) 1
2) 1) "2-0"
   2) ""
   3) (integer) -1
   4) (integer) 1
```

### NACK Zone: Data Structure Extension

To support unowned PEL entries and ensure they are prioritized for
re-delivery, a **NACK zone** is introduced at the head of the existing
PEL time-ordered doubly-linked list. A new `pel_nack_tail` pointer is
added to the `streamCG` structure:

**PEL ordering:**

```
[pel_time_head] <-> ... <-> [pel_nack_tail] <-> [owned entries...] <-> [pel_time_tail]
|_____________ NACK zone ______________|   |_______ normal PEL ________|
```

The head of the PEL contains all NACKed messages (FIFO-ordered),
followed by all delivered messages that were not NACKed (same order as
today). This ensures NACKed messages are always prioritized over idle
pending messages.

The delivery order for `XREADGROUP` is therefore:
1. If `CLAIM` was specified: first deliver NACKed messages, then deliver
due pending messages (current behavior)
2. Deliver new entries after the group's last-delivered-id (current
behavior)

**Structure Design:**

- NACKed entries occupy positions from `pel_time_head` to
`pel_nack_tail` in the time-ordered list
- Their `delivery_time` is set to `0`, ensuring they always appear
"oldest" and are immediately claimable
- Their `consumer` pointer is set to `NULL`, marking them as unowned
- `pel_nack_tail` is `NULL` when no NACKed entries exist

**Key Properties:**

- **O(1) insertion**: New NACKed entries are inserted right after
`pel_nack_tail` (or at the list head if the zone is empty)
- **FIFO ordering** among NACKed entries: entries are NACKed in the
order they are released
- **Immediate claimability**: Since `delivery_time = 0`, NACKed entries
have maximum idle time and satisfy any `min-idle-time` threshold in
`XCLAIM` and `XAUTOCLAIM`, In `XREADGROUP CLAIM`, NACKed entries are
also prioritized over other pending entries due to their position at the
head of the PEL.
- **Zone integrity**: The `pelListInsertSorted` function is updated to
stop scanning at the `pel_nack_tail` boundary, ensuring owned entries
are never placed inside the NACK zone

### Impact on Existing Commands

All commands that interact with the PEL are updated to handle unowned
(`consumer = NULL`) entries:

- **XPENDING**: Shows NACKed entries with an empty consumer name
- **XCLAIM / XAUTOCLAIM**: Can claim NACKed entries (they satisfy any
min-idle-time since `delivery_time = 0`)
- **XREADGROUP CLAIM**: NACKed entries are picked up by the claim phase
- **XACK**: Works correctly on NACKed entries (removes from group PEL)
- **XINFO STREAM FULL**: Displays NACKed entries with an empty consumer
name
- **XGROUP DELCONSUMER**: Unaffected — NACKed entries are not in any
consumer's PEL

Propagation is also updated: when `XCLAIM` or `XAUTOCLAIM` encounters a
deleted stream entry for an unowned NACK, it propagates `XACK` (instead
of `XCLAIM`) to replicas and AOF, since there is no source consumer to
reference.

### Persistence

**RDB:**

- A new RDB type `RDB_TYPE_STREAM_LISTPACKS_5` (type 27) is introduced
- After saving consumer PEL entries, the NACK zone stream IDs are saved
separately (count + encoded IDs)
- On load, NACK zone entries are reconstructed by looking them up in the
group PEL, unlinking from their sorted position, and re-inserting into
the NACK zone via `pelListInsertNacked`
- Backward compatibility is preserved: old RDB types continue to load
with the existing validation (all entries must have consumers)

**AOF:**

- AOF rewrite emits `XNACK <key> <group> FAIL IDS <n> <id...> RETRYCOUNT
<cnt> FORCE` commands for entries in the NACK zone
- Consecutive entries with the same `delivery_count` are batched into a
single command (up to `AOF_REWRITE_ITEMS_PER_CMD` IDs per command)

### Defragmentation

The defragmentation logic is restructured to handle unowned entries:

- **`defragStreamCGPendingEntry`** (new): Walks the group-level PEL rax,
defragments each NACK, updates the doubly-linked list pointers
(`pel_prev`, `pel_next`), `pel_time_head`, `pel_time_tail`,
`pel_nack_tail`, and the consumer PEL back-pointer for owned entries
- **`defragStreamConsumerPendingEntry`** (simplified): Only fixes up
back-pointers to the possibly-relocated consumer and CG, since actual
defragmentation is now done at the group-level walk. Unowned (NACK zone)
entries have no consumer PEL walk, so the group-level pass is their only
chance

### Key Benefits

- **Immediate re-delivery**: NACKed entries are instantly claimable by
other consumers via `XCLAIM` and `XAUTOCLAIM` (since `delivery_time = 0`
satisfies any `min-idle-time`), and prioritized for re-delivery in
`XREADGROUP CLAIM`, eliminating idle-time delays that can range from
seconds to minutes
- **Explicit release semantics**: Consumers can release messages
intentionally, with fine-grained control over retry behavior — a
capability that exists in competing systems like RabbitMQ
- **Flexible retry control**: Three modes (SILENT, FAIL, FATAL) plus
RETRYCOUNT cover the full spectrum of failure handling strategies, from
graceful shutdown to poison message detection
- **Reduced application complexity**: Eliminates the need for
application-level workarounds involving XPENDING polling, arbitrary idle
timeouts, and manual XCLAIM orchestration
- **Dead-letter queue readiness**: FATAL mode + delivery count enables
straightforward poison message detection and future DLQ integration
- **Backward compatibility**: Fully optional new command with zero
breaking changes to existing behavior
2026-04-07 14:17:53 +03:00
Moti Cohen
153b79a290
keymeta: add DEBUG flag for runtime keymeta class registration (#14968)
M_CreateKeyMetaClass() allows registration only on:
- 'DEBUG enable-module-keymeta-runtime-registration 1' (replaces server.enable_debug_cmd)
- REDISMODULE_CTX_FLAGS_SERVER_STARTUP, in addition to module->onload
2026-04-07 12:31:53 +03:00
Moti Cohen
d22c68f904
Partial support set keymeta on ksn (#15004)
As part of KSN, modules must not modify keys. However, RediSearch
modifies key metadata in some flows, which may invalidate the local
kvobj pointer.

Introduce KSN_INVALIDATE_KVOBJ() to explicitly invalidate kvobj after
notifications, preventing further access by Redis core. Currently
relevant for hash keys without HFE.

Changes:
- Add KSN_INVALIDATE_KVOBJ() to guard unsafe flows
- Apply invalidation beyond hash-specific paths
- Extend KSN side-effect coverage for DELEX and MOVE
- Rearrange flows to avoid kvobj access after notification
- Include additional tests from @JoanFM (#14939)

Behavior:
No intended behavior change and no reordering of notifications.
2026-04-07 12:15:26 +03:00
Huihui Huang
fe3ae0ccac
Fix memory leak in clusterManagerFixSlotsCoverage() in src/redis-cli.c (#14863)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
2026-04-03 22:34:32 +08:00
Yuan Wang
5aa6440cbc
Disable memory tracking in child processes (#14928)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Disable memory tracking in child processes (forked for RDB operations)
to reduce unnecessary overhead.
2026-04-03 15:10:13 +08:00
Sergei Georgiev
8d2b548f84
Fix IDMP tracking not cleared when resetting stream IDMP config (#14955)
## Summary

Unregisters stream keys from `db->stream_idmp_keys` when IDMP
configuration is changed via `XCFGSET`. Previously, changing
`IDMP-DURATION` or `IDMP-MAXSIZE` would clear all IDMP producers but
leave the key registered in the tracking dictionary, causing the cron
job `handleExpiredIdmpEntries()` to needlessly iterate over streams with
no IDMP data.

## Changes

- **`xcfgsetCommand` (src/t_stream.c):** Added
`dictDelete(c->db->stream_idmp_keys, key)` in the `if (changed)` block
to immediately untrack the key when IDMP configuration changes clear all
producers.

## Benefits

- **Immediate cleanup of tracking state:** The stream key is removed
from `stream_idmp_keys` when configuration changes, rather than relying
on the cron to detect and clean up the stale entry on a subsequent pass.
- **Reduced unnecessary cron work:** The cron no longer wastes cycles
inspecting streams that have no IDMP producers.
2026-04-03 09:17:11 +03:00
Gagan Dhakrey
c5287c0b0d
Fix memory leak in helloacl module test (#14974)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
2026-04-03 11:09:40 +08:00
Lior Kogan
a0bad9a048
Update descriptions in gcra.json (#14969)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
Low risk documentation-only change updating `src/commands/gcra.json`
reply field descriptions with no functional or behavioral impact.
2026-04-01 12:58:51 +03:00
debing.sun
effcb5a03c
Fix flaky cluster pubsubshard test in 26-pubsubshard (#14962)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
In the "PUBSUB channels/shardchannels" test, we call sunsubscribe
without channels, but the number of loops in
consume_subscribe_messages() is determined by the size of channels.
When channels are empty, the loop will loop 0 times and will not read
the sunsubscribe response message returned by the server.
This means that when verifying the channel length, the previous command
might not have been complete yet, so this PR added a read after
sunsubscribe.
2026-04-01 06:50:58 +08:00
Moti Cohen
8f3b6990dd
keymeta: rename void *value to void *reserved in rdb_save/aof_rewrite callbacks (#14964)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Rename the `void *value` parameter to `void *reserved` in keymeta
`rdb_save` and `aof_rewrite` module callbacks, and pass `NULL` at call
sites.

Originally the `value` parameter was planned to pass the internal kvobj
for core use of key metadata, but since modules cannot use it in any
meaningful way, it should not be exposed. The parameter is kept as a
reserved slot for potential future use.
2026-03-31 15:09:06 +03:00