redis

mirror of https://github.com/redis/redis.git synced 2026-02-03 20:39:54 -05:00

Author	SHA1	Message	Date
Filipe Oliveira (Redis)	3c96680cfb	Enable hardware clock by default on ARM AArch64. (#14676 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details Redis can already use a processor-provided hardware counter as a high-performance monotonic clock. On some architectures this must be enabled carefully, but on ARM AArch64 the situation is different: - The ARM Generic Timer is architecturally mandatory for all processors that implement the AArch64 execution state. - The system counter (`CNTVCT_EL0`) and its frequency (`CNTFRQ_EL0`) are guaranteed to exist and provide a monotonic time source (per the “The Generic Timer in AArch64 state” section of the Arm® Architecture Reference Manual for Armv8-A — https://developer.arm.com/documentation/ddi0487/latest). Because of this architectural guarantee, it is safe to enable the hardware clock by default on ARM AArch64. Like detailed bellow, this gives us around 5% boost on io-thread deployments for a simple strings benchmark.	2026-01-13 20:12:04 +08:00
Salvatore Sanfilippo	60a4fa2e4b	Vsets: Remove stale note about replication from README. (#14528 )	2026-01-13 16:13:59 +08:00
Moti Cohen	cc1660abdd	Refactor dict key encoding and fix defrag tag bit bug (#14682 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details Introduce encodeEntryKey() helper to centralize key encoding logic for no_value dicts, replacing 4 instances of duplicated code. This also fixes a bug in dictDefragBucket() where: - Before: bucketref = newkey (loses ENTRY_PTR_IS_EVEN_KEY tag) - After: bucketref = encodeEntryKey(d, newkey) (preserves tag bits) The bug affects dicts with no_value=1 and keys_are_odd=0 when defragKey callback returns a relocated pointer. Currently theoretical as main DB dict uses defragKey=NULL.	2026-01-12 13:19:03 +02:00
Salvatore Sanfilippo	391530cd15	[Vector sets]: redis-cli recall testing abilities (#14408 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details Vector sets have the ability to also ask for ground truth performing an O(N) scan. This allows to perform a recall test against any key holding a vector set, allowing users to verify what is the best EF value to use and how HNSW performs depending on the data set on a given key (the level of clustering changes significantly how vectors near/far a cluster will behave). --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2026-01-12 12:40:39 +08:00
Vitah Lin	e396dd3385	Fix flaky stream LRM test due to timing precision (#14674 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details	2026-01-09 10:14:44 +08:00
Yuan Wang	858a8800e2	Propagate migrate task info to replicas (#14672 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details - Allow replicas to track master's migrate task state Previously, we only propagate import task info to replicas, but now we also support propagating migrate task info, so the new master can initiate slots trimming again if needed after failover, this can avoid data redundancy. - Prevent replicas from initiating slot trimming actively Lack of data cleaning mechanism on source side, so we allow replicas to continue pending slot trimming, but it is not good idea to let replicas trim actively. As we introduce above feature, we can delete this logic	2026-01-08 19:06:57 +08:00
Moti Cohen	29f733484a	Optimize ZRANK by avoiding string comparisons during skiplist traversal (#14636 ) This optimization is based on Valkey valkey-io/valkey#1389 ZRANK no longer performs per-level string comparisons when walking the skiplist. Instead, it retrieves the skiplist node directly from the hash table entry via pointer arithmetic. Rank is computed by walking upward from the node and summing spans using stored node heights, eliminating costly byte-wise comparisons. This improves ZRANK throughput by 2–14% depending on score distribution. --------- Co-authored-by: Ran Shidlansik <ranshid@amazon.com>	2026-01-08 11:20:52 +02:00
Slavomir Kaslev	5aa47347e7	Fix CLUSTER SLOT-STATS test Lua scripts (#14671 ) Fix hard-coded keys in test Lua scripts which is incompatible with cluster-mode. Reported-by: Oran Agra <oran@redis.com>	2026-01-08 11:16:50 +02:00
Stav-Levi	73249497d4	Fix ACL key-pattern bypass in MSETEX command (#14659 ) MSETEX doesn't properly check ACL key permissions for all keys - only the first key is validated. MSETEX arguments look like: MSETEX <numkeys> key1 val1 key2 val2 ... EX seconds Keys are at every 2nd position (step=2). When Redis extracts keys for ACL checking, it calculates where the last key is: last = first + numkeys - 1; => calculation ignores step last = first + (numkeys-1) * step; With 2 keys starting at position 2: Bug: last = 2 + 2 - 1 = 3 → only checks position 2 Fix: last = 2 + (2-1)*2 = 4 → checks positions 2 and 4 Fixes #14657	2026-01-08 08:41:55 +02:00
debing.sun	85ab4cab58	Fix UBSan error in stream trim when processing last entry (#14669 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details ## Summary This bus was introduced by https://github.com/redis/redis/pull/14623 Before PR #14623, when a stream node was going to be fully removed, we would just delete the whole node directly instead of iterating through and deleting each entry. Now, with the XTRIM/XADD flags, we have to iterate and delete entries one by one. However, the implementation in issue #8169 didn’t consider the case where all entries are removed, so `p` can end up being NULL. Fixes an UndefinedBehaviorSanitizer error in `streamTrim()` when marking the last entry in a listpack as deleted. The issue occurs when performing pointer arithmetic on a NULL pointer after `lpNext()` reaches the end of the listpack. ## Solution If p is NULL, we skip the delta calculation and the calculation of new `p`.	2026-01-07 20:51:41 +08:00
Salvatore Sanfilippo	154fdcee01	Test tcp deadlock fixes (#14667 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details Disclaimer: this patch was created with the help of AI My experience with the Redis test not passing on older hardware didn't stop just with the other PR opened with the same problem. There was another deadlock happening when the test was writing a lot of commands without reading it back, and the cause seems related to the fact that such tests have something in common. They create a deferred client (that does not read replies at all, if not asked to), flood the server with 1 million of requests without reading anything back. This results in a networking issue where the TCP socket stops accepting more data, and the test hangs forever. To read those replies from time to time allows to run the test on such older hardware. Ping oranagra that introduced at least one of the bulk writes tests. AFAIK there is no problem in the test, if we change it in this way, since the slave buffer is going to be filled anyway. But better to be sure that it was not intentional to write all those data without reading back for some reason I can't see. IMPORTANT NOTE: I am NOT sure at all that the TCP socket senses congestion in one side and also stops the other side, but anyway this fix works well and is likely a good idea in general. At the same time, I doubt there is a pending bug in Redis that makes it hang if the output buffer is too large, or we are flooding the system with too many commands without reading anything back. So the actual cause remains cloudy. I remember that Redis, when the output limit is reached, could kill the client, and not lower the priority of command processing. Maybe Oran knows more about this. ## LLM commit message. The test "slave buffer are counted correctly" was hanging indefinitely on slow machines. The test sends 1M pipelined commands without reading responses, which triggers a TCP-level deadlock. Root cause: When the test client sends commands without reading responses: 1. Server processes commands and sends responses 2. Client's TCP receive buffer fills (client not reading) 3. Server's TCP send buffer fills 4. Packets get dropped due to buffer pressure 5. TCP congestion control interprets this as network congestion 6. cwnd (congestion window) drops to 1, RTO increases exponentially 7. After multiple backoffs, RTO reaches ~100 seconds 8. Connection becomes effectively frozen This was confirmed by examining TCP socket state showing cwnd:1, backoff:9, rto:102912ms, and rwnd_limited:100% on the client side. The fix interleaves reads with writes by processing responses every 10,000 commands. This prevents TCP buffers from filling to the point where congestion control triggers the pathological backoff behavior. The test still validates the same functionality (slave buffer memory accounting) since the measurement happens after all commands complete. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 14:26:22 +08:00
Moti Cohen	da4c5eec82	Replace fragile dict stored-key API with getKeyId callback (#14646 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details This change simplifies the dictionary API for handling stored keys by replacing the previous dict stored-key mechanism with a cleaner `keyFromStoredKey` callback approach.	2026-01-06 18:57:28 +02:00
debing.sun	0cb1ee0dc1	New eviction policies - least recently modified (#14624 ) ### Summary This PR introduces two new maxmemory eviction policies: `volatile-lrm` and `allkeys-lrm`. LRM (Least Recently Modified) is similar to LRU but only updates the timestamp on write operations, not read operations. This makes it useful for evicting keys that haven't been modified recently, regardless of how frequently they are read. ### Core Implementation The LRM implementation reuses the existing LRU infrastructure but with a key difference in when timestamps are updated: - LRU: Updates timestamp on both read and write operations - LRM: Updates timestamp only on write operations via `updateLRM()` ### Key changes: Add `keyModified()` to accept an optional `robj *val` parameter and call `updateLRM()` when a value is provided. Since `keyModified()` serves as the unified entry point for all key modifications, placing the LRM update here ensures timestamps are consistently updated across all write operations --------- Co-authored-by: oranagra <oran@redislabs.com> Co-authored-by: Yuan Wang <yuan.wang@redis.com>	2026-01-06 20:57:31 +08:00
debing.sun	9ca860be9e	Fix XTRIM/XADD with approx not deletes entries for DELREF/ACKED strategies (#14623 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details This bug was introduced by #14130 and found by guybe7 When using XTRIM/XADD with approx mode (~) and DELREF/ACKED delete strategies, if a node was eligible for removal but couldn't be removed directly (because consumer group references need to be checked), the code would incorrectly break out of the loop instead of continuing to process entries within the node. This fix allows the per-entry deletion logic to execute for eligible nodes when using non-KEEPREF strategies.	2026-01-05 21:17:36 +08:00
debing.sun	4eda670de9	Fix infinite loop during reverse iteration due to invalid numfields of corrupted stream (#14472 ) Follow https://github.com/redis/redis/pull/14423 In https://github.com/redis/redis/pull/14423, I thought the last lpNext operation of the iterator occurred at the end of streamIteratorGetID. However, I overlooked the fact that after calling `streamIteratorGetID()`, we might still use `streamIteratorGetField()` to continue moving within the current entry. This means that during reverse iteration, the iterator could move back to a previous entry position. To fix this, in this PR I record the current position at the beginning of streamIteratorGetID(). When we enter it again next time, we ensure that the entry position does not exceed the previous one, that is, during forward iteration the entry must be greater than the last entry position, and during reverse iteration it must be smaller than the last entry position. Note that the fix for https://github.com/redis/redis/pull/14423 has been replaced by this fix.	2026-01-05 21:16:53 +08:00
Andy Pan	7511a1919b	Sanitize TCP_KEEPINTVL and simplify TCP_KEEPALIVE_ABORT_THRESHOLD on Solaris (#13142 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details This PR fixes a bug on Solaris where setsockopt() fails with EINVAL when TCP keepalive parameters fall below the kernel's 10-second minimum. When the user-configured interval is divided by 3 to calculate TCP_KEEPINTVL, values below 30 seconds result in intervals less than 10 seconds (e.g., interval=25 → intvl=8), causing connection failures. The fix adds if (intvl < 10) intvl = 10; to enforce the minimum and simplifies the TCP_KEEPALIVE_ABORT_THRESHOLD calculation for older Solaris versions. This behavior is documented in the Oracle Solaris TCP manual.	2026-01-05 09:57:33 +02:00
Moti Cohen	16068d6b63	Fix: Use dictSetKeyAtLink in activeDefragHfieldDictCallback (#14654 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details Problem: The activeDefragHfieldDictCallback was wrongly using dictSetKey() which set key and value. However, hash field dictionaries use no_value=1 (since PR #14595), causing assertion `assert(!d->type->no_value)` to fail Solution: * Replace `dictSetKey(d, (dictEntry )de, newEntry)` with `dictSetKeyAtLink(d, newEntry, &plink, 0)` which properly handles both regular dictEntry and the optimized `no_value=1` case where keys are stored directly in the hash table. The callback already receives the plink parameter pointing to the exact location that needs updating. Following PR #14595 value can be now optionally embedded in `entry`. As a result, `activeDefragEntry()` refines and defragments an entry’s value only when `entryGetValuePtrRef(entry) != NULL`.	2026-01-04 14:38:08 +02:00
igalperelman	ea72406275	Updated readme: added a Redis Cloud paragraph (#14651 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details Making Redis Cloud a little more visible in the readme file.	2026-01-04 09:09:14 +02:00
Andy Pan	eb2661a46d	Detect accept4() on specific versions of various platforms (#14558 ) This PR has mainly done three things: 1. Enable `accept4()` on DragonFlyBSD 4.3+ 2. Fix the failures of determining the presence of `accept4()` due to the missing <sys/param.h> on two OSs: NetBSD, OpenBSD 3. Drop the support of FreeBSD <10.0 for `redis`, FreeBSD 10 is past EOL, as are the two major versions following it, so defined(__FreeBSD__) is sufficient. - [param.h in DragonFlyBSD](`7485684fa5/sys/sys/param.h (L129-L257)`) - [param.h in FreeBSD](https://github.com/freebsd/freebsd-src/blob/main/sys/sys/param.h#L46-L76) - [param.h in NetBSD](`b5f8d2f930/sys/sys/param.h (L53-L70)`) - [param.h in OpenBSD](`d9c286e032/sys/sys/param.h (L40-L45)`) --------- Signed-off-by: Andy Pan <i@andypan.me>	2026-01-04 15:05:07 +08:00
zzj	0ef4a4e7e3	Fix some comment spelling typos (#14648 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details	2026-01-04 10:38:11 +08:00
RoyBenMoshe	29346eb7dd	Hide PII from ACL log (#14645 ) This PR continues the work from [#13400](https://github.com/redis/redis/pull/13400), following the discussion in [#11747](https://github.com/redis/redis/pull/11747#discussion_r1094418111), to further ensure sensitive user data is not exposed in logs when hide_user_data_from_log is enabled. - Introduce redactLogCstr() helper for safe, centralized log redaction. - Update ACL and networking log messages to use redacted values where appropriate. - Prevent leaking raw query buffer contents.	2026-01-04 10:35:30 +08:00
Yueyang (Terry) Tao	174307530b	Expand hash dicts using original length when rdb loading (#14635 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details During hash RDB loading (`rdbLoadObject`), the element count `len` is consumed as entries are read. In the listpack -> hashtable (HT) spillover path, we later used the remaining `len` for `dictTryExpand`. By that point `len` may no longer represent the original cardinality (and can be 0), which can skip/undersize the pre-sizing and lead to extra rehash/expansion work while loading large hashes. The same issue existed in the hash-with-metadata (field expire) load path.	2025-12-26 14:22:40 +08:00
Moti Cohen	e4b69f9a13	Remove dead code leftover (#14640 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details Flags defined (mutually exclusive): plainFlag = flags & RDB_LOAD_PLAIN sdsFlag = flags & RDB_LOAD_SDS robjFlag = !(plainFlag \|\| sdsFlag) If robjFlag is true, the function returns early. Otherwise we are in the plain/sds path: plainFlag → allocate with ztrymalloc_usable() sdsFlag → allocate with sdstrynewlen() Thus, in error handling only two cases exist: plainFlag → zfree(buf) else → sdsFlag → sdsfree(buf) The hfldFlag branch assumed a third allocation path that no longer exists after PR #14595, making entryFree(buf, NULL) unreachable.	2025-12-25 14:14:24 +02:00
Stav-Levi	860b8c772a	Add TLS certificate-based automatic client authentication (#14610 ) This PR implements support for automatic client authentication based on a field in the client's TLS certificate. We adopt ValKey’s PR: https://github.com/valkey-io/valkey/pull/1920 API Changes: Add New configuration tls-auth-clients-user - Allowed values: `off` (default), `CN`. - `off` – disable TLS certificate–based auto-authentication. - `CN` – derive the ACL username from the Common Name (CN) field of the client certificate. New INFO stat - `acl_access_denied_tls_cert` - Counts failed TLS certificate–based authentication attempts, i.e. TLS connections where a client certificate was presented, a username was derived from it, but no matching ACL user was found. New ACL LOG reason - Reason string: `"tls-cert"` - Emitted when a client certificate’s Common Name fails to match any existing ACL user. Implementation Details: - Added getCertFieldByName() utility to extract fields from peer certificates. - Added autoAuthenticateClientFromCert() to handle automatic login logic post-handshake. - Integrated automatic authentication into the TLSAccept function after handshake completion. - Updated test suite (tests/integration/tls.tcl) to validate the feature.	2025-12-25 14:07:58 +02:00
itayTziv	877c09f662	incrRefCount off-by-one error (#14647 ) The condition for blocking `o->refcount++` in `incrRefCount` is `if (o->refcount < OBJ_FIRST_SPECIAL_REFCOUNT)`, meaning refcount can accidentally reach the first special refcount (`OBJ_STATIC_REFCOUNT` currently). Fixed the condition to be `if (o->refcount < OBJ_FIRST_SPECIAL_REFCOUNT - 1)`	2025-12-25 18:51:57 +08:00
Moti Cohen	238a626859	Hash - Unify Field-Value into a single struct along with dict no_value=1 (#14595 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details Unifies field–value pairs and optional expiration into a single allocation, removing the generic mstr abstraction (mstr was discarded because its overly generic design added complexity and runtime overhead without clear benefits for hash workloads). The new Entry layout supports embedded values (≤128B) and pointer-based values, with expiration metadata integrated for per-field hash TTLs. Update hash dictionaries to no_value=1 and apply optimizations to avoid regressions. This significantly reduces hash memory usage (~30–50%) with minimal performance impact.	2025-12-23 12:19:00 +02:00
Ozan Tezcan	fde3576f88	Fix adjacent slot range behavior in ASM operations (#14637 ) This PR containts a few changes for ASM: Bug fix: - Fixes an issue in ASM when adjacent slot ranges are provided in CLUSTER MIGRATION IMPORT command (e.g. 0-10 11-100). ASM task keeps the original slot ranges as given, but later the source node reconstructs the slot ranges from the config update as a single range (e.g. 0-100). This causes asmLookupTaskBySlotRangeArray() to fail to match the task, and the source node incorrectly marks the ASM task as failed. Although the migration completes successfully, the source node performs a blocking trim operation for these keys, assuming the slot ownership changed outside of an ASM operation. With this PR, redis merges adjacent slot ranges in a slot range array to avoid this problem. Other improvements: - Indicates imported/migrated key count in the log once asm operation is completed. - Use error return value instead of assert in parseSlotRangesOrReply() - Validate slot range array that is given by cluster implementation on ASM_EVENT_IMPORT_START. --------- Co-authored-by: Yuan Wang <yuan.wang@redis.com>	2025-12-23 11:54:12 +03:00
h.o.t. neglected	c5f3d3e11c	Fix use-after-free in hnsw_cursor_free (#14627 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details Close https://github.com/redis/redis/issues/14626. Note that this method hasn't been used by any place.	2025-12-22 10:34:50 +08:00
John	0d5d75e04d	Fix incorrect comment about LRU clock resolution in initObjectLRUOrLFU (#14582 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details Because the LRU_CLOCK_RESOLUTION macro is 1000 and its comment is LRU clock resolution in ms	2025-12-20 19:30:15 +08:00
debing.sun	1e974e6311	Fix kvstoreGetFirstNonEmptyDictIndex() and kvstoreIteratorReset() for empty kvstore (#14625 ) These bugs was located by @rantidhar This PR fixes two related issues in kvstore iterator handling when dealing with empty kvstores: 1. If the kvstore is empty, kvstoreGetFirstNonEmptyDictIndex() may return 0. For example, during defragmentation, it may only be when calling kvstoreGetNextNonEmptyDictIndex() that the invalid slot is detected. This fix ensures that kvstoreGetFirstNonEmptyDictIndex() will eventually return -1 and terminate the defragmentation process. However, currently, when the kvstore is created, the number of dictionary arrays is at least 1, so this is just a defensive fix. 2. If a kvstoreIterator is initialized but not used by calling kvstoreIteratorNextDict() before it is released, then during the kvstoreIteratorReset(), using didx(-1) to access the dictionary array could lead to an out-of-bounds access. However, in the current code, there will never be a situation where kvstoreIteratorNextDict() is not called, so this is just a defensive fix. --------- Co-authored-by: rantidhar <ran.tidhar@redis.com>	2025-12-19 11:40:01 +08:00
Andy Pan	a9f0f07b7c	Merge kqueue events to reduce system calls (#14557 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details `kqueue` has the capability of batch applying events: > The kevent,() kevent64() and kevent_qos() system calls are used to register events with the queue, and return any pending events to the user. The changelist argument is a pointer to an array of kevent, kevent64_s or kevent_qos_s structures, as defined in <sys/event.h>. All changes contained in the changelist are applied before any pending events are read from the queue. The nchanges argument gives the size of changelist. This PR implements this functionality for `kqueue` with which we're able to reduce plenty of system calls of `kevent(2)`. ## References [FreeBSD - kqueue](https://man.freebsd.org/cgi/man.cgi?kqueue)	2025-12-18 19:51:02 +08:00
Yuan Wang	dd67275033	Unify slot migration logs across cluster implementations (#14628 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details Using a different cluster implementation instead of the legacy one may result in inconsistent slot migration logs, which can cause confusion. Therefore, we should centralize these logs within the slot migration process itself rather than relying on the specific cluster implementation.	2025-12-18 18:22:25 +08:00
Moti Cohen	2e69130ea3	Improve dict pointer tagging doc (#14616 ) Clarifies the pointer tagging scheme used in Redis dicts, particularly for the no_value=1 optimization introduced in #11595.	2025-12-18 09:24:45 +02:00
John	081693f32e	Fix incorrect comment about STATS_METRIC_* Macro in server.h (#14620 )	2025-12-18 14:43:05 +08:00
fanpei91	e6e0cf5764	Fix incorrect stream ID comparison in streamReplyWithRangeFromConsumerPEL() (#14619 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details Since all commands that invoke streamReplyWithRange with a group argument always pass end as NULL, therefore will not trigger incorrect stream ID comparisons. In other words, even if this bug remains unfixed, no incident would occur.	2025-12-16 17:00:22 +08:00
Yuan Wang	33391a7b61	Support delay trimming slots after finishing migrating slots (#14567 ) This PR introduces a mechanism that allows a module to temporarily disable trimming after an ASM migration operation so it can safely finish ongoing asynchronous jobs that depend on keys in migrating (and about to be trimmed) slots. 1. ClusterDisableTrim/ClusterEnableTrim We introduce `ClusterDisableTrim/ClusterEnableTrim` Module APIs to allow module to disable/enable slot migration ``` /* Disable automatic slot trimming. / int RM_ClusterDisableTrim(RedisModuleCtx ctx) /* Enable automatic slot trimming / int RM_ClusterEnableTrim(RedisModuleCtx ctx) ``` Please notice: Redis will not start any subsequent import or migrate ASM operations while slot trimming is disabled, so modules must re-enable trimming immediately after completing their pending work. The only valid and meaningful time for a module to disable trimming appears to be after the MIGRATE_COMPLETED event. 2. REDISMODULE_OPEN_KEY_ACCESS_TRIMMED Added REDISMODULE_OPEN_KEY_ACCESS_TRIMMED to RM_OpenKey() so that module can operate with these keys in the unowned slots after trim is paused. And now we don't delete the key if it is in trim job when we access it. And `expireIfNeeded` returns `KEY_VALID` if `EXPIRE_ALLOW_ACCESS_TRIMMED` is set, otherwise, returns `KEY_TRIMMED` without deleting key. 3. REDISMODULE_CTX_FLAGS_TRIM_IN_PROGRESS We also extend RM_GetContextFlags() to include a flag REDISMODULE_CTX_FLAGS_TRIM_IN_PROGRESS indicating whether a trimming job is pending (due to trim pause) or in progress. Modules could periodically poll this flag to synchronize their internal state, e.g., if a trim job was delayed or if the module incorrectly assumed trimming was still active. Bugfix: RM_SetClusterFlags could not clear a flag after enabling it first. --------- Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>	2025-12-16 16:30:56 +08:00
Rushabh Mehta	ddbd96d8ae	Add `--name` flag to redis-cli for setting client name (#14588 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details This PR introduces a new flag `--name <client-name>` to `redis-cli`. This allows users to specify a persistent client name that remains associated with the connection. Implementation Details: - Configuration: Added `client_name` field to the global config struct. - Argument Parsing: Updated `parseOptions` to handle the `--name` flag. - Unified Logic (`cliSetName`): Introduced a helper function cliSetName that sends `CLIENT SETNAME <name>` immediately after the connection is established. This ensures the name is set consistently for both RESP2 and RESP3 modes. - Documentation: Updated `redis-cli --help` output to include the new flag. This PR can close #14585	2025-12-15 21:43:51 +08:00
Yuan Wang	f3316c3a1a	Introduce flushdb option for repl-diskless-load (#14596 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details `repl-diskless-load` feature can effectively reduce the time of full synchronization, but maybe it is not widely used. `swapdb` option needs double `maxmemory`, and `on-empty-db` only works on the first full sync (the replica must have no data). This PR introduce a new option: `flushdb` - Always flush the entire dataset before diskless load. If the diskless load fails, the replica will lose all existing data. Of course, it brings the risk of data loss, but it provides a choice if you want to reduce full sync time and accept this risk.	2025-12-15 11:25:53 +08:00
Stav-Levi	23aca15c8c	Fix the flexibility of argument positions in the Redis API's (#14416 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details This PR implements flexible keyword-based argument parsing for all 12 hash field expiration commands, allowing users to specify arguments in any logical order rather than being constrained by rigid positional requirements. This enhancement follows Redis's modern design of keyword-based flexible argument ordering and significantly improves user experience. Commands with Flexible Parsing HEXPIRE, HPEXPIRE, HEXPIREAT, HPEXPIREAT, HGETEX, HSETEX some examples: HEXPIRE: * All these are equivalent and valid: HEXPIRE key EX 60 NX FIELDS 2 f1 f2 HEXPIRE key NX EX 60 FIELDS 2 f1 f2 HEXPIRE key FIELDS 2 f1 f2 EX 60 NX HEXPIRE key FIELDS 2 f1 f2 NX EX 60 HEXPIRE key NX FIELDS 2 f1 f2 EX 60 HGETEX: * All these are equivalent and valid: HGETEX key EX 60 FIELDS 2 f1 f2 HGETEX key FIELDS 2 f1 f2 EX 60 HSETEX: * All these are equivalent and valid: HSETEX key FNX EX 60 FIELDS 2 f1 v1 f2 v2 HSETEX key EX 60 FNX FIELDS 2 f1 v1 f2 v2 HSETEX key FIELDS 2 f1 v1 f2 v2 FNX EX 60 HSETEX key FIELDS 2 f1 v1 f2 v2 EX 60 FNX HSETEX key FNX FIELDS 2 f1 v1 f2 v2 EX 60	2025-12-14 09:35:12 +02:00
Lior Kogan	9b7254c810	Clarify that `BUILD_WITH_MODULES=yes` is not supported on 32 bit systems. (#14606 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details Following #14618 , This PR Update the readme file	2025-12-11 11:07:47 +02:00
YaacovHazan	ec84bd6143	Prevent building with modules on 32-bit systems (#14618 ) Redis modules do not support 32-bit architectures. The build now fails early when modules are enabled on such systems.	2025-12-11 11:04:30 +02:00
debing.sun	679e009b73	Add daily CI for vectorset (#14302 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details	2025-12-10 08:52:43 +08:00
Vitah Lin	4499d68748	Cleanup redundant declaration of getSlotOrReply() (#14576 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details	2025-12-09 17:58:19 +08:00
Vitah Lin	3bcacd8a21	Upgrade GitHub Actions macOS runner (#14613 ) 1. GitHub has deprecated older macOS runners, and macos-13 is no longer supported. Updating to macos-26 ensures that CI workflows continue to run without interruption. 2. Previously, cross-platform-actions/action@v0.22.0 used runs-on: macos-13. I checked the latest version of cross-platform-actions, and the official examples now use runs-on: ubuntu. I think we can switch from macOS to Ubuntu. --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2025-12-09 15:01:58 +08:00
Slavomir Kaslev	5299ccf2a9	Add kvstore type and decouple kvstore from its metadata (#14543 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details Decouple kvstore from its metadata by introducing `kvstoreType` structure of callbacks. This resolves the abstraction layer violation of having kvstore include `server.h` directly. Move (again) cluster slot statistics to per slot dicts' metadata. The callback `canFreeDict` is used to prevent freeing empty per slot dicts from losing per slot statistics. Co-authored-by: Ran Tidhar <ran.tidhar@redis.com>	2025-12-08 21:12:33 +02:00
debing.sun	dd57b141b9	Clean up lookahead-related code (#14562 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details ## Summary Clean up lookahead-related(https://github.com/redis/redis/issues/14440) code by consolidating slot extraction logic. ## Changes * Replace `GETSLOT_NOKEYS` with `INVALID_CLUSTER_SLOT` * Refactor `getSlotFromCommand()` to reuse `extractSlotFromKeysResult()` * Let extractSlotFromKeysResult () behavior more unified and more readable * Fix comment alignment --------- Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>	2025-12-08 14:47:39 +08:00
Yuan Wang	cb71dec0c3	Disable RDB compression when diskless replication is used (#14575 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details Fixes #14538 If the master uses diskless synchronization and the replica uses diskless load, we can disable RDB compression to reduce full sync time. I tested on AWS and found we could reduce time by 20-40%. In terms of implementation, when the replica can use diskless load, the replica will send `replconf rdb-no-compress 1` to master to deliver a RDB without compression. If your network is slow, please disable repl-diskless-load, and maybe even repl-diskless-sync --------- Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>	2025-12-04 09:24:23 +08:00
Ozan Tezcan	08b63b6ceb	Fix flaky ASM tests (#14604 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details 1. Fix "Simple slot migration with write load" by introducing artificial delay to traffic generator to slow down it for tsan builds. Failed test: https://github.com/redis/redis/actions/runs/19720942981/job/56503213650 2. Fix "Test RM_ClusterCanAccessKeysInSlot returns false for unowned slots" by waiting config propagation before checking it on a replica. Failed test: https://github.com/redis/redis/actions/runs/19841852142/job/56851802772	2025-12-03 12:12:48 +03:00
Ozan Tezcan	3c57a8fc92	Retry an ASM import step when the source node is temporarily not ready (#14599 ) Some checks are pending CI / test-ubuntu-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Waiting to run Details CI / build-debian-old (push) Waiting to run Details CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Waiting to run Details CI / build-libc-malloc (push) Waiting to run Details CI / build-centos-jemalloc (push) Waiting to run Details CI / build-old-chain-jemalloc (push) Waiting to run Details Codecov / code-coverage (push) Waiting to run Details External Server Tests / test-external-standalone (push) Waiting to run Details External Server Tests / test-external-cluster (push) Waiting to run Details External Server Tests / test-external-nodebug (push) Waiting to run Details Spellcheck / Spellcheck (push) Waiting to run Details The cluster implementation may be temporarily unavailable and return an error to the `ASM_EVENT_MIGRATE_PREP` event to prevent starting a new migration. Although this is most likely a transient condition, the source node has no way to distinguish it from a real error, so it must fail the import attempt and start a new one. In Redis, failing an attempt is cheap, but in other cluster implementations it may require cleaning up resources and can cause unnecessary disruption. This PR introduces a new `-NOTREADY` error reply for the `CLUSTER SYNCSLOTS SYNC` command. When the source replies with `-NOTREADY`, the destination can recognize the condition as transient and retry sending `CLUSTER SYNCSLOTS SYNC` step periodically instead of failing the attempt.	2025-12-02 13:38:22 +03:00
Ozan Tezcan	86c63588b0	Refactor some of ASM and slot-stats functions (#14587 ) Some checks failed CI / test-ubuntu-latest (push) Has been cancelled Details CI / test-sanitizer-address (push) Has been cancelled Details CI / build-debian-old (push) Has been cancelled Details CI / build-macos-latest (push) Has been cancelled Details CI / build-32bit (push) Has been cancelled Details CI / build-libc-malloc (push) Has been cancelled Details CI / build-centos-jemalloc (push) Has been cancelled Details CI / build-old-chain-jemalloc (push) Has been cancelled Details Codecov / code-coverage (push) Has been cancelled Details External Server Tests / test-external-standalone (push) Has been cancelled Details External Server Tests / test-external-cluster (push) Has been cancelled Details External Server Tests / test-external-nodebug (push) Has been cancelled Details Spellcheck / Spellcheck (push) Has been cancelled Details This PR does not introduce any behavioral changes. - Refactored and moved verifyClusterConfigWithData() into cluster.c. - Refactored and centralized ASM and slot-stats initialization functions. These changes place shared logic in a common location so it can be reused by different cluster implementations.	2025-11-29 22:41:58 +03:00

1 2 3 4 5 ...

12870 commits