redis/tests/unit/moduleapi/propagate.tcl
Mincho Paskalev e3c38aab66
Handle primary/replica clients in IO threads (#14335)
# Problem

While introducing Async IO
threads(https://github.com/redis/redis/pull/13695) primary and replica
clients were left to be handled inside main thread due to data race and
synchronization issues. This PR solves this issue with the additional
hope it increases performance of replication.

# Overview

## Moving the clients to IO threads

Since clients first participate in a handshake and an RDB replication
phases it was decided they are moved to IO-thread after RDB replication
is done. For primary client this was trivial as the master client is
created only after RDB sync (+ some additional checks one can see in
`isClientMustHandledByMainThread`). Replica clients though are moved to
IO threads immediately after connection (as are all clients) so
currently in `unstable` replication happens while this client is in
IO-thread. In this PR it was moved to main thread after receiving the
first `REPLCONF` message from the replica, but it is a bit hacky and we
can remove it. I didn't find issues between the two versions.

## Primary client (replica node)

We have few issues here:
- during `serverCron` a `replicationCron` is ran which periodically
sends `REPLCONF ACK` message to the master, also checks for timed-out
master. In order to prevent data races we utilize`IOThreadClientsCron`.
The client is periodically sent to main thread and during
`processClientsFromIOThread` it's checked if it needs to run the
replication cron behaviour.

- data races with main thread - specifically `lastinteraction` and
`read_reploff` members of the primary client that are written to in
`readQueryFromClient` could be accessed at the same time from main
thread during execution of `INFO REPLICATION`(`genRedisInfoString`). To
solve this the members were duplicated so if the client is in IO-thread
it writes to the duplicates and they are synced with the original
variables each time the client is send to main thread ( that means `INFO
REPLICATION` could potentially return stale values).

- During `freeClient` the primary client is fetched to main thread but
when caching it(`replicationCacheMaster`) the thread id will remain the
id of the IO thread it was from. This creates problems when resurrecting
the master client. Here the call to `unbindClientFromIOThreadEventLoop`
in `freeClient` was rewritten to call `keepClientInMainThread` which
automatically fixes the problem.

- During `exitScriptTimedoutMode` the master is queued for reprocessing
(specifically process any pending commands ASAP after it's unblocked).
We do that by putting it in the `server.unblocked_clients` list, which
are processed in the next `beforeSleep` cycle in main thread. Since this
will create a contention between main and IO thread, we just skip this
queueing in `unblocked_clients` and just queue the client to main thread
- the `processClientsFromIOThread` will process the pending commands
just as main would have.

## Replica clients (primary node)

We move the client after RDB replication is done and after replication
backlog is fed with its first message.
We do that so that the client's reference to the first replication
backlog node is initialized before it's read from IO-thread, hence no
contention with main thread on it.

### Shared replication buffer

Currently in unstable the replication buffer is shared amongst clients.
This is done via clients holding references to the nodes inside the
buffer. A node from the buffer can be trimmed once each replica client
has read it and send its contents. The reference is
`client->ref_repl_buf_node`. The replication buffer is written to by
main thread in `feedReplicationBuffer` and the refcounting is intrusive
- it's inside the replication-buffer nodes themselves.

Since the replica client changes the refcount (decreases the refcount of
the node it has just read, and increases the refcount of the next node
it starts to read) during `writeToClient` we have a data race with main
thread when it feeds the replication buffer. Moreover, main thread also
updates the `used` size of the node - how much it has written to it,
compared to its capacity which the replica client relies on to know how
much to read. Obviously replica being in IO-thread creates another data
race here. To mitigate these issues a few new variables were added to
the client's struct:

- `io_curr_repl_node` - starting node this replica is reading from
inside IO-thread
- `io_bound_repl_node` - the last node in the replication buffer the
replica sees before being send to IO-thread.

These values are only allowed to be updated in main thread. The client
keeps track of how much it has read into the buffer via the old
`ref_repl_buf_node`. Generally while in IO-thread the replica client
will now keep refcount of the `io_curr_repl_node` until it's processed
all the nodes up to `io_bound_repl_node` - at that point its returned to
main thread which can safely update the refcounts.
The `io_bound_repl_node` reference is there so the replica knows when to
stop reading from the repl buffer - imagine that replica reads from the
last node of the replication buffer while main thread feeds data to it -
we will create a data race on the `used` value
(`_writeToClientSlave`(IO-thread) vs `feedReplicationBuffer`(main)).
That's why this value is updated just before the replica is being send
to IO thread.
*NOTE*, this means that when replicas are handled by IO threads they
will hold more than one node at a time (i.e `io_curr_repl_node` up to
`io_bound_repl_node`) meaning trimming will happen a bit less
frequently. Tests show no significant problems with that.
(tnx to @ShooterIT for the `io_curr_repl_node` and `io_bound_repl_node`
mechanism as my initial implementation had similar semantics but was way
less clear)

Example of how this works:

* Replication buffer state at time N:
   | node 0| ... | node M, used_size K |
* replica caches `io_curr_repl_node`=0, `io_bound_repl_node`=M and
`io_bound_block_pos`=K
* replica moves to IO thread and processes all the data it sees
* Replication buffer state at time N + 1:
| node 0| ... | node M, used_size Full | |node M + 1| |node M + 2,
used_size L|, where Full > M
* replica moves to main thread at time N + 1, at this point following
happens
   - refcount to node 0 (io_curr_repl_node) is decreased
- `ref_repl_buf_node` becomes node M(io_bound_repl_node) (we still have
size-K bytes to process from there)
- refcount to node M is increased (now all nodes from 0 up to M-1
including can be trimmed unless some other replica holds reference to
them)
- And just before the replica is send back to IO thread the following
are updated:
   - `io_bound_repl_node` ref becomes node M+2
   - `io_bound_block_pos` becomes L

Note that replica client is only moved to main if it has processed all
the data it knows about (i.e up to `io_bound_repl_node` +
`io_bound_block_pos`)

### Replica clients kept in main as much as possible

During implementation an issue arose - how fast is the replica client
able to get knowledge about new data from the replication buffer and how
fast can it trim it. In order for that to happen ASAP whenever a replica
is moved to main it remains there until the replication buffer is fed
new data. At that point its put in the pending write queue and special
cased in handleClientsWithPendingWrites so that its send to IO thread
ASAP to write the new data to replica. Also since each time the replica
writes its whole repl data it knows about that means after it's send to
main thread `processClientsFromIOThread` is able to immediately update
the refcounts and trim whatever it can.

### ACK messages from primary

Slave clients need to periodically read `REPLCONF ACK` messages from
client. Since replica can remain in main thread indefinitely if no DB
change occurs, a new atomic `pending_read` was added during
`readQueryFromClient`. If a replica client has a pending read it's
returned back to IO-thread in order to process the read even if there is
no pending repl data to write.

### Replicas during shutdown

During shutdown the main thread pauses write actions and periodically
checks if all replicas have reached the same replication offset as the
primary node. During `finishShutdown` that may or may not be the case.
Either way a client data may be read from the replicas and even we may
try to write any pending data to them inside `flushSlavesOutputBuffers`.
In order to prevent races all the replicas from IO threads are moved to
main via `fetchClientFromIOThread`. A cancel of the shutdown should be
ok, since the mechanism employed by `handleClientsWithPendingWrites`
should return the client back to IO thread when needed.

## Notes

While adding new tests timing issues with Tsan tests were found and
fixed.

Also there is a data race issue caught by Tsan on the `last_error`
member of the `client` struct. It happens when both IO-thread and main
thread make a syscall using a `client` instance - this can happen only
for primary and replica clients since their data can be accessed by
commands send from other clients. Specific example is the `INFO
REPLICATION` command.
Although other such races were fixed, as described above, this once is
insignificant and it was decided to be ignored in `tsan.sup`.

---------

Co-authored-by: Yuan Wang <wangyuancode@163.com>
Co-authored-by: Yuan Wang <yuan.wang@redis.com>
2026-01-21 16:19:12 +02:00

806 lines
32 KiB
Tcl

set testmodule [file normalize tests/modules/propagate.so]
set miscmodule [file normalize tests/modules/misc.so]
set keyspace_events [file normalize tests/modules/keyspace_events.so]
tags "modules external:skip" {
test {Modules can propagate in async and threaded contexts} {
start_server [list overrides [list loadmodule "$testmodule"]] {
set replica [srv 0 client]
set replica_host [srv 0 host]
set replica_port [srv 0 port]
$replica module load $keyspace_events
start_server [list overrides [list loadmodule "$testmodule"]] {
set master [srv 0 client]
set master_host [srv 0 host]
set master_port [srv 0 port]
$master module load $keyspace_events
# Start the replication process...
$replica replicaof $master_host $master_port
wait_for_sync $replica
after 1000
test {module propagates from timer} {
set repl [attach_to_replication_stream]
$master propagate-test.timer
wait_for_condition 500 10 {
[$replica get timer] eq "3"
} else {
fail "The two counters don't match the expected value."
}
assert_replication_stream $repl {
{select *}
{incr timer}
{incr timer}
{incr timer}
}
close_replication_stream $repl
}
test {module propagation with notifications} {
set repl [attach_to_replication_stream]
$master set x y
assert_replication_stream $repl {
{multi}
{select *}
{incr notifications}
{set x y}
{exec}
}
close_replication_stream $repl
}
test {module propagation with notifications with multi} {
set repl [attach_to_replication_stream]
$master multi
$master set x1 y1
$master set x2 y2
$master exec
assert_replication_stream $repl {
{multi}
{select *}
{incr notifications}
{set x1 y1}
{incr notifications}
{set x2 y2}
{exec}
}
close_replication_stream $repl
}
test {module propagation with notifications with active-expire} {
$master debug set-active-expire 1
set repl [attach_to_replication_stream]
$master set asdf1 1 PX 300
$master set asdf2 2 PX 300
$master set asdf3 3 PX 300
wait_for_condition 500 10 {
[$replica keys asdf*] eq {}
} else {
fail "Not all keys have expired"
}
# Note whenever there's double notification: SET with PX issues two separate
# notifications: one for "set" and one for "expire"
assert_replication_stream $repl {
{multi}
{select *}
{incr notifications}
{incr notifications}
{set asdf1 1 PXAT *}
{exec}
{multi}
{incr notifications}
{incr notifications}
{set asdf2 2 PXAT *}
{exec}
{multi}
{incr notifications}
{incr notifications}
{set asdf3 3 PXAT *}
{exec}
{multi}
{incr notifications}
{incr notifications}
{incr testkeyspace:expired}
{del asdf*}
{exec}
{multi}
{incr notifications}
{incr notifications}
{incr testkeyspace:expired}
{del asdf*}
{exec}
{multi}
{incr notifications}
{incr notifications}
{incr testkeyspace:expired}
{del asdf*}
{exec}
}
close_replication_stream $repl
$master debug set-active-expire 0
}
test {module propagation with notifications with eviction case 1} {
$master flushall
$master set asdf1 1
$master set asdf2 2
$master set asdf3 3
$master config set maxmemory-policy allkeys-random
$master config set maxmemory 1
# Please note the following loop:
# We evict a key and send a notification, which does INCR on the "notifications" key, so
# that every time we evict any key, "notifications" key exist (it happens inside the
# performEvictions loop). So even evicting "notifications" causes INCR on "notifications".
# If maxmemory_eviction_tenacity would have been set to 100 this would be an endless loop, but
# since the default is 10, at some point the performEvictions loop would end.
# Bottom line: "notifications" always exists and we can't really determine the order of evictions
# This test is here only for sanity
# The replica will get the notification with multi exec and we have a generic notification handler
# that performs `RedisModule_Call(ctx, "INCR", "c", "multi");` if the notification is inside multi exec.
# so we will have 2 keys, "notifications" and "multi".
wait_for_condition 500 10 {
[$replica dbsize] eq 2
} else {
fail "Not all keys have been evicted"
}
$master config set maxmemory 0
$master config set maxmemory-policy noeviction
}
test {module propagation with notifications with eviction case 2} {
$master flushall
set repl [attach_to_replication_stream]
$master set asdf1 1 EX 300
$master set asdf2 2 EX 300
$master set asdf3 3 EX 300
# Please note we use volatile eviction to prevent the loop described in the test above.
# "notifications" is not volatile so it always remains
$master config resetstat
$master config set maxmemory-policy volatile-ttl
$master config set maxmemory 1
wait_for_condition 500 10 {
[s evicted_keys] eq 3
} else {
fail "Not all keys have been evicted"
}
$master config set maxmemory 0
$master config set maxmemory-policy noeviction
$master set asdf4 4
# Note whenever there's double notification: SET with EX issues two separate
# notifications: one for "set" and one for "expire"
# Note that although CONFIG SET maxmemory is called in this flow (see issue #10014),
# eviction will happen and will not induce propagation of the CONFIG command (see #10019).
assert_replication_stream $repl {
{multi}
{select *}
{incr notifications}
{incr notifications}
{set asdf1 1 PXAT *}
{exec}
{multi}
{incr notifications}
{incr notifications}
{set asdf2 2 PXAT *}
{exec}
{multi}
{incr notifications}
{incr notifications}
{set asdf3 3 PXAT *}
{exec}
{multi}
{incr notifications}
{del asdf*}
{exec}
{multi}
{incr notifications}
{del asdf*}
{exec}
{multi}
{incr notifications}
{del asdf*}
{exec}
{multi}
{incr notifications}
{set asdf4 4}
{exec}
}
close_replication_stream $repl
}
test {module propagation with timer and CONFIG SET maxmemory} {
set repl [attach_to_replication_stream]
$master config resetstat
$master config set maxmemory-policy volatile-random
$master propagate-test.timer-maxmemory
# Wait until the volatile keys are evicted
wait_for_condition 500 10 {
[s evicted_keys] eq 2
} else {
fail "Not all keys have been evicted"
}
assert_replication_stream $repl {
{multi}
{select *}
{incr notifications}
{incr notifications}
{set timer-maxmemory-volatile-start 1 PXAT *}
{incr timer-maxmemory-middle}
{incr notifications}
{incr notifications}
{set timer-maxmemory-volatile-end 1 PXAT *}
{exec}
{multi}
{incr notifications}
{del timer-maxmemory-volatile-*}
{exec}
{multi}
{incr notifications}
{del timer-maxmemory-volatile-*}
{exec}
}
close_replication_stream $repl
$master config set maxmemory 0
$master config set maxmemory-policy noeviction
}
test {module propagation with timer and EVAL} {
set repl [attach_to_replication_stream]
$master propagate-test.timer-eval
assert_replication_stream $repl {
{multi}
{select *}
{incr notifications}
{incrby timer-eval-start 1}
{incr notifications}
{set foo bar}
{incr timer-eval-middle}
{incr notifications}
{incrby timer-eval-end 1}
{exec}
}
close_replication_stream $repl
}
test {module propagates nested ctx case1} {
set repl [attach_to_replication_stream]
$master propagate-test.timer-nested
wait_for_condition 500 10 {
[$replica get timer-nested-end] eq "1"
} else {
fail "The two counters don't match the expected value."
}
assert_replication_stream $repl {
{multi}
{select *}
{incrby timer-nested-start 1}
{incrby timer-nested-end 1}
{exec}
}
close_replication_stream $repl
# Note propagate-test.timer-nested just propagates INCRBY, causing an
# inconsistency, so we flush
$master flushall
}
test {module propagates nested ctx case2} {
set repl [attach_to_replication_stream]
$master propagate-test.timer-nested-repl
wait_for_condition 500 10 {
[$replica get timer-nested-end] eq "1"
} else {
fail "The two counters don't match the expected value."
}
assert_replication_stream $repl {
{multi}
{select *}
{incrby timer-nested-start 1}
{incr notifications}
{incr using-call}
{incr counter-1}
{incr counter-2}
{incr counter-3}
{incr counter-4}
{incr notifications}
{incr after-call}
{incr notifications}
{incr before-call-2}
{incr notifications}
{incr asdf}
{incr notifications}
{del asdf}
{incr notifications}
{incr after-call-2}
{incr notifications}
{incr timer-nested-middle}
{incrby timer-nested-end 1}
{exec}
}
close_replication_stream $repl
# Note propagate-test.timer-nested-repl just propagates INCRBY, causing an
# inconsistency, so we flush
$master flushall
}
test {module propagates from thread} {
set repl [attach_to_replication_stream]
$master propagate-test.thread
wait_for_condition 500 10 {
[$replica get a-from-thread] eq "3"
} else {
fail "The two counters don't match the expected value."
}
assert_replication_stream $repl {
{multi}
{select *}
{incr a-from-thread}
{incr notifications}
{incr thread-call}
{incr b-from-thread}
{exec}
{multi}
{incr a-from-thread}
{incr notifications}
{incr thread-call}
{incr b-from-thread}
{exec}
{multi}
{incr a-from-thread}
{incr notifications}
{incr thread-call}
{incr b-from-thread}
{exec}
}
close_replication_stream $repl
}
test {module propagates from thread with detached ctx} {
set repl [attach_to_replication_stream]
$master propagate-test.detached-thread
wait_for_condition 500 10 {
[$replica get thread-detached-after] eq "1"
} else {
fail "The key doesn't match the expected value."
}
assert_replication_stream $repl {
{multi}
{select *}
{incr thread-detached-before}
{incr notifications}
{incr thread-detached-1}
{incr notifications}
{incr thread-detached-2}
{incr thread-detached-after}
{exec}
}
close_replication_stream $repl
}
test {module propagates from command} {
set repl [attach_to_replication_stream]
$master propagate-test.simple
$master propagate-test.mixed
assert_replication_stream $repl {
{multi}
{select *}
{incr counter-1}
{incr counter-2}
{exec}
{multi}
{incr notifications}
{incr using-call}
{incr counter-1}
{incr counter-2}
{incr notifications}
{incr after-call}
{exec}
}
close_replication_stream $repl
}
test {module propagates from EVAL} {
set repl [attach_to_replication_stream]
assert_equal [ $master eval { \
redis.call("propagate-test.simple"); \
redis.call("set", "x", "y"); \
redis.call("propagate-test.mixed"); return "OK" } 0 ] {OK}
assert_replication_stream $repl {
{multi}
{select *}
{incr counter-1}
{incr counter-2}
{incr notifications}
{set x y}
{incr notifications}
{incr using-call}
{incr counter-1}
{incr counter-2}
{incr notifications}
{incr after-call}
{exec}
}
close_replication_stream $repl
}
test {module propagates from command after good EVAL} {
set repl [attach_to_replication_stream]
assert_equal [ $master eval { return "hello" } 0 ] {hello}
$master propagate-test.simple
$master propagate-test.mixed
assert_replication_stream $repl {
{multi}
{select *}
{incr counter-1}
{incr counter-2}
{exec}
{multi}
{incr notifications}
{incr using-call}
{incr counter-1}
{incr counter-2}
{incr notifications}
{incr after-call}
{exec}
}
close_replication_stream $repl
}
test {module propagates from command after bad EVAL} {
set repl [attach_to_replication_stream]
catch { $master eval { return "hello" } -12 } e
assert_equal $e {ERR Number of keys can't be negative}
$master propagate-test.simple
$master propagate-test.mixed
assert_replication_stream $repl {
{multi}
{select *}
{incr counter-1}
{incr counter-2}
{exec}
{multi}
{incr notifications}
{incr using-call}
{incr counter-1}
{incr counter-2}
{incr notifications}
{incr after-call}
{exec}
}
close_replication_stream $repl
}
test {module propagates from multi-exec} {
set repl [attach_to_replication_stream]
$master multi
$master propagate-test.simple
$master propagate-test.mixed
$master propagate-test.timer-nested-repl
$master exec
wait_for_condition 500 10 {
[$replica get timer-nested-end] eq "1"
} else {
fail "The two counters don't match the expected value."
}
assert_replication_stream $repl {
{multi}
{select *}
{incr counter-1}
{incr counter-2}
{incr notifications}
{incr using-call}
{incr counter-1}
{incr counter-2}
{incr notifications}
{incr after-call}
{exec}
{multi}
{incrby timer-nested-start 1}
{incr notifications}
{incr using-call}
{incr counter-1}
{incr counter-2}
{incr counter-3}
{incr counter-4}
{incr notifications}
{incr after-call}
{incr notifications}
{incr before-call-2}
{incr notifications}
{incr asdf}
{incr notifications}
{del asdf}
{incr notifications}
{incr after-call-2}
{incr notifications}
{incr timer-nested-middle}
{incrby timer-nested-end 1}
{exec}
}
close_replication_stream $repl
# Note propagate-test.timer-nested just propagates INCRBY, causing an
# inconsistency, so we flush
$master flushall
}
test {module RM_Call of expired key propagation} {
$master debug set-active-expire 0
$master set k1 900 px 100
after 110
set repl [attach_to_replication_stream]
$master propagate-test.incr k1
assert_replication_stream $repl {
{multi}
{select *}
{del k1}
{propagate-test.incr k1}
{exec}
}
close_replication_stream $repl
assert_equal [$master get k1] 1
assert_equal [$master ttl k1] -1
wait_for_condition 50 100 {
[$replica get k1] eq 1 &&
[$replica ttl k1] eq -1
} else {
fail "failed RM_Call of expired key propagation"
}
}
test {module notification on set} {
set repl [attach_to_replication_stream]
$master SADD s foo
wait_for_condition 500 10 {
[$replica SCARD s] eq "1"
} else {
fail "Failed to wait for set to be replicated"
}
$master SPOP s 1
wait_for_condition 500 10 {
[$replica SCARD s] eq "0"
} else {
fail "Failed to wait for set to be replicated"
}
# Currently the `del` command comes after the notification.
# When we fix spop to fire notification at the end (like all other commands),
# the `del` will come first.
assert_replication_stream $repl {
{multi}
{select *}
{incr notifications}
{sadd s foo}
{exec}
{multi}
{incr notifications}
{incr notifications}
{del s}
{exec}
}
close_replication_stream $repl
}
test {module key miss notification do not cause read command to be replicated} {
set repl [attach_to_replication_stream]
$master flushall
$master get unexisting_key
wait_for_condition 500 10 {
[$replica get missed] eq "1"
} else {
fail "Failed to wait for set to be replicated"
}
# Test is checking a wrong!!! behavior that causes a read command to be replicated to replica/aof.
# We keep the test to verify that such a wrong behavior does not cause any crashes.
assert_replication_stream $repl {
{select *}
{flushall}
{multi}
{incr notifications}
{incr missed}
{get unexisting_key}
{exec}
}
close_replication_stream $repl
}
test "Unload the module - propagate-test/testkeyspace" {
assert_equal {OK} [r module unload propagate-test]
assert_equal {OK} [r module unload testkeyspace]
}
assert_equal [s -1 unexpected_error_replies] 0
}
}
}
}
tags "modules aof external:skip" {
foreach aofload_type {debug_cmd startup} {
test "Modules RM_Replicate replicates MULTI/EXEC correctly: AOF-load type $aofload_type" {
start_server [list overrides [list loadmodule "$testmodule"]] {
# Enable the AOF
r config set appendonly yes
r config set auto-aof-rewrite-percentage 0 ; # Disable auto-rewrite.
waitForBgrewriteaof r
r propagate-test.simple
r propagate-test.mixed
r multi
r propagate-test.simple
r propagate-test.mixed
r exec
assert_equal [r get counter-1] {}
assert_equal [r get counter-2] {}
assert_equal [r get using-call] 2
assert_equal [r get after-call] 2
assert_equal [r get notifications] 4
# Load the AOF
if {$aofload_type == "debug_cmd"} {
r debug loadaof
} else {
r config rewrite
restart_server 0 true false
wait_done_loading r
}
# This module behaves bad on purpose, it only calls
# RM_Replicate for counter-1 and counter-2 so values
# after AOF-load are different
assert_equal [r get counter-1] 4
assert_equal [r get counter-2] 4
assert_equal [r get using-call] 2
assert_equal [r get after-call] 2
# 4+4+2+2 commands from AOF (just above) + 4 "INCR notifications" from AOF + 4 notifications for these INCRs
assert_equal [r get notifications] 20
assert_equal {OK} [r module unload propagate-test]
assert_equal [s 0 unexpected_error_replies] 0
}
}
test "Modules RM_Call does not update stats during aof load: AOF-load type $aofload_type" {
start_server [list overrides [list loadmodule "$miscmodule"]] {
# Enable the AOF
r config set appendonly yes
r config set auto-aof-rewrite-percentage 0 ; # Disable auto-rewrite.
waitForBgrewriteaof r
r config resetstat
r set foo bar
r EVAL {return redis.call('SET', KEYS[1], ARGV[1])} 1 foo bar2
r test.rm_call_replicate set foo bar3
r EVAL {return redis.call('test.rm_call_replicate',ARGV[1],KEYS[1],ARGV[2])} 1 foo set bar4
r multi
r set foo bar5
r EVAL {return redis.call('SET', KEYS[1], ARGV[1])} 1 foo bar6
r test.rm_call_replicate set foo bar7
r EVAL {return redis.call('test.rm_call_replicate',ARGV[1],KEYS[1],ARGV[2])} 1 foo set bar8
r exec
assert_match {*calls=8,*,rejected_calls=0,failed_calls=0} [cmdrstat set r]
# Load the AOF
if {$aofload_type == "debug_cmd"} {
r config resetstat
r debug loadaof
} else {
r config rewrite
restart_server 0 true false
wait_done_loading r
}
assert_no_match {*calls=*} [cmdrstat set r]
}
}
}
}
# This test does not really test module functionality, but rather uses a module
# command to test Redis replication mechanisms.
test {Replicas that was marked as CLIENT_CLOSE_ASAP should not keep the replication backlog from been trimmed} {
start_server [list overrides [list loadmodule "$testmodule"] tags {"external:skip"}] {
set replica [srv 0 client]
start_server [list overrides [list loadmodule "$testmodule"] tags {"external:skip"}] {
set master [srv 0 client]
set master_host [srv 0 host]
set master_port [srv 0 port]
$master config set client-output-buffer-limit "replica 10mb 5mb 0"
# Start the replication process...
$replica replicaof $master_host $master_port
wait_for_sync $replica
test {module propagates from timer} {
# Replicate large commands to make the replica disconnected.
$master write [format_command propagate-test.verbatim 100000 [string repeat "a" 1000]] ;# almost 100mb
# Execute this command together with module commands within the same
# event loop to prevent periodic cleanup of replication backlog.
$master write [format_command info memory]
$master flush
$master read ;# propagate-test.verbatim
set res [$master read] ;# info memory
# Wait for the replica to be disconnected.
wait_for_log_messages 0 {"*flags=S*scheduled to be closed ASAP for overcoming of output buffer limits*"} 0 1500 10
# Due to the replica reaching the soft limit (5MB), memory peaks should not significantly
# exceed the replica soft limit. Furthermore, as the replica release its reference to
# replication backlog, it should be properly trimmed, the memory usage of replication
# backlog should not significantly exceed repl-backlog-size (default 1MB). */
assert_lessthan [getInfoProperty $res used_memory_peak] 10000000;# less than 10mb
assert_lessthan [getInfoProperty $res mem_replication_backlog] 2000000;# less than 2mb
}
}
}
}