mirror of
https://github.com/redis/redis.git
synced 2026-02-03 20:39:54 -05:00
# Problem While introducing Async IO threads(https://github.com/redis/redis/pull/13695) primary and replica clients were left to be handled inside main thread due to data race and synchronization issues. This PR solves this issue with the additional hope it increases performance of replication. # Overview ## Moving the clients to IO threads Since clients first participate in a handshake and an RDB replication phases it was decided they are moved to IO-thread after RDB replication is done. For primary client this was trivial as the master client is created only after RDB sync (+ some additional checks one can see in `isClientMustHandledByMainThread`). Replica clients though are moved to IO threads immediately after connection (as are all clients) so currently in `unstable` replication happens while this client is in IO-thread. In this PR it was moved to main thread after receiving the first `REPLCONF` message from the replica, but it is a bit hacky and we can remove it. I didn't find issues between the two versions. ## Primary client (replica node) We have few issues here: - during `serverCron` a `replicationCron` is ran which periodically sends `REPLCONF ACK` message to the master, also checks for timed-out master. In order to prevent data races we utilize`IOThreadClientsCron`. The client is periodically sent to main thread and during `processClientsFromIOThread` it's checked if it needs to run the replication cron behaviour. - data races with main thread - specifically `lastinteraction` and `read_reploff` members of the primary client that are written to in `readQueryFromClient` could be accessed at the same time from main thread during execution of `INFO REPLICATION`(`genRedisInfoString`). To solve this the members were duplicated so if the client is in IO-thread it writes to the duplicates and they are synced with the original variables each time the client is send to main thread ( that means `INFO REPLICATION` could potentially return stale values). - During `freeClient` the primary client is fetched to main thread but when caching it(`replicationCacheMaster`) the thread id will remain the id of the IO thread it was from. This creates problems when resurrecting the master client. Here the call to `unbindClientFromIOThreadEventLoop` in `freeClient` was rewritten to call `keepClientInMainThread` which automatically fixes the problem. - During `exitScriptTimedoutMode` the master is queued for reprocessing (specifically process any pending commands ASAP after it's unblocked). We do that by putting it in the `server.unblocked_clients` list, which are processed in the next `beforeSleep` cycle in main thread. Since this will create a contention between main and IO thread, we just skip this queueing in `unblocked_clients` and just queue the client to main thread - the `processClientsFromIOThread` will process the pending commands just as main would have. ## Replica clients (primary node) We move the client after RDB replication is done and after replication backlog is fed with its first message. We do that so that the client's reference to the first replication backlog node is initialized before it's read from IO-thread, hence no contention with main thread on it. ### Shared replication buffer Currently in unstable the replication buffer is shared amongst clients. This is done via clients holding references to the nodes inside the buffer. A node from the buffer can be trimmed once each replica client has read it and send its contents. The reference is `client->ref_repl_buf_node`. The replication buffer is written to by main thread in `feedReplicationBuffer` and the refcounting is intrusive - it's inside the replication-buffer nodes themselves. Since the replica client changes the refcount (decreases the refcount of the node it has just read, and increases the refcount of the next node it starts to read) during `writeToClient` we have a data race with main thread when it feeds the replication buffer. Moreover, main thread also updates the `used` size of the node - how much it has written to it, compared to its capacity which the replica client relies on to know how much to read. Obviously replica being in IO-thread creates another data race here. To mitigate these issues a few new variables were added to the client's struct: - `io_curr_repl_node` - starting node this replica is reading from inside IO-thread - `io_bound_repl_node` - the last node in the replication buffer the replica sees before being send to IO-thread. These values are only allowed to be updated in main thread. The client keeps track of how much it has read into the buffer via the old `ref_repl_buf_node`. Generally while in IO-thread the replica client will now keep refcount of the `io_curr_repl_node` until it's processed all the nodes up to `io_bound_repl_node` - at that point its returned to main thread which can safely update the refcounts. The `io_bound_repl_node` reference is there so the replica knows when to stop reading from the repl buffer - imagine that replica reads from the last node of the replication buffer while main thread feeds data to it - we will create a data race on the `used` value (`_writeToClientSlave`(IO-thread) vs `feedReplicationBuffer`(main)). That's why this value is updated just before the replica is being send to IO thread. *NOTE*, this means that when replicas are handled by IO threads they will hold more than one node at a time (i.e `io_curr_repl_node` up to `io_bound_repl_node`) meaning trimming will happen a bit less frequently. Tests show no significant problems with that. (tnx to @ShooterIT for the `io_curr_repl_node` and `io_bound_repl_node` mechanism as my initial implementation had similar semantics but was way less clear) Example of how this works: * Replication buffer state at time N: | node 0| ... | node M, used_size K | * replica caches `io_curr_repl_node`=0, `io_bound_repl_node`=M and `io_bound_block_pos`=K * replica moves to IO thread and processes all the data it sees * Replication buffer state at time N + 1: | node 0| ... | node M, used_size Full | |node M + 1| |node M + 2, used_size L|, where Full > M * replica moves to main thread at time N + 1, at this point following happens - refcount to node 0 (io_curr_repl_node) is decreased - `ref_repl_buf_node` becomes node M(io_bound_repl_node) (we still have size-K bytes to process from there) - refcount to node M is increased (now all nodes from 0 up to M-1 including can be trimmed unless some other replica holds reference to them) - And just before the replica is send back to IO thread the following are updated: - `io_bound_repl_node` ref becomes node M+2 - `io_bound_block_pos` becomes L Note that replica client is only moved to main if it has processed all the data it knows about (i.e up to `io_bound_repl_node` + `io_bound_block_pos`) ### Replica clients kept in main as much as possible During implementation an issue arose - how fast is the replica client able to get knowledge about new data from the replication buffer and how fast can it trim it. In order for that to happen ASAP whenever a replica is moved to main it remains there until the replication buffer is fed new data. At that point its put in the pending write queue and special cased in handleClientsWithPendingWrites so that its send to IO thread ASAP to write the new data to replica. Also since each time the replica writes its whole repl data it knows about that means after it's send to main thread `processClientsFromIOThread` is able to immediately update the refcounts and trim whatever it can. ### ACK messages from primary Slave clients need to periodically read `REPLCONF ACK` messages from client. Since replica can remain in main thread indefinitely if no DB change occurs, a new atomic `pending_read` was added during `readQueryFromClient`. If a replica client has a pending read it's returned back to IO-thread in order to process the read even if there is no pending repl data to write. ### Replicas during shutdown During shutdown the main thread pauses write actions and periodically checks if all replicas have reached the same replication offset as the primary node. During `finishShutdown` that may or may not be the case. Either way a client data may be read from the replicas and even we may try to write any pending data to them inside `flushSlavesOutputBuffers`. In order to prevent races all the replicas from IO threads are moved to main via `fetchClientFromIOThread`. A cancel of the shutdown should be ok, since the mechanism employed by `handleClientsWithPendingWrites` should return the client back to IO thread when needed. ## Notes While adding new tests timing issues with Tsan tests were found and fixed. Also there is a data race issue caught by Tsan on the `last_error` member of the `client` struct. It happens when both IO-thread and main thread make a syscall using a `client` instance - this can happen only for primary and replica clients since their data can be accessed by commands send from other clients. Specific example is the `INFO REPLICATION` command. Although other such races were fixed, as described above, this once is insignificant and it was decided to be ignored in `tsan.sup`. --------- Co-authored-by: Yuan Wang <wangyuancode@163.com> Co-authored-by: Yuan Wang <yuan.wang@redis.com>
806 lines
32 KiB
Tcl
806 lines
32 KiB
Tcl
set testmodule [file normalize tests/modules/propagate.so]
|
|
set miscmodule [file normalize tests/modules/misc.so]
|
|
set keyspace_events [file normalize tests/modules/keyspace_events.so]
|
|
|
|
tags "modules external:skip" {
|
|
test {Modules can propagate in async and threaded contexts} {
|
|
start_server [list overrides [list loadmodule "$testmodule"]] {
|
|
set replica [srv 0 client]
|
|
set replica_host [srv 0 host]
|
|
set replica_port [srv 0 port]
|
|
$replica module load $keyspace_events
|
|
start_server [list overrides [list loadmodule "$testmodule"]] {
|
|
set master [srv 0 client]
|
|
set master_host [srv 0 host]
|
|
set master_port [srv 0 port]
|
|
$master module load $keyspace_events
|
|
|
|
# Start the replication process...
|
|
$replica replicaof $master_host $master_port
|
|
wait_for_sync $replica
|
|
after 1000
|
|
|
|
test {module propagates from timer} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master propagate-test.timer
|
|
|
|
wait_for_condition 500 10 {
|
|
[$replica get timer] eq "3"
|
|
} else {
|
|
fail "The two counters don't match the expected value."
|
|
}
|
|
|
|
assert_replication_stream $repl {
|
|
{select *}
|
|
{incr timer}
|
|
{incr timer}
|
|
{incr timer}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagation with notifications} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master set x y
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr notifications}
|
|
{set x y}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagation with notifications with multi} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master multi
|
|
$master set x1 y1
|
|
$master set x2 y2
|
|
$master exec
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr notifications}
|
|
{set x1 y1}
|
|
{incr notifications}
|
|
{set x2 y2}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagation with notifications with active-expire} {
|
|
$master debug set-active-expire 1
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master set asdf1 1 PX 300
|
|
$master set asdf2 2 PX 300
|
|
$master set asdf3 3 PX 300
|
|
|
|
wait_for_condition 500 10 {
|
|
[$replica keys asdf*] eq {}
|
|
} else {
|
|
fail "Not all keys have expired"
|
|
}
|
|
|
|
# Note whenever there's double notification: SET with PX issues two separate
|
|
# notifications: one for "set" and one for "expire"
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{set asdf1 1 PXAT *}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{set asdf2 2 PXAT *}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{set asdf3 3 PXAT *}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{incr testkeyspace:expired}
|
|
{del asdf*}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{incr testkeyspace:expired}
|
|
{del asdf*}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{incr testkeyspace:expired}
|
|
{del asdf*}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
|
|
$master debug set-active-expire 0
|
|
}
|
|
|
|
test {module propagation with notifications with eviction case 1} {
|
|
$master flushall
|
|
$master set asdf1 1
|
|
$master set asdf2 2
|
|
$master set asdf3 3
|
|
|
|
$master config set maxmemory-policy allkeys-random
|
|
$master config set maxmemory 1
|
|
|
|
# Please note the following loop:
|
|
# We evict a key and send a notification, which does INCR on the "notifications" key, so
|
|
# that every time we evict any key, "notifications" key exist (it happens inside the
|
|
# performEvictions loop). So even evicting "notifications" causes INCR on "notifications".
|
|
# If maxmemory_eviction_tenacity would have been set to 100 this would be an endless loop, but
|
|
# since the default is 10, at some point the performEvictions loop would end.
|
|
# Bottom line: "notifications" always exists and we can't really determine the order of evictions
|
|
# This test is here only for sanity
|
|
|
|
# The replica will get the notification with multi exec and we have a generic notification handler
|
|
# that performs `RedisModule_Call(ctx, "INCR", "c", "multi");` if the notification is inside multi exec.
|
|
# so we will have 2 keys, "notifications" and "multi".
|
|
wait_for_condition 500 10 {
|
|
[$replica dbsize] eq 2
|
|
} else {
|
|
fail "Not all keys have been evicted"
|
|
}
|
|
|
|
$master config set maxmemory 0
|
|
$master config set maxmemory-policy noeviction
|
|
}
|
|
|
|
test {module propagation with notifications with eviction case 2} {
|
|
$master flushall
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master set asdf1 1 EX 300
|
|
$master set asdf2 2 EX 300
|
|
$master set asdf3 3 EX 300
|
|
|
|
# Please note we use volatile eviction to prevent the loop described in the test above.
|
|
# "notifications" is not volatile so it always remains
|
|
$master config resetstat
|
|
$master config set maxmemory-policy volatile-ttl
|
|
$master config set maxmemory 1
|
|
|
|
wait_for_condition 500 10 {
|
|
[s evicted_keys] eq 3
|
|
} else {
|
|
fail "Not all keys have been evicted"
|
|
}
|
|
|
|
$master config set maxmemory 0
|
|
$master config set maxmemory-policy noeviction
|
|
|
|
$master set asdf4 4
|
|
|
|
# Note whenever there's double notification: SET with EX issues two separate
|
|
# notifications: one for "set" and one for "expire"
|
|
# Note that although CONFIG SET maxmemory is called in this flow (see issue #10014),
|
|
# eviction will happen and will not induce propagation of the CONFIG command (see #10019).
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{set asdf1 1 PXAT *}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{set asdf2 2 PXAT *}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{set asdf3 3 PXAT *}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{del asdf*}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{del asdf*}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{del asdf*}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{set asdf4 4}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagation with timer and CONFIG SET maxmemory} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master config resetstat
|
|
$master config set maxmemory-policy volatile-random
|
|
|
|
$master propagate-test.timer-maxmemory
|
|
|
|
# Wait until the volatile keys are evicted
|
|
wait_for_condition 500 10 {
|
|
[s evicted_keys] eq 2
|
|
} else {
|
|
fail "Not all keys have been evicted"
|
|
}
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{set timer-maxmemory-volatile-start 1 PXAT *}
|
|
{incr timer-maxmemory-middle}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{set timer-maxmemory-volatile-end 1 PXAT *}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{del timer-maxmemory-volatile-*}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{del timer-maxmemory-volatile-*}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
|
|
$master config set maxmemory 0
|
|
$master config set maxmemory-policy noeviction
|
|
}
|
|
|
|
test {module propagation with timer and EVAL} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master propagate-test.timer-eval
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr notifications}
|
|
{incrby timer-eval-start 1}
|
|
{incr notifications}
|
|
{set foo bar}
|
|
{incr timer-eval-middle}
|
|
{incr notifications}
|
|
{incrby timer-eval-end 1}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagates nested ctx case1} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master propagate-test.timer-nested
|
|
|
|
wait_for_condition 500 10 {
|
|
[$replica get timer-nested-end] eq "1"
|
|
} else {
|
|
fail "The two counters don't match the expected value."
|
|
}
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incrby timer-nested-start 1}
|
|
{incrby timer-nested-end 1}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
|
|
# Note propagate-test.timer-nested just propagates INCRBY, causing an
|
|
# inconsistency, so we flush
|
|
$master flushall
|
|
}
|
|
|
|
test {module propagates nested ctx case2} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master propagate-test.timer-nested-repl
|
|
|
|
wait_for_condition 500 10 {
|
|
[$replica get timer-nested-end] eq "1"
|
|
} else {
|
|
fail "The two counters don't match the expected value."
|
|
}
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incrby timer-nested-start 1}
|
|
{incr notifications}
|
|
{incr using-call}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{incr counter-3}
|
|
{incr counter-4}
|
|
{incr notifications}
|
|
{incr after-call}
|
|
{incr notifications}
|
|
{incr before-call-2}
|
|
{incr notifications}
|
|
{incr asdf}
|
|
{incr notifications}
|
|
{del asdf}
|
|
{incr notifications}
|
|
{incr after-call-2}
|
|
{incr notifications}
|
|
{incr timer-nested-middle}
|
|
{incrby timer-nested-end 1}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
|
|
# Note propagate-test.timer-nested-repl just propagates INCRBY, causing an
|
|
# inconsistency, so we flush
|
|
$master flushall
|
|
}
|
|
|
|
test {module propagates from thread} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master propagate-test.thread
|
|
|
|
wait_for_condition 500 10 {
|
|
[$replica get a-from-thread] eq "3"
|
|
} else {
|
|
fail "The two counters don't match the expected value."
|
|
}
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr a-from-thread}
|
|
{incr notifications}
|
|
{incr thread-call}
|
|
{incr b-from-thread}
|
|
{exec}
|
|
{multi}
|
|
{incr a-from-thread}
|
|
{incr notifications}
|
|
{incr thread-call}
|
|
{incr b-from-thread}
|
|
{exec}
|
|
{multi}
|
|
{incr a-from-thread}
|
|
{incr notifications}
|
|
{incr thread-call}
|
|
{incr b-from-thread}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagates from thread with detached ctx} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master propagate-test.detached-thread
|
|
|
|
wait_for_condition 500 10 {
|
|
[$replica get thread-detached-after] eq "1"
|
|
} else {
|
|
fail "The key doesn't match the expected value."
|
|
}
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr thread-detached-before}
|
|
{incr notifications}
|
|
{incr thread-detached-1}
|
|
{incr notifications}
|
|
{incr thread-detached-2}
|
|
{incr thread-detached-after}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagates from command} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master propagate-test.simple
|
|
$master propagate-test.mixed
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr using-call}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{incr notifications}
|
|
{incr after-call}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagates from EVAL} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
assert_equal [ $master eval { \
|
|
redis.call("propagate-test.simple"); \
|
|
redis.call("set", "x", "y"); \
|
|
redis.call("propagate-test.mixed"); return "OK" } 0 ] {OK}
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{incr notifications}
|
|
{set x y}
|
|
{incr notifications}
|
|
{incr using-call}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{incr notifications}
|
|
{incr after-call}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagates from command after good EVAL} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
assert_equal [ $master eval { return "hello" } 0 ] {hello}
|
|
$master propagate-test.simple
|
|
$master propagate-test.mixed
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr using-call}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{incr notifications}
|
|
{incr after-call}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagates from command after bad EVAL} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
catch { $master eval { return "hello" } -12 } e
|
|
assert_equal $e {ERR Number of keys can't be negative}
|
|
$master propagate-test.simple
|
|
$master propagate-test.mixed
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr using-call}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{incr notifications}
|
|
{incr after-call}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module propagates from multi-exec} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master multi
|
|
$master propagate-test.simple
|
|
$master propagate-test.mixed
|
|
$master propagate-test.timer-nested-repl
|
|
$master exec
|
|
|
|
wait_for_condition 500 10 {
|
|
[$replica get timer-nested-end] eq "1"
|
|
} else {
|
|
fail "The two counters don't match the expected value."
|
|
}
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{incr notifications}
|
|
{incr using-call}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{incr notifications}
|
|
{incr after-call}
|
|
{exec}
|
|
{multi}
|
|
{incrby timer-nested-start 1}
|
|
{incr notifications}
|
|
{incr using-call}
|
|
{incr counter-1}
|
|
{incr counter-2}
|
|
{incr counter-3}
|
|
{incr counter-4}
|
|
{incr notifications}
|
|
{incr after-call}
|
|
{incr notifications}
|
|
{incr before-call-2}
|
|
{incr notifications}
|
|
{incr asdf}
|
|
{incr notifications}
|
|
{del asdf}
|
|
{incr notifications}
|
|
{incr after-call-2}
|
|
{incr notifications}
|
|
{incr timer-nested-middle}
|
|
{incrby timer-nested-end 1}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
|
|
# Note propagate-test.timer-nested just propagates INCRBY, causing an
|
|
# inconsistency, so we flush
|
|
$master flushall
|
|
}
|
|
|
|
test {module RM_Call of expired key propagation} {
|
|
$master debug set-active-expire 0
|
|
|
|
$master set k1 900 px 100
|
|
after 110
|
|
|
|
set repl [attach_to_replication_stream]
|
|
$master propagate-test.incr k1
|
|
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{del k1}
|
|
{propagate-test.incr k1}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
|
|
assert_equal [$master get k1] 1
|
|
assert_equal [$master ttl k1] -1
|
|
|
|
wait_for_condition 50 100 {
|
|
[$replica get k1] eq 1 &&
|
|
[$replica ttl k1] eq -1
|
|
} else {
|
|
fail "failed RM_Call of expired key propagation"
|
|
}
|
|
}
|
|
|
|
test {module notification on set} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master SADD s foo
|
|
|
|
wait_for_condition 500 10 {
|
|
[$replica SCARD s] eq "1"
|
|
} else {
|
|
fail "Failed to wait for set to be replicated"
|
|
}
|
|
|
|
$master SPOP s 1
|
|
|
|
wait_for_condition 500 10 {
|
|
[$replica SCARD s] eq "0"
|
|
} else {
|
|
fail "Failed to wait for set to be replicated"
|
|
}
|
|
|
|
# Currently the `del` command comes after the notification.
|
|
# When we fix spop to fire notification at the end (like all other commands),
|
|
# the `del` will come first.
|
|
assert_replication_stream $repl {
|
|
{multi}
|
|
{select *}
|
|
{incr notifications}
|
|
{sadd s foo}
|
|
{exec}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr notifications}
|
|
{del s}
|
|
{exec}
|
|
}
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test {module key miss notification do not cause read command to be replicated} {
|
|
set repl [attach_to_replication_stream]
|
|
|
|
$master flushall
|
|
|
|
$master get unexisting_key
|
|
|
|
wait_for_condition 500 10 {
|
|
[$replica get missed] eq "1"
|
|
} else {
|
|
fail "Failed to wait for set to be replicated"
|
|
}
|
|
|
|
# Test is checking a wrong!!! behavior that causes a read command to be replicated to replica/aof.
|
|
# We keep the test to verify that such a wrong behavior does not cause any crashes.
|
|
assert_replication_stream $repl {
|
|
{select *}
|
|
{flushall}
|
|
{multi}
|
|
{incr notifications}
|
|
{incr missed}
|
|
{get unexisting_key}
|
|
{exec}
|
|
}
|
|
|
|
close_replication_stream $repl
|
|
}
|
|
|
|
test "Unload the module - propagate-test/testkeyspace" {
|
|
assert_equal {OK} [r module unload propagate-test]
|
|
assert_equal {OK} [r module unload testkeyspace]
|
|
}
|
|
|
|
assert_equal [s -1 unexpected_error_replies] 0
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
|
|
tags "modules aof external:skip" {
|
|
foreach aofload_type {debug_cmd startup} {
|
|
test "Modules RM_Replicate replicates MULTI/EXEC correctly: AOF-load type $aofload_type" {
|
|
start_server [list overrides [list loadmodule "$testmodule"]] {
|
|
# Enable the AOF
|
|
r config set appendonly yes
|
|
r config set auto-aof-rewrite-percentage 0 ; # Disable auto-rewrite.
|
|
waitForBgrewriteaof r
|
|
|
|
r propagate-test.simple
|
|
r propagate-test.mixed
|
|
r multi
|
|
r propagate-test.simple
|
|
r propagate-test.mixed
|
|
r exec
|
|
|
|
assert_equal [r get counter-1] {}
|
|
assert_equal [r get counter-2] {}
|
|
assert_equal [r get using-call] 2
|
|
assert_equal [r get after-call] 2
|
|
assert_equal [r get notifications] 4
|
|
|
|
# Load the AOF
|
|
if {$aofload_type == "debug_cmd"} {
|
|
r debug loadaof
|
|
} else {
|
|
r config rewrite
|
|
restart_server 0 true false
|
|
wait_done_loading r
|
|
}
|
|
|
|
# This module behaves bad on purpose, it only calls
|
|
# RM_Replicate for counter-1 and counter-2 so values
|
|
# after AOF-load are different
|
|
assert_equal [r get counter-1] 4
|
|
assert_equal [r get counter-2] 4
|
|
assert_equal [r get using-call] 2
|
|
assert_equal [r get after-call] 2
|
|
# 4+4+2+2 commands from AOF (just above) + 4 "INCR notifications" from AOF + 4 notifications for these INCRs
|
|
assert_equal [r get notifications] 20
|
|
|
|
assert_equal {OK} [r module unload propagate-test]
|
|
assert_equal [s 0 unexpected_error_replies] 0
|
|
}
|
|
}
|
|
test "Modules RM_Call does not update stats during aof load: AOF-load type $aofload_type" {
|
|
start_server [list overrides [list loadmodule "$miscmodule"]] {
|
|
# Enable the AOF
|
|
r config set appendonly yes
|
|
r config set auto-aof-rewrite-percentage 0 ; # Disable auto-rewrite.
|
|
waitForBgrewriteaof r
|
|
|
|
r config resetstat
|
|
r set foo bar
|
|
r EVAL {return redis.call('SET', KEYS[1], ARGV[1])} 1 foo bar2
|
|
r test.rm_call_replicate set foo bar3
|
|
r EVAL {return redis.call('test.rm_call_replicate',ARGV[1],KEYS[1],ARGV[2])} 1 foo set bar4
|
|
|
|
r multi
|
|
r set foo bar5
|
|
r EVAL {return redis.call('SET', KEYS[1], ARGV[1])} 1 foo bar6
|
|
r test.rm_call_replicate set foo bar7
|
|
r EVAL {return redis.call('test.rm_call_replicate',ARGV[1],KEYS[1],ARGV[2])} 1 foo set bar8
|
|
r exec
|
|
|
|
assert_match {*calls=8,*,rejected_calls=0,failed_calls=0} [cmdrstat set r]
|
|
|
|
|
|
# Load the AOF
|
|
if {$aofload_type == "debug_cmd"} {
|
|
r config resetstat
|
|
r debug loadaof
|
|
} else {
|
|
r config rewrite
|
|
restart_server 0 true false
|
|
wait_done_loading r
|
|
}
|
|
|
|
assert_no_match {*calls=*} [cmdrstat set r]
|
|
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
# This test does not really test module functionality, but rather uses a module
|
|
# command to test Redis replication mechanisms.
|
|
test {Replicas that was marked as CLIENT_CLOSE_ASAP should not keep the replication backlog from been trimmed} {
|
|
start_server [list overrides [list loadmodule "$testmodule"] tags {"external:skip"}] {
|
|
set replica [srv 0 client]
|
|
start_server [list overrides [list loadmodule "$testmodule"] tags {"external:skip"}] {
|
|
set master [srv 0 client]
|
|
set master_host [srv 0 host]
|
|
set master_port [srv 0 port]
|
|
$master config set client-output-buffer-limit "replica 10mb 5mb 0"
|
|
|
|
# Start the replication process...
|
|
$replica replicaof $master_host $master_port
|
|
wait_for_sync $replica
|
|
|
|
test {module propagates from timer} {
|
|
# Replicate large commands to make the replica disconnected.
|
|
$master write [format_command propagate-test.verbatim 100000 [string repeat "a" 1000]] ;# almost 100mb
|
|
# Execute this command together with module commands within the same
|
|
# event loop to prevent periodic cleanup of replication backlog.
|
|
$master write [format_command info memory]
|
|
$master flush
|
|
$master read ;# propagate-test.verbatim
|
|
set res [$master read] ;# info memory
|
|
|
|
# Wait for the replica to be disconnected.
|
|
wait_for_log_messages 0 {"*flags=S*scheduled to be closed ASAP for overcoming of output buffer limits*"} 0 1500 10
|
|
# Due to the replica reaching the soft limit (5MB), memory peaks should not significantly
|
|
# exceed the replica soft limit. Furthermore, as the replica release its reference to
|
|
# replication backlog, it should be properly trimmed, the memory usage of replication
|
|
# backlog should not significantly exceed repl-backlog-size (default 1MB). */
|
|
assert_lessthan [getInfoProperty $res used_memory_peak] 10000000;# less than 10mb
|
|
assert_lessthan [getInfoProperty $res mem_replication_backlog] 2000000;# less than 2mb
|
|
}
|
|
}
|
|
}
|
|
}
|