redis/tests/support/util.tcl
Ozan Tezcan 73a9b916c9
Rdb channel replication (#13732)
This PR is based on:

https://github.com/redis/redis/pull/12109
https://github.com/valkey-io/valkey/pull/60

Closes: https://github.com/redis/redis/issues/11678

**Motivation**

During a full sync, when master is delivering RDB to the replica,
incoming write commands are kept in a replication buffer in order to be
sent to the replica once RDB delivery is completed. If RDB delivery
takes a long time, it might create memory pressure on master. Also, once
a replica connection accumulates replication data which is larger than
output buffer limits, master will kill replica connection. This may
cause a replication failure.

The main benefit of the rdb channel replication is streaming incoming
commands in parallel to the RDB delivery. This approach shifts
replication stream buffering to the replica and reduces load on master.
We do this by opening another connection for RDB delivery. The main
channel on replica will be receiving replication stream while rdb
channel is receiving the RDB.

This feature also helps to reduce master's main process CPU load. By
opening a dedicated connection for the RDB transfer, the bgsave process
has access to the new connection and it will stream RDB directly to the
replicas. Before this change, due to TLS connection restriction, the
bgsave process was writing RDB bytes to a pipe and the main process was
forwarding
it to the replica. This is no longer necessary, the main process can
avoid these expensive socket read/write syscalls. It also means RDB
delivery to replica will be faster as it avoids this step.

In summary, replication will be faster and master's performance during
full syncs will improve.


**Implementation steps**

1. When replica connects to the master, it sends 'rdb-channel-repl' as
part of capability exchange to let master to know replica supports rdb
channel.
2. When replica lacks sufficient data for PSYNC, master sends
+RDBCHANNELSYNC reply with replica's client id. As the next step, the
replica opens a new connection (rdb-channel) and configures it against
the master with the appropriate capabilities and requirements. It also
sends given client id back to master over rdbchannel, so that master can
associate these channels. (initial replica connection will be referred
as main-channel) Then, replica requests fullsync using the RDB channel.
3. Prior to forking, master attaches the replica's main channel to the
replication backlog to deliver replication stream starting at the
snapshot end offset.
4. The master main process sends replication stream via the main
channel, while the bgsave process sends the RDB directly to the replica
via the rdb-channel. Replica accumulates replication stream in a local
buffer, while the RDB is being loaded into the memory.
5. Once the replica completes loading the rdb, it drops the rdb channel
and streams the accumulated replication stream into the db. Sync is
completed.

**Some details**
- Currently, rdbchannel replication is supported only if
`repl-diskless-sync` is enabled on master. Otherwise, replication will
happen over a single connection as in before.
- On replica, there is a limit to replication stream buffering. Replica
uses a new config `replica-full-sync-buffer-limit` to limit number of
bytes to accumulate. If it is not set, replica inherits
`client-output-buffer-limit <replica>` hard limit config. If we reach
this limit, replica stops accumulating. This is not a failure scenario
though. Further accumulation will happen on master side. Depending on
the configured limits on master, master may kill the replica connection.

**API changes in INFO output:**

1. New replica state: `send_bulk_and_stream`. Indicates full sync is
still in progress for this replica. It is receiving replication stream
and rdb in parallel.
```
slave0:ip=127.0.0.1,port=5002,state=send_bulk_and_stream,offset=0,lag=0
```
Replica state changes in steps:
- First, replica sends psync and receives +RDBCHANNELSYNC
:`state=wait_bgsave`
- After replica connects with rdbchannel and delivery starts:
`state=send_bulk_and_stream`
 - After full sync: `state=online`

2. On replica side, replication stream buffering metrics:
- replica_full_sync_buffer_size: Currently accumulated replication
stream data in bytes.
- replica_full_sync_buffer_peak: Peak number of bytes that this instance
accumulated in the lifetime of the process.

```
replica_full_sync_buffer_size:20485             
replica_full_sync_buffer_peak:1048560
```

**API changes in CLIENT LIST**

In `client list` output, rdbchannel clients will have 'C' flag in
addition to 'S' replica flag:
```
id=11 addr=127.0.0.1:39108 laddr=127.0.0.1:5001 fd=14 name= age=5 idle=5 flags=SC db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=1920 events=r cmd=psync user=default redir=-1 resp=2 lib-name= lib-ver= io-thread=0
```

**Config changes:**
- `replica-full-sync-buffer-limit`: Controls how much replication data
replica can accumulate during rdbchannel replication. If it is not set,
a value of 0 means replica will inherit `client-output-buffer-limit
<replica>` hard limit config to limit accumulated data.
- `repl-rdb-channel` config is added as a hidden config. This is mostly
for testing as we need to support both rdbchannel replication and the
older single connection replication (to keep compatibility with older
versions and rdbchannel replication will not be enabled if
repl-diskless-sync is not enabled). it affects both the master (not to
respond to rdb channel requests), and the replica (not to declare
capability)

**Internal API changes:**
Changes that were introduced to Redis replication:
- New replication capability is added to replconf command: `capa
rdb-channel-repl`. Indicates replica is capable of rdb channel
replication. Replica sends it when it connects to master along with
other capabilities.
- If replica needs fullsync, master replies `+RDBCHANNELSYNC
<client-id>` to the replica's PSYNC request.
- When replica opens rdbchannel connection, as part of replconf command,
it sends `rdb-channel 1` to let master know this is rdb channel. Also,
it sends `main-ch-client-id <client-id>` as part of replconf command so
master can associate channels.
  
**Testing:**
As rdbchannel replication is enabled by default, we run whole test suite
with it. Though, as we need to support both rdbchannel and single
connection replication, we'll be running some tests twice with
`repl-rdb-channel yes/no` config.

**Replica state diagram**
```
* * Replica state machine *
 *
 * Main channel state
 * ┌───────────────────┐
 * │RECEIVE_PING_REPLY │
 * └────────┬──────────┘
 *          │ +PONG
 * ┌────────▼──────────┐
 * │SEND_HANDSHAKE     │                     RDB channel state
 * └────────┬──────────┘            ┌───────────────────────────────┐
 *          │+OK                ┌───► RDB_CH_SEND_HANDSHAKE         │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_AUTH_REPLY │        │    REPLCONF main-ch-client-id <clientid>
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │ RDB_CH_RECEIVE_AUTH_REPLY     │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_PORT_REPLY │        │                  │ +OK
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │  RDB_CH_RECEIVE_REPLCONF_REPLY│
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_IP_REPLY   │        │                  │ +OK
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │ RDB_CH_RECEIVE_FULLRESYNC     │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_CAPA_REPLY │        │                  │+FULLRESYNC
 * └────────┬──────────┘        │                  │Rdb delivery
 *          │                   │   ┌──────────────▼────────────────┐
 * ┌────────▼──────────┐        │   │ RDB_CH_RDB_LOADING            │
 * │SEND_PSYNC         │        │   └──────────────┬────────────────┘
 * └─┬─────────────────┘        │                  │ Done loading
 *   │PSYNC (use cached-master) │                  │
 * ┌─▼─────────────────┐        │                  │
 * │RECEIVE_PSYNC_REPLY│        │    ┌────────────►│ Replica streams replication
 * └─┬─────────────────┘        │    │             │ buffer into memory
 *   │                          │    │             │
 *   │+RDBCHANNELSYNC client-id │    │             │
 *   ├──────┬───────────────────┘    │             │
 *   │      │ Main channel           │             │
 *   │      │ accumulates repl data  │             │
 *   │   ┌──▼────────────────┐       │     ┌───────▼───────────┐
 *   │   │ REPL_TRANSFER     ├───────┘     │    CONNECTED      │
 *   │   └───────────────────┘             └────▲───▲──────────┘
 *   │                                          │   │
 *   │                                          │   │
 *   │  +FULLRESYNC    ┌───────────────────┐    │   │
 *   ├────────────────► REPL_TRANSFER      ├────┘   │
 *   │                 └───────────────────┘        │
 *   │  +CONTINUE                                   │
 *   └──────────────────────────────────────────────┘
 */
 ```
 -----
 This PR also contains changes and ideas from: 
https://github.com/valkey-io/valkey/pull/837
https://github.com/valkey-io/valkey/pull/1173
https://github.com/valkey-io/valkey/pull/804
https://github.com/valkey-io/valkey/pull/945
https://github.com/valkey-io/valkey/pull/989
---------

Co-authored-by: Yuan Wang <wangyuancode@163.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Moti Cohen <moticless@gmail.com>
Co-authored-by: naglera <anagler123@gmail.com>
Co-authored-by: Amit Nagler <58042354+naglera@users.noreply.github.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Ping Xie <pingxie@outlook.com>
Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
Co-authored-by: ranshid <88133677+ranshid@users.noreply.github.com>
Co-authored-by: xbasel <103044017+xbasel@users.noreply.github.com>
2025-01-13 15:09:52 +03:00

1200 lines
35 KiB
Tcl

#
# Copyright (c) 2009-Present, Redis Ltd.
# All rights reserved.
#
# Copyright (c) 2024-present, Valkey contributors.
# All rights reserved.
#
# Licensed under your choice of the Redis Source Available License 2.0
# (RSALv2) or the Server Side Public License v1 (SSPLv1).
#
# Portions of this file are available under BSD3 terms; see REDISCONTRIBUTIONS for more information.
#
proc randstring {min max {type binary}} {
set len [expr {$min+int(rand()*($max-$min+1))}]
set output {}
if {$type eq {binary}} {
set minval 0
set maxval 255
} elseif {$type eq {alpha} || $type eq {simplealpha}} {
set minval 48
set maxval 122
} elseif {$type eq {compr}} {
set minval 48
set maxval 52
}
while {$len} {
set num [expr {$minval+int(rand()*($maxval-$minval+1))}]
set rr [format "%c" $num]
if {$type eq {simplealpha} && ![string is alnum $rr]} {continue}
if {$type eq {alpha} && $num eq 92} {continue} ;# avoid putting '\' char in the string, it can mess up TCL processing
append output $rr
incr len -1
}
return $output
}
# Useful for some test
proc zlistAlikeSort {a b} {
if {[lindex $a 0] > [lindex $b 0]} {return 1}
if {[lindex $a 0] < [lindex $b 0]} {return -1}
string compare [lindex $a 1] [lindex $b 1]
}
# Return all log lines starting with the first line that contains a warning.
# Generally, this will be an assertion error with a stack trace.
proc crashlog_from_file {filename} {
set lines [split [exec cat $filename] "\n"]
set matched 0
set logall 0
set result {}
foreach line $lines {
if {[string match {*REDIS BUG REPORT START*} $line]} {
set logall 1
}
if {[regexp {^\[\d+\]\s+\d+\s+\w+\s+\d{2}:\d{2}:\d{2} \#} $line]} {
set matched 1
}
if {$logall || $matched} {
lappend result $line
}
}
join $result "\n"
}
# Return sanitizer log lines
proc sanitizer_errors_from_file {filename} {
set log [exec cat $filename]
set lines [split [exec cat $filename] "\n"]
foreach line $lines {
# Ignore huge allocation warnings
if ([string match {*WARNING: AddressSanitizer failed to allocate*} $line]) {
continue
}
# GCC UBSAN output does not contain 'Sanitizer' but 'runtime error'.
if {[string match {*runtime error*} $line] ||
[string match {*Sanitizer*} $line]} {
return $log
}
}
return ""
}
proc getInfoProperty {infostr property} {
if {[regexp -lineanchor "^$property:(.*?)\r\n" $infostr _ value]} {
return $value
}
}
# Return value for INFO property
proc status {r property} {
set _ [getInfoProperty [{*}$r info] $property]
}
proc waitForBgsave r {
while 1 {
if {[status $r rdb_bgsave_in_progress] eq 1} {
if {$::verbose} {
puts -nonewline "\nWaiting for background save to finish... "
flush stdout
}
after 50
} else {
break
}
}
}
proc waitForBgrewriteaof r {
while 1 {
if {[status $r aof_rewrite_in_progress] eq 1} {
if {$::verbose} {
puts -nonewline "\nWaiting for background AOF rewrite to finish... "
flush stdout
}
after 50
} else {
break
}
}
}
proc wait_for_sync r {
wait_for_condition 50 100 {
[status $r master_link_status] eq "up"
} else {
fail "replica didn't sync in time"
}
}
proc wait_replica_online {r {replica_id 0} {maxtries 50} {delay 100}} {
wait_for_condition $maxtries $delay {
[string match "*slave$replica_id:*,state=online*" [$r info replication]]
} else {
fail "replica $replica_id did not become online in time"
}
}
proc wait_for_ofs_sync {r1 r2} {
wait_for_condition 50 100 {
[status $r1 master_repl_offset] eq [status $r2 master_repl_offset]
} else {
fail "replica offset didn't match in time"
}
}
proc wait_done_loading r {
wait_for_condition 50 100 {
[catch {$r ping} e] == 0
} else {
fail "Loading DB is taking too much time."
}
}
proc wait_lazyfree_done r {
wait_for_condition 50 100 {
[status $r lazyfree_pending_objects] == 0
} else {
fail "lazyfree isn't done"
}
}
# count current log lines in server's stdout
proc count_log_lines {srv_idx} {
set _ [string trim [exec wc -l < [srv $srv_idx stdout]]]
}
# returns the number of times a line with that pattern appears in a file
proc count_message_lines {file pattern} {
set res 0
# exec fails when grep exists with status other than 0 (when the pattern wasn't found)
catch {
set res [string trim [exec grep $pattern $file 2> /dev/null | wc -l]]
}
return $res
}
# returns the number of times a line with that pattern appears in the log
proc count_log_message {srv_idx pattern} {
set stdout [srv $srv_idx stdout]
return [count_message_lines $stdout $pattern]
}
# verify pattern exists in server's sdtout after a certain line number
proc verify_log_message {srv_idx pattern from_line} {
incr from_line
set result [exec tail -n +$from_line < [srv $srv_idx stdout]]
if {![string match $pattern $result]} {
error "assertion:expected message not found in log file: $pattern"
}
}
# wait for pattern to be found in server's stdout after certain line number
# return value is a list containing the line that matched the pattern and the line number
proc wait_for_log_messages {srv_idx patterns from_line maxtries delay} {
set retry $maxtries
set next_line [expr $from_line + 1] ;# searching form the line after
set stdout [srv $srv_idx stdout]
while {$retry} {
# re-read the last line (unless it's before to our first), last time we read it, it might have been incomplete
set next_line [expr $next_line - 1 > $from_line + 1 ? $next_line - 1 : $from_line + 1]
set result [exec tail -n +$next_line < $stdout]
set result [split $result "\n"]
foreach line $result {
foreach pattern $patterns {
if {[string match $pattern $line]} {
return [list $line $next_line]
}
}
incr next_line
}
incr retry -1
after $delay
}
if {$retry == 0} {
if {$::verbose} {
puts "content of $stdout from line: $from_line:"
puts [exec tail -n +$from_line < $stdout]
}
fail "log message of '$patterns' not found in $stdout after line: $from_line till line: [expr $next_line -1]"
}
}
# write line to server log file
proc write_log_line {srv_idx msg} {
set logfile [srv $srv_idx stdout]
set fd [open $logfile "a+"]
puts $fd "### $msg"
close $fd
}
# Random integer between 0 and max (excluded).
proc randomInt {max} {
expr {int(rand()*$max)}
}
# Random integer between min and max (excluded).
proc randomRange {min max} {
expr {int(rand()*[expr $max - $min]) + $min}
}
# Random signed integer between -max and max (both extremes excluded).
proc randomSignedInt {max} {
set i [randomInt $max]
if {rand() > 0.5} {
set i -$i
}
return $i
}
proc randpath args {
set path [expr {int(rand()*[llength $args])}]
uplevel 1 [lindex $args $path]
}
proc randomValue {} {
randpath {
# Small enough to likely collide
randomSignedInt 1000
} {
# 32 bit compressible signed/unsigned
randpath {randomSignedInt 2000000000} {randomSignedInt 4000000000}
} {
# 64 bit
randpath {randomSignedInt 1000000000000}
} {
# Random string
randpath {randstring 0 256 alpha} \
{randstring 0 256 compr} \
{randstring 0 256 binary}
}
}
proc randomKey {} {
randpath {
# Small enough to likely collide
randomInt 1000
} {
# 32 bit compressible signed/unsigned
randpath {randomInt 2000000000} {randomInt 4000000000}
} {
# 64 bit
randpath {randomInt 1000000000000}
} {
# Random string
randpath {randstring 1 256 alpha} \
{randstring 1 256 compr}
}
}
proc findKeyWithType {r type} {
for {set j 0} {$j < 20} {incr j} {
set k [{*}$r randomkey]
if {$k eq {}} {
return {}
}
if {[{*}$r type $k] eq $type} {
return $k
}
}
return {}
}
proc createComplexDataset {r ops {opt {}}} {
set useexpire [expr {[lsearch -exact $opt useexpire] != -1}]
set usehexpire [expr {[lsearch -exact $opt usehexpire] != -1}]
if {[lsearch -exact $opt usetag] != -1} {
set tag "{t}"
} else {
set tag ""
}
for {set j 0} {$j < $ops} {incr j} {
set k [randomKey]$tag
set k2 [randomKey]$tag
set f [randomValue]
set v [randomValue]
if {$useexpire} {
if {rand() < 0.1} {
{*}$r expire [randomKey] [randomInt 2]
}
}
randpath {
set d [expr {rand()}]
} {
set d [expr {rand()}]
} {
set d [expr {rand()}]
} {
set d [expr {rand()}]
} {
set d [expr {rand()}]
} {
randpath {set d +inf} {set d -inf}
}
set t [{*}$r type $k]
if {$t eq {none}} {
randpath {
{*}$r set $k $v
} {
{*}$r lpush $k $v
} {
{*}$r sadd $k $v
} {
{*}$r zadd $k $d $v
} {
{*}$r hset $k $f $v
} {
{*}$r del $k
}
set t [{*}$r type $k]
}
switch $t {
{string} {
# Nothing to do
}
{list} {
randpath {{*}$r lpush $k $v} \
{{*}$r rpush $k $v} \
{{*}$r lrem $k 0 $v} \
{{*}$r rpop $k} \
{{*}$r lpop $k}
}
{set} {
randpath {{*}$r sadd $k $v} \
{{*}$r srem $k $v} \
{
set otherset [findKeyWithType {*}$r set]
if {$otherset ne {}} {
randpath {
{*}$r sunionstore $k2 $k $otherset
} {
{*}$r sinterstore $k2 $k $otherset
} {
{*}$r sdiffstore $k2 $k $otherset
}
}
}
}
{zset} {
randpath {{*}$r zadd $k $d $v} \
{{*}$r zrem $k $v} \
{
set otherzset [findKeyWithType {*}$r zset]
if {$otherzset ne {}} {
randpath {
{*}$r zunionstore $k2 2 $k $otherzset
} {
{*}$r zinterstore $k2 2 $k $otherzset
}
}
}
}
{hash} {
randpath {{*}$r hset $k $f $v} \
{{*}$r hdel $k $f}
if { [{*}$r hexists $k $f] && $usehexpire && rand() < 0.5} {
{*}$r hexpire $k 1000 FIELDS 1 $f
}
}
}
}
}
proc formatCommand {args} {
set cmd "*[llength $args]\r\n"
foreach a $args {
append cmd "$[string length $a]\r\n$a\r\n"
}
set _ $cmd
}
proc csvdump r {
set o {}
if {$::singledb} {
set maxdb 1
} else {
set maxdb 16
}
for {set db 0} {$db < $maxdb} {incr db} {
if {!$::singledb} {
{*}$r select $db
}
foreach k [lsort [{*}$r keys *]] {
set type [{*}$r type $k]
append o [csvstring $db] , [csvstring $k] , [csvstring $type] ,
switch $type {
string {
append o [csvstring [{*}$r get $k]] "\n"
}
list {
foreach e [{*}$r lrange $k 0 -1] {
append o [csvstring $e] ,
}
append o "\n"
}
set {
foreach e [lsort [{*}$r smembers $k]] {
append o [csvstring $e] ,
}
append o "\n"
}
zset {
foreach e [{*}$r zrange $k 0 -1 withscores] {
append o [csvstring $e] ,
}
append o "\n"
}
hash {
set fields [{*}$r hgetall $k]
set newfields {}
foreach {f v} $fields {
set expirylist [{*}$r hexpiretime $k FIELDS 1 $f]
if {$expirylist eq (-1)} {
lappend newfields [list $f $v]
} else {
set e [lindex $expirylist 0]
lappend newfields [list $f $e $v] # TODO: extract the actual ttl value from the list in $e
}
}
set fields [lsort -index 0 $newfields]
foreach kv $fields {
append o [csvstring [lindex $kv 0]] ,
append o [csvstring [lindex $kv 1]] ,
}
append o "\n"
}
}
}
}
if {!$::singledb} {
{*}$r select 9
}
return $o
}
proc csvstring s {
return "\"$s\""
}
proc roundFloat f {
format "%.10g" $f
}
set ::last_port_attempted 0
proc find_available_port {start count} {
set port [expr $::last_port_attempted + 1]
for {set attempts 0} {$attempts < $count} {incr attempts} {
if {$port < $start || $port >= $start+$count} {
set port $start
}
set fd1 -1
proc dummy_accept {chan addr port} {}
if {[catch {set fd1 [socket -server dummy_accept -myaddr 127.0.0.1 $port]}] ||
[catch {set fd2 [socket -server dummy_accept -myaddr 127.0.0.1 [expr $port+10000]]}]} {
if {$fd1 != -1} {
close $fd1
}
} else {
close $fd1
close $fd2
set ::last_port_attempted $port
return $port
}
incr port
}
error "Can't find a non busy port in the $start-[expr {$start+$count-1}] range."
}
# Test if TERM looks like to support colors
proc color_term {} {
expr {[info exists ::env(TERM)] && [string match *xterm* $::env(TERM)]}
}
proc colorstr {color str} {
if {[color_term]} {
set b 0
if {[string range $color 0 4] eq {bold-}} {
set b 1
set color [string range $color 5 end]
}
switch $color {
red {set colorcode {31}}
green {set colorcode {32}}
yellow {set colorcode {33}}
blue {set colorcode {34}}
magenta {set colorcode {35}}
cyan {set colorcode {36}}
white {set colorcode {37}}
default {set colorcode {37}}
}
if {$colorcode ne {}} {
return "\033\[$b;${colorcode};49m$str\033\[0m"
}
} else {
return $str
}
}
proc find_valgrind_errors {stderr on_termination} {
set fd [open $stderr]
set buf [read $fd]
close $fd
# Look for stack trace (" at 0x") and other errors (Invalid, Mismatched, etc).
# Look for "Warnings", but not the "set address range perms". These don't indicate any real concern.
# corrupt-dump unit, not sure why but it seems they don't indicate any real concern.
if {[regexp -- { at 0x} $buf] ||
[regexp -- {^(?=.*Warning)(?:(?!set address range perms).)*$} $buf] ||
[regexp -- {Invalid} $buf] ||
[regexp -- {Mismatched} $buf] ||
[regexp -- {uninitialized} $buf] ||
[regexp -- {has a fishy} $buf] ||
[regexp -- {overlap} $buf]} {
return $buf
}
# If the process didn't terminate yet, we can't look for the summary report
if {!$on_termination} {
return ""
}
# Look for the absence of a leak free summary (happens when redis isn't terminated properly).
if {(![regexp -- {definitely lost: 0 bytes} $buf] &&
![regexp -- {no leaks are possible} $buf])} {
return $buf
}
return ""
}
# Execute a background process writing random data for the specified number
# of seconds to the specified Redis instance. If key is omitted, a random key
# is used for every SET command.
proc start_write_load {host port seconds {key ""}} {
set tclsh [info nameofexecutable]
exec $tclsh tests/helpers/gen_write_load.tcl $host $port $seconds $::tls $key &
}
# Stop a process generating write load executed with start_write_load.
proc stop_write_load {handle} {
catch {exec /bin/kill -9 $handle}
}
proc wait_load_handlers_disconnected {{level 0}} {
wait_for_condition 50 100 {
![string match {*name=LOAD_HANDLER*} [r $level client list]]
} else {
fail "load_handler(s) still connected after too long time."
}
}
proc K { x y } { set x }
# Shuffle a list with Fisher-Yates algorithm.
proc lshuffle {list} {
set n [llength $list]
while {$n>1} {
set j [expr {int(rand()*$n)}]
incr n -1
if {$n==$j} continue
set v [lindex $list $j]
lset list $j [lindex $list $n]
lset list $n $v
}
return $list
}
# Execute a background process writing complex data for the specified number
# of ops to the specified Redis instance.
proc start_bg_complex_data {host port db ops} {
set tclsh [info nameofexecutable]
exec $tclsh tests/helpers/bg_complex_data.tcl $host $port $db $ops $::tls &
}
# Stop a process generating write load executed with start_bg_complex_data.
proc stop_bg_complex_data {handle} {
catch {exec /bin/kill -9 $handle}
}
# Write num keys with the given key prefix and value size (in bytes). If idx is
# given, it's the index (AKA level) used with the srv procedure and it specifies
# to which Redis instance to write the keys.
proc populate {num {prefix key:} {size 3} {idx 0} {prints false} {expires 0}} {
r $idx deferred 1
if {$num > 16} {set pipeline 16} else {set pipeline $num}
set val [string repeat A $size]
for {set j 0} {$j < $pipeline} {incr j} {
if {$expires > 0} {
r $idx set $prefix$j $val ex $expires
} else {
r $idx set $prefix$j $val
}
if {$prints} {puts $j}
}
for {} {$j < $num} {incr j} {
if {$expires > 0} {
r $idx set $prefix$j $val ex $expires
} else {
r $idx set $prefix$j $val
}
r $idx read
if {$prints} {puts $j}
}
for {set j 0} {$j < $pipeline} {incr j} {
r $idx read
if {$prints} {puts $j}
}
r $idx deferred 0
}
proc get_child_pid {idx} {
set pid [srv $idx pid]
if {[file exists "/usr/bin/pgrep"]} {
set fd [open "|pgrep -P $pid" "r"]
set child_pid [string trim [lindex [split [read $fd] \n] 0]]
} else {
set fd [open "|ps --ppid $pid -o pid" "r"]
set child_pid [string trim [lindex [split [read $fd] \n] 1]]
}
close $fd
return $child_pid
}
proc process_is_alive pid {
if {[catch {exec ps -p $pid -f} err]} {
return 0
} else {
if {[string match "*<defunct>*" $err]} { return 0 }
return 1
}
}
proc pause_process pid {
exec kill -SIGSTOP $pid
wait_for_condition 50 100 {
[string match {*T*} [lindex [exec ps j $pid] 16]]
} else {
puts [exec ps j $pid]
fail "process didn't stop"
}
}
proc resume_process pid {
wait_for_condition 50 1000 {
[string match "T*" [exec ps -o state= -p $pid]]
} else {
puts [exec ps j $pid]
fail "process was not stopped"
}
exec kill -SIGCONT $pid
}
proc cmdrstat {cmd r} {
if {[regexp "\r\ncmdstat_$cmd:(.*?)\r\n" [$r info commandstats] _ value]} {
set _ $value
}
}
proc errorrstat {cmd r} {
if {[regexp "\r\nerrorstat_$cmd:(.*?)\r\n" [$r info errorstats] _ value]} {
set _ $value
}
}
proc latencyrstat_percentiles {cmd r} {
if {[regexp "\r\nlatency_percentiles_usec_$cmd:(.*?)\r\n" [$r info latencystats] _ value]} {
set _ $value
}
}
proc get_io_thread_clients {id {client r}} {
set pattern "io_thread_$id:clients=(\[0-9\]+)"
set info [$client info threads]
if {[regexp $pattern $info _ value]} {
return $value
} else {
return -1
}
}
proc generate_fuzzy_traffic_on_key {key type duration} {
# Commands per type, blocking commands removed
# TODO: extract these from COMMAND DOCS, and improve to include other types
set string_commands {APPEND BITCOUNT BITFIELD BITOP BITPOS DECR DECRBY GET GETBIT GETRANGE GETSET INCR INCRBY INCRBYFLOAT MGET MSET MSETNX PSETEX SET SETBIT SETEX SETNX SETRANGE LCS STRLEN}
set hash_commands {HDEL HEXISTS HGET HGETALL HINCRBY HINCRBYFLOAT HKEYS HLEN HMGET HMSET HSCAN HSET HSETNX HSTRLEN HVALS HRANDFIELD}
set zset_commands {ZADD ZCARD ZCOUNT ZINCRBY ZINTERSTORE ZLEXCOUNT ZPOPMAX ZPOPMIN ZRANGE ZRANGEBYLEX ZRANGEBYSCORE ZRANK ZREM ZREMRANGEBYLEX ZREMRANGEBYRANK ZREMRANGEBYSCORE ZREVRANGE ZREVRANGEBYLEX ZREVRANGEBYSCORE ZREVRANK ZSCAN ZSCORE ZUNIONSTORE ZRANDMEMBER}
set list_commands {LINDEX LINSERT LLEN LPOP LPOS LPUSH LPUSHX LRANGE LREM LSET LTRIM RPOP RPOPLPUSH RPUSH RPUSHX}
set set_commands {SADD SCARD SDIFF SDIFFSTORE SINTER SINTERSTORE SISMEMBER SMEMBERS SMOVE SPOP SRANDMEMBER SREM SSCAN SUNION SUNIONSTORE}
set stream_commands {XACK XADD XCLAIM XDEL XGROUP XINFO XLEN XPENDING XRANGE XREAD XREADGROUP XREVRANGE XTRIM}
set commands [dict create string $string_commands hash $hash_commands zset $zset_commands list $list_commands set $set_commands stream $stream_commands]
set cmds [dict get $commands $type]
set start_time [clock seconds]
set sent {}
set succeeded 0
while {([clock seconds]-$start_time) < $duration} {
# find a random command for our key type
set cmd_idx [expr {int(rand()*[llength $cmds])}]
set cmd [lindex $cmds $cmd_idx]
# get the command details from redis
if { [ catch {
set cmd_info [lindex [r command info $cmd] 0]
} err ] } {
# if we failed, it means redis crashed after the previous command
return $sent
}
# try to build a valid command argument
set arity [lindex $cmd_info 1]
set arity [expr $arity < 0 ? - $arity: $arity]
set firstkey [lindex $cmd_info 3]
set lastkey [lindex $cmd_info 4]
set i 1
if {$cmd == "XINFO"} {
lappend cmd "STREAM"
lappend cmd $key
lappend cmd "FULL"
incr i 3
}
if {$cmd == "XREAD"} {
lappend cmd "STREAMS"
lappend cmd $key
randpath {
lappend cmd \$
} {
lappend cmd [randomValue]
}
incr i 3
}
if {$cmd == "XADD"} {
lappend cmd $key
randpath {
lappend cmd "*"
} {
lappend cmd [randomValue]
}
lappend cmd [randomValue]
lappend cmd [randomValue]
incr i 4
}
for {} {$i < $arity} {incr i} {
if {$i == $firstkey || $i == $lastkey} {
lappend cmd $key
} else {
lappend cmd [randomValue]
}
}
# execute the command, we expect commands to fail on syntax errors
lappend sent $cmd
if { ! [ catch {
r {*}$cmd
} err ] } {
incr succeeded
} else {
set err [format "%s" $err] ;# convert to string for pattern matching
if {[string match "*SIGTERM*" $err]} {
puts "commands caused test to hang:"
foreach cmd $sent {
foreach arg $cmd {
puts -nonewline "[string2printable $arg] "
}
puts ""
}
# Re-raise, let handler up the stack take care of this.
error $err $::errorInfo
}
}
}
# print stats so that we know if we managed to generate commands that actually made sense
#if {$::verbose} {
# set count [llength $sent]
# puts "Fuzzy traffic sent: $count, succeeded: $succeeded"
#}
# return the list of commands we sent
return $sent
}
proc string2printable s {
set res {}
set has_special_chars false
foreach i [split $s {}] {
scan $i %c int
# non printable characters, including space and excluding: " \ $ { }
if {$int < 32 || $int > 122 || $int == 34 || $int == 36 || $int == 92} {
set has_special_chars true
}
# TCL8.5 has issues mixing \x notation and normal chars in the same
# source code string, so we'll convert the entire string.
append res \\x[format %02X $int]
}
if {!$has_special_chars} {
return $s
}
set res "\"$res\""
return $res
}
# Calculation value of Chi-Square Distribution. By this value
# we can verify the random distribution sample confidence.
# Based on the following wiki:
# https://en.wikipedia.org/wiki/Chi-square_distribution
#
# param res Random sample list
# return Value of Chi-Square Distribution
#
# x2_value: return of chi_square_value function
# df: Degrees of freedom, Number of independent values minus 1
#
# By using x2_value and df to back check the cardinality table,
# we can know the confidence of the random sample.
proc chi_square_value {res} {
unset -nocomplain mydict
foreach key $res {
dict incr mydict $key 1
}
set x2_value 0
set p [expr [llength $res] / [dict size $mydict]]
foreach key [dict keys $mydict] {
set value [dict get $mydict $key]
# Aggregate the chi-square value of each element
set v [expr {pow($value - $p, 2) / $p}]
set x2_value [expr {$x2_value + $v}]
}
return $x2_value
}
#subscribe to Pub/Sub channels
proc consume_subscribe_messages {client type channels} {
set numsub -1
set counts {}
for {set i [llength $channels]} {$i > 0} {incr i -1} {
set msg [$client read]
assert_equal $type [lindex $msg 0]
# when receiving subscribe messages the channels names
# are ordered. when receiving unsubscribe messages
# they are unordered
set idx [lsearch -exact $channels [lindex $msg 1]]
if {[string match "*unsubscribe" $type]} {
assert {$idx >= 0}
} else {
assert {$idx == 0}
}
set channels [lreplace $channels $idx $idx]
# aggregate the subscription count to return to the caller
lappend counts [lindex $msg 2]
}
# we should have received messages for channels
assert {[llength $channels] == 0}
return $counts
}
proc subscribe {client channels} {
$client subscribe {*}$channels
consume_subscribe_messages $client subscribe $channels
}
proc ssubscribe {client channels} {
$client ssubscribe {*}$channels
consume_subscribe_messages $client ssubscribe $channels
}
proc unsubscribe {client {channels {}}} {
$client unsubscribe {*}$channels
consume_subscribe_messages $client unsubscribe $channels
}
proc sunsubscribe {client {channels {}}} {
$client sunsubscribe {*}$channels
consume_subscribe_messages $client sunsubscribe $channels
}
proc psubscribe {client channels} {
$client psubscribe {*}$channels
consume_subscribe_messages $client psubscribe $channels
}
proc punsubscribe {client {channels {}}} {
$client punsubscribe {*}$channels
consume_subscribe_messages $client punsubscribe $channels
}
proc debug_digest_value {key} {
if {[lsearch $::denytags "needs:debug"] >= 0 || $::ignoredigest} {
return "dummy-digest-value"
}
r debug digest-value $key
}
proc debug_digest {{level 0}} {
if {[lsearch $::denytags "needs:debug"] >= 0 || $::ignoredigest} {
return "dummy-digest"
}
r $level debug digest
}
proc wait_for_blocked_client {{idx 0}} {
wait_for_condition 50 100 {
[s $idx blocked_clients] ne 0
} else {
fail "no blocked clients"
}
}
proc wait_for_blocked_clients_count {count {maxtries 100} {delay 10} {idx 0}} {
wait_for_condition $maxtries $delay {
[s $idx blocked_clients] == $count
} else {
fail "Timeout waiting for blocked clients"
}
}
proc wait_for_watched_clients_count {count {maxtries 100} {delay 10} {idx 0}} {
wait_for_condition $maxtries $delay {
[s $idx watching_clients] == $count
} else {
fail "Timeout waiting for watched clients"
}
}
proc read_from_aof {fp} {
# Input fp is a blocking binary file descriptor of an opened AOF file.
if {[gets $fp count] == -1} return ""
set count [string range $count 1 end]
# Return a list of arguments for the command.
set res {}
for {set j 0} {$j < $count} {incr j} {
read $fp 1
set arg [::redis::redis_bulk_read $fp]
if {$j == 0} {set arg [string tolower $arg]}
lappend res $arg
}
return $res
}
proc assert_aof_content {aof_path patterns} {
set fp [open $aof_path r]
fconfigure $fp -translation binary
fconfigure $fp -blocking 1
for {set j 0} {$j < [llength $patterns]} {incr j} {
assert_match [lindex $patterns $j] [read_from_aof $fp]
}
}
proc config_set {param value {options {}}} {
set mayfail 0
foreach option $options {
switch $option {
"mayfail" {
set mayfail 1
}
default {
error "Unknown option $option"
}
}
}
if {[catch {r config set $param $value} err]} {
if {!$mayfail} {
error $err
} else {
if {$::verbose} {
puts "Ignoring CONFIG SET $param $value failure: $err"
}
}
}
}
proc config_get_set {param value {options {}}} {
set config [lindex [r config get $param] 1]
config_set $param $value $options
return $config
}
proc delete_lines_with_pattern {filename tmpfilename pattern} {
set fh_in [open $filename r]
set fh_out [open $tmpfilename w]
while {[gets $fh_in line] != -1} {
if {![regexp $pattern $line]} {
puts $fh_out $line
}
}
close $fh_in
close $fh_out
file rename -force $tmpfilename $filename
}
proc get_nonloopback_addr {} {
set addrlist [list {}]
catch { set addrlist [exec hostname -I] }
return [lindex $addrlist 0]
}
proc get_nonloopback_client {} {
return [redis [get_nonloopback_addr] [srv 0 "port"] 0 $::tls]
}
# The following functions and variables are used only when running large-memory
# tests. We avoid defining them when not running large-memory tests because the
# global variables takes up lots of memory.
proc init_large_mem_vars {} {
if {![info exists ::str500]} {
set ::str500 [string repeat x 500000000] ;# 500mb
set ::str500_len [string length $::str500]
}
}
# Utility function to write big argument into redis client connection
proc write_big_bulk {size {prefix ""} {skip_read no}} {
init_large_mem_vars
assert {[string length prefix] <= $size}
r write "\$$size\r\n"
r write $prefix
incr size -[string length $prefix]
while {$size >= 500000000} {
r write $::str500
incr size -500000000
}
if {$size > 0} {
r write [string repeat x $size]
}
r write "\r\n"
if {!$skip_read} {
r flush
r read
}
}
# Utility to read big bulk response (work around Tcl limitations)
proc read_big_bulk {code {compare no} {prefix ""}} {
init_large_mem_vars
r readraw 1
set resp_len [uplevel 1 $code] ;# get the first line of the RESP response
assert_equal [string range $resp_len 0 0] "$"
set resp_len [string range $resp_len 1 end]
set prefix_len [string length $prefix]
if {$compare} {
assert {$prefix_len <= $resp_len}
assert {$prefix_len <= $::str500_len}
}
set remaining $resp_len
while {$remaining > 0} {
set l $remaining
if {$l > $::str500_len} {set l $::str500_len} ; # can't read more than 2gb at a time, so read 500mb so we can easily verify read data
set read_data [r rawread $l]
set nbytes [string length $read_data]
if {$compare} {
set comp_len $nbytes
# Compare prefix part
if {$remaining == $resp_len} {
assert_equal $prefix [string range $read_data 0 [expr $prefix_len - 1]]
set read_data [string range $read_data $prefix_len $nbytes]
incr comp_len -$prefix_len
}
# Compare rest of data, evaluate and then assert to avoid huge print in case of failure
set data_equal [expr {$read_data == [string range $::str500 0 [expr $comp_len - 1]]}]
assert $data_equal
}
incr remaining -$nbytes
}
assert_equal [r rawread 2] "\r\n"
r readraw 0
return $resp_len
}
proc prepare_value {size} {
set _v "c"
for {set i 1} {$i < $size} {incr i} {
append _v 0
}
return $_v
}
proc memory_usage {key} {
set usage [r memory usage $key]
if {![string match {*jemalloc*} [s mem_allocator]]} {
# libc allocator can sometimes return a different size allocation for the same requested size
# this makes tests that rely on MEMORY USAGE unreliable, so instead we return a constant 1
set usage 1
}
return $usage
}
# forward compatibility, lmap missing in TCL 8.5
proc lmap args {
set body [lindex $args end]
set args [lrange $args 0 end-1]
set n 0
set pairs [list]
foreach {varnames listval} $args {
set varlist [list]
foreach varname $varnames {
upvar 1 $varname var$n
lappend varlist var$n
incr n
}
lappend pairs $varlist $listval
}
set temp [list]
foreach {*}$pairs {
lappend temp [uplevel 1 $body]
}
set temp
}
proc format_command {args} {
set cmd "*[llength $args]\r\n"
foreach a $args {
append cmd "$[string length $a]\r\n$a\r\n"
}
set _ $cmd
}
# Returns whether or not the system supports stack traces
proc system_backtrace_supported {} {
set system_name [string tolower [exec uname -s]]
if {$system_name eq {darwin}} {
return 1
} elseif {$system_name ne {linux}} {
return 0
}
# libmusl does not support backtrace. Also return 0 on
# static binaries (ldd exit code 1) where we can't detect libmusl
if {![catch {set ldd [exec ldd src/redis-server]}]} {
if {![string match {*libc.*musl*} $ldd]} {
return 1
}
}
return 0
}
proc generate_largevalue_test_array {} {
array set largevalue {}
set largevalue(listpack) "hello"
set largevalue(quicklist) [string repeat "x" 8192]
return [array get largevalue]
}