bind9/bin/tests/system/cacheclean/tests.sh
Ondřej Surý 50f357cb36
Refactor the dns_adb unit
The dns_adb unit has been refactored to be much simpler.  Following
changes have been made:

1. Simplify the ADB to always allow GLUE and hints

   There were only two places where dns_adb_createfind() was used - in
   the dns_resolver unit where hints and GLUE addresses were ok, and in
   the dns_zone where dns_adb_createfind() would be called without
   DNS_ADBFIND_HINTOK and DNS_ADBFIND_GLUEOK set.

   Simplify the logic by allowing hint and GLUE addresses when looking
   up the nameserver addresses to notify.  The difference is negligible
   and would cause a difference in the notified addresses only when
   there's mismatch between the parent and child addresses and we
   haven't cached the child addresses yet.

2. Drop the namebuckets and entrybuckets

   Formerly, the namebuckets and entrybuckets were used to reduced the
   lock contention when accessing the double-linked lists stored in each
   bucket.  In the previous refactoring, the custom hashtable for the
   buckets has been replaced with isc_ht/isc_hashmap, so only a single
   item (mostly, see below) would end up in each bucket.

   Removing the entrybuckets has been straightforward, the only matching
   was done on the isc_sockaddr_t member of the dns_adbentry.

   Removing the zonebuckets required GLUEOK and HINTOK bits to be
   removed because the find could match entries with-or-without the bits
   set, and creating a custom key that stores the
   DNS_ADBFIND_STARTATZONE in the first byte of the key, so we can do a
   straightforward lookup into the hashtable without traversing a list
   that contains items with different flags.

3. Remove unassociated entries from ADB database

   Previously, the adbentries could live in the ADB database even after
   unlinking them from dns_adbnames.  Such entries would show up as
   "Unassociated entries" in the ADB dump.  The benefit of keeping such
   entries is little - the chance that we link such entry to a adbname
   is small, and it's simpler to evict unlinked entries from the ADB
   cache (and the hashtable) than create second LRU cleaning mechanism.

   Unlinked ADB entries are now directly deleted from the hash
   table (hashmap) upon destruction.

4. Cleanup expired entries from the hash table

   When buckets were still in place, the code would keep the buckets
   always allocated and never shrink the hash table (hashmap).  With
   proper reference counting in place, we can delete the adbnames from
   the hash table and the LRU list.

5. Stop purging the names early when we hit the time limit

   Because the LRU list is now time ordered, we can stop purging the
   names when we find a first entry that doesn't fullfil our time-based
   eviction criteria because no further entry on the LRU list will meet
   the criteria.

Future work:

1. Lock contention

   In this commit, the focus was on correctness of the data structure,
   but in the future, the lock contention in the ADB database needs to
   be addressed.  Currently, we use simple mutex to lock the hash
   tables, because we almost always need to use a write lock for
   properly purging the hashtables.  The ADB database needs to be
   sharded (similar to the effect that buckets had in the past).  Each
   shard would contain own hashmap and own LRU list.

2. Time-based purging

   The ADB names and entries stay intact when there are no lookups.
   When we add separate shards, a timer needs to be added for time-based
   cleaning in case there's no traffic hashing to the inactive shard.

3. Revisit the 30 minutes limit

   The ADB cache is capped at 30 minutes.  This needs to be revisited,
   and at least the limit should be configurable (in both directions).
2022-11-30 10:03:24 +01:00

265 lines
8.7 KiB
Bash
Executable file

#!/bin/sh
# Copyright (C) Internet Systems Consortium, Inc. ("ISC")
#
# SPDX-License-Identifier: MPL-2.0
#
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, you can obtain one at https://mozilla.org/MPL/2.0/.
#
# See the COPYRIGHT file distributed with this work for additional
# information regarding copyright ownership.
. ../conf.sh
status=0
n=0
RNDCOPTS="-c ../common/rndc.conf -s 10.53.0.2 -p ${CONTROLPORT}"
DIGOPTS="+nosea +nocomm +nocmd +noquest +noadd +noauth +nocomm \
+nostat @10.53.0.2 -p ${PORT}"
# fill the cache with nodes from flushtest.example zone
load_cache () {
# empty all existing cache data
$RNDC $RNDCOPTS flush
# load the positive cache entries
$DIG $DIGOPTS -f - << EOF > /dev/null 2>&1
txt top1.flushtest.example
txt second1.top1.flushtest.example
txt third1.second1.top1.flushtest.example
txt third2.second1.top1.flushtest.example
txt second2.top1.flushtest.example
txt second3.top1.flushtest.example
txt second1.top2.flushtest.example
txt second2.top2.flushtest.example
txt second3.top2.flushtest.example
txt top3.flushtest.example
txt second1.top3.flushtest.example
txt third1.second1.top3.flushtest.example
txt third2.second1.top3.flushtest.example
txt third1.second2.top3.flushtest.example
txt third2.second2.top3.flushtest.example
txt second3.top3.flushtest.example
EOF
# load the negative cache entries
# nxrrset:
$DIG $DIGOPTS a third1.second1.top1.flushtest.example > /dev/null
# nxdomain:
$DIG $DIGOPTS txt top4.flushtest.example > /dev/null
# empty nonterminal:
$DIG $DIGOPTS txt second2.top3.flushtest.example > /dev/null
# sleep 2 seconds ensure the TTLs will be lower on cached data
sleep 2
}
dump_cache () {
rndc_dumpdb ns2 -cache _default
}
clear_cache () {
$RNDC $RNDCOPTS flush
}
in_cache () {
ttl=`$DIG $DIGOPTS "$@" | awk '{print $2}'`
[ -z "$ttl" ] && {
ttl=`$DIG $DIGOPTS +noanswer +auth "$@" | awk '{print $2}'`
[ "$ttl" -ge 3599 ] && return 1
return 0
}
[ "$ttl" -ge 3599 ] && return 1
return 0
}
# Extract records at and below name "$1" from the cache dump in file "$2".
filter_tree () {
tree="$1"
file="$2"
perl -n -e '
next if /^;/;
if (/'"$tree"'/ || (/^\t/ && $print)) {
$print = 1;
} else {
$print = 0;
}
print if $print;
' "$file"
}
n=`expr $n + 1`
echo_i "check correctness of routine cache cleaning ($n)"
$DIG $DIGOPTS +tcp +keepopen -b 10.53.0.7 -f dig.batch > dig.out.ns2 || status=1
digcomp --lc dig.out.ns2 knowngood.dig.out || status=1
n=`expr $n + 1`
echo_i "only one tcp socket was used ($n)"
tcpclients=`awk '$3 == "client" && $5 ~ /10.53.0.7#[0-9]*:/ {print $5}' ns2/named.run | sort | uniq -c | wc -l`
test $tcpclients -eq 1 || { status=1; echo_i "failed"; }
n=`expr $n + 1`
echo_i "reset and check that records are correctly cached initially ($n)"
ret=0
load_cache
dump_cache
nrecords=`filter_tree flushtest.example ns2/named_dump.db.test$n | grep -E '(TXT|ANY)' | wc -l`
[ $nrecords -eq 18 ] || { ret=1; echo_i "found $nrecords records expected 18"; }
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check flushing of the full cache ($n)"
ret=0
clear_cache
dump_cache
nrecords=`filter_tree flushtest.example ns2/named_dump.db.test$n | wc -l`
[ $nrecords -eq 0 ] || ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check flushing of individual nodes (interior node) ($n)"
ret=0
clear_cache
load_cache
# interior node
in_cache txt top1.flushtest.example || ret=1
$RNDC $RNDCOPTS flushname top1.flushtest.example
in_cache txt top1.flushtest.example && ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check flushing of individual nodes (leaf node, under the interior node) ($n)"
ret=0
# leaf node, under the interior node (should still exist)
in_cache txt third2.second1.top1.flushtest.example || ret=1
$RNDC $RNDCOPTS flushname third2.second1.top1.flushtest.example
in_cache txt third2.second1.top1.flushtest.example && ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check flushing of individual nodes (another leaf node, with both positive and negative cache entries) ($n)"
ret=0
# another leaf node, with both positive and negative cache entries
in_cache a third1.second1.top1.flushtest.example || ret=1
in_cache txt third1.second1.top1.flushtest.example || ret=1
$RNDC $RNDCOPTS flushname third1.second1.top1.flushtest.example
in_cache a third1.second1.top1.flushtest.example && ret=1
in_cache txt third1.second1.top1.flushtest.example && ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check flushing a nonexistent name ($n)"
ret=0
$RNDC $RNDCOPTS flushname fake.flushtest.example || ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check flushing of namespaces ($n)"
ret=0
clear_cache
load_cache
# flushing leaf node should leave the interior node:
in_cache txt third1.second1.top1.flushtest.example || ret=1
in_cache txt top1.flushtest.example || ret=1
$RNDC $RNDCOPTS flushtree third1.second1.top1.flushtest.example
in_cache txt third1.second1.top1.flushtest.example && ret=1
in_cache txt top1.flushtest.example || ret=1
in_cache txt second1.top1.flushtest.example || ret=1
in_cache txt third2.second1.top1.flushtest.example || ret=1
$RNDC $RNDCOPTS flushtree second1.top1.flushtest.example
in_cache txt top1.flushtest.example || ret=1
in_cache txt second1.top1.flushtest.example && ret=1
in_cache txt third2.second1.top1.flushtest.example && ret=1
# flushing from an empty node should still remove all its children
in_cache txt second1.top2.flushtest.example || ret=1
$RNDC $RNDCOPTS flushtree top2.flushtest.example
in_cache txt second1.top2.flushtest.example && ret=1
in_cache txt second2.top2.flushtest.example && ret=1
in_cache txt second3.top2.flushtest.example && ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check flushing a nonexistent namespace ($n)"
ret=0
$RNDC $RNDCOPTS flushtree fake.flushtest.example || ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check the number of cached records remaining ($n)"
ret=0
dump_cache
nrecords=`filter_tree flushtest.example ns2/named_dump.db.test$n | grep -v '^;' | grep -E '(TXT|ANY)' | wc -l`
[ $nrecords -eq 17 ] || { ret=1; echo_i "found $nrecords records expected 17"; }
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check the check that flushname of a partial match works ($n)"
ret=0
in_cache txt second2.top1.flushtest.example || ret=1
$RNDC $RNDCOPTS flushtree example
in_cache txt second2.top1.flushtest.example && ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check the number of cached records remaining ($n)"
ret=0
dump_cache
nrecords=`filter_tree flushtest.example ns2/named_dump.db.test$n | grep -E '(TXT|ANY)' | wc -l`
[ $nrecords -eq 1 ] || { ret=1; echo_i "found $nrecords records expected 1"; }
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check flushtree clears adb correctly ($n)"
ret=0
load_cache
dump_cache
mv ns2/named_dump.db.test$n ns2/named_dump.db.test$n.a
sed -n '/plain success\/timeout/,/Unassociated entries/p' \
ns2/named_dump.db.test$n.a > sed.out.$n.a
grep 'plain success/timeout' sed.out.$n.a > /dev/null 2>&1 || ret=1
grep 'ns.flushtest.example' sed.out.$n.a > /dev/null 2>&1 || ret=1
$RNDC $RNDCOPTS flushtree flushtest.example || ret=1
dump_cache
mv ns2/named_dump.db.test$n ns2/named_dump.db.test$n.b
sed -n '/plain success\/timeout/,/Unassociated entries/p' \
ns2/named_dump.db.test$n.b > sed.out.$n.b
grep 'plain success/timeout' sed.out.$n.b > /dev/null 2>&1 || ret=1
grep 'ns.flushtest.example' sed.out.$n.b > /dev/null 2>&1 && ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check expire option returned from primary zone ($n)"
ret=0
$DIG @10.53.0.1 -p ${PORT} +expire soa expire-test > dig.out.expire
grep EXPIRE: dig.out.expire > /dev/null || ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
n=`expr $n + 1`
echo_i "check expire option returned from secondary zone ($n)"
ret=0
$DIG @10.53.0.2 -p ${PORT} +expire soa expire-test > dig.out.expire
grep EXPIRE: dig.out.expire > /dev/null || ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=`expr $status + $ret`
echo_i "exit status: $status"
[ $status -eq 0 ] || exit 1