Commit graph

2281 commits

Author SHA1 Message Date
Evan Hunt
1170a52f48 remove isc_task from xfrin
since the network manager is now handling timeouts, xfrin doesn't
need an isc_task object.

it may be necessary to revert this later if we find that it's
important for zone_xfrdone() to be executed in the zone task context.
currently things seem to be working well without that, though.
2020-11-09 13:45:43 +01:00
Evan Hunt
a8d28881d1 remove isc_timer from xfrin
the network manager can now handle timeouts, so it isn't
necessary for xfrin to use isc_timer for the purpose any
longer.
2020-11-09 13:45:43 +01:00
Evan Hunt
49d53a4aa9 use netmgr for xfrin
Use isc_nm_tcpdnsconnect() in xfrin.c for zone transfers.
2020-11-09 13:45:43 +01:00
Michal Nowak
1f6f7ccad6
Drop unused portlist code 2020-10-22 13:11:16 +02:00
Michal Nowak
e67737aa75
Drop unused dbtable code 2020-10-22 13:11:16 +02:00
Ondřej Surý
bb990030d3 Simplify the EDNS buffer size logic for DNS Flag Day 2020
The DNS Flag Day 2020 aims to remove the IP fragmentation problem from
the UDP DNS communication.  In this commit, we implement the required
changes and simplify the logic for picking the EDNS Buffer Size.

1. The defaults for `edns-udp-size`, `max-udp-size` and
   `nocookie-udp-size` have been changed to `1232` (the value picked by
   DNS Flag Day 2020).

2. The probing heuristics that would try 512->4096->1432->1232 buffer
   sizes has been removed and the resolver will always use just the
   `edns-udp-size` value.

3. Instead of just disabling the PMTUD mechanism on the UDP sockets, we
   now set IP_DONTFRAG (IPV6_DONTFRAG) flag.  That means that the UDP
   packets won't get ever fragmented.  If the ICMP packets are lost the
   UDP will just timeout and eventually be retried over TCP.
2020-10-05 16:21:21 +02:00
Matthijs Mekking
70d1ec432f Use explicit result codes for 'rndc dnssec' cmd
It is better to add new result codes than to overload existing codes.
2020-10-05 10:53:46 +02:00
Matthijs Mekking
edc53fc416 Various rndc dnssec -checkds fixes
While working on 'rndc dnssec -rollover' I noticed the following
(small) issues:

- The key files where updated with hints set to "-when" and that
  should always be "now.
- The kasp system test did not properly update the test number when
  calling 'rndc dnssec -checkds' (and ensuring that works).
- There was a missing ']' in the rndc.c help output.
2020-10-05 10:53:46 +02:00
Matthijs Mekking
df8276aef0 Add manual key rollover logic
Add to the keymgr a function that will schedule a rollover. This
basically means setting the time when the key needs to retire,
and updating the key lifetime, then update the state file. The next
time that named runs the keymgr the new lifetime will be taken into
account.
2020-10-05 10:52:19 +02:00
Ondřej Surý
33eefe9f85 The dns_message_create() cannot fail, change the return to void
The dns_message_create() function cannot soft fail (as all memory
allocations either succeed or cause abort), so we change the function to
return void and cleanup the calls.
2020-09-29 08:22:08 +02:00
Diego Fronza
12d6d13100 Refactored dns_message_t for using attach/detach semantics
This commit will be used as a base for the next code updates in order
to have a better control of dns_message_t objects' lifetime.
2020-09-29 08:22:08 +02:00
Matthijs Mekking
8beda7d2ea Add -expired flag to rndc dumpdb command
This flag is the same as -cache, but will use a different style format
that will also print expired entries (awaiting cleanup) from the cache.
2020-09-23 16:08:29 +02:00
Mark Andrews
f0d9bf7c30 Clone the saved / query message buffers
The message buffer passed to ns__client_request is only valid for
the life of the the ns__client_request call.  Save a copy of it
when we recurse or process a update as ns__client_request will
return before those operations complete.
2020-09-23 10:37:42 +10:00
Evan Hunt
dcee985b7f update all copyright headers to eliminate the typo 2020-09-14 16:20:40 -07:00
Mark Andrews
0b2555e8cf Address use after free between view, resolver and nta.
Hold a weak reference to the view so that it can't go away while
nta is performing its lookups.  Cancel nta timers once all external
references to the view have gone to prevent them triggering new work.
2020-08-11 11:00:49 +10:00
Matthijs Mekking
46fcd927e7 rndc dnssec -checkds set algorithm
In the rare case that you have multiple keys acting as KSK and that
have the same keytag, you can now set the algorithm when calling
'-checkds'.
2020-08-07 11:26:09 +02:00
Matthijs Mekking
a25f49f153 Make 'parent-registration-delay' obsolete
With the introduction of 'checkds', the 'parent-registration-delay'
option becomes obsolete.
2020-08-07 11:26:09 +02:00
Matthijs Mekking
04d8fc0143 Implement 'rndc dnssec -checkds'
Add a new 'rndc' command 'dnssec -checkds' that allows the user to
signal named that a new DS record has been seen published in the
parent, or that an existing DS record has been withdrawn from the
parent.

Upon the 'checkds' request, 'named' will write out the new state for
the key, updating the 'DSPublish' or 'DSRemoved' timing metadata.

This replaces the "parent-registration-delay" configuration option,
this was unreliable because it was purely time based (if the user
did not actually submit the new DS to the parent for example, this
could result in an invalid DNSSEC state).

Because we cannot rely on the parent registration delay for state
transition, we need to replace it with a different guard. Instead,
if a key wants its DS state to be moved to RUMOURED, the "DSPublish"
time must be set and must not be in the future. If a key wants its
DS state to be moved to UNRETENTIVE, the "DSRemoved" time must be set
and must not be in the future.

By default, with '-checkds' you set the time that the DS has been
published or withdrawn to now, but you can set a different time with
'-when'. If there is only one KSK for the zone, that key has its
DS state moved to RUMOURED. If there are multiple keys for the zone,
specify the right key with '-key'.
2020-08-07 11:26:09 +02:00
Ondřej Surý
e24bc324b4 Fix the rbt hashtable and grow it when setting max-cache-size
There were several problems with rbt hashtable implementation:

1. Our internal hashing function returns uint64_t value, but it was
   silently truncated to unsigned int in dns_name_hash() and
   dns_name_fullhash() functions.  As the SipHash 2-4 higher bits are
   more random, we need to use the upper half of the return value.

2. The hashtable implementation in rbt.c was using modulo to pick the
   slot number for the hash table.  This has several problems because
   modulo is: a) slow, b) oblivious to patterns in the input data.  This
   could lead to very uneven distribution of the hashed data in the
   hashtable.  Combined with the single-linked lists we use, it could
   really hog-down the lookup and removal of the nodes from the rbt
   tree[a].  The Fibonacci Hashing is much better fit for the hashtable
   function here.  For longer description, read "Fibonacci Hashing: The
   Optimization that the World Forgot"[b] or just look at the Linux
   kernel.  Also this will make Diego very happy :).

3. The hashtable would rehash every time the number of nodes in the rbt
   tree would exceed 3 * (hashtable size).  The overcommit will make the
   uneven distribution in the hashtable even worse, but the main problem
   lies in the rehashing - every time the database grows beyond the
   limit, each subsequent rehashing will be much slower.  The mitigation
   here is letting the rbt know how big the cache can grown and
   pre-allocate the hashtable to be big enough to actually never need to
   rehash.  This will consume more memory at the start, but since the
   size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
   will only consume maximum of 32GB of memory for hashtable in the
   worst case (and max-cache-size would need to be set to more than
   4TB).  Calling the dns_db_adjusthashsize() will also cap the maximum
   size of the hashtable to the pre-computed number of bits, so it won't
   try to consume more gigabytes of memory than available for the
   database.

   FIXME: What is the average size of the rbt node that gets hashed?  I
   chose the pagesize (4k) as initial value to precompute the size of
   the hashtable, but the value is based on feeling and not any real
   data.

For future work, there are more places where we use result of the hash
value modulo some small number and that would benefit from Fibonacci
Hashing to get better distribution.

Notes:
a. A doubly linked list should be used here to speedup the removal of
   the entries from the hashtable.
b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/
2020-07-21 08:44:26 +02:00
Michał Kępień
53120279b5 Fix locking for LMDB 0.9.26
When "rndc reconfig" is run, named first configures a fresh set of views
and then tears down the old views.  Consider what happens for a single
view with LMDB enabled; "envA" is the pointer to the LMDB environment
used by the original/old version of the view, "envB" is the pointer to
the same LMDB environment used by the new version of that view:

 1. mdb_env_open(envA) is called when the view is first created.
 2. "rndc reconfig" is called.
 3. mdb_env_open(envB) is called for the new instance of the view.
 4. mdb_env_close(envA) is called for the old instance of the view.

This seems to have worked so far.  However, an upstream change [1] in
LMDB which will be part of its 0.9.26 release prevents the above
sequence of calls from working as intended because the locktable mutexes
will now get destroyed by the mdb_env_close() call in step 4 above,
causing any subsequent mdb_txn_begin() calls to fail (because all of the
above steps are happening within a single named process).

Preventing the above scenario from happening would require either
redesigning the way we use LMDB in BIND, which is not something we can
easily backport, or redesigning the way BIND carries out its
reconfiguration process, which would be an even more severe change.

To work around the problem, set MDB_NOLOCK when calling mdb_env_open()
to stop LMDB from controlling concurrent access to the database and do
the necessary locking in named instead.  Reuse the view->new_zone_lock
mutex for this purpose to prevent the need for modifying struct dns_view
(which would necessitate library API version bumps).  Drop use of
MDB_NOTLS as it is made redundant by MDB_NOLOCK: MDB_NOTLS only affects
where LMDB reader locktable slots are stored while MDB_NOLOCK prevents
the reader locktable from being used altogether.

[1] 2fd44e3251
2020-07-10 11:29:18 +02:00
Evan Hunt
16e14353b1 add "primaries" as a synonym for "masters" in named.conf
as "type primary" is preferred over "type master" now, it makes
sense to make "primaries" available as a synonym too.

added a correctness check to ensure "primaries" and "masters"
cannot both be used in the same zone.
2020-07-01 11:11:34 -07:00
Matthijs Mekking
19ce9ec1d4 Output rndc dnssec -status
Implement the 'rndc dnssec -status' command that will output
some information about the key states, such as which policy is
used for the zone, what keys are in use, and when rollover is
scheduled.

Add loose testing in the kasp system test, the actual times are
already tested via key file inspection.
2020-06-30 09:51:04 +02:00
Mark Andrews
ff4fc3f8dc val->keynode is no longer needed 2020-06-11 16:03:11 +10:00
Mark Andrews
e5b2eca1d3 The dsset returned by dns_keynode_dsset needs to be thread safe.
- clone keynode->dsset rather than return a pointer so that thread
  use is independent of each other.
- hold a reference to the dsset (keynode) so it can't be deleted
  while in use.
- create a new keynode when removing DS records so that dangling
  pointers to the deleted records will not occur.
- use a rwlock when accessing the rdatalist to prevent instabilities
  when DS records are added.
2020-06-11 16:02:09 +10:00
Mark Andrews
35a58d30c9 Reject primary zones with an DS record at the zone apex.
DS records only belong at delegation points and if present
at the zone apex are invariably the result of administrative
errors.  Additionally they can't be queried for with modern
resolvers as the parent servers will be queried.
2020-06-04 16:00:33 +02:00
Mark Andrews
3ee5ea2fdb Reduce the number of fetches we make when looking up addresses
If there are more that 5 NS record for a zone only perform a
maximum of 4 address lookups for all the name servers.  This
limits the amount of remote lookup performed for server
addresses at each level for a given query.
2020-05-19 12:30:29 +02:00
Mark Andrews
919a9ece25 enforce record count maximums 2020-05-13 15:35:28 +10:00
Mark Andrews
79de6edde8 allow grant rules to be retrieved 2020-05-13 15:35:28 +10:00
Mark Andrews
361ec726cb allow per type record counts to be specified 2020-05-13 15:35:28 +10:00
Mark Andrews
b144ae1bb0 Report Extended DNS Error codes 2020-05-12 22:01:54 +10:00
Diego Fronza
f2bf7beeb6 Added new logging category rpz-passthru
It is now possible to use the new logging category "rpz-passthru"
to redirect RPZ passthru activity to a dedicate logging channel.
2020-05-07 11:44:48 -03:00
Ondřej Surý
6494665f08 Remove 'ephemeral' database implementation
The 'ephemeral' database implementation was used to provide a
lightweight database implemenation that doesn't cache results, and the
only place where it was really use is "samples" because delv is
overriding this to use "rbtdb" instead. Otherwise it was completely
unused.

 * The 'ephemeral' cache DB (ecdb) implementation.  An ecdb just provides
 * temporary storage for ongoing name resolution with the common DB interfaces.
 * It actually doesn't cache anything.  The implementation expects any stored
 * data is released within a short period, and does not care about the
 * scalability in terms of the number of nodes.
2020-04-23 18:05:53 +02:00
Ondřej Surý
978c7b2e89 Complete rewrite the BIND 9 build system
The rewrite of BIND 9 build system is a large work and cannot be reasonable
split into separate merge requests.  Addition of the automake has a positive
effect on the readability and maintainability of the build system as it is more
declarative, it allows conditional and we are able to drop all of the custom
make code that BIND 9 developed over the years to overcome the deficiencies of
autoconf + custom Makefile.in files.

This squashed commit contains following changes:

- conversion (or rather fresh rewrite) of all Makefile.in files to Makefile.am
  by using automake

- the libtool is now properly integrated with automake (the way we used it
  was rather hackish as the only official way how to use libtool is via
  automake

- the dynamic module loading was rewritten from a custom patchwork to libtool's
  libltdl (which includes the patchwork to support module loading on different
  systems internally)

- conversion of the unit test executor from kyua to automake parallel driver

- conversion of the system test executor from custom make/shell to automake
  parallel driver

- The GSSAPI has been refactored, the custom SPNEGO on the basis that
  all major KRB5/GSSAPI (mit-krb5, heimdal and Windows) implementations
  support SPNEGO mechanism.

- The various defunct tests from bin/tests have been removed:
  bin/tests/optional and bin/tests/pkcs11

- The text files generated from the MD files have been removed, the
  MarkDown has been designed to be readable by both humans and computers

- The xsl header is now generated by a simple sed command instead of
  perl helper

- The <irs/platform.h> header has been removed

- cleanups of configure.ac script to make it more simpler, addition of multiple
  macros (there's still work to be done though)

- the tarball can now be prepared with `make dist`

- the system tests are partially able to run in oot build

Here's a list of unfinished work that needs to be completed in subsequent merge
requests:

- `make distcheck` doesn't yet work (because of system tests oot run is not yet
  finished)

- documentation is not yet built, there's a different merge request with docbook
  to sphinx-build rst conversion that needs to be rebased and adapted on top of
  the automake

- msvc build is non functional yet and we need to decide whether we will just
  cross-compile bind9 using mingw-w64 or fix the msvc build

- contributed dlz modules are not included neither in the autoconf nor automake
2020-04-21 14:19:48 +02:00
Ondřej Surý
4df5a5832c Remove files generated by autotools 2020-04-21 14:19:30 +02:00
Mark Andrews
eeeaf9dbd4 Move structure declarations from dns/peer.h into peer.c 2020-04-20 08:59:09 +00:00
Matthijs Mekking
44b49955e1 Replace sign operation bool with enum 2020-04-03 09:27:15 +02:00
Matthijs Mekking
b2028e26da Embed algorithm in key tag counter
Key tags are not unique across algorithms.
2020-04-03 09:27:15 +02:00
Matthijs Mekking
705810d577 Redesign dnssec sign statistics
The first attempt to add DNSSEC sign statistics was naive: for each
zone we allocated 64K counters, twice.  In reality each zone has at
most four keys, so the new approach only has room for four keys per
zone. If after a rollover more keys have signed the zone, existing
keys are rotated out.

The DNSSEC sign statistics has three counters per key, so twelve
counters per zone. First counter is actually a key id, so it is
clear what key contributed to the metrics.  The second counter
tracks the number of generated signatures, and the third tracks
how many of those are refreshes.

This means that in the zone structure we no longer need two separate
references to DNSSEC sign metrics: both the resign and refresh stats
are kept in a single dns_stats structure.

Incrementing dnssecsignstats:

Whenever a dnssecsignstat is incremented, we look up the key id
to see if we already are counting metrics for this key.  If so,
we update the corresponding operation counter (resign or
refresh).

If the key is new, store the value in a new counter and increment
corresponding counter.

If all slots are full, we rotate the keys and overwrite the last
slot with the new key.

Dumping dnssecsignstats:

Dumping dnssecsignstats is no longer a simple wrapper around
isc_stats_dump, but uses the same principle.  The difference is that
rather than dumping the index (key tag) and counter, we have to look
up the corresponding counter.
2020-04-03 09:27:11 +02:00
Diego Fronza
c786c578d7 Added RPZ configuration option "nsdname-wait-recurse"
This new option was added to fill a gap in RPZ configuration
options.

It was possible to instruct BIND wheter NSIP rewritting rules would
apply or not, as long as the required data was already in cache or not,
respectively, by means of the option nsip-wait-recurse.

A value of yes (default) could incur a little processing cost, since
BIND would need to recurse to find NS addresses in case they were not in
the cache.

This behavior could be changed by setting nsip-wait-recurse value to no,
in which case BIND would promptly return some error code if the NS IP addresses
data were not in cache, then BIND would start a recursive query
in background, so future similar requests would have the required data
(NS IPs) in cache, allowing BIND to apply NSIP rules accordingly.

A similar feature wasn't available for NSDNAME triggers, so this commit
adds the option nsdname-wait-recurse to fill this gap, as it was
expected by couple BIND users.
2020-03-16 15:18:46 -03:00
Evan Hunt
98b55eb442 improve calculation of database transfer size
- change name of 'bytes' to 'xfrsize' in dns_db_getsize() parameter list
  and related variables; this is a more accurate representation of what
  the function is doing
- change the size calculations in dns_db_getsize() to more accurately
  represent the space needed for a *XFR message or journal file to contain
  the data in the database. previously we returned the sizes of all
  rdataslabs, including header overhead and offset tables, which
  resulted in the database size being reported as much larger than the
  equivalent *XFR or journal.
- map files caused a particular problem here: the fullname can't be
  determined from the node while a file is being deserialized, because
  the uppernode pointers aren't set yet. so we store "full name length"
  in the dns_rbtnode structure while serializing, and clear it after
  deserialization is complete.
2020-03-05 17:20:16 -08:00
Evan Hunt
52a31a9883 dns_journal_iter_init() can now return the size of the delta
the call initailizing a journal iterator can now optionally return
to the caller the size in bytes of an IXFR message (not including
DNS header overhead, signatures etc) containing the differences from
the beginning to the ending serial number.

this is calculated by scanning the journal transaction headers to
calculate the transfer size. since journal file records contain a length
field that is not included in IXFR messages, we subtract out the length
of those fields from the overall transaction length.

this necessitated adding an "RR count" field to the journal transaction
header, so we know how many length fields to subract. NOTE: this will
make existing journal files stop working!
2020-03-05 17:20:16 -08:00
Evan Hunt
aeef4719e9 add syntax and setter/getter functions to configure max-ixfr-ratio 2020-03-05 17:20:16 -08:00
Michał Kępień
b675d30f09 Fix lists of installed header files 2020-03-05 23:09:51 +00:00
Evan Hunt
7a3fa9f593 list "validate-except" entries in "rndc nta -d" and "rndc secroots"
- no longer exclude these entries when dumping the NTA table
- indicate "validate-except" entries with the keyword "permanent" in
  place of an expiry date
- add a test for this feature, and update other tests to account for
  the presence of extra lines in some rndc outputs
- incidentally removed the unused function dns_ntatable_dump()
- CHANGES, release note
2020-03-04 00:44:32 -08:00
Witold Kręcicki
3a3b5f557a Add an arena to compressctx 2020-02-26 07:57:44 +00:00
Evan Hunt
ba0313e649 fix spelling errors reported by Fossies. 2020-02-21 15:05:08 +11:00
Diego Fronza
63c88f4a04 Enable named-checkzone and named-compilezone to take input from stdin
If a filename (the last argument) is not provided for named-checkzone or
named-compilezone, or if it is a single dash "-" character,
zone data will be read from stdin.

Example of invocation:
cat /etc/zone_name.db | named-compilezone -f text -F raw \
    -o zone_name.raw zone_name
2020-02-20 11:19:13 -03:00
Ondřej Surý
654927c871 Add separate .clang-format files for headers 2020-02-14 09:31:05 +01:00
Evan Hunt
e851ed0bb5 apply the modified style 2020-02-13 15:05:06 -08:00
Ondřej Surý
056e133c4c Use clang-tidy to add curly braces around one-line statements
The command used to reformat the files in this commit was:

./util/run-clang-tidy \
	-clang-tidy-binary clang-tidy-11
	-clang-apply-replacements-binary clang-apply-replacements-11 \
	-checks=-*,readability-braces-around-statements \
	-j 9 \
	-fix \
	-format \
	-style=file \
	-quiet
clang-format -i --style=format $(git ls-files '*.c' '*.h')
uncrustify -c .uncrustify.cfg --replace --no-backup $(git ls-files '*.c' '*.h')
clang-format -i --style=format $(git ls-files '*.c' '*.h')
2020-02-13 22:07:21 +01:00