Commit graph

1236 commits

Author SHA1 Message Date
Ondřej Surý
36ddefacb4 Change the isc_nm_(get|set)timeouts() to work with milliseconds
The RFC7828 specifies the keepalive interval to be 16-bit, specified in
units of 100 milliseconds and the configuration options tcp-*-timeouts
are following the suit.  The units of 100 milliseconds are very
unintuitive and while we can't change the configuration and presentation
format, we should not follow this weird unit in the API.

This commit changes the isc_nm_(get|set)timeouts() functions to work
with milliseconds and convert the values to milliseconds before passing
them to the function, not just internally.
2021-03-18 16:37:57 +01:00
Ondřej Surý
0f44139145 Bump the maximum number of hazard pointers in tests
On 24-core machine, the tests would crash because we would run out of
the hazard pointers.  We now adjust the number of hazard pointers to be
in the <128,256> interval based on the number of available cores.

Note: This is just a band-aid and needs a proper fix.
2021-02-18 19:32:55 +01:00
Evan Hunt
2b2e1a02bd allow configuration of "default" http endpoint
specifying "http default" in a listen-on statement sets up
the default "/dns-query" endpoint. tests and documentation
have been updated.
2021-02-16 16:24:35 -08:00
Evan Hunt
957052eea5 move listen-on correctness checks into check.c
errors in listen-on and listen-on-v6 can now be detected
by named-checkconf.
2021-02-16 16:24:35 -08:00
Evan Hunt
fd763d7223 enable listen-on parameters to be specified in any order
updated the parser to allow the "port", "tls" and "http"
paramters to "listen-on" and "listen-on-v6" to be specified in any
order. previously the parser would throw an error if any other order
was used than port, tls, http.
2021-02-16 16:24:35 -08:00
Evan Hunt
07f525bae5 require "tls none" for unencrypted HTTP listeners
unencrypted DoH connections may be used in some operational
environments where encryption is handled by a reverse proxy,
but it's going to be relatively rare, so we shouldn't make it
easy to do by mistake.  this commit changes the syntax for
listen-on and listen-on-v6 so that if "http" is specified, "tls"
must also be specified; for unencrypted listeners, "tls none"
can be used.
2021-02-16 16:24:35 -08:00
Ondřej Surý
23c3bcc711 Stop including dnstap headers from <dns/dnstap.h>
The <fstrm.h> and <protobuf-c/protobuf-c.h> headers are only directly
included where used and we stopped exposing those headers from libdns
headers.
2021-02-16 01:04:46 +00:00
Diego Fronza
30729c7013 Fix dangling references to outdated views after reconfig
This commit fix a leak which was happening every time an inline-signed
zone was added to the configuration, followed by a rndc reconfig.

During the reconfig process, the secure version of every inline-signed
zone was "moved" to a new view upon a reconfig and it "took the raw
version along", but only once the secure version was freed (at shutdown)
was prev_view for the raw version detached from, causing the old view to
be released as well.

This caused dangling references to be kept for the previous view, thus
keeping all resources used by that view in memory.
2021-02-15 11:15:20 -03:00
Evan Hunt
fe99484e14 support "tls ephemeral" with https 2021-02-03 12:06:17 +01:00
Evan Hunt
aa9d51c494 tls and http configuration code was unnecessarily complex
removed the isc_cfg_http_t and isc_cfg_tls_t structures
and the functions that loaded and accessed them; this can
be done using normal config parser functions.
2021-02-03 12:06:17 +01:00
Artem Boldariev
08da09bc76 Initial support for DNS-over-HTTP(S)
This commit completes the support for DNS-over-HTTP(S) built on top of
nghttp2 and plugs it into the BIND. Support for both GET and POST
requests is present, as required by RFC8484.

Both encrypted (via TLS) and unencrypted HTTP/2 connections are
supported. The latter are mostly there for debugging/troubleshooting
purposes and for the means of encryption offloading to third-party
software (as might be desirable in some environments to simplify TLS
certificates management).
2021-02-03 12:06:17 +01:00
Ondřej Surý
e488309da7 implement xfrin via XoT
Add support for a "tls" key/value pair for zone primaries, referencing
either a "tls" configuration statement or "ephemeral". If set to use
TLS, zones will send SOA and AXFR/IXFR queries over a TLS channel.
2021-01-29 12:07:38 +01:00
Mark Andrews
2b3fcd7156 Pass an afg_aclconfctx_t structure to cfg_acl_fromconfig
in named_zone_inlinesigning.  A NULL pointer does not work.
2021-01-28 01:54:59 +00:00
Mark Andrews
dd3520ae41 Improve the diagnostic 'rndc retransfer' error message 2021-01-28 08:43:03 +11:00
Diego Fronza
0ad6f594f6 Added option for disabling stale-answer-client-timeout
This commit allows to specify "disabled" or "off" in
stale-answer-client-timeout statement. The logic to support this
behavior will be added in the subsequent commits.

This commit also ensures an upper bound to stale-answer-client-timeout
which equals to one second less than 'resolver-query-timeout'.
2021-01-25 10:47:14 -03:00
Diego Fronza
171a5b7542 Add stale-answer-client-timeout option
The general logic behind the addition of this new feature works as
folows:

When a client query arrives, the basic path (query.c / ns_query_recurse)
was to create a fetch, waiting for completion in fetch_callback.

With the introduction of stale-answer-client-timeout, a new event of
type DNS_EVENT_TRYSTALE may invoke fetch_callback, whenever stale
answers are enabled and the fetch took longer than
stale-answer-client-timeout to complete.

When an event of type DNS_EVENT_TRYSTALE triggers fetch_callback, we
must ensure that the folowing happens:

1. Setup a new query context with the sole purpose of looking up for
   stale RRset only data, for that matters a new flag was added
   'DNS_DBFIND_STALEONLY' used in database lookups.

    . If a stale RRset is found, mark the original client query as
      answered (with a new query attribute named NS_QUERYATTR_ANSWERED),
      so when the fetch completion event is received later, we avoid
      answering the client twice.

    . If a stale RRset is not found, cleanup and wait for the normal
      fetch completion event.

2. In ns_query_done, we must change this part:
	/*
	 * If we're recursing then just return; the query will
	 * resume when recursion ends.
	 */
	if (RECURSING(qctx->client)) {
		return (qctx->result);
	}

   To this:

	if (RECURSING(qctx->client) && !QUERY_STALEONLY(qctx->client)) {
		return (qctx->result);
	}

   Otherwise we would not proceed to answer the client if it happened
   that a stale answer was found when looking up for stale only data.

When an event of type DNS_EVENT_FETCHDONE triggers fetch_callback, we
proceed as before, resuming query, updating stats, etc, but a few
exceptions had to be added, most important of which are two:

1. Before answering the client (ns_client_send), check if the query
   wasn't already answered before.

2. Before detaching a client, e.g.
   isc_nmhandle_detach(&client->reqhandle), ensure that this is the
   fetch completion event, and not the one triggered due to
   stale-answer-client-timeout, so a correct call would be:
   if (!QUERY_STALEONLY(client)) {
        isc_nmhandle_detach(&client->reqhandle);
   }

Other than these notes, comments were added in code in attempt to make
these updates easier to follow.
2021-01-25 10:47:14 -03:00
Matthijs Mekking
cf420b2af0 Treat dnssec-policy "none" as a builtin zone
Configure "none" as a builtin policy. Change the 'cfg_kasp_fromconfig'
api so that the 'name' will determine what policy needs to be
configured.

When transitioning a zone from secure to insecure, there will be
cases when a zone with no DNSSEC policy (dnssec-policy none) should
be using KASP. When there are key state files available, this is an
indication that the zone once was DNSSEC signed but is reconfigured
to become insecure.

If we would not run the keymgr, named would abruptly remove the
DNSSEC records from the zone, making the zone bogus. Therefore,
change the code such that a zone will use kasp if there is a valid
dnssec-policy configured, or if there are state files available.
2020-12-23 09:02:11 +01:00
Mark Andrews
c51ef23c22 Implement ipv4only.arpa forward and reverse zones as per RFC 8880. 2020-12-11 14:16:40 +11:00
Ondřej Surý
7ba18870dc Reformat sources using clang-format-11 2020-12-08 18:36:23 +01:00
Ondřej Surý
634bdfb16d Refactor netmgr and add more unit tests
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested.  It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.

NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.

The changes that are included in this commit are listed here
(extensively, but not exclusively):

* The netmgr_test unit test was split into individual tests (udp_test,
  tcp_test, tcpdns_test and newly added tcp_quota_test)

* The udp_test and tcp_test has been extended to allow programatic
  failures from the libuv API.  Unfortunately, we can't use cmocka
  mock() and will_return(), so we emulate the behaviour with #define and
  including the netmgr/{udp,tcp}.c source file directly.

* The netievents that we put on the nm queue have variable number of
  members, out of these the isc_nmsocket_t and isc_nmhandle_t always
  needs to be attached before enqueueing the netievent_<foo> and
  detached after we have called the isc_nm_async_<foo> to ensure that
  the socket (handle) doesn't disappear between scheduling the event and
  actually executing the event.

* Cancelling the in-flight TCP connection using libuv requires to call
  uv_close() on the original uv_tcp_t handle which just breaks too many
  assumptions we have in the netmgr code.  Instead of using uv_timer for
  TCP connection timeouts, we use platform specific socket option.

* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}

  When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
  waiting for socket to either end up with error (that path was fine) or
  to be listening or connected using condition variable and mutex.

  Several things could happen:

    0. everything is ok

    1. the waiting thread would miss the SIGNAL() - because the enqueued
       event would be processed faster than we could start WAIT()ing.
       In case the operation would end up with error, it would be ok, as
       the error variable would be unchanged.

    2. the waiting thread miss the sock->{connected,listening} = `true`
       would be set to `false` in the tcp_{listen,connect}close_cb() as
       the connection would be so short lived that the socket would be
       closed before we could even start WAIT()ing

* The tcpdns has been converted to using libuv directly.  Previously,
  the tcpdns protocol used tcp protocol from netmgr, this proved to be
  very complicated to understand, fix and make changes to.  The new
  tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
  Closes: #2194, #2283, #2318, #2266, #2034, #1920

* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
  pass accepted TCP sockets between netthreads, but instead (similar to
  UDP) uses per netthread uv_loop listener.  This greatly reduces the
  complexity as the socket is always run in the associated nm and uv
  loops, and we are also not touching the libuv internals.

  There's an unfortunate side effect though, the new code requires
  support for load-balanced sockets from the operating system for both
  UDP and TCP (see #2137).  If the operating system doesn't support the
  load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
  on FreeBSD 12+), the number of netthreads is limited to 1.

* The netmgr has now two debugging #ifdefs:

  1. Already existing NETMGR_TRACE prints any dangling nmsockets and
     nmhandles before triggering assertion failure.  This options would
     reduce performance when enabled, but in theory, it could be enabled
     on low-performance systems.

  2. New NETMGR_TRACE_VERBOSE option has been added that enables
     extensive netmgr logging that allows the software engineer to
     precisely track any attach/detach operations on the nmsockets and
     nmhandles.  This is not suitable for any kind of production
     machine, only for debugging.

* The tlsdns netmgr protocol has been split from the tcpdns and it still
  uses the old method of stacking the netmgr boxes on top of each other.
  We will have to refactor the tlsdns netmgr protocol to use the same
  approach - build the stack using only libuv and openssl.

* Limit but not assert the tcp buffer size in tcp_alloc_cb
  Closes: #2061
2020-12-01 16:47:07 +01:00
Matthijs Mekking
6b5d7357df Detect NSEC3 salt collisions
When generating a new salt, compare it with the previous NSEC3
paremeters to ensure the new parameters are different from the
previous ones.

This moves the salt generation call from 'bin/named/*.s' to
'lib/dns/zone.c'. When setting new NSEC3 parameters, you can set a new
function parameter 'resalt' to enforce a new salt to be generated. A
new salt will also be generated if 'salt' is set to NULL.

Logging salt with zone context can now be done with 'dnssec_log',
removing the need for 'dns_nsec3_log_salt'.
2020-11-26 10:43:59 +01:00
Matthijs Mekking
3b4c764b43 Add zone context to "generated salt" logs 2020-11-26 10:43:59 +01:00
Matthijs Mekking
7878f300ff Move logging of salt in separate function
There may be a desire to log the salt without losing the context
of log module, level, and category.
2020-11-26 10:43:59 +01:00
Matthijs Mekking
eae9a6d297 Don't use 'rndc signing' with kasp
The 'rndc signing' command allows you to manipulate the private
records that are used to store signing state. Don't use these with
'dnssec-policy' as such manipulations may violate the policy (if you
want to change the NSEC3 parameters, change the policy and reconfig).
2020-11-26 10:43:27 +01:00
Matthijs Mekking
ba8128ea00 Fix a reconfig bug wrt inline-signing
When doing 'rndc reconfig', named may complain about a zone not being
reusable because it has a raw version of the zone, and the new
configuration has not set 'inline-signing'. However, 'inline-signing'
may be implicitly true if a 'dnssec-policy' is used for the zone, and
the zone is not dynamic.

Improve the check in 'named_zone_reusable'.  Create a new function for
checking 'inline-signing' configuration that matches existing code in
'bin/named/server.c'.
2020-11-26 10:43:27 +01:00
Matthijs Mekking
114af58ee2 Support for NSEC3 in dnssec-policy
Implement support for NSEC3 in dnssec-policy.  Store the configuration
in kasp objects. When configuring a zone, call 'dns_zone_setnsec3param'
to queue an nsec3param event. This will ensure that any previous
chains will be removed and a chain according to the dnssec-policy is
created.

Add tests for dnssec-policy zones that uses the new 'nsec3param'
option, as well as changing to new values, changing to NSEC, and
changing from NSEC.
2020-11-26 10:43:27 +01:00
Matthijs Mekking
84a4273074 Move generate_salt function to lib/dns/nsec3
We will be using this function also on reconfig, so it should have
a wider availability than just bin/named/server.
2020-11-26 10:43:27 +01:00
Diego Fronza
d4142d2bed Output 'stale-refresh-time' value on rndc serve-stale status 2020-11-11 12:53:24 -03:00
Diego Fronza
581e2a8f28 Check 'stale-refresh-time' when sharing cache between views
This commit ensures that, along with previous restrictions, a cache is
shareable between views only if their 'stale-refresh-time' value are
equal.
2020-11-11 12:53:24 -03:00
Diego Fronza
4827ad0ec4 Add stale-refresh-time option
Before this update, BIND would attempt to do a full recursive resolution
process for each query received if the requested rrset had its ttl
expired. If the resolution fails for any reason, only then BIND would
check for stale rrset in cache (if 'stale-cache-enable' and
'stale-answer-enable' is on).

The problem with this approach is that if an authoritative server is
unreachable or is failing to respond, it is very unlikely that the
problem will be fixed in the next seconds.

A better approach to improve performance in those cases, is to mark the
moment in which a resolution failed, and if new queries arrive for that
same rrset, try to respond directly from the stale cache, and do that
for a window of time configured via 'stale-refresh-time'.

Only when this interval expires we then try to do a normal refresh of
the rrset.

The logic behind this commit is as following:

- In query.c / query_gotanswer(), if the test of 'result' variable falls
  to the default case, an error is assumed to have happened, and a call
  to 'query_usestale()' is made to check if serving of stale rrset is
  enabled in configuration.

- If serving of stale answers is enabled, a flag will be turned on in
  the query context to look for stale records:
  query.c:6839
  qctx->client->query.dboptions |= DNS_DBFIND_STALEOK;

- A call to query_lookup() will be made again, inside it a call to
  'dns_db_findext()' is made, which in turn will invoke rbdb.c /
  cache_find().

- In rbtdb.c / cache_find() the important bits of this change is the
  call to 'check_stale_header()', which is a function that yields true
  if we should skip the stale entry, or false if we should consider it.

- In check_stale_header() we now check if the DNS_DBFIND_STALEOK option
  is set, if that is the case we know that this new search for stale
  records was made due to a failure in a normal resolution, so we keep
  track of the time in which the failured occured in rbtdb.c:4559:
  header->last_refresh_fail_ts = search->now;

- In check_stale_header(), if DNS_DBFIND_STALEOK is not set, then we
  know this is a normal lookup, if the record is stale and the query
  time is between last failure time + stale-refresh-time window, then
  we return false so cache_find() knows it can consider this stale
  rrset entry to return as a response.

The last additions are two new methods to the database interface:
- setservestale_refresh
- getservestale_refresh

Those were added so rbtdb can be aware of the value set in configuration
option, since in that level we have no access to the view object.
2020-11-11 12:53:23 -03:00
Ondřej Surý
fa424225af netmgr: Add additional safeguards to netmgr/tls.c
This commit adds couple of additional safeguards against running
sends/reads on inactive sockets.  The changes was modeled after the
changes we made to netmgr/tcpdns.c
2020-11-10 14:17:20 +01:00
Witold Kręcicki
d2a2804069 DoT test
Preliminary test for DNSoverTLS - add the dot-port template to system
tests, test a simple query to an authoritative.
2020-11-10 14:17:18 +01:00
Witold Kręcicki
e94afa5bc0 Add 'ephemeral' keyword to 'tls' option in listen-on directive.
listen-on tls ephemeral will cause named to create an ephemeral
TLS self-signed certificate and key, stored only in memory.
2020-11-10 14:17:14 +01:00
Witold Kręcicki
38b78f59a0 Add DoT support to bind
Parse the configuration of tls objects into SSL_CTX* objects.  Listen on
DoT if 'tls' option is setup in listen-on directive.  Use DoT/DoH ports
for DoT/DoH.
2020-11-10 14:16:55 +01:00
Evan Hunt
e011521ef1 address some possible shutdown races in xfrin
there were two failures during observed in testing, both occurring
when 'rndc halt' was run rather than 'rndc stop' - the latter dumps
zone contents to disk and presumably introduced enough delay to
prevent the races:

- a failure when the zone was shut down and called dns_xfrin_detach()
  before the xfrin had finished connecting; the connect timeout
  terminated without detaching its handle
- a failure when the tcpdns socket timer fired after the outerhandle
  had already been cleared.

this commit incidentally addresses a failure observed in mutexatomic
due to a variable having been initialized incorrectly.
2020-11-09 12:33:37 -08:00
Evan Hunt
49d53a4aa9 use netmgr for xfrin
Use isc_nm_tcpdnsconnect() in xfrin.c for zone transfers.
2020-11-09 13:45:43 +01:00
Michal Nowak
1f6f7ccad6
Drop unused portlist code 2020-10-22 13:11:16 +02:00
Matthijs Mekking
70d1ec432f Use explicit result codes for 'rndc dnssec' cmd
It is better to add new result codes than to overload existing codes.
2020-10-05 10:53:46 +02:00
Matthijs Mekking
e826facadb Add rndc dnssec -rollover command
This command is similar in arguments as -checkds so refactor the
'named_server_dnssec' function accordingly.  The only difference
are that:

- It does not take a "publish" or "withdrawn" argument.
- It requires the key id to be set (add a check to make sure).

Add tests that will trigger rollover immediately and one that
schedules a test in the future.
2020-10-05 10:53:45 +02:00
Mark Andrews
b00ba7ac94 make (named_server_t).reload_status atomic
WARNING: ThreadSanitizer: data race
    Write of size 4 at 0x000000000001 by thread T1:
    #0 view_loaded bin/named/server.c:9678:25
    #1 call_loaddone lib/dns/zt.c:308:3
    #2 doneloading lib/dns/zt.c:582:3
    #3 zone_asyncload lib/dns/zone.c:2322:3
    #4 dispatch lib/isc/task.c:1152:7
    #5 run lib/isc/task.c:1344:2

    Previous read of size 4 at 0x000000000001 by thread T2:
    #0 named_server_status bin/named/server.c:11903:14
    #1 named_control_docommand bin/named/control.c:272:12
    #2 control_command bin/named/controlconf.c:390:17
    #3 dispatch lib/isc/task.c:1152:7
    #4 run lib/isc/task.c:1344:2

    Location is heap block of size 409 at 0x000000000011 allocated by main thread:
    #0 malloc <null>
    #1 default_memalloc lib/isc/mem.c:713:8
    #2 mem_get lib/isc/mem.c:622:8
    #3 mem_allocateunlocked lib/isc/mem.c:1268:8
    #4 isc___mem_allocate lib/isc/mem.c:1288:7
    #5 isc__mem_allocate lib/isc/mem.c:2453:10
    #6 isc___mem_get lib/isc/mem.c:1037:11
    #7 isc__mem_get lib/isc/mem.c:2432:10
    #8 named_server_create bin/named/server.c:9978:27
    #9 setup bin/named/main.c:1256:2
    #10 main bin/named/main.c:1523:2

    Thread T1 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    Thread T2 (running) created by main thread at:
    #0 pthread_create <null>
    #1 isc_thread_create lib/isc/pthreads/thread.c:73:8
    #2 isc_taskmgr_create lib/isc/task.c:1434:3
    #3 create_managers bin/named/main.c:915:11
    #4 setup bin/named/main.c:1223:11
    #5 main bin/named/main.c:1523:2

    SUMMARY: ThreadSanitizer: data race bin/named/server.c:9678:25 in view_loaded
2020-09-30 14:19:09 +00:00
Matthijs Mekking
8beda7d2ea Add -expired flag to rndc dumpdb command
This flag is the same as -cache, but will use a different style format
that will also print expired entries (awaiting cleanup) from the cache.
2020-09-23 16:08:29 +02:00
Mark Andrews
3c4b68af7c Break lock order loop by sending TAT in an event
The dotat() function has been changed to send the TAT
query asynchronously, so there's no lock order loop
because we initialize the data first and then we schedule
the TAT send to happen asynchronously.

This breaks following lock-order loops:

zone->lock (dns_zone_setviewcommit) while holding view->lock
(dns_view_setviewcommit)

keytable->lock (dns_keytable_find) while holding zone->lock
(zone_asyncload)

view->lock (dns_view_findzonecut) while holding keytable->lock
(dns_keytable_forall)
2020-09-22 12:33:58 +00:00
Michał Kępień
5ae33351f2 Deprecate the "glue-cache" option
No issues with the glue cache feature have been reported since its
introduction in BIND 9.12.  As the rationale for introducing the
"glue-cache" option was to have a safety switch readily available in
case the glue cache turns out to cause problems, it is time to deprecate
the option.  Glue cache will be permanently enabled in a future release,
at which point the "glue-cache" option will be made obsolete.
2020-09-16 11:18:07 +02:00
Evan Hunt
dcee985b7f update all copyright headers to eliminate the typo 2020-09-14 16:20:40 -07:00
Evan Hunt
57b4dde974 change from isc_nmhandle_ref/unref to isc_nmhandle attach/detach
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.

A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:

        - reqhandle:    held while waiting for a request callback (query,
                        notify, update)
        - sendhandle:   held while waiting for a send callback
        - fetchhandle:  held while waiting for a recursive fetch to
                        complete
        - updatehandle: held while waiting for an update-forwarding
                        task to complete

control channel connection objects now contain:

        - readhandle: held while waiting for a read callback
        - sendhandle: held while waiting for a send callback
        - cmdhandle:  held while an rndc command is running

httpd connections contain:

        - readhandle: held while waiting for a read callback
        - sendhandle: held while waiting for a send callback
2020-09-11 12:17:57 -07:00
Mark Andrews
9b445f33e2 Defer read of zl->server and zl->reconfig until
the reference counter has gone to zero and there is
no longer a possibility of changes in other threads.
2020-09-09 13:58:31 +10:00
Michał Kępień
9ac1f6a9bc Add "-T maxcachesize=..." command line option
An implicit default of "max-cache-size 90%;" may cause memory use issues
on hosts which run numerous named instances in parallel (e.g. GitLab CI
runners) due to the cache RBT hash table now being pre-allocated [1] at
startup.  Add a new command line option, "-T maxcachesize=...", to allow
the default value of "max-cache-size" to be overridden at runtime.  When
this new option is in effect, it overrides any other "max-cache-size"
setting in the configuration, either implicit or explicit.  This
approach was chosen because it is arguably the simplest one to
implement.

The following alternative approaches to solving this problem were
considered and ultimately rejected (after it was decided they were not
worth the extra code complexity):

  - adding the same command line option, but making explicit
    configuration statements have priority over it,

  - adding a build-time option that allows the implicit default of
    "max-cache-size 90%;" to be overridden.

[1] see commit e24bc324b4
2020-08-31 13:15:33 +02:00
Matthijs Mekking
46fcd927e7 rndc dnssec -checkds set algorithm
In the rare case that you have multiple keys acting as KSK and that
have the same keytag, you can now set the algorithm when calling
'-checkds'.
2020-08-07 11:26:09 +02:00
Matthijs Mekking
04d8fc0143 Implement 'rndc dnssec -checkds'
Add a new 'rndc' command 'dnssec -checkds' that allows the user to
signal named that a new DS record has been seen published in the
parent, or that an existing DS record has been withdrawn from the
parent.

Upon the 'checkds' request, 'named' will write out the new state for
the key, updating the 'DSPublish' or 'DSRemoved' timing metadata.

This replaces the "parent-registration-delay" configuration option,
this was unreliable because it was purely time based (if the user
did not actually submit the new DS to the parent for example, this
could result in an invalid DNSSEC state).

Because we cannot rely on the parent registration delay for state
transition, we need to replace it with a different guard. Instead,
if a key wants its DS state to be moved to RUMOURED, the "DSPublish"
time must be set and must not be in the future. If a key wants its
DS state to be moved to UNRETENTIVE, the "DSRemoved" time must be set
and must not be in the future.

By default, with '-checkds' you set the time that the DS has been
published or withdrawn to now, but you can set a different time with
'-when'. If there is only one KSK for the zone, that key has its
DS state moved to RUMOURED. If there are multiple keys for the zone,
specify the right key with '-key'.
2020-08-07 11:26:09 +02:00
Ondřej Surý
dd62275152 Add CHANGES and release notes for GL #1712 and GL #1829 2020-08-04 10:51:09 +02:00