bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-02-27 03:51:16 -05:00

Author	SHA1	Message	Date
Ondřej Surý	c8eddf4f33	Refactor zone dumping code to use netmgr async threadpools Previously, dumping the zones to the files were quantized, so it doesn't slow down network IO processing. With the introduction of network manager asynchronous threadpools, we can move the IO intensive work to use that API and we don't have to quantize the work anymore as it the file IO won't block anything except other zone dumping processes. (cherry picked from commit `8a5c62de83`)	2021-05-31 16:57:19 +02:00
Matthijs Mekking	96be6473fc	Lock kasp when looking for zone keys We should also lock kasp when reading key files, because at the same time the zone in another view may be updating the key file. (cherry picked from commit `252a1ae0a1`)	2021-05-20 09:52:53 +02:00
Evan Hunt	ef1d909fa9	backport of netmgr/taskmgr to 9.16 this rolls up numerous changes that have been applied to the main branch, including moving isc_task operations into the netmgr event loops, and other general stabilization.	2021-05-14 12:52:48 +02:00
Matthijs Mekking	ff4930951c	rndc dnssec -status should include offline keys The rndc command 'dnssec -status' only considered keys from 'dns_dnssec_findmatchingkeys' which only includes keys with accessible private keys. Change it so that offline keys are also listed in the status. (cherry picked from commit `b3a5859a9b`)	2021-05-05 12:49:38 +02:00
Matthijs Mekking	375112a623	Add built-in dnssec-policy "insecure" Add a new built-in policy "insecure", to be used to gracefully unsign a zone. Previously you could just remove the 'dnssec-policy' configuration from your zone statement, or remove it. The built-in policy "none" (or not configured) now actually means no DNSSEC maintenance for the corresponding zone. So if you immediately reconfigure your zone from whatever policy to "none", your zone will temporarily be seen as bogus by validating resolvers. This means we can remove the functions 'dns_zone_use_kasp()' and 'dns_zone_secure_to_insecure()' again. We also no longer have to check for the existence of key state files to figure out if a zone is transitioning to insecure. (cherry picked from commit `2710d9a11d`)	2021-04-30 13:58:22 +02:00
Mark Andrews	9324d2d295	Reduce nsec3 max iterations to 150 (cherry picked from commit `29126500d2`)	2021-04-29 17:44:46 +10:00
Diego Fronza	942b83d392	Fix deadlock between rndc addzone/delzone/modzone It follows a description of the steps that were leading to the deadlock: 1. `do_addzone` calls `isc_task_beginexclusive`. 2. `isc_task_beginexclusive` waits for (N_WORKERS - 1) halted tasks, this blocks waiting for those (no. workers -1) workers to halt. ... isc_task_beginexclusive(isc_task_t *task0) { ... while (manager->halted + 1 < manager->workers) { wake_all_queues(manager); WAIT(&manager->halt_cond, &manager->halt_lock); } ``` 3. It is possible that in `task.c / dispatch()` a worker is running a task event, if that event blocks it will not allow this worker to halt. 4. `do_addzone` acquires `LOCK(&view->new_zone_lock);`, 5. `rmzone` event is called from some worker's `dispatch()`, `rmzone` blocks waiting for the same lock. 6. `do_addzone` calls `isc_task_beginexclusive`. 7. Deadlock triggered, since: - `rmzone` is wating for the lock. - `isc_task_beginexclusive` is waiting for (no. workers - 1) to be halted - since `rmzone` event is blocked it won't allow the worker to halt. To fix this, we updated do_addzone code to call isc_task_beginexclusive before the lock is acquired, we postpone locking to the nearest required place, same for isc_task_beginexclusive. The same could happen with rndc modzone, so that was addressed as well.	2021-04-26 11:35:18 -03:00
Matthijs Mekking	d12b40f6fb	Rekey immediately after rndc checkds/rollover Call 'dns_zone_rekey' after a 'rndc dnssec -checkds' or 'rndc dnssec -rollover' command is received, because such a command may influence the next key event. Updating the keys immediately avoids unnecessary rollover delays. The kasp system test no longer needs to call 'rndc loadkeys' after a 'rndc dnssec -checkds' or 'rndc dnssec -rollover' command. (cherry picked from commit `82f72ae249`)	2021-03-22 15:35:22 +01:00
Ondřej Surý	db49ffca20	Change the isc_nm_(get\|set)timeouts() to work with milliseconds The RFC7828 specifies the keepalive interval to be 16-bit, specified in units of 100 milliseconds and the configuration options tcp-*-timeouts are following the suit. The units of 100 milliseconds are very unintuitive and while we can't change the configuration and presentation format, we should not follow this weird unit in the API. This commit changes the isc_nm_(get\|set)timeouts() functions to work with milliseconds and convert the values to milliseconds before passing them to the function, not just internally.	2021-03-18 15:16:13 +01:00
Ondřej Surý	4bbe3e75de	Stop including dnstap headers from <dns/dnstap.h> The <fstrm.h> and <protobuf-c/protobuf-c.h> headers are only directly included where used and we stopped exposing those headers from libdns headers.	2021-02-16 12:08:21 +11:00
Mark Andrews	bf5aac225b	Stop including <lmdb.h> from <dns/lmdb.h> The lmdb.h header doesn't have to be included from the dns/lmdb.h header as it can be separately included where used. This stops exposing the inclusion of lmdb.h from the libdns headers.	2021-02-16 12:08:21 +11:00
Diego Fronza	d89a8bf696	Fix dangling references to outdated views after reconfig This commit fix a leak which was happening every time an inline-signed zone was added to the configuration, followed by a rndc reconfig. During the reconfig process, the secure version of every inline-signed zone was "moved" to a new view upon a reconfig and it "took the raw version along", but only once the secure version was freed (at shutdown) was prev_view for the raw version detached from, causing the old view to be released as well. This caused dangling references to be kept for the previous view, thus keeping all resources used by that view in memory.	2021-02-15 11:52:50 -03:00
Diego Fronza	0aebad96b5	Added option for disabling stale-answer-client-timeout This commit allows to specify "disabled" or "off" in stale-answer-client-timeout statement. The logic to support this behavior will be added in the subsequent commits. This commit also ensures an upper bound to stale-answer-client-timeout which equals to one second less than 'resolver-query-timeout'. (cherry picked from commit `0ad6f594f6`)	2021-01-29 10:38:58 +01:00
Diego Fronza	3478794a5d	Add stale-answer-client-timeout option The general logic behind the addition of this new feature works as folows: When a client query arrives, the basic path (query.c / ns_query_recurse) was to create a fetch, waiting for completion in fetch_callback. With the introduction of stale-answer-client-timeout, a new event of type DNS_EVENT_TRYSTALE may invoke fetch_callback, whenever stale answers are enabled and the fetch took longer than stale-answer-client-timeout to complete. When an event of type DNS_EVENT_TRYSTALE triggers fetch_callback, we must ensure that the folowing happens: 1. Setup a new query context with the sole purpose of looking up for stale RRset only data, for that matters a new flag was added 'DNS_DBFIND_STALEONLY' used in database lookups. . If a stale RRset is found, mark the original client query as answered (with a new query attribute named NS_QUERYATTR_ANSWERED), so when the fetch completion event is received later, we avoid answering the client twice. . If a stale RRset is not found, cleanup and wait for the normal fetch completion event. 2. In ns_query_done, we must change this part: /* * If we're recursing then just return; the query will * resume when recursion ends. */ if (RECURSING(qctx->client)) { return (qctx->result); } To this: if (RECURSING(qctx->client) && !QUERY_STALEONLY(qctx->client)) { return (qctx->result); } Otherwise we would not proceed to answer the client if it happened that a stale answer was found when looking up for stale only data. When an event of type DNS_EVENT_FETCHDONE triggers fetch_callback, we proceed as before, resuming query, updating stats, etc, but a few exceptions had to be added, most important of which are two: 1. Before answering the client (ns_client_send), check if the query wasn't already answered before. 2. Before detaching a client, e.g. isc_nmhandle_detach(&client->reqhandle), ensure that this is the fetch completion event, and not the one triggered due to stale-answer-client-timeout, so a correct call would be: if (!QUERY_STALEONLY(client)) { isc_nmhandle_detach(&client->reqhandle); } Other than these notes, comments were added in code in attempt to make these updates easier to follow. (cherry picked from commit `171a5b7542`)	2021-01-29 10:38:32 +01:00
Mark Andrews	85318b521d	Pass an afg_aclconfctx_t structure to cfg_acl_fromconfig in named_zone_inlinesigning. A NULL pointer does not work. (cherry picked from commit `2b3fcd7156`)	2021-01-28 13:43:47 +11:00
Mark Andrews	b416d8fcdf	Improve the diagnostic 'rndc retransfer' error message (cherry picked from commit `dd3520ae41`)	2021-01-28 09:44:26 +11:00
Evan Hunt	85530bdd23	use primary/secondary terminology in 'rndc zonestatus' (cherry picked from commit `68c384e118`)	2021-01-12 15:21:14 +01:00
Mark Andrews	72fa03a1e9	Use atomic_init when initalising server->reload_status	2021-01-04 05:16:16 +00:00
Matthijs Mekking	cf0439cd5f	Treat dnssec-policy "none" as a builtin zone Configure "none" as a builtin policy. Change the 'cfg_kasp_fromconfig' api so that the 'name' will determine what policy needs to be configured. When transitioning a zone from secure to insecure, there will be cases when a zone with no DNSSEC policy (dnssec-policy none) should be using KASP. When there are key state files available, this is an indication that the zone once was DNSSEC signed but is reconfigured to become insecure. If we would not run the keymgr, named would abruptly remove the DNSSEC records from the zone, making the zone bogus. Therefore, change the code such that a zone will use kasp if there is a valid dnssec-policy configured, or if there are state files available. (cherry picked from commit `cf420b2af0`)	2020-12-23 11:56:33 +01:00
Ondřej Surý	7b9c8b9781	Refactor netmgr and add more unit tests This is a part of the works that intends to make the netmgr stable, testable, maintainable and tested. It contains a numerous changes to the netmgr code and unfortunately, it was not possible to split this into smaller chunks as the work here needs to be committed as a complete works. NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and tcpdns.c and it should be a subject to refactoring in the future. The changes that are included in this commit are listed here (extensively, but not exclusively): * The netmgr_test unit test was split into individual tests (udp_test, tcp_test, tcpdns_test and newly added tcp_quota_test) * The udp_test and tcp_test has been extended to allow programatic failures from the libuv API. Unfortunately, we can't use cmocka mock() and will_return(), so we emulate the behaviour with #define and including the netmgr/{udp,tcp}.c source file directly. * The netievents that we put on the nm queue have variable number of members, out of these the isc_nmsocket_t and isc_nmhandle_t always needs to be attached before enqueueing the netievent_<foo> and detached after we have called the isc_nm_async_<foo> to ensure that the socket (handle) doesn't disappear between scheduling the event and actually executing the event. * Cancelling the in-flight TCP connection using libuv requires to call uv_close() on the original uv_tcp_t handle which just breaks too many assumptions we have in the netmgr code. Instead of using uv_timer for TCP connection timeouts, we use platform specific socket option. * Fix the synchronization between {nm,async}_{listentcp,tcpconnect} When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was waiting for socket to either end up with error (that path was fine) or to be listening or connected using condition variable and mutex. Several things could happen: 0. everything is ok 1. the waiting thread would miss the SIGNAL() - because the enqueued event would be processed faster than we could start WAIT()ing. In case the operation would end up with error, it would be ok, as the error variable would be unchanged. 2. the waiting thread miss the sock->{connected,listening} = `true` would be set to `false` in the tcp_{listen,connect}close_cb() as the connection would be so short lived that the socket would be closed before we could even start WAIT()ing * The tcpdns has been converted to using libuv directly. Previously, the tcpdns protocol used tcp protocol from netmgr, this proved to be very complicated to understand, fix and make changes to. The new tcpdns protocol is modeled in a similar way how tcp netmgr protocol. Closes: #2194, #2283, #2318, #2266, #2034, #1920 * The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to pass accepted TCP sockets between netthreads, but instead (similar to UDP) uses per netthread uv_loop listener. This greatly reduces the complexity as the socket is always run in the associated nm and uv loops, and we are also not touching the libuv internals. There's an unfortunate side effect though, the new code requires support for load-balanced sockets from the operating system for both UDP and TCP (see #2137). If the operating system doesn't support the load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB on FreeBSD 12+), the number of netthreads is limited to 1. * The netmgr has now two debugging #ifdefs: 1. Already existing NETMGR_TRACE prints any dangling nmsockets and nmhandles before triggering assertion failure. This options would reduce performance when enabled, but in theory, it could be enabled on low-performance systems. 2. New NETMGR_TRACE_VERBOSE option has been added that enables extensive netmgr logging that allows the software engineer to precisely track any attach/detach operations on the nmsockets and nmhandles. This is not suitable for any kind of production machine, only for debugging. * The tlsdns netmgr protocol has been split from the tcpdns and it still uses the old method of stacking the netmgr boxes on top of each other. We will have to refactor the tlsdns netmgr protocol to use the same approach - build the stack using only libuv and openssl. * Limit but not assert the tcp buffer size in tcp_alloc_cb Closes: #2061 (cherry picked from commit `634bdfb16d`)	2020-12-09 10:46:16 +01:00
Ondřej Surý	a35a666a7c	Reformat sources using clang-format-11 (cherry picked from commit `7ba18870dc`)	2020-12-08 19:34:05 +01:00
Matthijs Mekking	6db879160f	Detect NSEC3 salt collisions When generating a new salt, compare it with the previous NSEC3 paremeters to ensure the new parameters are different from the previous ones. This moves the salt generation call from 'bin/named/*.s' to 'lib/dns/zone.c'. When setting new NSEC3 parameters, you can set a new function parameter 'resalt' to enforce a new salt to be generated. A new salt will also be generated if 'salt' is set to NULL. Logging salt with zone context can now be done with 'dnssec_log', removing the need for 'dns_nsec3_log_salt'. (cherry picked from commit `6b5d7357df`)	2020-11-26 14:15:05 +00:00
Matthijs Mekking	734865e110	Add zone context to "generated salt" logs (cherry picked from commit `3b4c764b43`)	2020-11-26 14:15:05 +00:00
Matthijs Mekking	93f9d3b812	Move logging of salt in separate function There may be a desire to log the salt without losing the context of log module, level, and category. (cherry picked from commit `7878f300ff`)	2020-11-26 14:15:04 +00:00
Matthijs Mekking	b6cf88333a	Don't use 'rndc signing' with kasp The 'rndc signing' command allows you to manipulate the private records that are used to store signing state. Don't use these with 'dnssec-policy' as such manipulations may violate the policy (if you want to change the NSEC3 parameters, change the policy and reconfig). (cherry picked from commit `eae9a6d297`)	2020-11-26 14:15:02 +00:00
Matthijs Mekking	d13786d583	Fix a reconfig bug wrt inline-signing When doing 'rndc reconfig', named may complain about a zone not being reusable because it has a raw version of the zone, and the new configuration has not set 'inline-signing'. However, 'inline-signing' may be implicitly true if a 'dnssec-policy' is used for the zone, and the zone is not dynamic. Improve the check in 'named_zone_reusable'. Create a new function for checking 'inline-signing' configuration that matches existing code in 'bin/named/server.c'. (cherry picked from commit `ba8128ea00`)	2020-11-26 14:15:02 +00:00
Matthijs Mekking	008e84e965	Support for NSEC3 in dnssec-policy Implement support for NSEC3 in dnssec-policy. Store the configuration in kasp objects. When configuring a zone, call 'dns_zone_setnsec3param' to queue an nsec3param event. This will ensure that any previous chains will be removed and a chain according to the dnssec-policy is created. Add tests for dnssec-policy zones that uses the new 'nsec3param' option, as well as changing to new values, changing to NSEC, and changing from NSEC. (cherry picked from commit `114af58ee2`)	2020-11-26 14:15:02 +00:00
Matthijs Mekking	9b9ac92fd0	Move generate_salt function to lib/dns/nsec3 We will be using this function also on reconfig, so it should have a wider availability than just bin/named/server. (cherry picked from commit `84a4273074`)	2020-11-26 14:14:56 +00:00
Diego Fronza	4905c2e24a	Output 'stale-refresh-time' value on rndc serve-stale status	2020-11-11 16:06:30 -03:00
Diego Fronza	73c199dec7	Check 'stale-refresh-time' when sharing cache between views This commit ensures that, along with previous restrictions, a cache is shareable between views only if their 'stale-refresh-time' value are equal.	2020-11-11 16:06:23 -03:00
Diego Fronza	8cc5abff23	Add stale-refresh-time option Before this update, BIND would attempt to do a full recursive resolution process for each query received if the requested rrset had its ttl expired. If the resolution fails for any reason, only then BIND would check for stale rrset in cache (if 'stale-cache-enable' and 'stale-answer-enable' is on). The problem with this approach is that if an authoritative server is unreachable or is failing to respond, it is very unlikely that the problem will be fixed in the next seconds. A better approach to improve performance in those cases, is to mark the moment in which a resolution failed, and if new queries arrive for that same rrset, try to respond directly from the stale cache, and do that for a window of time configured via 'stale-refresh-time'. Only when this interval expires we then try to do a normal refresh of the rrset. The logic behind this commit is as following: - In query.c / query_gotanswer(), if the test of 'result' variable falls to the default case, an error is assumed to have happened, and a call to 'query_usestale()' is made to check if serving of stale rrset is enabled in configuration. - If serving of stale answers is enabled, a flag will be turned on in the query context to look for stale records: query.c:6839 qctx->client->query.dboptions \|= DNS_DBFIND_STALEOK; - A call to query_lookup() will be made again, inside it a call to 'dns_db_findext()' is made, which in turn will invoke rbdb.c / cache_find(). - In rbtdb.c / cache_find() the important bits of this change is the call to 'check_stale_header()', which is a function that yields true if we should skip the stale entry, or false if we should consider it. - In check_stale_header() we now check if the DNS_DBFIND_STALEOK option is set, if that is the case we know that this new search for stale records was made due to a failure in a normal resolution, so we keep track of the time in which the failured occured in rbtdb.c:4559: header->last_refresh_fail_ts = search->now; - In check_stale_header(), if DNS_DBFIND_STALEOK is not set, then we know this is a normal lookup, if the record is stale and the query time is between last failure time + stale-refresh-time window, then we return false so cache_find() knows it can consider this stale rrset entry to return as a response. The last additions are two new methods to the database interface: - setservestale_refresh - getservestale_refresh Those were added so rbtdb can be aware of the value set in configuration option, since in that level we have no access to the view object.	2020-11-11 15:59:56 -03:00
Matthijs Mekking	63652ca58f	Use explicit result codes for 'rndc dnssec' cmd It is better to add new result codes than to overload existing codes. (cherry picked from commit `70d1ec432f`)	2020-10-05 11:20:35 +02:00
Matthijs Mekking	4d0dc466b5	Add rndc dnssec -rollover command This command is similar in arguments as -checkds so refactor the 'named_server_dnssec' function accordingly. The only difference are that: - It does not take a "publish" or "withdrawn" argument. - It requires the key id to be set (add a check to make sure). Add tests that will trigger rollover immediately and one that schedules a test in the future. (cherry picked from commit `e826facadb`)	2020-10-05 11:20:35 +02:00
Evan Hunt	ba2e9dfb99	change from isc_nmhandle_ref/unref to isc_nmhandle attach/detach Attaching and detaching handle pointers will make it easier to determine where and why reference counting errors have occurred. A handle needs to be referenced more than once when multiple asynchronous operations are in flight, so callers must now maintain multiple handle pointers for each pending operation. For example, ns_client objects now contain: - reqhandle: held while waiting for a request callback (query, notify, update) - sendhandle: held while waiting for a send callback - fetchhandle: held while waiting for a recursive fetch to complete - updatehandle: held while waiting for an update-forwarding task to complete (cherry picked from commit `57b4dde974`)	2020-10-01 18:09:35 +02:00
Evan Hunt	1263201732	don't use exclusive mode for rndc commands that don't need it "showzone" and "tsig-list" both used exclusive mode unnecessarily; changing this will simplify future refactoring a bit. (cherry picked from commit `002c328437`)	2020-10-01 16:44:43 +02:00
Mark Andrews	c99cf25ac9	make (named_server_t).reload_status atomic WARNING: ThreadSanitizer: data race Write of size 4 at 0x000000000001 by thread T1: #0 view_loaded bin/named/server.c:9678:25 #1 call_loaddone lib/dns/zt.c:308:3 #2 doneloading lib/dns/zt.c:582:3 #3 zone_asyncload lib/dns/zone.c:2322:3 #4 dispatch lib/isc/task.c:1152:7 #5 run lib/isc/task.c:1344:2 Previous read of size 4 at 0x000000000001 by thread T2: #0 named_server_status bin/named/server.c:11903:14 #1 named_control_docommand bin/named/control.c:272:12 #2 control_command bin/named/controlconf.c:390:17 #3 dispatch lib/isc/task.c:1152:7 #4 run lib/isc/task.c:1344:2 Location is heap block of size 409 at 0x000000000011 allocated by main thread: #0 malloc <null> #1 default_memalloc lib/isc/mem.c:713:8 #2 mem_get lib/isc/mem.c:622:8 #3 mem_allocateunlocked lib/isc/mem.c:1268:8 #4 isc___mem_allocate lib/isc/mem.c:1288:7 #5 isc__mem_allocate lib/isc/mem.c:2453:10 #6 isc___mem_get lib/isc/mem.c:1037:11 #7 isc__mem_get lib/isc/mem.c:2432:10 #8 named_server_create bin/named/server.c:9978:27 #9 setup bin/named/main.c:1256:2 #10 main bin/named/main.c:1523:2 Thread T1 (running) created by main thread at: #0 pthread_create <null> #1 isc_thread_create lib/isc/pthreads/thread.c:73:8 #2 isc_taskmgr_create lib/isc/task.c:1434:3 #3 create_managers bin/named/main.c:915:11 #4 setup bin/named/main.c:1223:11 #5 main bin/named/main.c:1523:2 Thread T2 (running) created by main thread at: #0 pthread_create <null> #1 isc_thread_create lib/isc/pthreads/thread.c:73:8 #2 isc_taskmgr_create lib/isc/task.c:1434:3 #3 create_managers bin/named/main.c:915:11 #4 setup bin/named/main.c:1223:11 #5 main bin/named/main.c:1523:2 SUMMARY: ThreadSanitizer: data race bin/named/server.c:9678:25 in view_loaded (cherry picked from commit `b00ba7ac94`)	2020-10-01 00:47:53 +10:00
Matthijs Mekking	d77283ff63	Add -expired flag to rndc dumpdb command This flag is the same as -cache, but will use a different style format that will also print expired entries (awaiting cleanup) from the cache. (cherry picked from commit `8beda7d2ea`)	2020-09-25 08:20:02 +02:00
Mark Andrews	f2c0aa1dfe	Break lock order loop by sending TAT in an event The dotat() function has been changed to send the TAT query asynchronously, so there's no lock order loop because we initialize the data first and then we schedule the TAT send to happen asynchronously. This breaks following lock-order loops: zone->lock (dns_zone_setviewcommit) while holding view->lock (dns_view_setviewcommit) keytable->lock (dns_keytable_find) while holding zone->lock (zone_asyncload) view->lock (dns_view_findzonecut) while holding keytable->lock (dns_keytable_forall) (cherry picked from commit `3c4b68af7c`)	2020-09-22 23:04:44 +10:00
Evan Hunt	df698d73f4	update all copyright headers to eliminate the typo	2020-09-14 16:50:58 -07:00
Mark Andrews	4881207780	Defer read of zl->server and zl->reconfig until the reference counter has gone to zero and there is no longer a possibility of changes in other threads. (cherry picked from commit `9b445f33e2`)	2020-09-09 16:22:38 +10:00
Michał Kępień	3f25b8e608	Add "-T maxcachesize=..." command line option An implicit default of "max-cache-size 90%;" may cause memory use issues on hosts which run numerous named instances in parallel (e.g. GitLab CI runners) due to the cache RBT hash table now being pre-allocated [1] at startup. Add a new command line option, "-T maxcachesize=...", to allow the default value of "max-cache-size" to be overridden at runtime. When this new option is in effect, it overrides any other "max-cache-size" setting in the configuration, either implicit or explicit. This approach was chosen because it is arguably the simplest one to implement. The following alternative approaches to solving this problem were considered and ultimately rejected (after it was decided they were not worth the extra code complexity): - adding the same command line option, but making explicit configuration statements have priority over it, - adding a build-time option that allows the implicit default of "max-cache-size 90%;" to be overridden. [1] see commit `aa72c31422` (cherry picked from commit `9ac1f6a9bc`)	2020-08-31 23:41:24 +02:00
Matthijs Mekking	624f1b9531	rndc dnssec -checkds set algorithm In the rare case that you have multiple keys acting as KSK and that have the same keytag, you can now set the algorithm when calling '-checkds'. (cherry picked from commit `46fcd927e7`)	2020-08-07 13:34:10 +02:00
Matthijs Mekking	81d0c63ecb	Implement 'rndc dnssec -checkds' Add a new 'rndc' command 'dnssec -checkds' that allows the user to signal named that a new DS record has been seen published in the parent, or that an existing DS record has been withdrawn from the parent. Upon the 'checkds' request, 'named' will write out the new state for the key, updating the 'DSPublish' or 'DSRemoved' timing metadata. This replaces the "parent-registration-delay" configuration option, this was unreliable because it was purely time based (if the user did not actually submit the new DS to the parent for example, this could result in an invalid DNSSEC state). Because we cannot rely on the parent registration delay for state transition, we need to replace it with a different guard. Instead, if a key wants its DS state to be moved to RUMOURED, the "DSPublish" time must be set and must not be in the future. If a key wants its DS state to be moved to UNRETENTIVE, the "DSRemoved" time must be set and must not be in the future. By default, with '-checkds' you set the time that the DS has been published or withdrawn to now, but you can set a different time with '-when'. If there is only one KSK for the zone, that key has its DS state moved to RUMOURED. If there are multiple keys for the zone, specify the right key with '-key'. (cherry picked from commit `04d8fc0143`)	2020-08-07 13:30:19 +02:00
Ondřej Surý	f3a7ee87ef	Add CHANGES and release notes for GL #1712 and GL #1829 (cherry picked from commit `dd62275152`)	2020-08-05 09:09:16 +02:00
Ondřej Surý	b48e9ab201	Add stale-cache-enable option and disable serve-stable by default The current serve-stale implementation in BIND 9 stores all received records in the cache for a max-stale-ttl interval (default 12 hours). This allows DNS operators to turn the serve-stale answers in an event of large authoritative DNS outage. The caching of the stale answers needs to be enabled before the outage happens or the feature would be otherwise useless. The negative consequence of the default setting is the inevitable cache-bloat that happens for every and each DNS operator running named. In this MR, a new configuration option `stale-cache-enable` is introduced that allows the operators to selectively enable or disable the serve-stale feature of BIND 9 based on their decision. The newly introduced option has been disabled by default, e.g. serve-stale is disabled in the default configuration and has to be enabled if required. (cherry picked from commit `ce53db34d6`)	2020-08-05 09:09:16 +02:00
Mark Andrews	14fe6e77a7	Always check the return from isc_refcount_decrement. Created isc_refcount_decrement_expect macro to test conditionally the return value to ensure it is in expected range. Converted unchecked isc_refcount_decrement to use isc_refcount_decrement_expect. Converted INSIST(isc_refcount_decrement()...) to isc_refcount_decrement_expect. (cherry picked from commit `bde5c7632a`)	2020-07-31 12:54:47 +10:00
Petr Menšík	fade143531	Prevent crash on dst initialization failure server might be created, but not yet fully initialized, when fatal function is called. Check both server and task before attaching exclusive task. (cherry picked from commit `c5e7152cf0`)	2020-07-23 11:28:11 +10:00
Evan Hunt	fc73dbdc7d	make sure new_zone_lock is locked before unlocking it it was possible for the count_newzones() function to try to unlock view->new_zone_lock on return before locking it, which caused a crash on shutdown. (cherry picked from commit `ed37c63e2b`)	2020-07-13 23:53:14 +00:00
Mark Andrews	0265bd17d5	Fallback to built in trust-anchors, managed-keys, or trusted-keys if the bind.keys file cannot be parsed. (cherry picked from commit `d02a14c795`)	2020-07-13 15:13:50 +10:00
Michał Kępień	0bc4d6cc7a	Fix locking for LMDB 0.9.26 When "rndc reconfig" is run, named first configures a fresh set of views and then tears down the old views. Consider what happens for a single view with LMDB enabled; "envA" is the pointer to the LMDB environment used by the original/old version of the view, "envB" is the pointer to the same LMDB environment used by the new version of that view: 1. mdb_env_open(envA) is called when the view is first created. 2. "rndc reconfig" is called. 3. mdb_env_open(envB) is called for the new instance of the view. 4. mdb_env_close(envA) is called for the old instance of the view. This seems to have worked so far. However, an upstream change [1] in LMDB which will be part of its 0.9.26 release prevents the above sequence of calls from working as intended because the locktable mutexes will now get destroyed by the mdb_env_close() call in step 4 above, causing any subsequent mdb_txn_begin() calls to fail (because all of the above steps are happening within a single named process). Preventing the above scenario from happening would require either redesigning the way we use LMDB in BIND, which is not something we can easily backport, or redesigning the way BIND carries out its reconfiguration process, which would be an even more severe change. To work around the problem, set MDB_NOLOCK when calling mdb_env_open() to stop LMDB from controlling concurrent access to the database and do the necessary locking in named instead. Reuse the view->new_zone_lock mutex for this purpose to prevent the need for modifying struct dns_view (which would necessitate library API version bumps). Drop use of MDB_NOTLS as it is made redundant by MDB_NOLOCK: MDB_NOTLS only affects where LMDB reader locktable slots are stored while MDB_NOLOCK prevents the reader locktable from being used altogether. [1] `2fd44e3251` (cherry picked from commit `53120279b5`)	2020-07-10 11:30:31 +02:00

1 2 3 4 5 ...

1225 commits