bind9

Author	SHA1	Message	Date
Evan Hunt	220ada9422	reset taskmgr mode immediately after returning from zone load all privileged tasks are complete by the time we return from isc_task_endexclusive(), so it makes sense to reset the taskmgr mode to non-privileged right then.	2021-05-10 12:26:27 -07:00
Ondřej Surý	365c6a9851	ensure interlocked netmgr events run on worker[0] Network manager events that require interlock (pause, resume, listen) are now always executed in the same worker thread, mgr->workers[0], to prevent races. "stoplistening" events no longer require interlock.	2021-05-07 14:28:32 -07:00
Evan Hunt	5c08f97791	only run tasks as privileged if taskmgr is in privileged mode all zone loading tasks have the privileged flag, but we only want them to run as privileged tasks when the server is being initialized; if we privilege them the rest of the time, the server may hang for a long time after a reload/reconfig. so now we call isc_taskmgr_setmode() to turn privileged execution mode on or off in the task manager. isc_task_privileged() returns true if the task's privilege flag is set and the taskmgr is in privileged execution mode. this is used to determine in which netmgr event queue the task should be run.	2021-05-07 14:28:30 -07:00
Ondřej Surý	0b491913df	Don't clear dig lookup if it was already cleared This workarounds couple of races where the current_lookup would be already detached during shutting down the dig, but still processing the pending reads.	2021-05-07 14:28:30 -07:00
Ondřej Surý	2836bc1854	Fix wrong query accounting in the connect function in dighost.c The start_udp() function didn't properly attach to the query and thus a callback with ISC_R_CANCELED would end with wrong accounting on the query object. Usually, this doesn't happen because underlying libuv API uv_udp_connect() is synchronous, but isc_nm_udpconnect() could return ISC_R_CANCELED in case it's called while the netmgr is shutting down.	2021-05-07 14:28:30 -07:00
Ondřej Surý	b5bf58b419	Destroy netmgr before destroying taskmgr With taskmgr running on top of netmgr, the ordering of how the tasks and netmgr shutdown interacts was wrong as previously isc_taskmgr_destroy() was waiting until all tasks were properly shutdown and detached. This responsibility was moved to netmgr, so we now need to do the following: 1. shutdown all the tasks - this schedules all shutdown events onto the netmgr queue 2. shutdown the netmgr - this also makes sure all the tasks and events are properly executed 3. Shutdown the taskmgr - this now waits for all the tasks to finish running before returning 4. Shutdown the netmgr - this call waits for all the netmgr netievents to finish before returning This solves the race when the taskmgr object would be destroyed before all the tasks were finished running in the netmgr loops.	2021-05-07 14:28:30 -07:00
Ondřej Surý	a011d42211	Add new isc_managers API to simplify <>mgr create/destroy Previously, netmgr, taskmgr, timermgr and socketmgr all had their own isc_<>mgr_create() and isc_<>mgr_destroy() functions. The new isc_managers_create() and isc_managers_destroy() fold all four into a single function and makes sure the objects are created and destroy in correct order. Especially now, when taskmgr runs on top of netmgr, the correct order is important and when the code was duplicated at many places it's easy to make mistake. The former isc_<>mgr_create() and isc_<*>mgr_destroy() functions were made private and a single call to isc_managers_create() and isc_managers_destroy() is required at the program startup / shutdown.	2021-05-07 10:19:05 -07:00
Matthijs Mekking	66f2cd228d	Use isdigit instead of checking character range When looking for key files, we could use isdigit rather than checking if the character is within the range [0-9]. Use (unsigned char) cast to ensure the value is representable in the unsigned char type (as suggested by the isdigit manpage). Change " & 0xff" occurrences to the recommended (unsigned char) type cast.	2021-05-05 19:15:33 +02:00
Matthijs Mekking	511bc1b882	Check for filename clashes /w dnssec-policy zones Just like with dynamic and/or inline-signing zones, check if no two or more zone configurations set the same filename. In these cases, the zone files are not read-only and named-checkconf should catch a configuration where multiple zone statements write to the same file. Add some bad configuration tests where KASP zones reference the same zone file. Update the good-kasp test to allow for two zones configure the same file name, dnssec-policy none.	2021-05-05 19:13:55 +02:00
Matthijs Mekking	2d1b3a9899	Check zonefile is untouched if dnssec-policy none Make sure no DNSSEC contents are added to the zonefile if dnssec-policy is set to "none" (and no .state files exist for the zone).	2021-05-05 19:13:55 +02:00
Mark Andrews	ae1ae07b03	Check journal compaction	2021-05-05 23:12:37 +10:00
Mark Andrews	4a8e33b9f0	Always perform a re-write when processing a version 1 journal version 1 journals may have a mix of type 1 and type 2 transaction headers so always use the recovery code.	2021-05-05 23:12:37 +10:00
Mark Andrews	71df4fb84c	Allow named-journalprint to compact journals at a given serial	2021-05-05 23:12:37 +10:00
Matthijs Mekking	4a8ad0a77f	Add kasp tests for offline keys Add a test for default.kasp that if we remove the private key file, no successor key is created for it. We need to update the kasp script to deal with a missing private key. If this is the case, skip checks for private key files. Add a test with a zone for which the private key of the ZSK is missing. Add a test with a zone for which the private key of the KSK is missing.	2021-05-05 11:14:02 +02:00
Matthijs Mekking	b3a5859a9b	rndc dnssec -status should include offline keys The rndc command 'dnssec -status' only considered keys from 'dns_dnssec_findmatchingkeys' which only includes keys with accessible private keys. Change it so that offline keys are also listed in the status.	2021-05-05 11:13:19 +02:00
Mark Andrews	dba13d280a	named-checkconf now detects redefinition of dnssec-policy 'insecure'	2021-05-05 16:23:19 +10:00
Matthijs Mekking	a548a450b3	checkconf tests for inline-signing at options/view	2021-05-04 23:35:59 +00:00
Mark Andrews	b3301da262	inline-signing should have been in zone_only_clauses	2021-05-04 23:35:59 +00:00
Matthijs Mekking	572f421df4	Fix intermittent kasp test failure The kasp system test performs for each zone a couple of checks to make sure the zone is signed correctly. To avoid test failures caused by timing issues, there is first a check to ensure the zone is done signing, 'wait_for_done_signing'. This function waits with the DNSSEC checks until a "zone_rekey done" log message is seen for a specific key. Unfortunately this is not sufficient to avoid test failures due to timing issues, because there is a small amount of time in between this log message and the newly signed zone actually being served. Therefore, in 'check_apex', retry for three seconds the DNSKEY query check. After that, additional checks should pass without retries, because at that point we know for sure the zone has been resigned with the expected keys. Also reduce the number of redundant 'check_signatures'	2021-05-04 04:50:01 +00:00
Mark Andrews	205d1bb762	Remove spurious $ and \ in addzone example	2021-05-04 02:18:34 +00:00
Ondřej Surý	dfd56b84f5	Add support for generating backtraces on Windows This commit adds support for generating backtraces on Windows and refactors the isc_backtrace API to match the Linux/BSD API (without the isc_ prefix) * isc_backtrace_gettrace() was renamed to isc_backtrace(), the third argument was removed and the return type was changed to int * isc_backtrace_symbols() was added * isc_backtrace_symbols_fd() was added and used as appropriate	2021-05-03 20:31:52 +02:00
Ondřej Surý	c37ff5d188	Add nanosleep and usleep Windows shims This commit adds POSIX nanosleep() and usleep() shim implementation for Windows to help implementors use less #ifdef _WIN32 in the code.	2021-05-03 20:22:54 +02:00
Matthijs Mekking	5b31811b5f	Update nsupdate test The nsupdate system test did not record failures from the 'update_test.pl' Perl script. This was because the 'ret' value was not being saved outside the '{ $PERL ... \|\| ret=1 } cat_i' scope. Change this piece to store the output in a separate file and then cat its contents. Now the 'ret' value is being saved. Also record failures in 'update_test.pl' if sending the update failed. Add missing 'n' incrementals to 'nsupdate/test.sh' to keep track of test numbers.	2021-04-30 12:25:25 +00:00
Matthijs Mekking	287428e0aa	Add kasp test policy goes straight to "none" Add a test case when a dnssec-policy is reconfigured to "none", without setting it to "insecure" first. This is unsupported behavior, but we want to make sure the behavior is somewhat expected. The zone should remain signed (but will go bogus once the signatures expire).	2021-04-30 11:20:41 +02:00
Matthijs Mekking	9c6ff463fd	Add test for "insecure" policy While it is meant to be used for transitioning a zone to insecure, add a test case where a zone uses the "insecure" policy immediately. The zone will go through DNSSEC maintenance, but the outcome should be the same as 'dnssec-policy none;', that is the zone should be unsigned.	2021-04-30 11:18:38 +02:00
Matthijs Mekking	17e3b056c8	Update kasp tests to "insecure" policy The tests for going insecure should be changed to use the built-in "insecure" policy. The function that checks dnssec status output should again check for the special case "none".	2021-04-30 11:18:38 +02:00
Matthijs Mekking	2710d9a11d	Add built-in dnssec-policy "insecure" Add a new built-in policy "insecure", to be used to gracefully unsign a zone. Previously you could just remove the 'dnssec-policy' configuration from your zone statement, or remove it. The built-in policy "none" (or not configured) now actually means no DNSSEC maintenance for the corresponding zone. So if you immediately reconfigure your zone from whatever policy to "none", your zone will temporarily be seen as bogus by validating resolvers. This means we can remove the functions 'dns_zone_use_kasp()' and 'dns_zone_secure_to_insecure()' again. We also no longer have to check for the existence of key state files to figure out if a zone is transitioning to insecure.	2021-04-30 11:18:38 +02:00
Mark Andrews	044933756a	NSEC3PARAM support was added to Net::DNS in 1.00_06 Require 1.01 or later to when adding a NSEC3PARAM records.	2021-04-30 15:59:30 +10:00
Mark Andrews	8510ccaa54	Update ZONEMD to match RFC 8976 * The location of the digest type field has changed to where the reserved field was. * The reserved field is now called scheme and is where the digest type field was. * Digest type 2 has been defined (SHA256).	2021-04-30 10:43:37 +10:00
Michal Nowak	e1c3034107	Disable pytest cacheprovider plugin in CI The pytest "cacheprovider" plugin produces a .cache/v/cache/lastfailed file, which holds a Python dictionary structure with failed tests. However, on Ubuntu 16.04 (Xenial) the file is created even though the test passed and the file contains just an empty dictionary ("{}"). Given that we are not interested in this feature, disabling the "cacheprovider" plugin globally and removing per-test removals of the .cache directory seems like the best course of action.	2021-04-29 15:29:18 +02:00
Mark Andrews	e6e0e29fbb	Check insecure responses returned with too many NSEC3 iterations	2021-04-29 13:43:40 +02:00
Ondřej Surý	861a236937	Use SIGABRT instead of SIGKILL to produce cores on failed start When the `named` would hang on startup it would be killed with SIGKILL leaving us with no information about the state the process was in. This commit changes the start.pl script to send SIGABRT instead, so we can properly collect and process the coredump from the hung named process.	2021-04-29 12:03:50 +02:00
Matthijs Mekking	efa5d84dcf	dnssec-policy: reduce NSEC3 iterations to 150 When reducing the number of NSEC3 iterations to 150, commit `aa26cde2ae` added tests for dnssec-policy to check that a too high iteration count is a configuration failure. The test is not sufficient because 151 was always too high for ECDSAP256SHA256. The test should check for a different algorithm. There was an existing test case that checks for NSEC3 iterations. Update the test with the new maximum values. Update the code in 'kaspconf.c' to allow at most 150 iterations.	2021-04-29 10:41:16 +02:00
Mark Andrews	46eb21c546	Check that excessive iterations in logged by named when loading an existing zone or transfering from the primary.	2021-04-29 17:18:26 +10:00
Mark Andrews	8ec16c378d	Check NSEC3 iterations with dnssec-signzone	2021-04-29 17:18:26 +10:00
Mark Andrews	4ce8437a6e	Check that named rejects excessive iterations via UPDATE	2021-04-29 17:18:26 +10:00
Mark Andrews	3fe75d9809	nsupdate: reject attempts to add NSEC3PARAM with excessive iterations	2021-04-29 17:18:26 +10:00
Mark Andrews	aa26cde2ae	Check dnssec-policy nsec3param iterations limit	2021-04-29 17:18:26 +10:00
Mark Andrews	29126500d2	Reduce nsec3 max iterations to 150	2021-04-29 17:18:26 +10:00
Matthijs Mekking	104b676235	Serve-stale nit fixes While working on the serve-stale backports, I noticed the following oddities: 1. In the serve-stale system test, in one case we keep track of the time how long it took for dig to complete. In commit `aaed7f9d8c`, the code removed the exception to check for result == ISC_R_SUCCESS on stale found answers, and adjusted the test accordingly. This failed to update the time tracking accordingly. Move the t1/t2 time track variables back around the two dig commands to ensure the lookups resolved faster than the resolver-query-timeout. 2. We can remove the setting of NS_QUERYATTR_STALEOK and DNS_RDATASETATTR_STALE_ADDED on the "else if (stale_timeout)" code path, because they are added later when we know we have actually found a stale answer on a stale timeout lookup. 3. We should clear the NS_QUERYATTR_STALEOK flag from the client query attributes instead of DNS_RDATASETATTR_STALE_ADDED (that flag is set on the rdataset attributes). 4. In 'bin/named/config.c' we should set the configuration options in alpabetical order. 5. In the ARM, in the backports we have added "(stale)" between "cached" and "RRset" to make more clear a stale RRset may be returned in this scenario.	2021-04-28 12:24:24 +02:00
Michał Kępień	241e85ef0c	Warn when log files grow too big in system tests Exerting excessive I/O load on the host running system tests should be avoided in order to limit the number of false positives reported by the system test suite. In some cases, running named with "-d 99" (which is the default for system tests) results in a massive amount of logs being generated, most of which are useless. Implement a log file size check to draw developers' attention to overly verbose named instances used in system tests. The warning threshold of 200,000 lines was chosen arbitrarily.	2021-04-28 07:56:47 +02:00
Michał Kępień	17e5c2a50e	Prevent useless logging in the "tcp" system test The regression test for CVE-2020-8620 causes a lot of useless messages to be logged. However, globally decreasing the log level for the affected named instance would be a step too far as debugging information may be useful for troubleshooting other checks in the "tcp" system test. Starting a separate named instance for a single check should be avoided when possible and thus is also not a good solution. As a compromise, run "rndc trace 1" for the affected named instance before starting the regression test for CVE-2020-8620.	2021-04-28 07:56:47 +02:00
Michał Kępień	4a8d404876	Limit logging for verbose system tests The system test framework starts all named instances with the "-d 99" command line option (unless it is overridden by a named.args file in a given instance's working directory). This causes a lot of log messages to be written to named.run files - currently over 5 million lines for a single test suite run. While debugging information preserved in the log files is essential for troubleshooting intermittent test failures, some system tests involve sending hundreds or even thousands of queries, which causes the relevant log files to explode in size. When multiple tests (or even multiple test suites) are run in parallel, excessive logging contributes considerably to the I/O load on the test host, increasing the odds of intermittent test failures getting triggered. Decrease the debug level for the seven most verbose named instances: - use "-d 3" for ns2 in the "cacheclean" system test (it is the lowest logging level at which the test still passes without the need to apply any changes to tests.sh), - use "-d 1" for the other six named instances. This roughly halves the number of lines logged by each test suite run while still leaving enough information in the logs to allow at least basic troubleshooting in case of test failures. This approach was chosen as it results in a greater decrease in the number of lines logged than running all named instances with "-d 3", without causing any test failures.	2021-04-28 07:56:47 +02:00
Diego Fronza	4d6408b823	Fix following up lookup failure if more resolvers are available _query_detach function was incorrectly unliking the query object from the lookup->q query list, this made it impossible to follow a query lookup failure with the next one in the list (possibly using a separate resolver), as the link to the next query in the list was dissolved. Fix by unliking the node only when the query object is about to be destroyed, i.e. there is no more references to the object.	2021-04-26 11:14:14 -03:00
Michał Kępień	24bf4b946a	Test handling of non-apex RRSIG(SOA) RRsets Add a check to the "dnssec" system test which ensures that RRSIG(SOA) RRsets present anywhere else than at the zone apex are automatically removed after a zone containing such RRsets is loaded.	2021-04-23 14:26:48 +02:00
Michał Kępień	6feac68b50	Test "tkey-gssapi-credential" conditionally If "tkey-gssapi-credential" is set in the configuration and GSSAPI support is not available, named will refuse to start. As the test system framework does not support starting named instances conditionally, ensure that "tkey-gssapi-credential" is only present in named.conf if GSSAPI support is available.	2021-04-26 07:16:38 +02:00
Mark Andrews	8d5870f9df	Wait for named to start If we don't wait for named to finish starting, 'rndc stop' may fail due to the listen limit being reached in named leading to a false negative test	2021-04-24 01:19:47 +00:00
Diego Fronza	d6224035d8	Add system test for the deadlock fix The test spawns 4 parallel workers that keep adding, modifying and deleting zones, the main thread repeatedly checks wheter rndc status responds within a reasonable period. While environment and timing issues may affect the test, in most test cases the deadlock that was taking place before the fix used to trigger in less than 7 seconds in a machine with at least 2 cores.	2021-04-22 15:45:55 +00:00
Diego Fronza	9298dcebbd	Fix deadlock between rndc addzone/delzone/modzone It follows a description of the steps that were leading to the deadlock: 1. `do_addzone` calls `isc_task_beginexclusive`. 2. `isc_task_beginexclusive` waits for (N_WORKERS - 1) halted tasks, this blocks waiting for those (no. workers -1) workers to halt. ... isc_task_beginexclusive(isc_task_t *task0) { ... while (manager->halted + 1 < manager->workers) { wake_all_queues(manager); WAIT(&manager->halt_cond, &manager->halt_lock); } ``` 3. It is possible that in `task.c / dispatch()` a worker is running a task event, if that event blocks it will not allow this worker to halt. 4. `do_addzone` acquires `LOCK(&view->new_zone_lock);`, 5. `rmzone` event is called from some worker's `dispatch()`, `rmzone` blocks waiting for the same lock. 6. `do_addzone` calls `isc_task_beginexclusive`. 7. Deadlock triggered, since: - `rmzone` is wating for the lock. - `isc_task_beginexclusive` is waiting for (no. workers - 1) to be halted - since `rmzone` event is blocked it won't allow the worker to halt. To fix this, we updated do_addzone code to call isc_task_beginexclusive before the lock is acquired, we postpone locking to the nearest required place, same for isc_task_beginexclusive. The same could happen with rndc modzone, so that was addressed as well.	2021-04-22 15:45:55 +00:00
Petr Špaček	1746d2e84a	Add tests for the "tkey-gssapi-credential" option Four named instances in the "nsupdate" system test have GSS-TSIG support enabled. All of them currently use "tkey-gssapi-keytab". Configure two of them with "tkey-gssapi-credential" to test that option. As "tkey-gssapi-keytab" and "tkey-gssapi-credential" both provide the same functionality, no test modifications are required. The difference between the two options is that the value of "tkey-gssapi-keytab" is an explicit path to the keytab file to acquire credentials from, while the value of "tkey-gssapi-credential" is the name of the principal whose credentials should be used; those credentials are looked up in the keytab file expected by the Kerberos library, i.e. /etc/krb5.keytab by default. The path to the default keytab file can be overridden using by setting the KRB5_KTNAME environment variable. Utilize that variable to use existing keytab files with the "tkey-gssapi-credential" option. The KRB5_KTNAME environment variable should not interfere with the "tkey-gssapi-keytab" option. Nevertheless, rename one of the keytab files used with "tkey-gssapi-keytab" to something else than the contents of the KRB5_KTNAME environment variable in order to make sure that both "tkey-gssapi-keytab" and "tkey-gssapi-credential" are actually tested.	2021-04-22 16:15:22 +02:00

1 2 3 4 5 ...

10208 Commits