bind9

Author	SHA1	Message	Date
Mark Andrews	38449de93b	Update named's usage description	2021-04-12 12:07:44 +10:00
Michał Kępień	c3718b926b	Use the same port selection method on all systems When system tests are run on Windows, they are assigned port ranges that are 100 ports wide and start from port number 5000. This is a different port assignment method than the one used on Unix systems. Drop the "-p" command line option from bin/tests/system/run.sh invocations used for starting system tests on Windows to unify the port assignment method used across all operating systems.	2021-04-08 11:12:37 +02:00
Michał Kępień	31e5ca4bd9	Rework get_ports.sh to make it not use a lock file The get_ports.sh script is used for determining the range of ports a given system test should use. It first determines the start of the port range to return (the base port); it can either be specified explicitly by the caller or chosen randomly. Subsequent ports are picked sequentially, starting from the base port. To ensure no single port is used by multiple tests, a state file (get_ports.state) containing the last assigned port is maintained by the script. Concurrent access to the state file is protected by a lock file (get_ports.lock); if one instance of the script holds the lock file while another instance tries to acquire it, the latter retries its attempt to acquire the lock file after sleeping for 1 second; this retry process can be repeated up to 10 times before the script returns an error. There are some problems with this approach: - the sleep period in case of failure to acquire the lock file is fixed, which leads to a "thundering herd" type of problem, where (depending on how processes are scheduled by the operating system) multiple system tests try to acquire the lock file at the same time and subsequently sleep for 1 second, only for the same situation to likely happen the next time around, - the lock file is being locked and then unlocked for every single port assignment made, not just once for the entire range of ports a system test should use; in other words, the lock file is currently locked and unlocked 13 times per system test; this increases the odds of the "thundering herd" problem described above preventing a system test from getting one or more ports assigned before the maximum retry count is reached (assuming multiple system tests are run in parallel); it also enables the range of ports used by a given system test to be non-sequential (which is a rather cosmetic issue, but one that can make log interpretation harder than necessary when test failures are diagnosed), - both issues described above cause unnecessary delays when multiple system tests are started in parallel (due to high lock file contention among the system tests being started), - maintaining a state file requires ensuring proper locking, which complicates the script's source code. Rework the get_ports.sh script so that it assigns non-overlapping port ranges to its callers without using a state file or a lock file: - add a new command line switch, "-t", which takes the name of the system test to assign ports for, - ensure every instance of get_ports.sh knows how many ports all system tests which form the test suite are going to need in total (based on the number of subdirectories found in bin/tests/system/), - in order to ensure all instances of get_ports.sh work on the same global port range (so that no port range collisions happen), a stable (throughout the expected run time of a single system test suite) base port selection method is used instead of the random one; specifically, the base port, unless specified explicitly using the "-p" command line switch, is derived from the number of hours which passed since the Unix Epoch time, - use the name of the system test to assign ports for (passed via the new "-t" command line switch) as a unique index into the global system test range, to ensure all system tests use disjoint port ranges.	2021-04-08 11:12:37 +02:00
Michal Nowak	cd0a34df1b	Move fromhex.pl script to bin/tests/system/ The fromhex.pl script needs to be copied from the source directory to the build directory before any test is run, otherwise the out-of-tree fails to find it. Given that the script is used only in system test, move it to bin/tests/system/.	2021-04-08 11:04:26 +02:00
Mark Andrews	bb6f0faeed	Check that upgrade of managed-keys.bind.jnl succeeded Update the system to include a recoverable managed.keys journal created with <size,serial0,serial1,0> transactions and test that it has been updated as part of the start up process.	2021-04-07 20:27:22 +02:00
Evan Hunt	d2ea8f4245	Ensure dig lookup is detached on UDP connect failure dig could hang when UDP connect failed due to a dangling lookup object.	2021-04-07 15:36:59 +02:00
Ondřej Surý	86f4872dd6	isc_nm_connect() always return via callback The isc_nm_connect() functions were refactored to always return the connection status via the connect callback instead of sometimes returning the hard failure directly (for example, when the socket could not be created, or when the network manager was shutting down). This commit changes the connect functions in all the network manager modules, and also makes the necessary refactoring changes in places where the connect functions are called.	2021-04-07 15:36:59 +02:00
Evan Hunt	a70cd026df	move UDP connect retries from dig into isc_nm_udpconnect() dig previously ran isc_nm_udpconnect() three times before giving up, to work around a freebsd bug that caused connect() to return a spurious transient EADDRINUSE. this commit moves the retry code into the network manager itself, so that isc_nm_udpconnect() no longer needs to return a result code.	2021-04-07 15:36:59 +02:00
Matthijs Mekking	e443279bbf	Change default stale-answer-client-timeout to off Using "stale-answer-client-timeout" turns out to have unforeseen negative consequences, and thus it is better to disable the feature by default for the time being.	2021-04-07 14:10:31 +02:00
Matthijs Mekking	aaed7f9d8c	Remove result exception on staleonly lookup When implementing "stale-answer-client-timeout", we decided that we should only return positive answers prematurely to clients. A negative response is not useful, and in that case it is better to wait for the recursion to complete. To do so, we check the result and if it is not ISC_R_SUCCESS, we decide that it is not good enough. However, there are more return codes that could lead to a positive answer (e.g. CNAME chains). This commit removes the exception and now uses the same logic that other stale lookups use to determine if we found a useful stale answer (stale_found == true). This means we can simplify two test cases in the serve-stale system test: nodata.example is no longer treated differently than data.example.	2021-04-02 10:02:40 +02:00
Mark Andrews	35e8f56b49	Test dynamic libraries should not be installed Tag the libraries with check_ to prevent them being installed by "make install". Additionally make check requires .so to be create which requires .lai files to be constructed which, in turn, requires -rpath <dir> as part of "linking" the .la file.	2021-04-01 19:11:54 +11:00
Diego Fronza	3b98c4d311	Update dig's man page Adjusted man page entries for +tries and +retry options to reflect the fact that now those options apply to TCP as well.	2021-03-25 14:08:40 -03:00
Diego Fronza	4f82cc41cc	Added tests for tries=1 and retry=0 on TCP EOF Added tests to ensure that dig won't retry sending a query over tcp (+tcp) when a TCP connection is closed prematurely (EOF is read) if either +tries=1 or retry=0 is specified on the command line.	2021-03-25 14:08:40 -03:00
Diego Fronza	e680896003	Adjusted dig system tests Now that premature EOF on tcp connections take +tries and +retry into account, the dig system tests handling TCP EOF with +tries=1 were expecting dig to do a second attempt in handling the tcp query, which doesn't happen anymore. To make the test work as expected +tries value was adjusted to 2, to make it behave as before after the new update on dig.	2021-03-25 14:08:40 -03:00
Diego Fronza	78f6ead480	Don't retry +tcp queries on failure if tries=1 or retries=0 Before this commit, a premature EOF (connection closed) on tcp queries was causing dig to automatically attempt to send the query again, even if +tries=1 or +retries=0 was provided on command line. This commit fix the problem by taking into account the no. of retries specified by the user when processing a premature EOF on tcp connections.	2021-03-25 14:08:39 -03:00
Matthijs Mekking	93ed215065	Add kasp.sh to run.sh.in script Add kasp.sh to the list of scripts copied from the source directory to the build directory before any test is run. This will fix the out-of-tree test failures introduced in commit `ecb073bdd6` on the 'main' branch.	2021-03-24 08:55:24 +01:00
Matthijs Mekking	82d667e1d5	Fix some intermittent kasp failures When calling "rndc dnssec -checkds", it may take some milliseconds before the appropriate changes have been written to the state file. Add retry_quiet mechanisms to allow the write operation to finish. Also retry_quiet the check for the next key event. A "rndc dnssec" command may trigger a zone_rekey event and this will write out a new "next key event" log line, but it may take a bit longer than than expected in the tests.	2021-03-22 11:58:26 +01:00
Matthijs Mekking	82f72ae249	Rekey immediately after rndc checkds/rollover Call 'dns_zone_rekey' after a 'rndc dnssec -checkds' or 'rndc dnssec -rollover' command is received, because such a command may influence the next key event. Updating the keys immediately avoids unnecessary rollover delays. The kasp system test no longer needs to call 'rndc loadkeys' after a 'rndc dnssec -checkds' or 'rndc dnssec -rollover' command.	2021-03-22 11:58:26 +01:00
Matthijs Mekking	6f31f62d69	Delete CDS/CDNSKEY records when zone is unsigned CDS/CDNSKEY DELETE records are only useful if they are signed, otherwise the parent cannot verify these RRsets anyway. So once the DS has been removed (and signaled to BIND), we can remove the DNSKEY and RRSIG records, and at this point we can also remove the CDS/CDNSKEY records.	2021-03-22 10:30:59 +01:00
Matthijs Mekking	f211c7c2a1	Allow CDS/CDNSKEY DELETE records in unsigned zone While not useful, having a CDS/CDNSKEY DELETE record in an unsigned zone is not an error and "named-checkzone" should not complain.	2021-03-22 10:25:30 +01:00
Matthijs Mekking	d5531df79a	Retry quiet check keys Change the 'check_keys' function to try three times. Some intermittent kasp test failures are because we are inspecting the key files before the actual change has happen. The 'retry_quiet' approach allows for a bit more time to let the write operation finish.	2021-03-22 09:50:05 +01:00
Matthijs Mekking	c40c1ebcb1	Test keymgr2kasp state from timing metadata Add two test zones that migrate to dnssec-policy. Test if the key states are set accordingly given the timing metadata. The rumoured.kasp zone has its Publish/Active/SyncPublish times set not too long ago so the key states should be set to RUMOURED. The omnipresent.kasp zone has its Publish/Active/SyncPublish times set long enough to set the key states to OMNIPRESENT. Slightly change the init_migration_keys function to set the key lifetime to "none" (legacy keys don't have lifetime). Then in the test case set the expected key lifetime explicitly.	2021-03-22 09:50:05 +01:00
Matthijs Mekking	f6fa254256	Editorial commit keymgr2kasp test This commit is somewhat editorial as it does not introduce something new nor fixes anything. The layout in keymgr2kasp/tests.sh has been changed, with the intention to make more clear where a test scenario ends and begins. The publication time of some ZSKs has been changed. It makes a more clear distinction between publication time and activation time.	2021-03-22 09:50:05 +01:00
Matthijs Mekking	ecb073bdd6	Introduce kasp.sh Add a script similar to conf.sh to include common functions and variables for testing KASP. Currently used in kasp, keymgr2kasp, and nsec3.	2021-03-22 09:50:05 +01:00
Matthijs Mekking	5389172111	Move kasp migration tests to different directory The kasp system test was getting pretty large, and more tests are on the way. Time to split up. Move tests that are related to migrating to dnssec-policy to a separate directory 'keymgr2kasp'.	2021-03-22 09:50:05 +01:00
Michał Kępień	185a1a5643	Install man page for named-compilezone The named-checkzone tool can also be invoked as named-compilezone. Make sure a man page is installed for that alias. Move and rename the "man_named-checkzone" label to prevent a Sphinx duplicate label warning from being raised (see commit `84862e96c1` for more information).	2021-03-22 09:36:48 +01:00
Patrick McLean	56cef1495f	dig: Use high resolution clocks when microsecond accuracy is requested The TIME_NOW macro calls isc_time_now which uses CLOCK_REALTIME_COARSE for getting the current time. This is perfectly fine for millisecond, however when the user request microsecond resolutiuon, they are going to get very inaccurate results. This is especially true on a server class machine where the clock ticks may be set to 100HZ. This changes dig to use the new TIME_NOW_HIRES macro that uses the CLOCK_MONOTONIC_RAW that is more expensive, but gets the actual current time rather than the at the last kernel time tick.	2021-03-20 11:25:55 -07:00
treysis	6b2ea00621	Add filter-a plugin for IPv6-dominant environments (cherry picked from commit 78f6cd57e1cc166823415438fe2d19a324cf7a67)	2021-03-19 08:06:55 +01:00
Ondřej Surý	36ddefacb4	Change the isc_nm_(get\|set)timeouts() to work with milliseconds The RFC7828 specifies the keepalive interval to be 16-bit, specified in units of 100 milliseconds and the configuration options tcp-*-timeouts are following the suit. The units of 100 milliseconds are very unintuitive and while we can't change the configuration and presentation format, we should not follow this weird unit in the API. This commit changes the isc_nm_(get\|set)timeouts() functions to work with milliseconds and convert the values to milliseconds before passing them to the function, not just internally.	2021-03-18 16:37:57 +01:00
Ondřej Surý	64cff61c02	Add TCP timeouts system test The system tests were missing a test that would test tcp-initial-timeout and tcp-idle-timeout. This commit adds new "timeouts" system test that adds: * Test that waits longer than tcp-initial-timeout and then checks whether the socket was closed * Test that sends and receives DNS message then waits longer than tcp-initial-timeout but shorter time than tcp-idle-timeout than sends DNS message again than waits longer than tcp-idle-timeout and checks whether the socket was closed * Similar test, but bursting 25 DNS messages than waiting longer than tcp-initial-timeout and shorter than tcp-idle-timeout than do second 25 DNS message burst * Check whether transfer longer than tcp-initial-timeout succeeds	2021-03-18 16:37:57 +01:00
Matthijs Mekking	0cae3249e3	Add test for thaw dynamic kasp zone Add a test for freezing, manually updating, and then thawing a dynamic zone with "dnssec-policy". In the kasp system test we add parameters to the "update_is_signed" check to signal the indicated IP addresses for the labels "a" and "d". If set to '-', the test is skipped. After nsupdating the dynamic.kasp zone, we revert the update (with nsupdate) and update the zone again, but now with the freeze/thaw approach.	2021-03-17 08:24:17 +01:00
Matthijs Mekking	ee0835d977	Fix a XoT crash The transport should also be detached when we skip a master, otherwise named will crash when sending a SOA query to the next master over TLS, because the transport must be NULL when we enter 'dns_view_gettransport'.	2021-03-16 10:11:12 +01:00
Mark Andrews	25d1276170	Ignore the actual error code returned by getaddrinfo when testing if interactive mode continues or not on invalid hostname. We only need to detect that getaddrinfo failed and that we continued or not.	2021-03-16 10:20:28 +11:00
Matthijs Mekking	87591de6f7	Fix servestale fetchlimits crash When we query the resolver for a domain name that is in the same zone for which is already one or more fetches outstanding, we could potentially hit the fetch limits. If so, recursion fails immediately for the incoming query and if serve-stale is enabled, we may try to return a stale answer. If the resolver is also is authoritative for the parent zone (for example the root zone), first a delegation is found, but we first check the cache for a better response. Nothing is found in the cache, so we try to recurse to find the answer to the query. Because of fetch-limits 'dns_resolver_createfetch()' returns an error, which 'ns_query_recurse()' propagates to the caller, 'query_delegation_recurse()'. Because serve-stale is enabled, 'query_usestale()' is called, setting 'qctx->db' to the cache db, but leaving 'qctx->version' untouched. Now 'query_lookup()' is called to search for stale data in the cache database with a non-NULL 'qctx->version' (which is set to a zone db version), and thus we hit an assertion in rbtdb. This crash was introduced in 'main' by commit `8bcd7fe69e`.	2021-03-11 12:16:14 +01:00
Mark Andrews	af0ee2c718	Rename 'yield' to 'waitforsignal' due to namespace clash	2021-03-11 11:34:15 +11:00
Mark Andrews	926b9056b7	add journal to conf.sh.common	2021-03-08 11:36:00 +11:00
Evan Hunt	dbffb212ce	add basic DoH system tests - rename dot to doth, as it now covers both dot and doh. - merge xot into doth as it's closely related. - added long-lived key and cert files (expiring 2121). - add tests with https-get, https-post, http-plain, alternate endpoints, and both static and ephemeral TLS configuration. - incidentally fixed a memory leak in dig that occurred if +https was specified more than once.	2021-03-05 18:09:42 +02:00
Artem Boldariev	ca9a15e3bc	DoH: call send callbacks after data was actually sent	2021-03-05 13:29:32 +02:00
Evan Hunt	88752b1121	refactor outgoing HTTP connection support - style, cleanup, and removal of unnecessary code. - combined isc_nm_http_add_endpoint() and isc_nm_http_add_doh_endpoint() into one function, renamed isc_http_endpoint(). - moved isc_nm_http_connect_send_request() into doh_test.c as a helper function; remove it from the public API. - renamed isc_http2 and isc_nm_http2 types and functions to just isc_http and isc_nm_http, for consistency with other existing names. - shortened a number of long names. - the caller is now responsible for determining the peer address. in isc_nm_httpconnect(); this eliminates the need to parse the URI and the dependency on an external resolver. - the caller is also now responsible for creating the SSL client context, for consistency with isc_nm_tlsdnsconnect(). - added setter functions for HTTP/2 ALPN. instead of setting up ALPN in isc_tlsctx_createclient(), we now have a function isc_tlsctx_enable_http2client_alpn() that can be run from isc_nm_httpconnect(). - refactored isc_nm_httprequest() into separate read and send functions. isc_nm_send() or isc_nm_read() is called on an http socket, it will be stored until a corresponding isc_nm_read() or _send() arrives; when we have both halves of the pair the HTTP request will be initiated. - isc_nm_httprequest() is renamed isc__nm_http_request() for use as an internal helper function by the DoH unit test. (eventually doh_test should be rewritten to use read and send, and this function should be removed.) - added implementations of isc__nm_tls_settimeout() and isc__nm_http_settimeout(). - increased NGHTTP2 header block length for client connections to 128K. - use isc_mem_t for internal memory allocations inside nghttp2, to help track memory leaks. - send "Cache-Control" header in requests and responses. (note: currently we try to bypass HTTP caching proxies, but ideally we should interact with them: https://tools.ietf.org/html/rfc8484#section-5.1)	2021-03-05 13:29:26 +02:00
Ondřej Surý	9c8b7a5c45	add preliminary DoH client support to dig add options "+https", "+https-get" and "+http-plain" to allow dig to connect over HTTP/2 channels.	2021-03-05 13:28:17 +02:00
Mark Andrews	4015af02d8	Move cleanup of queries to later in the shutdown sequence to avoid TSAN report WARNING: ThreadSanitizer: data race Write of size 8 at 0x000000000001 by main thread: #0 free <null> #1 default_memfree lib/isc/mem.c:440 #2 mem_put lib/isc/mem.c:363 #3 isc__mem_free lib/isc/mem.c:1012 #4 main bin/tools/mdig.c:2231 Previous read of size 1 at 0x000000000005 by thread T1: #0 dns_name_fromtext lib/dns/name.c:1121 #1 sendquery bin/tools/mdig.c:596 #2 sendqueries bin/tools/mdig.c:779 #3 dispatch lib/isc/task.c:1153 #4 run lib/isc/task.c:1345 #5 isc__trampoline_run lib/isc/trampoline.c:184 #6 <null> <null> Thread T1 (running) created by main thread at: #0 pthread_create <null> #1 isc_thread_create pthreads/thread.c:79 #2 isc_taskmgr_create lib/isc/task.c:1435 #3 main bin/tools/mdig.c:2148 SUMMARY: ThreadSanitizer: data race in __interceptor_free	2021-03-04 13:21:56 +01:00
Ondřej Surý	8153729d3a	Use int type to store result from isc_commandline_parse() The C standard actually doesn't define char as signed or unsigned, and it could be either according to underlying architecture. It turns out that while it's usually signed type, it isn't on arm64 where it's unsigned. isc_commandline_parse() return int, just use that instead of the char.	2021-03-04 10:43:00 +01:00
Evan Hunt	a0aefa1de6	create 'journal' system test tests that version 1 journal files containing version 1 transaction headers are rolled forward correctly on server startup, then updated into version 2 journals. also checks journal file consistency and 'max-journal-size' behavior.	2021-03-03 17:54:47 -08:00
Mark Andrews	fb2d0e2897	extend named-journalprint to be able to force the journal version named-journalprint can now upgrade or downgrade a journal file in place; the '-u' option upgrades and the '-d' option downgrades.	2021-03-03 17:54:47 -08:00
Evan Hunt	ee19966326	allow dns_journal_rollforward() to read old journal files when the 'max-ixfr-ratio' option was added, journal transaction headers were revised to include a count of RR's in each transaction. this made it impossible to read old journal files after an upgrade. this branch restores the ability to read version 1 transaction headers. when rolling forward, printing journal contents, if the wrong transaction header format is found, we can switch. when dns_journal_rollforward() detects a version 1 transaction header, it returns DNS_R_RECOVERABLE. this triggers zone_postload() to force a rewrite of the journal file in the new format, and also to schedule a dump of the zone database with minimal delay. journal repair is done by dns_journal_compact(), which rewrites the entire journal, ignoring 'max-journal-size'. journal size is corrected later. newly created journal files now have "BIND LOG V9.2" in their headers instead of "BIND LOG V9". files with the new version string cannot be read using the old transaction header format. note that this means newly created journal files will be rejected by older versions of named. named-journalprint now takes a "-x" option, causing it to print transaction header information before each delta, including its format version.	2021-03-03 17:54:47 -08:00
Matthijs Mekking	f8b7b597e9	Don't servfail on staleonly lookups When a staleonly lookup doesn't find a satisfying answer, it should not try to respond to the client. This is not true when the initial lookup is staleonly (that is when 'stale-answer-client-timeout' is set to 0), because no resolver fetch has been created at this point. In this case continue with the lookup normally.	2021-02-25 11:32:17 +01:00
Matthijs Mekking	9e061faaae	Don't allow recursion on staleonly lookups Fix a crash that can happen in the following scenario: A client request is received. There is no data for it in the cache, (not even stale data). A resolver fetch is created as part of recursion. Some time later, the fetch still hasn't completed, and stale-answer-client-timeout is triggered. A staleonly lookup is started. It will also find no data in the cache. So 'query_lookup()' will call 'query_gotanswer()' with ISC_R_NOTFOUND, so this will call 'query_notfound()' and this will start recursion. We will eventually end up in 'ns_query_recurse()' and that requires the client query fetch to be NULL: REQUIRE(client->query.fetch == NULL); If the previously started fetch is still running this assertion fails. The crash is easily prevented by not requiring recursion for staleonly lookups. Also remove a redundant setting of the staleonly flag at the end of 'query_lookup_staleonly()' before destroying the query context. Add a system test to catch this case.	2021-02-25 11:32:17 +01:00
Matthijs Mekking	0c0f10b53f	Add tests for NSEC3 on dynamic zones GitLab issue #2498 is a bug report on NSEC3 with dynamic zones. Tests for it in the nsec3 system test directory were missing.	2021-02-25 17:21:17 +11:00
Mark Andrews	658c950d7b	Silence CID 320481: Null pointer dereferences *** CID 320481: Null pointer dereferences (REVERSE_INULL) /bin/tests/wire_test.c: 261 in main() 255 process_message(input); 256 } 257 } else { 258 process_message(input); 259 } 260 CID 320481: Null pointer dereferences (REVERSE_INULL) Null-checking "input" suggests that it may be null, but it has already been dereferenced on all paths leading to the check. 261 if (input != NULL) { 262 isc_buffer_free(&input); 263 } 264 265 if (printmemstats) { 266 isc_mem_stats(mctx, stdout);	2021-02-23 12:45:45 +00:00
Mark Andrews	5fb168fab3	Silence CID 281450: Dereference before null check remove redundant 'inst != NULL' test 162cleanup: CID 281450 (#1 of 1): Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking inst suggests that it may be null, but it has already been dereferenced on all paths leading to the check. 163 if (result != ISC_R_SUCCESS && inst != NULL) { 164 plugin_destroy((void **)&inst); 165 }	2021-02-23 11:58:40 +00:00

1 2 3 4 5 ...

10140 Commits