bind9

Author	SHA1	Message	Date
Tony Finch	afae41aa40	Check the return value from uv_async_send() An omission pointed out by the following report from Coverity: /lib/isc/loop.c: 483 in isc_loopmgr_pause() >>> CID 455002: Error handling issues (CHECKED_RETURN) >>> Calling "uv_async_send" without checking return value (as is done elsewhere 5 out of 6 times). 483 uv_async_send(&loop->pause_trigger);	2023-05-15 18:52:04 +01:00
Evan Hunt	b4ac7faee9	allow streamdns read to resume after timeout when reading on a streamdns socket failed due to timeout, but the dispatch was still waiting for other responses, it would resume reading by calling isc_nm_read() again. this caused an assertion because the socket was already reading. we now check that either the socket is reading, or that it was already reading on the same handle.	2023-05-13 23:31:45 -07:00
Tony Finch	fc770a8bd0	Remove the now-unused ISC_STACK We are using the liburcu concurrent data structures instead.	2023-05-12 20:49:43 +01:00
Tony Finch	f11cc83142	Use per-CPU RCU helper threads Create and free per-CPU helper threads from the main thread and tell thread sanitizer to suppress leaking threads. (We are not leaking threads ourselves and we can safely ignore the Userspace-RCU thread leaks.)	2023-05-12 20:48:31 +01:00
Tony Finch	c377e0a9e3	Help thread sanitizer to cope with liburcu All the places the qp-trie code was using `call_rcu()` needed `__tsan_release()` and `__tsan_acquire()` annotations, so add a couple of wrappers to encapsulate this pattern. With these wrappers, the tests run almost clean under thread sanitizer. The remaining problems are due to `rcu_barrier()` which can be suppressed using `.tsan-suppress`. It does not suppress the whole of `liburcu`, because we would like thread sanitizer to detect problems in `call_rcu()` callbacks, which are called from `liburcu`. The CI jobs have been updated to use `.tsan-suppress` by default, except for a special-case job that needs the additional suppressions in `.tsan-suppress-extra`. We might be able to get rid of some of this after liburcu gains support for thread sanitizer. Note: the `rcu_barrier()` suppression is not entirely effective: tsan sometimes reports races that originate inside `rcu_barrier()` but tsan has discarded the stack so it does not have the information required to suppress the report. These "races" can be made much easier to reproduce by adding `atexit_sleep_ms=1000` to `TSAN_OPTIONS`. The problem with tsan's short memory can be addressed by increasing `history_size`: when it is large enough (6 or 7) the `rcu_barrier()` stack usually survives long enough for suppression to work.	2023-05-12 20:48:31 +01:00
Tony Finch	2bce998b2b	Avoid using the zone timer after its loop has gone Shutdown and cleanup of zones is more asynchronous with the qp-trie zone table. As a result it's possible that some activity is delayed until after a zone has been released from its zonemanager. Previously, the dns_zone code was not very strict in the way it refers to the loop it is running on: The loop pointer was stashed when dns_zonemgr_managezone() was called and never cleared. Now, zones properly attach to and detach from their loops. The zone timer depends on its loop. The shutdown crashes occurred when asynchronous calls tried to modify the zone timer after dns_zonemgr_releasezone() has been called and the loop was invalidated. In these cases the attempt to set the timer is now ignored, with a debug log message.	2023-05-12 20:48:31 +01:00
Tony Finch	9882a6ef90	The zone table no longer depends on the loop manager This reverts some of the changes in commit `b171cacf4f` because now it isn't necessary to pass the loopmgr around.	2023-05-12 20:48:31 +01:00
Tony Finch	6217e434b5	Refactor the core qp-trie code to use liburcu A `dns_qmpulti_t` no longer needs to know about its loopmgr. We no longer keep a linked list of `dns_qpmulti_t` that have reclamation work, and we no longer mark chunks with the phase in which they are to be reclaimed. Instead, empty chunks are listed in an array in a `qp_rcu_t`, which is passed to call_rcu().	2023-05-12 20:48:31 +01:00
Tony Finch	05ca11e122	Remove isc_qsbr (we are using liburcu instead) This commit breaks the qp-trie code.	2023-05-12 20:48:31 +01:00
Tony Finch	cd0795beea	Slightly more sanitary thread dispatch Tell thread sanitizer that the thread wrapper is released before passing it to a new thread.	2023-05-12 20:48:31 +01:00
Tony Finch	2e0c954806	Wait for RCU to finish before destroying a memory context Memory reclamation by `call_rcu()` is asynchronous, so during shutdown it can lose a race with the destruction of its memory context. When we defer memory reclamation, we need to attach to the memory context to indicate that it is still in use, but that is not enough to delay its destruction. So, call `rcu_barrier()` in `isc_mem_destroy()` to wait for pending RCU work to finish before proceeding to destroy the memory context.	2023-05-12 20:48:31 +01:00
Tony Finch	4f97a679f0	A macro for the size of a struct with a flexible array member It can be fairly long-winded to allocate space for a struct with a flexible array member: in general we need the size of the struct, the size of the member, and the number of elements. Wrap them all up in a STRUCT_FLEX_SIZE() macro, and use the new macro for the flexible arrays in isc_ht and dns_qp.	2023-05-12 20:48:31 +01:00
Aram Sargsyan	fae0930eb8	Check whether zone->db is a valid pointer before attaching The zone_resigninc() function does not check the validity of 'zone->db', which can crash named if the zone was unloaded earlier, for example with "rndc delete". Check that 'zone->db' is not 'NULL' before attaching to it, like it is done in zone_sign() and zone_nsec3chain() functions, which can similarly be called by zone maintenance.	2023-05-12 13:37:27 +00:00
Ondřej Surý	fd3522c37b	Add Userspace-RCU to global CFLAGS and LIBS The Userspace-RCU headers are now needed for more parts of the libisc and libdns, thus we need to add it globally to prevent compilation failures on systems with non-standard Userspace-RCU installation path.	2023-05-12 14:16:25 +02:00
Ondřej Surý	00f1823366	Change the isc_quota API to use cds_wfcqueue internally The isc_quota API was using locked list of isc_job_t objects to keep the waiting TCP accepts. Change the isc_quota implementation to use cds_wfcqueue internally - the enqueue is wait-free and only dequeue needs to be locked.	2023-05-12 14:16:25 +02:00
Ondřej Surý	7b1d985de2	Change the isc_async API to use cds_wfcqueue internally The isc_async API was using lock-free stack (where enqueue operation was not wait-free). Change the isc_async to use cds_wfcqueue internally - enqueue and splice (move the queue members from one list to another) is nonblocking and wait-free.	2023-05-12 14:16:25 +02:00
Ondřej Surý	7220851f67	Replace glue_cache hashtable with direct link in rdatasetheader Instead of having a global hashtable with a global rwlock for the GLUE cache, move the glue_list directly into rdatasetheader and use Userspace-RCU to update the pointer when the glue_list is empty. Additionally, the cached glue_lists needs to be stored in the RBTDB version for early cleaning, otherwise the circular dependencies between nodes and glue_lists will prevent nodes to be ever cleaned up.	2023-05-12 13:25:39 +02:00
Matthijs Mekking	2c7d93d431	Read from kasp whether to publish CDNSKEY Check the policy and feed 'dns_dnssec_syncupdate() the right value to enable/disable CDSNKEY publication.	2023-05-11 17:07:51 +02:00
Matthijs Mekking	8be61d1845	Add configuration option 'cdnskey' Add the 'cdnskey' configuration option to 'dnssec-policy'.	2023-05-11 17:07:51 +02:00
Matthijs Mekking	7960afcc0f	Add functions to set CDNSKEY publication Add kasp API functions to enable/disable publication of CDNSKEY records.	2023-05-11 17:07:51 +02:00
Michal Nowak	31935a3537	Disable ASAN in nsupdate for fatal cases Clang 16 LeakSanitizer reports a memory leak when dns_request_create() returned a TLS error in the nsupdate system test. While technically a memory leak on error handling, it's not a problem because the program is immediately terminated; nsupdate is not expected to run for a prolonged time.	2023-05-11 13:39:51 +02:00
Mark Andrews	f3b24ba789	Handle FORMERR on unknown EDNS option that are echoed If the resolver received a FORMERR response to a request with an DNS COOKIE option present that echoes the option back, resend the request without an DNS COOKIE option present.	2023-05-11 09:32:02 +10:00
Ondřej Surý	b3c6ee7b9a	Fix a logical flaw that would skip logging notify success The notify_done() would never log a success as the logging part was always skipped. Fix the code flow in the function.	2023-05-03 21:51:20 +02:00
Mark Andrews	9fcd42c672	Re-write remove_old_tsversions and greatest_version Stop deliberately breaking const rules by copying file->name into dirbuf and truncating it there. Handle files located in the root directory properly. Use unlinkat() from POSIX 200809.	2023-05-03 09:12:34 +02:00
Matthijs Mekking	70629d73da	Fix purging old log files with absolute file path Removing old timestamp or increment versions of log backup files did not work when the file is an absolute path: only the entry name was provided to the file remove function. The dirname was also bogus, since the file separater was put back too soon. Fix these issues to make log file rotation work when the file is configured to be an absolute path.	2023-05-03 09:12:11 +02:00
Tony Finch	7d1ceaf35d	Move per-thread RCU setup into isc_thread All the per-loop `libuv` setup remains in `isc_loop`, but the per-thread RCU setup is moved to `isc_thread` alongside the other per-thread setup. This avoids repeating the per-thread setup for `call_rcu()` helpers, and explains a little better why some parts of the per-thread setup is missing for `call_rcu()` helpers. This also removes the per-loop `call_rcu()` helpers as we refactored the isc__random_initialize() in the previous commit.	2023-04-27 12:38:53 +02:00
Ondřej Surý	65021dbf52	Move the isc_random API initialization to the thread_local variable Instead of writing complicated wrappers for every thread, move the initialization back to isc_random unit and check whether the random seed was initialized with a thread_local variable. Ensure that isc_entropy_get() returns a non-zero seed. This avoids problems with thread sanitizer tests getting stuck in an infinite loop.	2023-04-27 12:38:53 +02:00
Tony Finch	e0248bf60f	Simplify isc_thread a little Remove the `isc_threadarg_t` and `isc_threadresult_t` typedefs which were unhelpful disguises for `void *`, and free the dummy jemalloc allocation sooner.	2023-04-27 12:38:53 +02:00
Tony Finch	06f534fa69	Avoid spurious compilation failures in liburcu headers When liburcu is not installed from a system package, its headers are not treated as system headers by the compiler, so BIND's -Werror and other warning options take effect. The liburcu headers have a lot of inline functions, some of which do not use all their arguments, which BIND's build treats as an error.	2023-04-27 12:38:53 +02:00
Ondřej Surý	c2c907d728	Improve the Userspace RCU integration This commit allows BIND 9 to be compiled with different flavours of Userspace RCU, and improves the integration between Userspace RCU and our event loop: - In the RCU QSBR, the thread is put offline when polling and online when rcu_dereference, rcu_assign_pointer (or friends) are called. - In other RCU modes, we check that we are not reading when reaching the quiescent callback in the event loop. - We register the thread before uv_work_run() callback is called and after it has finished. The rcu_(un)register_thread() has a large overhead, but that's fine in this case.	2023-04-27 12:38:53 +02:00
Ondřej Surý	58663574b9	Use server socket to log TCP accept failures The accept_connection() could detach from the child socket on a failure, so we need to keep and use the server socket for logging the accept failures.	2023-04-27 11:07:57 +02:00
Ondřej Surý	27ad3a65f9	Fix potential UAF when shutting down isc_httpd Use the ISC_LIST_FOREACH_SAFE() macro to safely walk the running https and shut them down in a manner safe from deletion.	2023-04-25 08:16:46 +02:00
Ondřej Surý	ae997d9e21	Add ISC_LIST_FOREACH(_SAFE) macros There's a recurring pattern walking the ISC_LISTs that just repeats over and over. Add two macros: * ISC_LIST_FOREACH(list, elt, link) - walk the static list * ISC_LIST_FOREACH_SAFE(list, elt, link, next) - walk the list in a manner that's safe against list member deletions	2023-04-25 08:16:46 +02:00
Mark Andrews	27160c137f	Cleanup orphaned empty-non-terminal NSEC3 When OPTOUT was in use we didn't ensure that NSEC3 records for orphaned empty-non-terminals where removed. Check if there are orphaned empty-non-terminal NSEC3 even if there wasn't an NSEC3 RRset to be removed in dns_nsec3_delnsec3.	2023-04-25 05:03:12 +01:00
Aram Sargsyan	dfaecfd752	Implement new -T options for xfer system tests '-T transferinsecs' makes named interpret the max-transfer-time-out, max-transfer-idle-out, max-transfer-time-in and max-transfer-idle-in configuration options as seconds instead of minutes. '-T transferslowly' makes named to sleep for one second for every xfrout message. '-T transferstuck' makes named to sleep for one minute for every xfrout message.	2023-04-21 12:53:02 +02:00
Ondřej Surý	d2377f8e04	Implement maximum global and idle time for incoming XFR After the dns_xfrin was changed to use network manager, the maximum global (max-transfer-time-in) and idle (max-transfer-idle-in) times for incoming transfers were turned inoperational because of missing implementation. Restore this functionality by implementing the timers for the incoming transfers.	2023-04-21 12:53:02 +02:00
Evan Hunt	2269a3e6fb	check for invalid protocol when dispatch fails treat ISC_R_INVALIDPROTO as a networking error when it occurs.	2023-04-21 12:42:11 +02:00
Evan Hunt	0393b54afb	add a result code for ENOPROTOOPT, EPROTONOSUPPORT there was no isc_result_t value for invalid protocol errors that could be returned from libuv.	2023-04-21 12:42:10 +02:00
Ondřej Surý	b497e90179	Add isc_spinlock unit with shim pthread_spin implementation The spinlock is small (atomic_uint_fast32_t at most), lightweight synchronization primitive and should only be used for short-lived and most of the time a isc_mutex should be used. Add a isc_spinlock unit which is either (most of the time) a think wrapper around pthread_spin API or an efficient shim implementation of the simple spinlock.	2023-04-21 12:10:02 +02:00
Ondřej Surý	3b10814569	Fix the streaming read callback shutdown logic When shutting down TCP sockets, the read callback calling logic was flawed, it would call either one less callback or one extra. Fix the logic in the way: 1. When isc_nm_read() has been called but isc_nm_read_stop() hasn't on the handle, the read callback will be called with ISC_R_CANCELED to cancel active reading from the socket/handle. 2. When isc_nm_read() has been called and isc_nm_read_stop() has been called on the on the handle, the read callback will be called with ISC_R_SHUTTINGDOWN to signal that the dormant (not-reading) socket is being shut down. 3. The .reading and .recv_read flags are little bit tricky. The .reading flag indicates if the outer layer is reading the data (that would be uv_tcp_t for TCP and isc_nmsocket_t (TCP) for TLSStream), the .recv_read flag indicates whether somebody is interested in the data read from the socket. Usually, you would expect that the .reading should be false when .recv_read is false, but it gets even more tricky with TLSStream as the TLS protocol might need to read from the socket even when sending data. Fix the usage of the .recv_read and .reading flags in the TLSStream to their true meaning - which mostly consist of using .recv_read everywhere and then wrapping isc_nm_read() and isc_nm_read_stop() with the .reading flag. 4. The TLS failed read helper has been modified to resemble the TCP code as much as possible, clearing and re-setting the .recv_read flag in the TCP timeout code has been fixed and .recv_read is now cleared when isc_nm_read_stop() has been called on the streaming socket. 5. The use of Network Manager in the named_controlconf, isccc_ccmsg, and isc_httpd units have been greatly simplified due to the improved design. 6. More unit tests for TCP and TLS testing the shutdown conditions have been added. Co-authored-by: Ondřej Surý <ondrej@isc.org> Co-authored-by: Artem Boldariev <artem@isc.org>	2023-04-20 12:58:32 +02:00
Ondřej Surý	c8e8ccd026	Honour the source-port when retrying in dns_dispatch When retrying in the DNS dispatch, the local port would be forgotten on ISC_R_ADDRINUSE, keep the configured source-port even when retrying. Additionally, treat ISC_R_NOPERM same as ISC_R_ADDRINUSE. Closes: #3986	2023-04-20 10:57:20 +02:00
Ondřej Surý	0d48ac5a93	Handle the failure to send notify more gracefully and with log When dns_request_create() failed in notify_send_toaddr(), sending the notify would silently fail. When notify_done() failed, the error would be logged on the DEBUG(2) level. This commit remedies the situation by: * Promoting several messages related to notifies to INFO level and add a "success" log message at the INFO level * Adding a TCP fallback - when sending the notify over UDP fails, named will retry sending notify over TCP and log the information on the NOTICE level * When sending the notify over TCP fails, it will be logged on the WARNING level Closes: #4001, #4002	2023-04-20 10:09:53 +02:00
Matthijs Mekking	e752656a38	Add key state init debugging When debugging an issue it can be useful to see what BIND initially set the key states to.	2023-04-17 10:56:08 +02:00
Ondřej Surý	3df3b5efbd	Run the forward_cancel on the appropriate zone->loop If the zone forwards are canceled from dns_zonemgr_shutdown(), the forward_cancel() would get called from the main loop, which is wrong. It needs to be called from the matching zone->loop. Run the dns_request_cancel() via isc_async_run() on the loop associated with the zone instead of calling the dns_request_cancel() directly from the main loop.	2023-04-14 16:31:33 +02:00
Ondřej Surý	f677cf6b73	Remove unused netmgr->worker->sendbuf By inspecting the code, it was discovered that .sendbuf member of the isc__nm_networker_t was unused and just consuming ~64k per worker. Remove the member and the association allocation/deallocation.	2023-04-14 16:20:14 +02:00
Aram Sargsyan	d8a207bd00	Fix a use-after-free bug in dns_xfrin_create() 'xfr' is used after detaching the only reference, which would have destroyed the object. Call dns_xfrin_detach() only after the final use of 'xfr'.	2023-04-14 07:39:38 +00:00
Ondřej Surý	1715cad685	Refactor the isc_quota code and fix the quota in TCP accept code In `e185412872`, the TCP accept quota code became broken in a subtle way - the quota would get initialized on the first accept for the server socket and then deleted from the server socket, so it would never get applied again. Properly fixing this required a bigger refactoring of the isc_quota API code to make it much simpler. The new code decouples the ownership of the quota and acquiring/releasing the quota limit. After (during) the refactoring it became more clear that we need to use the callback from the child side of the accepted connection, and not the server side.	2023-04-12 14:10:37 +02:00
Ondřej Surý	1768522045	Convert tls_send() callback to use isc_job_run() The tls_send() was already using uvreq; convert this to use more direct isc_job_run() - the on-loop no-allocation method.	2023-04-12 14:10:37 +02:00
Ondřej Surý	1302345c93	Convert isc__nm_http_send() from isc_async_run() to isc_job_run() The isc__nm_http_send() was already using uvreq; convert this to use more direct isc_job_run() - the on-loop no-allocation method.	2023-04-12 14:10:37 +02:00
Ondřej Surý	3adba8ce23	Use isc_job_run() for reading from StreamDNS socket Change the reading in the StreamDNS code to use isc_job_run() instead of using isc_async_run() for less allocations and more streamlined execution.	2023-04-12 14:10:37 +02:00

1 2 3 4 5 ...

14779 Commits