bind9

Author	SHA1	Message	Date
Ondřej Surý	e42cb1f198	Implement incremental hash table resizing in isc_ht Previously, an incremental hash table resizing was implemented for the dns_rbt_t hash table implementation. Using that as a base, also implement the incremental hash table resizing also for isc_ht API hashtables: 1. During the resize, allocate the new hash table, but keep the old table unchanged. 2. In each lookup, delete, or iterator operation, check both tables. 3. Perform insertion operations only in the new table. 4. At each insertion also move <r> elements from the old table to the new table. 5. When all elements are removed from the old table, deallocate it. To ensure that the old table is completely copied over before the new table itself needs to be enlarged, it is necessary to increase the size of the table by a factor of at least (<r> + 1)/<r> during resizing. In our implementation <r> is equal to 1. The downside of this approach is that the old table and the new table could stay in memory for longer when there are no new insertions into the hash table for prolonged periods of time as the incremental rehashing happens only during the insertions.	2022-03-17 08:16:24 +01:00
Ondřej Surý	bfa4b9c141	Run .closehandle_cb asynchrounosly in nmhandle_detach_cb() When sock->closehandle_cb is set, we need to run nmhandle_detach_cb() asynchronously to ensure correct order of multiple packets processing in the isc__nm_process_sock_buffer(). When not run asynchronously, it would cause: a) out-of-order processing of the return codes from processbuffer(); b) stack growth because the next TCP DNS message read callback will be called from within the current TCP DNS message read callback. The sock->closehandle_cb is set to isc__nm_resume_processing() for TCP sockets which calls isc__nm_process_sock_buffer(). If the read callback (called from isc__nm_process_sock_buffer()->processbuffer()) doesn't attach to the nmhandle (f.e. because it wants to drop the processing or we send the response directly via uv_try_write()), the isc__nm_resume_processing() (via .closehandle_cb) would call isc__nm_process_sock_buffer() recursively. The below shortened code path shows how the stack can grow: 1: ns__client_request(handle, ...); 2: isc_nm_tcpdns_sequential(handle); 3: ns_query_start(client, handle); 4: query_lookup(qctx); 5: query_send(qctcx->client); 6: isc__nmhandle_detach(&client->reqhandle); 7: nmhandle_detach_cb(&handle); 8: sock->closehandle_cb(sock); // isc__nm_resume_processing 9: isc__nm_process_sock_buffer(sock); 10: processbuffer(sock); // isc__nm_tcpdns_processbuffer 11: isc_nmhandle_attach(req->handle, &handle); 12: isc__nm_readcb(sock, req, ISC_R_SUCCESS); 13: isc__nm_async_readcb(NULL, ...); 14: uvreq->cb.recv(...); // ns__client_request Instead, if 'sock->closehandle_cb' is set, we need to run detach the handle asynchroniously in 'isc__nmhandle_detach', so that on line 8 in the code flow above does not start this recursion. This ensures the correct order when processing multiple packets in the function 'isc__nm_process_sock_buffer()' and prevents the stack growth. When not run asynchronously, the out-of-order processing leaves the first TCP socket open until all requests on the stream have been processed. If the pipelining is disabled on the TCP via `keep-response-order` configuration option, named would keep the first socket in lingering CLOSE_WAIT state when the client sends an incomplete packet and then closes the connection from the client side.	2022-03-16 22:11:49 +01:00
Ondřej Surý	79b5ccbf34	Implement isc_interval_t on top of isc_time_t Change the isc_interval_t implementation from separate data type and separate implementation to be shim implementation on top of isc_time_t. The distinction between isc_interval_t and isc_time_t has been kept because they are semantically different - isc_interval_t is relative and isc_time_t is absolute, but this allows isc_time_t and isc_interval_t to be freely interchangeable, f.e. this: isc_time_t t1; isc_interval_t interval; isc_time_t t2; isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2);; isc_time_subtract(t1, interval, t2); isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2)); to just: isc_time_t t1; isc_interval_t interval; isc_time_t t2; isc_time_subtract(t1, t2, interval); without introducing a whole set of new functions.	2022-03-14 13:00:05 -07:00
Ondřej Surý	e6ca2a651f	Refactor isc_timer_reset() use with semantic patch Add and apply semantic patch to remove expires argument from the isc_timer_reset() calls through the codebase.	2022-03-14 13:00:05 -07:00
Ondřej Surý	6437bcc488	Remove expires argument from isc_timer API The isc_timer_reset() now works only with intervals for once timers. This makes the API almost 1:1 compatible with the libuv timers making the further refactoring possible.	2022-03-14 13:00:05 -07:00
Ondřej Surý	27850a5ad2	Change isc_timer_reset() usage to never use expires argument There were two places where expires argument (absolute isc_time_t value) was being used. Both places has been converted to use relative interval argument in preparation of simplification and refactoring of isc_timer API.	2022-03-14 13:00:05 -07:00
Ondřej Surý	c259cecc90	Refactor isc_timer_create() to just create timer The isc_timer_create() function was a bit conflated. It could have been used to create a timer and start it at the same time. As there was a single place where this was done before (see the previous commit for nta.c), this was cleaned up and the isc_timer_create() function was changed to only create new timer.	2022-03-14 13:00:05 -07:00
Ondřej Surý	8fbb42c49c	Remove "a temporary hack, 'rndc timerpoke'" In 2002, "a temporary hack, 'rndc timerpoke'" was added. It's time for it to go, so it was removed.	2022-03-14 13:00:05 -07:00
Ondřej Surý	f4751a91f7	Remove unused isc_timer_touch() function The isc_timer_touch() was unused, just remove it.	2022-03-14 13:00:05 -07:00
Ondřej Surý	bbe1c06a8b	Remove isc_timertype_limited from isc_timer API The isc_timertype_limited timer type was never used (not even in tests). Remove isc_timertype_limited timer type before planned refactoring.	2022-03-14 13:00:05 -07:00
Ondřej Surý	49c804f8b7	Cleanup the nmhandle attach/detach in httpd.c In httpd.c, the send callback can directly call read callback without calling isc_nm_resumeread(). When per-send timeout was added, this could lead to use-after-free when shutting down the named. Cleanup the way how we attach to .readhandle and .sendhandle, so there's assurance that .readhandle will be always non-NULL when reading and .sendhandle will be always non-NULL when sending. Additionally, it was found that the implementation ignored the "Connection: close" header and it worked only accidentally by closing the connection after the first read from the TCP socket. This has been also fixed.	2022-03-11 09:57:10 +01:00
Ondřej Surý	6ddac2d56d	On shutdown, reset the established TCP connections Previously, the established TCP connections (both client and server) would be gracefully closed waiting for the write timeout. Don't wait for TCP connections to gracefully shutdown, but directly reset them for faster shutdown.	2022-03-11 09:56:57 +01:00
Ondřej Surý	a761aa59e3	Change single write timer to per-send timers Previously, there was a single per-socket write timer that would get restarted for every new write. This turned out to be insufficient because the other side could keep reseting the timer, and never reading back the responses. Change the single write timer to per-send timer which would in turn reset the TCP connection on the first send timeout.	2022-03-11 09:56:57 +01:00
Ondřej Surý	f251d69eba	Remove usage of deprecated ATOMIC_VAR_INIT() macro The C17 standard deprecated ATOMIC_VAR_INIT() macro (see [1]). Follow the suite and remove the ATOMIC_VAR_INIT() usage in favor of simple assignment of the value as this is what all supported stdatomic.h implementations do anyway: * MacOSX.plaform: #define ATOMIC_VAR_INIT(__v) {__v} * Gcc stdatomic.h: #define ATOMIC_VAR_INIT(VALUE) (VALUE) 1. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1138r0.pdf	2022-03-08 23:55:10 +01:00
Ondřej Surý	8fa27365ec	Make isc_ht_init() and isc_ht_iter_create() return void Previously, the function(s) in the commit subject could fail for various reasons - mostly allocation failures, or other functions returning different return code than ISC_R_SUCCESS. Now, the aforementioned function(s) cannot ever fail and they would always return ISC_R_SUCCESS. Change the function(s) to return void and remove the extra checks in the code that uses them.	2022-03-08 14:51:55 +01:00
Ondřej Surý	bbb4cdb92d	Make isc_heap_create() and isc_heap_insert() return void Previously, the function(s) in the commit subject could fail for various reasons - mostly allocation failures, or other functions returning different return code than ISC_R_SUCCESS. Now, the aforementioned function(s) cannot ever fail and they would always return ISC_R_SUCCESS. Change the function(s) to return void and remove the extra checks in the code that uses them.	2022-03-08 11:19:34 +01:00
Ondřej Surý	8098a58581	Set TCP maximum segment size to minimum size of 1220 Previously the socket code would set the TCPv6 maximum segment size to minimum value to prevent IP fragmentation for TCP. This was not yet implemented for the network manager. Implement network manager functions to set and use minimum MTU socket option and set the TCP_MAXSEG socket option for both IPv4 and IPv6 and use those to clamp the TCP maximum segment size for TCP, TCPDNS and TLSDNS layers in the network manager to 1220 bytes, that is 1280 (IPv6 minimum link MTU) minus 40 (IPv6 fixed header) minus 20 (TCP fixed header) We already rely on a similar value for UDP to prevent IP fragmentation and it make sense to use the same value for IPv4 and IPv6 because the modern networks are required to support IPv6 packet sizes. If there's need for small TCP segment values, the MTU on the interfaces needs to be properly configured.	2022-03-08 10:27:05 +01:00
Ondřej Surý	5d34a14f22	Set minimum MTU (1280) on IPv6 sockets The IPV6_USE_MIN_MTU socket option directs the IP layer to limit the IPv6 packet size to the minimum required supported MTU from the base IPv6 specification, i.e. 1280 bytes. Many implementations of TCP running over IPv6 neglect to check the IPV6_USE_MIN_MTU value when performing MSS negotiation and when constructing a TCP segment despite MSS being defined to be the MTU less the IP and TCP header sizes (60 bytes for IPv6). This leads to oversized IPv6 packets being sent resulting in unintended Path Maximum Transport Unit Discovery (PMTUD) being performed and to fragmented IPv6 packets being sent. Add and use a function to set socket option to limit the MTU on IPv6 sockets to the minimum MTU (1280) both for UDP and TCP.	2022-03-08 10:27:05 +01:00
Ondřej Surý	6bd025942c	Replace netievent lock-free queue with simple locked queue The current implementation of isc_queue uses Michael-Scott lock-free queue that in turn uses hazard pointers. It was discovered that the way we use the isc_queue, such complicated mechanism isn't really needed, because most of the time, we either execute the work directly when on nmthread (in case of UDP) or schedule the work from the matching nmthreads. Replace the current implementation of the isc_queue with a simple locked ISC_LIST. There's a slight improvement - since copying the whole list is very lightweight - we move the queue into a new list before we start the processing and locking just for moving the queue and not for every single item on the list. NOTE: There's a room for future improvements - since we don't guarantee the order in which the netievents are processed, we could have two lists - one unlocked that would be used when scheduling the work from the matching thread and one locked that would be used from non-matching thread.	2022-03-04 13:49:51 +01:00
Aram Sargsyan	ef0d7177b6	Remove EVP_CIPHER_CTX_new() and EVP_CIPHER_CTX_free() shims LibreSSL 3.5.0 fails to compile with these shims. We could have just removed the LibreSSL check from the pre-processor condition, but it seems that these shims are no longer needed because all the supported versions of OpenSSL and LibreSSL have those functions. According to EVP_ENCRYPTINIT(3) manual page in LibreSSL, EVP_CIPHER_CTX_new() and EVP_CIPHER_CTX_free() first appeared in OpenSSL 0.9.8b, and have been available since OpenBSD 4.5.	2022-03-02 10:48:09 +00:00
Mark Andrews	4c356d2770	Grow the lex token buffer in one more place when parsing key pairs, if the '=' character fell at max_token a protective INSIST preventing buffer overrun could be triggered. Attempt to grow the buffer immediately before the INSIST. Also removed an unnecessary INSIST on the opening double quote of key buffer pair.	2022-03-01 16:05:39 -08:00
Ondřej Surý	b220fb32bd	Handle TCP sockets in isc__nmsocket_reset() The isc__nmsocket_reset() was missing a case for raw TCP sockets (used by RNDC and DoH) which would case a assertion failure when write timeout would be triggered. TCP sockets are now also properly handled in isc__nmsocket_reset().	2022-02-28 02:06:03 -08:00
Ondřej Surý	ecf042991c	Fix typo __SANITIZE_ADDRESS -> __SANITIZE_ADDRESS__ When checking for Address Sanitizer to disable the inactivehandles caching, there was a typo in the macro.	2022-02-24 00:15:16 +01:00
Ondřej Surý	be339b3c83	Disable inactive uvreqs caching when compiled with sanitizers When isc__nm_uvreq_t gets deactivated, it could be just put onto array stack to be reused later to save some initialization time. Unfortunately, this might hide some use-after-free errors. Disable the inactive uvreqs caching when compiled with Address or Thread Sanitizer.	2022-02-24 00:15:16 +01:00
Ondřej Surý	92cce1da65	Disable inactive handles caching when compiled with sanitizers When isc_nmhandle_t gets deactivated, it could be just put onto array stack to be reused later to safe some initialization time. Unfortunately, this might hide some use-after-free errors. Disable the inactive handles caching when compiled with Address or Thread Sanitizer.	2022-02-23 23:21:29 +01:00
Ondřej Surý	e2555a306f	Remove active handles tracking from isc__nmsocket_t The isc__nmsocket_t has locked array of isc_nmhandle_t that's not used for anything. The isc__nmhandle_get() adds the isc_nmhandle_t to the locked array (and resized if necessary) and removed when isc_nmhandle_put() finally destroys the handle. That's all it does, so it serves no useful purpose. Remove the .ah_handles, .ah_size, and .ah_frees members of the isc__nmsocket_t and .ah_pos member of the isc_nmhandle_t struct.	2022-02-23 22:54:47 +01:00
Ondřej Surý	3268627916	Delay isc__nm_uvreq_t deallocation to connection callback When the TCP, TCPDNS or TLSDNS connection times out, the isc__nm_uvreq_t would be pushed into sock->inactivereqs before the uv_tcp_connect() callback finishes. Because the isc__nmsocket_t keeps the list of inactive isc__nm_uvreq_t, this would cause use-after-free only when the sock->inactivereqs is full (which could never happen because the failure happens in connection timeout callback) or when the sock->inactivereqs mechanism is completely removed (f.e. when running under Address or Thread Sanitizer). Delay isc__nm_uvreq_t deallocation to the connection callback and only signal the connection callback should be called by shutting down the libuv socket from the connection timeout callback.	2022-02-23 22:54:47 +01:00
Ondřej Surý	88418c3372	Properly free up enqueued netievents in nm_destroy() When the isc_netmgr is being destroyed, the normal and priority queues should be dequeued and netievents properly freed. This wasn't the case.	2022-02-23 22:51:12 +01:00
Ondřej Surý	d01562f22b	Remove the keep-response-order ACL map The keep-response-order option has been obsoleted, and in this commit, remove the keep-response-order ACL map rendering the option no-op, the call the isc_nm_sequential() and the now unused isc_nm_sequential() function itself.	2022-02-18 09:16:03 +01:00
Ondřej Surý	4f5b4662b6	Remove the limit on the number of simultaneous TCP queries There was an artificial limit of 23 on the number of simultaneous pipelined queries in the single TCP connection. The new network managers is capable of handling "unlimited" (limited only by the TCP read buffer size ) queries similar to "unlimited" handling of the DNS queries receive over UDP. Don't limit the number of TCP queries that we can process within a single TCP read callback.	2022-02-17 16:19:12 -08:00
Ondřej Surý	3c7b04d015	Add network manager based timer API This commits adds API that allows to create arbitrary timers associated with the network manager handles.	2022-02-17 21:38:17 +01:00
Ondřej Surý	4716c56ebb	Reset the TCP connection when garbage is received When invalid DNS message is received, there was a handling mechanism for DoH that would be called to return proper HTTP response. Reuse this mechanism and reset the TCP connection when the client is blackholed, DNS message is completely bogus or the ns_client receives response instead of query.	2022-02-17 20:39:55 +01:00
Ondřej Surý	ee359d6ffa	Update writetimeout to be T_IDLE in netmgr_test.c Use the isc_nmhandle_setwritetimeout() function in the netmgr unit test to allow more time for writing and reading the responses because some of the intervals that are used in the unit tests are really small leaving a little room for any delays.	2022-02-17 09:06:58 +01:00
Ondřej Surý	a89d9e0fa6	Add isc_nmhandle_setwritetimeout() function In some situations (unit test and forthcoming XFR timeouts MR), we need to modify the write timeout independently of the read timeout. Add a isc_nmhandle_setwritetimeout() function that could be called before isc_nm_send() to specify a custom write timeout interval.	2022-02-17 09:06:58 +01:00
Ondřej Surý	408b362169	Add TCP, TCPDNS and TLSDNS write timer When the outgoing TCP write buffers are full because the other party is not reading the data, the uv_write() could wait indefinitely on the uv_loop and never calling the callback. Add a new write timer that uses the `tcp-idle-timeout` value to interrupt the TCP connection when we are not able to send data for defined period of time.	2022-02-17 09:06:58 +01:00
Ondřej Surý	cd3b58622c	Add uv_tcp_close_reset compat The uv_tcp_close_reset() function was added in libuv 1.32.0 and since we support older libuv releases, we have to add a shim uv_tcp_close_reset() implementation loosely based on libuv.	2022-02-17 09:06:58 +01:00
Ondřej Surý	45a73c113f	Rename sock->timer to sock->read_timer Before adding the write timer, we have to remove the generic sock->timer to sock->read_timer. We don't touch the function names to limit the impact of the refactoring.	2022-02-17 09:06:58 +01:00
Ondřej Surý	8715be1e4b	Use UV_RUNTIME_CHECK() as appropriate Replace the RUNTIME_CHECK() calls for libuv API calls with UV_RUNTIME_CHECK() to get more detailed error message when something fails and should not.	2022-02-16 11:16:57 +01:00
Ondřej Surý	62e15bb06d	Add UV_RUNTIME_CHECK() macro to print uv_strerror() When libuv functions fail, they return correct return value that could be useful for more detailed debugging. Currently, we usually just check whether the return value is 0 and invoke assertion error if it doesn't throwing away the details why the call has failed. Unfortunately, this often happen on more exotic platforms. Add a UV_RUNTIME_CHECK() macro that can be used to print more detailed error message (via uv_strerror() before ending the execution of the program abruptly with the assertion.	2022-02-16 11:16:57 +01:00
Ondřej Surý	b9cb29076f	Log when starting and ending task exclusive mode The task exclusive mode stops all processing (tasks and networking IO) except the designated exclusive task events. This has impact on the operation of the server. Add log messages indicating when we start the exclusive mode, and when we end exclusive task mode.	2022-02-10 21:09:06 +01:00
Ondřej Surý	0893b5fb79	Assert if statistics counter underflows in the developer mode There are reported occurences where the statitic counters underflows and starts reporting non-sense. Add a check for the underflow, when ``named`` is compiled in the developer mode.	2022-02-10 17:18:09 +01:00
Ondřej Surý	0500345513	Remove unused functions from isc_thread API The isc_thread_setaffinity call was removed in !5265 and we are not going to restore it because it was proven that the performance is better without it. Additionally, remove the already disabled cpu system test. The isc_thread_setconcurrency function is unused and also calling pthread_setconcurrency() on Linux has no meaning, formerly it was added because of Solaris in 2001 and it was removed when taskmgr was refactored to run on top of netmgr in !4918.	2022-02-09 17:22:06 +01:00
Ondřej Surý	2ae84702ad	Add log message when hard quota is reached in TCP accept When isc_quota_attach_cb() API returns ISC_R_QUOTA (meaning hard quota was reached) the accept_connection() would return without logging a message about quota reached. Change the connection callback to log the quota reached message.	2022-02-01 21:00:05 +01:00
Evan Hunt	d3fed6f400	update dlz_minimal.h the addition of support for ECS client information in DLZ modules omitted some necessary changes to build modules in contrib.	2022-01-27 15:48:50 -08:00
Petr Menšík	f00f521e9c	Use detected cache line size IBM power architecture has L1 cache line size equal to 128. Take advantage of that on that architecture, do not force more common value of 64. When it is possible to detect higher value, use that value instead. Keep the default to be 64.	2022-01-27 13:02:23 +01:00
Aram Sargsyan	81d3584116	Set the ephemeral certificate's "not before" a short time in the past TLS clients can have their clock a short time in the past which will result in not being able to validate the certificate. Setting the "not before" property 5 minutes in the past will accommodate with some possible clock skew across systems.	2022-01-25 09:09:35 +00:00
Ondřej Surý	b28327354d	Ignore the invalid L1 cache line size returned by sysconf() On some systems, the glibc can return 0 instead of cache-line size to indicate the cache line sizes cannot be determined. This is comment from glibc source code: /* In general we cannot determine these values. Therefore we return zero which indicates that no information is available. */ As the goal of the check is to determine whether the L1 cache line size is still 64 and we would use this value in case the sysconf() call is not available, we can also ignore the invalid values returned by the sysconf() call.	2022-01-22 16:59:50 +01:00
Ondřej Surý	b5e086257d	Explicitly enable IPV6_V6ONLY on the netmgr sockets Some operating systems (OpenBSD and DragonFly BSD) don't restrict the IPv6 sockets to sending and receiving IPv6 packets only. Explicitly enable the IPV6_V6ONLY socket option on the IPv6 sockets to prevent failures from using the IPv4-mapped IPv6 address.	2022-01-17 22:16:27 +01:00
Evan Hunt	be0bc24c7f	add UV_ENOTSUP to isc___nm_uverr2result() This error code is now mapped to ISC_R_FAMILYNOSUPPORT.	2022-01-17 11:45:10 +01:00
Artem Boldariev	ca9fe3559a	DoH: ensure that server_send_error_response() is used properly The server_send_error_response() function is supposed to be used only in case of failures and never in case of legitimate requests. Ensure that ISC_HTTP_ERROR_SUCCESS is never passed there by mistake.	2022-01-14 16:00:42 +02:00

1 2 3 4 5 ...

4376 Commits