bind9

Author	SHA1	Message	Date
Ondřej Surý	7ad2d6e986	Don't enable SO_REUSEADDR on outgoing UDP sockets Currently, the outgoing UDP sockets have enabled SO_REUSEADDR (SO_REUSEPORT on BSDs) which allows multiple UDP sockets to bind to the same address+port. There's one caveat though - only a single (the last one) socket is going to receive all the incoming traffic. This in turn could lead to incoming DNS message matching to invalid dns_dispatch and getting dropped. Disable setting the SO_REUSEADDR on the outgoing UDP sockets. This needs to be done explicitly because `uv_udp_open()` silently enables the option on the socket. (cherry picked from commit `eec30c33c2`)	2024-10-02 15:20:28 +02:00
Ondřej Surý	5bac885ace	Use release memory ordering when incrementing reference counter As the relaxed memory ordering doesn't ensure any memory synchronization, it is possible that the increment will succeed even in the case when it should not - there is a race between atomic_fetch_sub(..., acq_rel) and atomic_fetch_add(..., relaxed). Only the result is consistent, but the previous value for both calls could be same when both calls are executed at the same time. (cherry picked from commit `88227ea665`)	2024-10-02 09:09:03 +02:00
Nicki Křížek	50221d6ff1	Update code formatting clang 19 was updated in the base image. (cherry picked from commit `ebb5bd9c0f`)	2024-09-21 07:20:11 +00:00
alessio	01e3567243	Do not set SO_INCOMING_CPU We currently set SO_INCOMING_CPU incorrectly, and testing by Ondrej shows that fixing the issue and setting affinities is worse than letting the kernel schedule threads without constraints. So we should not set SO_INCOMING_CPU anymore. (cherry picked from commit `8b8149cdd2`)	2024-09-19 16:40:59 +02:00
Ondřej Surý	3012a97d58	Limit the outgoing UDP send queue size If the operating system UDP queue gets full and the outgoing UDP sending starts to be delayed, BIND 9 could exhibit memory spikes as it tries to enqueue all the outgoing UDP messages. As those are not going to be delivered anyway (as we argued when we stopped enlarging the operating system send and receive buffers), try to send the UDP messages directly using `uv_udp_try_send()` and if that fails, drop the outgoing UDP message. (cherry picked from commit `b576c4c977`)	2024-09-17 16:20:00 +02:00
Michal Nowak	fe8d6023e0	Update code formatting clang 19 was updated in the base image. (cherry picked from commit `ff69d07f`)	2024-09-11 11:47:10 +02:00
Ondřej Surý	c8f1fa0e47	Follow the number of CPU set by taskset/cpuset Administrators may wish to constrain the set of cores that BIND 9 runs on via the 'taskset', 'cpuset' or 'numactl' programs (or equivalent on other O/S), for example to achieve higher (or more stable) performance by more closely associating threads with individual NIC rx queues. If the admin has used taskset, it follows that BIND ought to automatically use the given number of CPUs rather than the system wide count. Co-Authored-By: Ray Bellis <ray@isc.org> (cherry picked from commit `5a2df8caf5`)	2024-09-03 14:54:40 +02:00
Ondřej Surý	015b390f62	Stop using malloc_usable_size and malloc_size Although the nanual page of malloc_usable_size says: Although the excess bytes can be over‐written by the application without ill effects, this is not good programming practice: the number of excess bytes in an allocation depends on the underlying implementation. it looks like the premise is broken with _FORTIFY_SOURCE=3 on newer systems and it might return a value that causes program to stop with "buffer overflow" detected from the _FORTIFY_SOURCE. As we do have own implementation that tracks the allocation size that we can use to track the allocation size, we can stop relying on this introspection function. Also the newer manual page for malloc_usable_size changed the NOTES to: The value returned by malloc_usable_size() may be greater than the requested size of the allocation because of various internal implementation details, none of which the programmer should rely on. This function is intended to only be used for diagnostics and statistics; writing to the excess memory without first calling realloc(3) to resize the allocation is not supported. The returned value is only valid at the time of the call. Remove usage of both malloc_usable_size() and malloc_size() to be on the safe size and only use the internal size tracking mechanism when jemalloc is not available. (cherry picked from commit `d61712d14e`)	2024-08-27 04:49:55 +02:00
Mark Andrews	b73a385696	Define ISC_ATTR_UNUSED macro for __attribute__((__unused__)) The ISC_ATTR_UNUSED macro was missing in BIND 9.18, which complicated things when backporting merge requests from main. As __attribute__((__unused__)) is ubiquitous, just define the macro.	2024-08-27 04:49:55 +02:00
Michal Nowak	b5caae0633	Use clang-format-19 to update formatting	2024-08-22 10:25:22 +02:00
Evan Hunt	a1b2c85d84	ensure fd is non-negative before calling dup() this silences a spurious warning from clang-scan 19.	2024-08-21 21:37:51 -07:00
Ondřej Surý	a49079c84c	Change the NS_PER_SEC (and friends) from enum to static const New version of clang (19) has introduced a stricter checks when mixing integer (and float types) with enums. In this case, we used enum {} as C17 doesn't have constexpr yet. Change the time conversion constants to be #defined constants because of RHEL 8 compiler doesn't consider static const unsigned int to be constant. (cherry picked from commit `b03e90e0d4`)	2024-08-19 15:32:03 +00:00
Ondřej Surý	e08d3a7932	Check the result of dirfd() before calling unlinkat() Instead of directly using the result of dirfd() in the unlinkat() call, check whether the returned file descriptor is actually valid. That doesn't really change the logic as the unlinkat() would fail with invalid descriptor anyway, but this is cleaner and will report the right error returned directly by dirfd() instead of EBADF from unlinkat(). (cherry picked from commit `59f4fdebc0`)	2024-08-19 11:23:05 +00:00
Ondřej Surý	bd8a1abc80	Remove code to read and parse /proc/net/if_inet6 on Linux The getifaddr() works fine for years, so we don't have to keep the callback to parse /proc/net/if_inet6 anymore. (cherry picked from commit `2fbf9757b8`)	2024-08-19 09:46:07 +00:00
Ondřej Surý	e707ee0946	Ignore errno returned from rewind() in the interface iterator The clang-scan 19 has reported that we are ignoring errno after the call to rewind(). As we don't really care about the result, just silence the error, the whole code will be removed in the development version anyway as it is not needed. (cherry picked from commit `dda5ba53df`)	2024-08-19 09:46:07 +00:00
Ondřej Surý	acabe271c5	Disassociate the SSL object from the cached SSL_SESSION When the SSL object was destroyed, it would invalidate all SSL_SESSION objects including the cached, but not yet used, TLS session objects. Properly disassociate the SSL object from the SSL_SESSION before we store it in the TLS session cache, so we can later destroy it without invalidating the cached TLS sessions. Co-authored-by: Ondřej Surý <ondrej@isc.org> Co-authored-by: Artem Boldariev <artem@isc.org> Co-authored-by: Aram Sargsyan <aram@isc.org> (cherry picked from commit `c11b736e44`)	2024-08-07 16:01:03 +00:00
Ondřej Surý	875755d9ea	Attach/detach to the listening child socket when accepting TLS When TLS connection (TLSstream) connection was accepted, the children listening socket was not attached to sock->server and thus it could have been freed before all the accepted connections were actually closed. In turn, this would cause us to call isc_tls_free() too soon - causing cascade errors in pending SSL_read_ex() in the accepted connections. Properly attach and detach the children listening socket when accepting and closing the server connections. (cherry picked from commit `684f3eb8e6`)	2024-08-07 17:20:03 +02:00
Ondřej Surý	9615f5b348	Don't loop indefinitely when isc_task quantum is 'unlimited' Don't run more events than already scheduled. If the quantum is set to a high value, the task_run() would execute already scheduled, and all new events that result from running event->ev_action(). Setting quantum to a number of scheduled events will postpone events scheduled after we enter the loop here to the next task_run() invocation.	2024-08-07 08:27:15 +02:00
Ondřej Surý	236de53c52	Use EXIT_SUCCESS and EXIT_FAILURE Instead of randomly using -1 or 1 as a failure status, properly utilize the EXIT_FAILURE define that's platform specific (as it should be). (cherry picked from commit76997983fde02d9c32aa23bda30b65f1ebd4178c)	2024-08-06 15:19:06 +02:00
JINMEI Tatuya	b9bef2cc89	add a trivial wrapper for uv_stream_get_write_queue_size	2024-08-05 10:27:37 +00:00
Mark Andrews	2994d6d700	Properly compute the physical memory size On a 32 bit machine casting to size_t can still lead to an overflow. Cast to uint64_t. Also detect all possible negative values for pages and pagesize to silence warning about possible negative value. 39#if defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE) 1. tainted_data_return: Called function sysconf(_SC_PHYS_PAGES), and a possible return value may be less than zero. 2. assign: Assigning: pages = sysconf(_SC_PHYS_PAGES). 40 long pages = sysconf(_SC_PHYS_PAGES); 41 long pagesize = sysconf(_SC_PAGESIZE); 42 3. Condition pages == -1, taking false branch. 4. Condition pagesize == -1, taking false branch. 43 if (pages == -1 \|\| pagesize == -1) { 44 return (0); 45 } 46 5. overflow: The expression (size_t)pages * pagesize might be negative, but is used in a context that treats it as unsigned. CID 498034: (#1 of 1): Overflowed return value (INTEGER_OVERFLOW) 6. return_overflow: (size_t)pages * pagesize, which might have underflowed, is returned from the function. 47 return ((size_t)pages * pagesize); 48#endif /* if defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE) */ (cherry picked from commit `e8dbc5db92`)	2024-07-31 07:30:39 +00:00
Artem Boldariev	c33b3d26f6	TCP/TLS DNS: unthrottle only when all input data processing This commit ensures that we restart reading only when all DNS data in the input buffer is processed so the we will not get into the situation when the buffer is overrun.	2024-07-03 15:08:01 +02:00
Ondřej Surý	4b7c61381f	Throttle the reading when writes are asynchronous Be more aggressive when throttling the reading - when we can't send the outgoing TCP synchronously with uv_try_write(), we start throttling the reading immediately instead of waiting for the send buffers to fill up. This should not affect behaved clients that read the data from the TCP on the other end. (cherry picked from commit `bc3e713317`)	2024-07-03 09:10:20 +02:00
Artem Boldariev	d4b1f7f239	Use smaller pools of requests and handles for sockets This commit ensures that socket objects use smaller sizes for its internal requests and handles pools. That prevents a memory allocator from thrashing.	2024-06-18 17:54:17 +03:00
Artem Boldariev	16c1d1eb2e	Avoid indefinite send re-scheduling in TLS DNS When a peer is not reading the data we are sending it was for the TLS DNS code to end up in a situation when it would indefinitely reschedule send requests, effectively turning the 'uv_loop' into a busy loop that would consume CPU cycles in endless efforts to send outgoing data. The main reason for that was only one send buffer dedicated for sends: the code would re-queue sends until it is empty - that would never happen when the remote side is not reading data. That seems like an omission from the older day of the Network Manager as it is quiet simple to make the code use multiple buffers for sends. That ultimately breaks the cycle of futile send request rescheduling. As a side effect, this commit also gets rid of one memory copying on a hot path.	2024-06-18 11:58:59 +03:00
Artem Boldariev	c71a61c44b	Introduce TCP throttling into TLS DNS code Throttling functionality was omitted from the `c6f13f12cd`. This commit fixes that, taking into account the latest developments in this area.	2024-06-18 11:58:59 +03:00
Artem Boldariev	eb4678e0b8	Do not un-throttle TCP connections on isc_nm_read() Due to omission it was possible to un-throttle a TCP connection previously throttled due to the peer not reading back data we are sending. In particular, that affected DoH code, but it could also affect other transports (the current or future ones) that pause/resume reading according to its internal state. (cherry picked from commit d228aa8bbb944fbd04baf22d151fde5c33561e26)	2024-06-18 11:58:59 +03:00
Ondřej Surý	964891a794	Limit the number of DNS message processed from a single TCP read The single TCP read can create as much as 64k divided by the minimum size of the DNS message. This can clog the processing thread and trash the memory allocator because we need to do as much as ~20k allocations in a single UV loop tick. Limit the number of the DNS messages processed in a single UV loop tick to just single DNS message and limit the number of the outstanding DNS messages back to 23. This effectively limits the number of pipelined DNS messages to that number (this is the limit we already had before). This reverts commit `780a89012d`.	2024-06-10 18:43:46 +02:00
Ondřej Surý	c6f13f12cd	Throttle reading from TCP if the sends are not getting through When TCP client would not read the DNS message sent to them, the TCP sends inside named would accumulate and cause degradation of the service. Throttle the reading from the TCP socket when we accumulate enough DNS data to be sent. Currently this is limited in a way that a single largest possible DNS message can fit into the buffer. (cherry picked from commit 26006f7b44474819fac2a76dc6cd6f69f0d76828)	2024-06-10 18:43:44 +02:00
Artem Boldariev	998522e68e	Keep the endpoints set reference within an HTTP/2 socket This commit ensures that an HTTP endpoints set reference is stored in a socket object associated with an HTTP/2 stream instead of referencing the global set stored inside a listener. This helps to prevent an issue like follows: 1. BIND is configured to serve DoH clients; 2. A client is connected and one or more HTTP/2 stream is created. Internal pointers are now pointing to the data on the associated HTTP endpoints set; 3. BIND is reconfigured - the new endpoints set object is created and promoted to all listeners; 4. The old pointers to the HTTP endpoints set data are now invalid. Instead referencing a global object that is updated on re-configurations we now store a local reference which prevents the endpoints set objects to go out of scope prematurely. (cherry picked from commit b9b5d0c01a3a546c4a6a8b3bff8ae9dd31fee224)	2024-06-10 18:35:18 +02:00
Artem Boldariev	b601a5b781	DoH: avoid potential use after free for HTTP/2 session objects It was reported that HTTP/2 session might get closed or even deleted before all async. processing has been completed. This commit addresses that: now we are avoiding using the object when we do not need it or specifically check if the pointers used are not 'NULL' and by ensuring that there is at least one reference to the session object while we are doing incoming data processing. This commit makes the code more resilient to such issues in the future. (cherry picked from commit 0cca550dff403c6100b7c0da8f252e7967765ba7)	2024-06-10 18:35:16 +02:00
Mark Andrews	e0af62deac	Add helper function isc_sockaddr_disabled (cherry picked from commit `9be1873ef3`)	2024-06-03 13:52:37 +00:00
Matthijs Mekking	e1a49ee6d4	Call reset_shutdown if uv_tcp_close_reset failed If uv_tcp_close_reset() returns an error code, this means the reset_shutdown callback has not been issued, so do it now. (cherry picked from commit `c40e5c8653`)	2024-06-03 08:16:32 +00:00
Matthijs Mekking	6f6d90fd51	Do not runtime check uv_tcp_close_reset When we reset a TCP connection by sending a RST packet, do not bother requiring the result is a success code. (cherry picked from commit `5b94bb2129`)	2024-06-03 08:16:32 +00:00
Aydın Mercan	dc9f55da5b	increase TCP4Clients/TCP6Clients after point of no failure Failing to accept TCP/TLS connections in 9.18 detaches the quota in isc__nm_failed_accept_cb, causing TCP4Clients and TCP6Clients statistics to not decrease inside cleanup. Fix by increasing the counter after the point of no failure but before handling statistics through the client's socket is no longer valid.	2024-05-30 13:39:23 +03:00
Mark Andrews	26b6ce9a56	Clang-format header file changes	2024-05-17 16:21:35 -07:00
Aram Sargsyan	0a48252b53	Fix a data race in isc_task_purgeevent() When isc_task_purgeevent() is called for and 'event', the event, in the meanwhile, could in theory get processed, unlinked, and freed. So when the function then operates on the 'event', it causes a segmentation fault. The only place where isc_task_purgeevent() is called is from timer_purge(). In order to resolve the data race, call isc_task_purgeevent() inside the 'timer->lock' locked block, so that timerevent_destroy() won't be able to destroy the event if it was processed in the meanwhile, before isc_task_purgeevent() had a chance to purge it. In order to be able to do that, move the responsibility of calling isc_event_free() (upon a successful purge) out from the isc_task_purgeevent() function to its caller instead, so that it can be called outside of the timer->lock locked block.	2024-05-17 12:08:27 +00:00
Aram Sargsyan	857f6adaec	Test a race condition between isc_timer_purge() and isc_event_free() Let basic_tick() of 'task1' and 'basic_quick' of 'task4' run in different threads, and insert an artificial delay in timer_purge() to cause an existing race condition to appear.	2024-05-17 10:49:57 +00:00
Aram Sargsyan	c7b15f1f5a	Expose internal timer_purge() as isc_timer_purge() This function is used in a unit test to check for data races.	2024-05-17 10:49:57 +00:00
Michal Nowak	ea413a6fae	Update sources to Clang 18 formatting (cherry picked from commit `f454fa6dea`)	2024-04-23 12:48:56 +00:00
Aydın Mercan	abc47f5ce4	Expose the TCP client count in statistics channel The statistics channel does not expose the current number of TCP clients connected, only the highwater. Therefore, users did not have an easy means to collect statistics about TCP clients served over time. This information could only be measured as a seperate mechanism via rndc by looking at the TCP quota filled. In order to expose the exact current count of connected TCP clients (tracked by the "tcp-clients" quota) as a statistics counter, an extra, dedicated Network Manager callback would need to be implemented for that purpose (a counterpart of ns__client_tcpconn() that would be run when a TCP connection is torn down), which is inefficient. Instead, track the number of currently-connected TCP clients separately for IPv4 and IPv6, as Network Manager statistics. (cherry picked from commit `2690dc48d3`)	2024-02-27 11:04:28 +03:00
Michał Kępień	4ad3c694f1	Merge tag 'v9.18.24' into bind-9.18 BIND 9.18.24	2024-02-14 13:35:19 +01:00
Ondřej Surý	c462d65b2f	Fix case insensitive matching in isc_ht hash table implementation The case insensitive matching in isc_ht was basically completely broken as only the hashvalue computation was case insensitive, but the key comparison was always case sensitive. (cherry picked from commit `ec11aa2836`)	2024-02-11 11:23:28 +01:00
Ondřej Surý	ec11aa2836	Fix case insensitive matching in isc_ht hash table implementation The case insensitive matching in isc_ht was basically completely broken as only the hashvalue computation was case insensitive, but the key comparison was always case sensitive. (cherry picked from commit 34ae6916f115fc291865857509433f95c2bc0871)	2024-02-11 09:39:19 +01:00
Ondřej Surý	1b3b0cef22	Split fast and slow task queues Change the taskmgr (and thus netmgr) in a way that it supports fast and slow task queues. The fast queue is used for incoming DNS traffic and it will pass the processing to the slow queue for sending outgoing DNS messages and processing resolver messages. In the future, more tasks might get moved to the slow queues, so the cached and authoritative DNS traffic can be handled without being slowed down by operations that take longer time to process.	2024-02-01 21:47:29 +01:00
Aydın Mercan	afb0b3971c	Forward declare mallocx in isc/mem.h cmocka.h and jemalloc.h/malloc_np.h has conflicting macro definitions. While fixing them with push_macro for only malloc is done below, we only need the non-standard mallocx interface which is easy to just define by ourselves. (cherry picked from commit `197de93bdc`)	2024-01-18 10:40:46 +01:00
Ondřej Surý	f82f4d1d77	Add workaround for jemalloc linking order Because we don't use jemalloc functions directly, but only via the libisc library, the dynamic linker might pull the jemalloc library too late when memory has been already allocated via standard libc allocator. Add a workaround round isc_mem_create() that makes the dynamic linker to pull jemalloc earlier than libc. (cherry picked from commit `41a0ee1071`)	2024-01-18 10:40:46 +01:00
Artem Boldariev	7b390a7fb6	Fix reading extra messages in TLS DNS in client mode When connecting to a remote party the TLS DNS code could process more than one message at a time despite the fact that it is expected that we should stop after every DNS message. Every DNS message is handled and consumed from the input buffer by isc__nm_process_sock_buffer(). However, as opposed to TCP DNS code, it can be called more than once when processing incoming data from a server (see tls_cycle_input()). That, in turn means that we can process more than one message at a time. Some higher level code might not expect that, as it breaks the contract. In particular, in the original report that happened during isc__nm_async_tlsdnsshutdown() call: when shutting down multiple calls to tls_cycle() are possible (each possibly leading to a isc__nm_process_sock_buffer()). If there are any non processed messages left, for any of the messages left the read callback will be called even when it is not expected as there were no preceding isc_nm_read(). To keep TCP DNS and TLS DNS code in sync, we make a similar change to it as well, although it should not matter.	2024-01-17 22:35:25 +02:00
Aydın Mercan	a83c749115	Use <isc/atomic.h> instead of <stdatomic.h> directly in <isc/types.h>	2024-01-03 20:36:35 +03:00
Aydın Mercan	6c0ae4ef6e	Move atomic statscounter next to the non-atomic definition (cherry picked from commit `9c4dd863a6`)	2024-01-03 20:36:35 +03:00

1 2 3 4 5 ...

4570 Commits