Commit Graph

4558 Commits

Author SHA1 Message Date
Ondřej Surý
e08d3a7932 Check the result of dirfd() before calling unlinkat()
Instead of directly using the result of dirfd() in the unlinkat() call,
check whether the returned file descriptor is actually valid.  That
doesn't really change the logic as the unlinkat() would fail with
invalid descriptor anyway, but this is cleaner and will report the right
error returned directly by dirfd() instead of EBADF from unlinkat().

(cherry picked from commit 59f4fdebc0)
2024-08-19 11:23:05 +00:00
Ondřej Surý
bd8a1abc80 Remove code to read and parse /proc/net/if_inet6 on Linux
The getifaddr() works fine for years, so we don't have to
keep the callback to parse /proc/net/if_inet6 anymore.

(cherry picked from commit 2fbf9757b8)
2024-08-19 09:46:07 +00:00
Ondřej Surý
e707ee0946 Ignore errno returned from rewind() in the interface iterator
The clang-scan 19 has reported that we are ignoring errno after the call
to rewind().  As we don't really care about the result, just silence the
error, the whole code will be removed in the development version anyway
as it is not needed.

(cherry picked from commit dda5ba53df)
2024-08-19 09:46:07 +00:00
Ondřej Surý
acabe271c5 Disassociate the SSL object from the cached SSL_SESSION
When the SSL object was destroyed, it would invalidate all SSL_SESSION
objects including the cached, but not yet used, TLS session objects.

Properly disassociate the SSL object from the SSL_SESSION before we
store it in the TLS session cache, so we can later destroy it without
invalidating the cached TLS sessions.

Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
Co-authored-by: Aram Sargsyan <aram@isc.org>
(cherry picked from commit c11b736e44)
2024-08-07 16:01:03 +00:00
Ondřej Surý
875755d9ea Attach/detach to the listening child socket when accepting TLS
When TLS connection (TLSstream) connection was accepted, the children
listening socket was not attached to sock->server and thus it could have
been freed before all the accepted connections were actually closed.

In turn, this would cause us to call isc_tls_free() too soon - causing
cascade errors in pending SSL_read_ex() in the accepted connections.

Properly attach and detach the children listening socket when accepting
and closing the server connections.

(cherry picked from commit 684f3eb8e6)
2024-08-07 17:20:03 +02:00
Ondřej Surý
9615f5b348 Don't loop indefinitely when isc_task quantum is 'unlimited'
Don't run more events than already scheduled.  If the quantum is set to
a high value, the task_run() would execute already scheduled, and all
new events that result from running event->ev_action().

Setting quantum to a number of scheduled events will postpone events
scheduled after we enter the loop here to the next task_run()
invocation.
2024-08-07 08:27:15 +02:00
Ondřej Surý
236de53c52 Use EXIT_SUCCESS and EXIT_FAILURE
Instead of randomly using -1 or 1 as a failure status, properly utilize
the EXIT_FAILURE define that's platform specific (as it should be).

(cherry picked from commit76997983fde02d9c32aa23bda30b65f1ebd4178c)
2024-08-06 15:19:06 +02:00
JINMEI Tatuya
b9bef2cc89 add a trivial wrapper for uv_stream_get_write_queue_size 2024-08-05 10:27:37 +00:00
Mark Andrews
2994d6d700 Properly compute the physical memory size
On a 32 bit machine casting to size_t can still lead to an overflow.
Cast to uint64_t.  Also detect all possible negative values for
pages and pagesize to silence warning about possible negative value.

    39#if defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE)
    	1. tainted_data_return: Called function sysconf(_SC_PHYS_PAGES),
           and a possible return value may be less than zero.
    	2. assign: Assigning: pages = sysconf(_SC_PHYS_PAGES).
    40        long pages = sysconf(_SC_PHYS_PAGES);
    41        long pagesize = sysconf(_SC_PAGESIZE);
    42
    	3. Condition pages == -1, taking false branch.
    	4. Condition pagesize == -1, taking false branch.
    43        if (pages == -1 || pagesize == -1) {
    44                return (0);
    45        }
    46
    	5. overflow: The expression (size_t)pages * pagesize might be negative,
           but is used in a context that treats it as unsigned.

    CID 498034: (#1 of 1): Overflowed return value (INTEGER_OVERFLOW)
    6. return_overflow: (size_t)pages * pagesize, which might have underflowed,
       is returned from the function.
    47        return ((size_t)pages * pagesize);
    48#endif /* if defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE) */

(cherry picked from commit e8dbc5db92)
2024-07-31 07:30:39 +00:00
Artem Boldariev
c33b3d26f6 TCP/TLS DNS: unthrottle only when all input data processing
This commit ensures that we restart reading only when all DNS data in
the input buffer is processed so the we will not get into the
situation when the buffer is overrun.
2024-07-03 15:08:01 +02:00
Ondřej Surý
4b7c61381f Throttle the reading when writes are asynchronous
Be more aggressive when throttling the reading - when we can't send the
outgoing TCP synchronously with uv_try_write(), we start throttling the
reading immediately instead of waiting for the send buffers to fill up.

This should not affect behaved clients that read the data from the TCP
on the other end.

(cherry picked from commit bc3e713317)
2024-07-03 09:10:20 +02:00
Artem Boldariev
d4b1f7f239 Use smaller pools of requests and handles for sockets
This commit ensures that socket objects use smaller sizes for its
internal requests and handles pools. That prevents a memory allocator
from thrashing.
2024-06-18 17:54:17 +03:00
Artem Boldariev
16c1d1eb2e Avoid indefinite send re-scheduling in TLS DNS
When a peer is not reading the data we are sending it was for the TLS
DNS code to end up in a situation when it would indefinitely
reschedule send requests, effectively turning the 'uv_loop' into a
busy loop that would consume CPU cycles in endless efforts to send
outgoing data.

The main reason for that was only one send buffer dedicated for sends:
the code would re-queue sends until it is empty - that would never
happen when the remote side is not reading data.

That seems like an omission from the older day of the Network Manager
as it is quiet simple to make the code use multiple buffers for
sends. That ultimately breaks the cycle of futile send request
rescheduling.

As a side effect, this commit also gets rid of one memory copying on a
hot path.
2024-06-18 11:58:59 +03:00
Artem Boldariev
c71a61c44b Introduce TCP throttling into TLS DNS code
Throttling functionality was omitted from the
c6f13f12cd. This commit fixes that,
taking into account the latest developments in this area.
2024-06-18 11:58:59 +03:00
Artem Boldariev
eb4678e0b8 Do not un-throttle TCP connections on isc_nm_read()
Due to omission it was possible to un-throttle a TCP connection
previously throttled due to the peer not reading back data we are
sending.

In particular, that affected DoH code, but it could also affect other
transports (the current or future ones) that pause/resume reading
according to its internal state.

(cherry picked from commit d228aa8bbb944fbd04baf22d151fde5c33561e26)
2024-06-18 11:58:59 +03:00
Ondřej Surý
964891a794 Limit the number of DNS message processed from a single TCP read
The single TCP read can create as much as 64k divided by the minimum
size of the DNS message.  This can clog the processing thread and trash
the memory allocator because we need to do as much as ~20k allocations in
a single UV loop tick.

Limit the number of the DNS messages processed in a single UV loop tick
to just single DNS message and limit the number of the outstanding DNS
messages back to 23.  This effectively limits the number of pipelined
DNS messages to that number (this is the limit we already had before).

This reverts commit 780a89012d.
2024-06-10 18:43:46 +02:00
Ondřej Surý
c6f13f12cd Throttle reading from TCP if the sends are not getting through
When TCP client would not read the DNS message sent to them, the TCP
sends inside named would accumulate and cause degradation of the
service.  Throttle the reading from the TCP socket when we accumulate
enough DNS data to be sent.  Currently this is limited in a way that a
single largest possible DNS message can fit into the buffer.

(cherry picked from commit 26006f7b44474819fac2a76dc6cd6f69f0d76828)
2024-06-10 18:43:44 +02:00
Artem Boldariev
998522e68e Keep the endpoints set reference within an HTTP/2 socket
This commit ensures that an HTTP endpoints set reference is stored in
a socket object associated with an HTTP/2 stream instead of
referencing the global set stored inside a listener.

This helps to prevent an issue like follows:

1. BIND is configured to serve DoH clients;
2. A client is connected and one or more HTTP/2 stream is
created. Internal pointers are now pointing to the data on the
associated HTTP endpoints set;
3. BIND is reconfigured - the new endpoints set object is created and
promoted to all listeners;
4. The old pointers to the HTTP endpoints set data are now invalid.

Instead referencing a global object that is updated on
re-configurations we now store a local reference which prevents the
endpoints set objects to go out of scope prematurely.

(cherry picked from commit b9b5d0c01a3a546c4a6a8b3bff8ae9dd31fee224)
2024-06-10 18:35:18 +02:00
Artem Boldariev
b601a5b781 DoH: avoid potential use after free for HTTP/2 session objects
It was reported that HTTP/2 session might get closed or even deleted
before all async. processing has been completed.

This commit addresses that: now we are avoiding using the object when
we do not need it or specifically check if the pointers used are not
'NULL' and by ensuring that there is at least one reference to the
session object while we are doing incoming data processing.

This commit makes the code more resilient to such issues in the
future.

(cherry picked from commit 0cca550dff403c6100b7c0da8f252e7967765ba7)
2024-06-10 18:35:16 +02:00
Mark Andrews
e0af62deac Add helper function isc_sockaddr_disabled
(cherry picked from commit 9be1873ef3)
2024-06-03 13:52:37 +00:00
Matthijs Mekking
e1a49ee6d4 Call reset_shutdown if uv_tcp_close_reset failed
If uv_tcp_close_reset() returns an error code, this means the
reset_shutdown callback has not been issued, so do it now.

(cherry picked from commit c40e5c8653)
2024-06-03 08:16:32 +00:00
Matthijs Mekking
6f6d90fd51 Do not runtime check uv_tcp_close_reset
When we reset a TCP connection by sending a RST packet, do not bother
requiring the result is a success code.

(cherry picked from commit 5b94bb2129)
2024-06-03 08:16:32 +00:00
Aydın Mercan
dc9f55da5b increase TCP4Clients/TCP6Clients after point of no failure
Failing to accept TCP/TLS connections in 9.18 detaches the quota in
isc__nm_failed_accept_cb, causing TCP4Clients and TCP6Clients statistics
to not decrease inside cleanup.

Fix by increasing the counter after the point of no failure but before
handling statistics through the client's socket is no longer valid.
2024-05-30 13:39:23 +03:00
Mark Andrews
26b6ce9a56 Clang-format header file changes 2024-05-17 16:21:35 -07:00
Aram Sargsyan
0a48252b53 Fix a data race in isc_task_purgeevent()
When isc_task_purgeevent() is called for and 'event', the event, in
the meanwhile, could in theory get processed, unlinked, and freed.
So when the function then operates on the 'event', it causes a
segmentation fault.

The only place where isc_task_purgeevent() is called is from
timer_purge().

In order to resolve the data race, call isc_task_purgeevent() inside
the 'timer->lock' locked block, so that timerevent_destroy() won't
be able to destroy the event if it was processed in the meanwhile,
before isc_task_purgeevent() had a chance to purge it.

In order to be able to do that, move the responsibility of calling
isc_event_free() (upon a successful purge) out from the
isc_task_purgeevent() function to its caller instead, so that it can
be called outside of the timer->lock locked block.
2024-05-17 12:08:27 +00:00
Aram Sargsyan
857f6adaec Test a race condition between isc_timer_purge() and isc_event_free()
Let basic_tick() of 'task1' and 'basic_quick' of 'task4' run in
different threads, and insert an artificial delay in timer_purge()
to cause an existing race condition to appear.
2024-05-17 10:49:57 +00:00
Aram Sargsyan
c7b15f1f5a Expose internal timer_purge() as isc_timer_purge()
This function is used in a unit test to check for data races.
2024-05-17 10:49:57 +00:00
Michal Nowak
ea413a6fae Update sources to Clang 18 formatting
(cherry picked from commit f454fa6dea)
2024-04-23 12:48:56 +00:00
Aydın Mercan
abc47f5ce4 Expose the TCP client count in statistics channel
The statistics channel does not expose the current number of TCP clients
connected, only the highwater. Therefore, users did not have an easy
means to collect statistics about TCP clients served over time. This
information could only be measured as a seperate mechanism via rndc by
looking at the TCP quota filled.

In order to expose the exact current count of connected TCP clients
(tracked by the "tcp-clients" quota) as a statistics counter, an
extra, dedicated Network Manager callback would need to be
implemented for that purpose (a counterpart of ns__client_tcpconn()
that would be run when a TCP connection is torn down), which is
inefficient. Instead, track the number of currently-connected TCP
clients separately for IPv4 and IPv6, as Network Manager statistics.

(cherry picked from commit 2690dc48d3)
2024-02-27 11:04:28 +03:00
Michał Kępień
4ad3c694f1 Merge tag 'v9.18.24' into bind-9.18
BIND 9.18.24
2024-02-14 13:35:19 +01:00
Ondřej Surý
c462d65b2f Fix case insensitive matching in isc_ht hash table implementation
The case insensitive matching in isc_ht was basically completely broken
as only the hashvalue computation was case insensitive, but the key
comparison was always case sensitive.

(cherry picked from commit ec11aa2836)
2024-02-11 11:23:28 +01:00
Ondřej Surý
ec11aa2836 Fix case insensitive matching in isc_ht hash table implementation
The case insensitive matching in isc_ht was basically completely broken
as only the hashvalue computation was case insensitive, but the key
comparison was always case sensitive.

(cherry picked from commit 34ae6916f115fc291865857509433f95c2bc0871)
2024-02-11 09:39:19 +01:00
Ondřej Surý
1b3b0cef22 Split fast and slow task queues
Change the taskmgr (and thus netmgr) in a way that it supports fast and
slow task queues.  The fast queue is used for incoming DNS traffic and
it will pass the processing to the slow queue for sending outgoing DNS
messages and processing resolver messages.

In the future, more tasks might get moved to the slow queues, so the
cached and authoritative DNS traffic can be handled without being slowed
down by operations that take longer time to process.
2024-02-01 21:47:29 +01:00
Aydın Mercan
afb0b3971c Forward declare mallocx in isc/mem.h
cmocka.h and jemalloc.h/malloc_np.h has conflicting macro definitions.
While fixing them with push_macro for only malloc is done below, we only
need the non-standard mallocx interface which is easy to just define by
ourselves.

(cherry picked from commit 197de93bdc)
2024-01-18 10:40:46 +01:00
Ondřej Surý
f82f4d1d77 Add workaround for jemalloc linking order
Because we don't use jemalloc functions directly, but only via the
libisc library, the dynamic linker might pull the jemalloc library
too late when memory has been already allocated via standard libc
allocator.

Add a workaround round isc_mem_create() that makes the dynamic linker
to pull jemalloc earlier than libc.

(cherry picked from commit 41a0ee1071)
2024-01-18 10:40:46 +01:00
Artem Boldariev
7b390a7fb6 Fix reading extra messages in TLS DNS in client mode
When connecting to a remote party the TLS DNS code could process more
than one message at a time despite the fact that it is expected that
we should stop after every DNS message.

Every DNS message is handled and consumed from the input buffer by
isc__nm_process_sock_buffer(). However, as opposed to TCP DNS code, it
can be called more than once when processing incoming data from a
server (see tls_cycle_input()). That, in turn means that we can
process more than one message at a time. Some higher level code might
not expect that, as it breaks the contract.

In particular, in the original report that happened during
isc__nm_async_tlsdnsshutdown() call: when shutting down multiple calls
to tls_cycle() are possible (each possibly leading to a
isc__nm_process_sock_buffer()). If there are any non processed
messages left, for any of the messages left the read callback will be
called even when it is not expected as there were no preceding
isc_nm_read().

To keep TCP DNS and TLS DNS code in sync, we make a similar change to
it as well, although it should not matter.
2024-01-17 22:35:25 +02:00
Aydın Mercan
a83c749115 Use <isc/atomic.h> instead of <stdatomic.h> directly in <isc/types.h> 2024-01-03 20:36:35 +03:00
Aydın Mercan
6c0ae4ef6e Move atomic statscounter next to the non-atomic definition
(cherry picked from commit 9c4dd863a6)
2024-01-03 20:36:35 +03:00
Aydın Mercan
9601763943 Use a non-atomic counter when passing to stats dumper
(cherry picked from commit bb96142a17)
2024-01-03 20:36:35 +03:00
Petr Špaček
d33b0f9ddb Avoid overflow during statistics dump
Related: !1493
Fixes: #4467
(cherry picked from commit 7b0115e331)
2024-01-03 20:10:27 +03:00
Mark Andrews
adfb365602 NetBSD has added 'hmac' to libc so rename our uses of hmac
(cherry picked from commit fd077c2661)
2023-12-14 11:14:04 +11:00
Mark Andrews
3aaf20a2dc Ineffective DbC protections
Dereference before NULL checks.  Thanks to Eric Sesterhenn from X41
D-Sec GmbH for reporting this.

(cherry picked from commit decc17d3b0)
2023-12-06 09:01:05 +11:00
Ondřej Surý
6a85e79c0b Reformat sources with up-to-date clang-format-17 2023-11-13 17:13:07 +01:00
Michał Kępień
e974f98eb4 Improve stability of the jemalloc workaround
When jemalloc is linked into BIND 9 binaries (rather than preloaded or
used as the system allocator), depending on the decisions made by the
linker, the malloc() symbol may be resolved to a non-jemalloc
implementation at runtime.  Such a scenario foils the workaround added
in commit 2da371d005 as it relies on the
jemalloc implementation of malloc() to be executed.

Handle the above scenario properly by calling mallocx() explicitly
instead of relying on the runtime resolution of the malloc() symbol.
Use trivial wrapper functions to avoid the need to copy multiple #ifdef
lines from lib/isc/mem.c to lib/isc/trampoline.c.  Using a simpler
alternative, e.g. calling isc_mem_create() & isc_mem_destroy(), was
already considered before and rejected, as described in the log message
for commit 2da371d005.

ADJUST_ZERO_ALLOCATION_SIZE() is only used in isc__mem_free_noctx() to
concisely avoid compilation warnings about its 'size' parameter not
being used when building against jemalloc < 4.0.0 (as sdallocx() is then
redefined to dallocx(), which has a different signature).
2023-11-01 18:04:07 +01:00
Mark Andrews
ebfbad29c1 Add parentheses around macro arguement 'msec'
The is needed to ensure that the multiplication is correctly done.
This was reported by Jinmei Tatuya.
2023-10-20 10:30:48 +11:00
Michal Nowak
7c6632e174 Update the source code formatting using clang-format-17 2023-10-18 09:02:57 +02:00
Ondřej Surý
818f4dc3a7 Explicitly cast chars to unsigned chars for <ctype.h> functions
Apply the semantic patch to catch all the places where we pass 'char' to
the <ctype.h> family of functions (isalpha() and friends, toupper(),
tolower()).

(cherry picked from commit 29caa6d1f0)
2023-09-22 17:01:59 +02:00
Artem Boldariev
18efa454a9 TLS DNS: fix 'isc__nm_uvreq_t' send request use after free error
In TLS DNS, it was possible for an 'isc__nm_uvreq_t' object to be used
twice when it was not implied, leading to use after free errors,
leading to abort()'s and/or crashes.

In particular, in the older version of the code, the following was
happening:

* 'tlsdns_send_direct()' is eventually called from
'isc__nm_async_tlsdnssend()' when sending data;

* 'tlsdns_send_direct()' could fail. If it happens early closer to the
beginning of the function, then the 'isc__nm_uvreq_t' object is passed
to 'isc__nm_failed_send_cb()' and eventually gets destroyed normally
without any negative consequences;

* However, if 'tlsdns_send_direct()' fails due to the call to
'tls_cycle()', then the send request object is re-queued to be
processed asynchronously later by a call to 'tlsdns_send_enqueue()'.

* The error processing code in 'tlsdns_send_direct()' is written
without an assumption that the send request object could be re-queued
in 'tlsdns_send_direct()' and, thus, it was passed to
'isc__nm_failed_send_cb()' for asynchronous error processing.

* As a result of asynchronous processing initiated by
`tlsdns_send_enqueue()` the 'isc__nm_uvreq_t' send request object gets
properly destroyed eventually.

* Then, as a result of asynchronous error processing initiated by the
'isc__nm_failed_send_cb()' call, the send request object is used a
second time. However, it was destroyed at the previous step.

So we have a (relatively) obvious logical mistake here. Why
'tls_cycle()' was failing is of no particular importance, as I can see
many ways for it to fail both on TLS and networking levels. In
particular, during testing with 'dnsperf', I have encountered the
following two cases:

1. 'result == ISC_R_TLSERROR' - the remote side was wrapping up,
causing errors on TLS level (e.g. handshake has not been completed
yet);

2. 'result == ISC_R_CONNECTIONRESET' - the remote side closed the
connection after we have initiated the asynchronous send request
operation and reached the point when its processing started.

One could ask: why haven't we encountered the error before? That
happened because 'send()' is, in general, a faster operation than
'read()' so we were lucky enough not to encounter this, as it took
dnsperf to make our code fail in a particular way to trigger the
problematic code path. It requires closing the connection shortly
after we have made an asynchronous 'send()' attempt and passed the
data for processing by the libuv code (or, to be precise, down our
technologies stack).
2023-09-08 11:14:22 +02:00
Mark Andrews
432a49a7b0 Limit isccc_cc_fromwire recursion depth
Named and rndc do not need a lot of recursion so the depth is
set to 10.
2023-09-07 19:50:27 +02:00
Mark Andrews
7f89f2d6bc Call ERR_clear_error on EVP_MD_fetch or EVP_##alg error
(cherry picked from commit 28adcf1831)
2023-09-06 15:47:05 +00:00