Commit Graph

4927 Commits

Author SHA1 Message Date
Ondřej Surý
97a9e4711c Remove code to read and parse /proc/net/if_inet6 on Linux
The getifaddr() works fine for years, so we don't have to
keep the callback to parse /proc/net/if_inet6 anymore.

(cherry picked from commit 2fbf9757b8)
2024-08-19 11:49:56 +00:00
Ondřej Surý
2a0454f881 Ignore errno returned from rewind() in the interface iterator
The clang-scan 19 has reported that we are ignoring errno after the call
to rewind().  As we don't really care about the result, just silence the
error, the whole code will be removed in the development version anyway
as it is not needed.

(cherry picked from commit dda5ba53df)
2024-08-19 11:49:56 +00:00
Ondřej Surý
530f1dd913 Check the result of dirfd() before calling unlinkat()
Instead of directly using the result of dirfd() in the unlinkat() call,
check whether the returned file descriptor is actually valid.  That
doesn't really change the logic as the unlinkat() would fail with
invalid descriptor anyway, but this is cleaner and will report the right
error returned directly by dirfd() instead of EBADF from unlinkat().

(cherry picked from commit 59f4fdebc0)
2024-08-19 10:03:08 +00:00
Ondřej Surý
dc4c0397eb Use constexpr for NS_PER_SEC and friends constants
The contexpr introduced in C23 standard makes perfect sense to be used
instead of preprocessor macros - the symbols are kept, etc.  Define
ISC_CONSTEXPR to be `constexpr` for C23 and `static const` for the older
C standards.  Use the newly introduced macro for the NS_PER_SEC and
friends time constants.

(cherry picked from commit 122a142241)
2024-08-19 09:10:04 +00:00
Ondřej Surý
27a7647559 Change the NS_PER_SEC (and friends) from enum to static const
New version of clang (19) has introduced a stricter checks when mixing
integer (and float types) with enums.  In this case, we used enum {}
as C17 doesn't have constexpr yet.  Change the time conversion constants
to be static const unsigned int instead of enum values.

(cherry picked from commit b03e90e0d4)
2024-08-19 09:10:04 +00:00
Aram Sargsyan
864d55081e Check if logconfig is NULL before using it in isc_log_doit()
Check if 'lctx->logconfig' is NULL before using it in isc_log_doit(),
because it's possible that isc_log_destroy() was already called, e.g.
when a 'call_rcu' function wants to log a message during shutdown.

(cherry picked from commit 656e04f48a)
2024-08-15 14:27:29 +00:00
Ondřej Surý
14302330f4 Skip already rehashed positions in the old hashmap table
When iterating through the old internal hashmap table, skip all the
nodes that have been already migrated to the new table.  We know that
all positions with index less than .hiter are NULL.

(cherry picked from commit 3e4d153453)
2024-08-15 12:09:28 +00:00
Ondřej Surý
61b88c56cd Fix the assertion failure in the isc_hashmap iterator
When the round robin hashing reorders the map entries on deletion, we
were adjusting the iterator table size only when the reordering was
happening at the internal table boundary.  The iterator table size had
to be reduced by one to prevent seeing the entry that resized on
position [0] twice because it migrated to [iter->size - 1] position.

However, the same thing could happen when the same entry migrates a
second time from [iter->size - 1] to [iter->size - 2] position (and so
on) because the check that we are manipulating the entry just in the [0]
position was insufficient.  Instead of checking the position [pos == 0],
we now check that the [pos % iter->size == 0], thus ignoring all the
entries that might have moved back to the end of the internal table.

(cherry picked from commit acdc57259f)
2024-08-15 12:09:28 +00:00
Ondřej Surý
bbf34c0604 Disassociate the SSL object from the cached SSL_SESSION
When the SSL object was destroyed, it would invalidate all SSL_SESSION
objects including the cached, but not yet used, TLS session objects.

Properly disassociate the SSL object from the SSL_SESSION before we
store it in the TLS session cache, so we can later destroy it without
invalidating the cached TLS sessions.

Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
Co-authored-by: Aram Sargsyan <aram@isc.org>
(cherry picked from commit c11b736e44)
2024-08-07 15:25:29 +00:00
Ondřej Surý
c6daaa4b8c Attach/detach to the listening child socket when accepting TLS
When TLS connection (TLSstream) connection was accepted, the children
listening socket was not attached to sock->server and thus it could have
been freed before all the accepted connections were actually closed.

In turn, this would cause us to call isc_tls_free() too soon - causing
cascade errors in pending SSL_read_ex() in the accepted connections.

Properly attach and detach the children listening socket when accepting
and closing the server connections.

(cherry picked from commit 684f3eb8e6)
2024-08-07 15:16:50 +00:00
Ondřej Surý
b0ba2b72e6 Call rcu_barrier() in the isc_mem_destroy() just once
The previous work in this area was led by the belief that we might be
calling call_rcu() from within call_rcu() callbacks.  After carefully
checking all the current callback, it became evident that this is not
the case and the problem isn't enough rcu_barrier() calls, but something
entirely else.

Call the rcu_barrier() just once as that's enough and the multiple
rcu_barrier() calls will not hide the real problem anymore, so we can
find it.

(cherry picked from commit 13941c8ca7)
2024-08-05 11:39:30 +00:00
Ondřej Surý
506138ec0f Fix the assertion failure when putting 48-bit number to buffer
When putting the 48-bit number into a fixed-size buffer that's exactly 6
bytes, the assertion failure would occur as the 48-bit number is
internally represented as 64-bit number and the code was checking if
there is enough space for `sizeof(val)`.  This causes assertion failure
when otherwise valid TSIG signature has a bad timing information.

Specify the size of the argument explicitly, so the 48-bit number
doesn't require 8-byte long buffer.

(cherry picked from commit 37dbd57c16)
2024-08-05 11:11:40 +00:00
Ondřej Surý
80738e98bd Fix PTHREAD_MUTEX_ADAPTIVE_NP and PTHREAD_MUTEX_ERRORCHECK_NP usage
The PTHREAD_MUTEX_ADAPTIVE_NP and PTHREAD_MUTEX_ERRORCHECK_NP are
usually not defines, but enum values, so simple preprocessor check
doesn't work.

Check for PTHREAD_MUTEX_ADAPTIVE_NP from the autoconf AS_COMPILE_IFELSE
block and define HAVE_PTHREAD_MUTEX_ADAPTIVE_NP.  This should enable
adaptive mutex on Linux and FreeBSD.

As PTHREAD_MUTEX_ERRORCHECK actually comes from POSIX and Linux glibc
does define it when compatibility macros are being set, we can just use
PTHREAD_MUTEX_ERRORCHECK instead of PTHREAD_MUTEX_ERRORCHECK_NP.

(cherry picked from commit cc4f99bc6d)
2024-08-05 09:13:07 +00:00
Ondřej Surý
5d76ef21f0 Remove ISC_MUTEX_INITIALIZER
It's hard to get it right on different platforms and it's unused
in BIND 9 anyway.

(cherry picked from commit f158884344)
2024-08-05 09:13:07 +00:00
Mark Andrews
fbcdfefd2d Properly compute the physical memory size
On a 32 bit machine casting to size_t can still lead to an overflow.
Cast to uint64_t.  Also detect all possible negative values for
pages and pagesize to silence warning about possible negative value.

    39#if defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE)
    	1. tainted_data_return: Called function sysconf(_SC_PHYS_PAGES),
           and a possible return value may be less than zero.
    	2. assign: Assigning: pages = sysconf(_SC_PHYS_PAGES).
    40        long pages = sysconf(_SC_PHYS_PAGES);
    41        long pagesize = sysconf(_SC_PAGESIZE);
    42
    	3. Condition pages == -1, taking false branch.
    	4. Condition pagesize == -1, taking false branch.
    43        if (pages == -1 || pagesize == -1) {
    44                return (0);
    45        }
    46
    	5. overflow: The expression (size_t)pages * pagesize might be negative,
           but is used in a context that treats it as unsigned.

    CID 498034: (#1 of 1): Overflowed return value (INTEGER_OVERFLOW)
    6. return_overflow: (size_t)pages * pagesize, which might have underflowed,
       is returned from the function.
    47        return ((size_t)pages * pagesize);
    48#endif /* if defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE) */

(cherry picked from commit e8dbc5db92)
2024-07-31 07:30:35 +00:00
Artem Boldariev
5781ff3a93 Drop expired but not accepted TCP connections
This commit ensures that we are not attempting to accept an expired
TCP connection as we are not interested in any data that could have
been accumulated in its internal buffers. Now we just drop them for
good.
2024-07-03 15:03:02 +03:00
Ondřej Surý
bc3e713317 Throttle the reading when writes are asynchronous
Be more aggressive when throttling the reading - when we can't send the
outgoing TCP synchronously with uv_try_write(), we start throttling the
reading immediately instead of waiting for the send buffers to fill up.

This should not affect behaved clients that read the data from the TCP
on the other end.
2024-07-03 08:45:39 +02:00
Artem Boldariev
55b1a093ea Do not un-throttle TCP connections on isc_nm_read()
Due to omission it was possible to un-throttle a TCP connection
previously throttled due to the peer not reading back data we are
sending.

In particular, that affected DoH code, but it could also affect other
transports (the current or future ones) that pause/resume reading
according to its internal state.
2024-06-12 13:44:37 +03:00
Ondřej Surý
4c2ac25a95 Limit the number of DNS message processed from a single TCP read
The single TCP read can create as much as 64k divided by the minimum
size of the DNS message.  This can clog the processing thread and trash
the memory allocator because we need to do as much as ~20k allocations in
a single UV loop tick.

Limit the number of the DNS messages processed in a single UV loop tick
to just single DNS message and limit the number of the outstanding DNS
messages back to 23.  This effectively limits the number of pipelined
DNS messages to that number (this is the limit we already had before).
2024-06-10 16:48:54 +02:00
Ondřej Surý
4e7c4af17f Throttle reading from TCP if the sends are not getting through
When TCP client would not read the DNS message sent to them, the TCP
sends inside named would accumulate and cause degradation of the
service.  Throttle the reading from the TCP socket when we accumulate
enough DNS data to be sent.  Currently this is limited in a way that a
single largest possible DNS message can fit into the buffer.
2024-06-10 16:48:52 +02:00
Artem Boldariev
d80dfbf745 Keep the endpoints set reference within an HTTP/2 socket
This commit ensures that an HTTP endpoints set reference is stored in
a socket object associated with an HTTP/2 stream instead of
referencing the global set stored inside a listener.

This helps to prevent an issue like follows:

1. BIND is configured to serve DoH clients;
2. A client is connected and one or more HTTP/2 stream is
created. Internal pointers are now pointing to the data on the
associated HTTP endpoints set;
3. BIND is reconfigured - the new endpoints set object is created and
promoted to all listeners;
4. The old pointers to the HTTP endpoints set data are now invalid.

Instead referencing a global object that is updated on
re-configurations we now store a local reference which prevents the
endpoints set objects to go out of scope prematurely.
2024-06-10 16:40:12 +02:00
Artem Boldariev
c41fb499b9 DoH: avoid potential use after free for HTTP/2 session objects
It was reported that HTTP/2 session might get closed or even deleted
before all async. processing has been completed.

This commit addresses that: now we are avoiding using the object when
we do not need it or specifically check if the pointers used are not
'NULL' and by ensuring that there is at least one reference to the
session object while we are doing incoming data processing.

This commit makes the code more resilient to such issues in the
future.
2024-06-10 16:40:10 +02:00
Ondřej Surý
a9b4d42346 Add isc_queue implementation on top of cds_wfcq
Add an isc_queue implementation that hides the gory details of cds_wfcq
into more neat API.  The same caveats as with cds_wfcq.

TODO: Add documentation to the API.
2024-06-05 09:19:56 +02:00
Mark Andrews
9be1873ef3 Add helper function isc_sockaddr_disabled 2024-06-03 18:34:31 +10:00
Matthijs Mekking
c40e5c8653 Call reset_shutdown if uv_tcp_close_reset failed
If uv_tcp_close_reset() returns an error code, this means the
reset_shutdown callback has not been issued, so do it now.
2024-06-03 10:14:47 +02:00
Matthijs Mekking
5b94bb2129 Do not runtime check uv_tcp_close_reset
When we reset a TCP connection by sending a RST packet, do not bother
requiring the result is a success code.
2024-06-03 10:14:47 +02:00
Aydın Mercan
49e62ee186 fix typing mistakes in trace macros
The detach function declaration in `ISC__REFCOUNT_TRACE_DECL` had an
returned an accidental implicit int. While not allowed since C99, it
became an error by default in GCC 14.

`ISC_REFCOUNT_TRACE_IMPL` and `ISC_REFCOUNT_STATIC_TRACE_IMPL` expanded
into the wrong macros, trying to declare it again with the wrong number
of parameters.
2024-05-17 18:11:23 -07:00
Mark Andrews
b7de2c7cb9 Clang-format header file changes 2024-05-17 16:03:21 -07:00
Ondřej Surý
eb862ce509 Properly attach/detach isc_httpd in case read ends earlier than send
An assertion failure would be triggered when sending the TCP data ends
after the TCP reading gets closed.  Implement proper reference counting
for the isc_httpd object.
2024-05-15 12:22:10 +02:00
Aydın Mercan
09e4fb2ffa Return the old counter value in isc_stats_increment
Returning the value allows for better high-water tracking without
running into edge cases like the following:

0. The counter is at value X
1. Increment the value (X+1)
2. The value is decreased multiple times in another threads (X+1-Y)
3. Get the value (X+1-Y)
4. Update-if-greater misses the X+1 value which should have been the
   high-water
2024-05-10 12:08:52 +03:00
Mark Andrews
88c48dde5e Stop processing catalog zone changes when shutting down
Abandon catz_addmodzone_cb  and catz_delzone_cb processing if the
loop is shutting down.
2024-05-09 08:17:44 +10:00
Evan Hunt
a5d0e6c4ba add static macros for ISC_REFCOUNT_DECL/IMPL
this commit adds a mechanism to statically declare attach/detach
and ref/unref methods, for objects that are only accessed within
a single C file.
2024-04-30 12:31:48 -07:00
Michal Nowak
f454fa6dea Update sources to Clang 18 formatting 2024-04-23 13:11:52 +02:00
Ondřej Surý
23835c4afe Use xmlMemSetup() instead of xmlGcMemSetup()
Since we don't have a specialized function for "atomic" allocations,
it's better to just use xmlMemSetup() instead of xmlGcMemSetup()
according to this:

https://mail.gnome.org/archives/xml/2007-August/msg00032.html
2024-04-18 10:53:31 +02:00
Ondřej Surý
950f828cd2 Offload the isc_http response processing to worker thread
Prepare the statistics channel data in the offloaded worker thread, so
the networking thread is not blocked by the process gathering data from
various data structures.  Only the netmgr send is then run on the
networkin thread when all the data is already there.
2024-04-18 10:53:00 +02:00
Evan Hunt
63659e2e3a complete removal of isc_loop_current()
isc_loop() can now take its place.

This also requires changes to the test harness - instead of running the
setup and teardown outside of th main loop, we now schedule the setup
and teardown to run on the loop (via isc_loop_setup() and
isc_loop_teardown()) - this is needed because the new the isc_loop()
call has to be run on the active event loop, but previously the
isc_loop_current() (and the variants like isc_loop_main()) would work
even outside of the loop because it needed just isc_tid() to work, but
not the full loop (which was mainly true for the main thread).
2024-04-02 10:35:56 +02:00
Evan Hunt
c47fa689d4 use a thread-local variable to get the current running loop
if we had a method to get the running loop, similar to how
isc_tid() gets the current thread ID, we can simplify loop
and loopmgr initialization.

remove most uses of isc_loop_current() in favor of isc_loop().
in some places where that was the only reason to pass loopmgr,
remove loopmgr from the function parameters.
2024-04-02 10:35:56 +02:00
Michał Kępień
8610799317 Merge tag 'v9.19.21'
BIND 9.19.21
2024-02-14 13:24:56 +01:00
Mark Andrews
dd57db2274 Remove duplicate unreachable code block
This was accidentially left in during the developement of !8299.
2024-02-12 15:18:46 +11:00
Ondřej Surý
175655b771 Fix case insensitive matching in isc_ht hash table implementation
The case insensitive matching in isc_ht was basically completely broken
as only the hashvalue computation was case insensitive, but the key
comparison was always case sensitive.
2024-02-11 09:36:56 +01:00
Aydın Mercan
a911949ebc Convert rwlock in isc_log_t to RCU
The isc_log_t contains a isc_logconfig_t that is swapped, dereferenced
or accessed its fields through a mutex. Instead of protecting it with a
rwlock, use RCU.
2024-02-09 13:11:48 +03:00
Ondřej Surý
15329d471e Add memory pools for isc_nmsocket_t structures
To reduce memory pressure, we can add light per-loop (netmgr worker)
memory pools for isc_nmsocket_t structures.  This will help in
situations where there's a lot of churn creating and destroying the
nmsockets.
2024-02-08 15:13:47 +01:00
Ondřej Surý
750bd364b5 Reduce the isc_nmsocket_t size from 1840 to 1208 bytes
Embedding isc_nmsocket_h2_t directly inside isc_nmsocket_t had increased
the size of isc_nmsocket_t to 1840 bytes.  Making the isc_nmsocket_h2_t
to be a pointer to the structure and allocated on demand allows us to
reduce the size to 1208 bytes.  While there are still some possible
reductions in the isc_nmsocket_t (embedded tlsstream, streamdns
structures), this was the far biggest drop in the memory usage.
2024-02-08 15:13:47 +01:00
Ondřej Surý
eada7b6e13 Reduce struct isc__nm_uvreq size from 1560 to 560 bytes
The uv_req union member of struct isc__nm_uvreq contained libuv request
types that we don't use.  Turns out that uv_getnameinfo_t is 1000 bytes
big and unnecessarily enlarged the whole structure.  Remove all the
unused members from the uv_req union.
2024-02-08 15:13:47 +01:00
Ondřej Surý
2367b6a2e1 Reduce sizeof isc_sockaddr from 152 to 48 bytes
After removing sockaddr_unix from isc_sockaddr, we can also remove
sockaddr_storage and reduce the isc_sockaddr size from 152 bytes to just
48 bytes needed to hold IPv6 addresses.
2024-02-08 15:13:47 +01:00
Ondřej Surý
2463e5232d Use proper padding instead of using alignas()
As it was pointed out, the alignas() can't be used on objects larger
than `max_align_t` otherwise the compiler might miscompile the code to
use auto-vectorization on unaligned memory.

As we were only using alignas() as a way to prevent false memory
sharing, we can use manual padding in the affected structures.
2024-02-08 10:54:35 +01:00
Ondřej Surý
0c18ed7ec6 Remove isc__tls_setfatalmode() function and the calls
With _exit() instead of exit() in place, we don't need
isc__tls_setfatalmode() mechanism as the atexit() calls will not be
executed including OpenSSL atexit hooks.
2024-02-08 08:01:58 +01:00
Ondřej Surý
e140743e6a Improve the rcu_barrier() call when destroying the mem context
Instead of crude 5x rcu_barrier() call in the isc__mem_destroy(), change
the mechanism to call rcu_barrier() until the memory use and references
stops decreasing.  This should deal with any number of nested call_rcu()
levels.

Additionally, don't destroy the contextslock if the list of the contexts
isn't empty.  Destroying the lock could make the late threads crash.
2024-02-08 08:01:58 +01:00
Ondřej Surý
2c98ccbdba Use error checking mutex in developer mode on Linux
When developer mode is enabled, use error checking mutex type, so we can
discover wrong use of mutexes faster.
2024-02-07 20:54:05 +01:00
Ondřej Surý
01038d894f Always use adaptive mutexes on Linux
When adaptive mutexes are available (with glibc), always use them.
Remove the autoconf switch and also fix the static initializer.
2024-02-07 20:54:05 +01:00