Commit Graph

4397 Commits

Author SHA1 Message Date
Ondřej Surý
f128a9bcf2 Move all the unit tests to /tests/<libname>/
The unit tests are now using a common base, which means that
lib/dns/tests/ code now has to include lib/isc/include/isc/test.h and
link with lib/isc/test.c and lib/ns/tests has to include both libisc and
libdns parts.

Instead of cross-linking code between the directories, move the
/lib/<foo>/test.c to /tests/<foo>.c and /lib/<foo>/include/<foo>test.h
to /tests/include/tests/<foo>.h and create a single libtest.la
convenience library in /tests/.

At the same time, move the /lib/<foo>/tests/ to /tests/<foo>/ (but keep
it symlinked to the old location) and adjust paths accordingly.  In few
places, we are now using absolute paths instead of relative paths,
because the directory level has changed.  By moving the directories
under the /tests/ directory, the test-related code is kept in a single
place and we can avoid referencing files between libns->libdns->libisc
which is unhealthy because they live in a separate Makefile-space.

In the future, the /bin/tests/ should be merged to /tests/ and symlink
kept, and the /fuzz/ directory moved to /tests/fuzz/.

(cherry picked from commit 2c3b2dabe9)
2022-05-31 12:06:00 +02:00
Ondřej Surý
f0df0d679a Give the unit tests a big overhaul
The unit tests contain a lot of duplicated code and here's an attempt
to reduce code duplication.

This commit does several things:

1. Remove #ifdef HAVE_CMOCKA - we already solve this with automake
   conditionals.

2. Create a set of ISC_TEST_* and ISC_*_TEST_ macros to wrap the test
   implementations, test lists, and the main test routine, so we don't
   have to repeat this all over again.  The macros were modeled after
   libuv test suite but adapted to cmocka as the test driver.

   A simple example of a unit test would be:

    ISC_RUN_TEST_IMPL(test1) { assert_true(true); }

    ISC_TEST_LIST_START
    ISC_TEST_ENTRY(test1)
    ISC_TEST_LIST_END

    ISC_TEST_MAIN (Discussion: Should this be ISC_TEST_RUN ?)

   For more complicated examples including group setup and teardown
   functions, and per-test setup and teardown functions.

3. The macros prefix the test functions and cmocka entries, so the name
   of the test can now match the tested function name, and we don't have
   to append `_test` because `run_test_` is automatically prepended to
   the main test function, and `setup_test_` and `teardown_test_` is
   prepended to setup and teardown function.

4. Update all the unit tests to use the new syntax and fix a few bits
   here and there.

5. In the future, we can separate the test declarations and test
   implementations which are going to greatly help with uncluttering the
   bigger unit tests like doh_test and netmgr_test, because the test
   implementations are not declared static (see `ISC_RUN_TEST_DECLARE`
   and `ISC_RUN_TEST_IMPL` for more details.

NOTE: This heavily relies on preprocessor macros, but the result greatly
outweighs all the negatives of using the macros.  There's less
duplicated code, the tests are more uniform and the implementation can
be more flexible.

(cherry picked from commit 63fe9312ff)
2022-05-31 11:34:54 +02:00
Petr Menšík
d074386ef1 Fix failures in isc netmgr_test on big endian machines
Typing from libuv structure to isc_region_t is not possible, because
their sizes differ on 64 bit architectures. Little endian machines seems
to be lucky and still result in test passed. But big endian machine such
as s390x fails the test reliably.

Fix by directly creating the buffer as isc_region_t and skipping the
type conversion. More readable and still more correct.

(cherry picked from commit 057438cb45)
2022-05-24 20:22:57 +02:00
Ondřej Surý
eabee4d7d9 Move setting the sock->write_timeout to the async_*send
Setting the sock->write_timeout from the TCP, TCPDNS, and TLSDNS send
functions could lead to (harmless) data race when setting the value for
the first time when the isc_nm_send() function would be called from
thread not-matching the socket we are sending to.  Move the setting the
sock->write_timeout to the matching async function which is always
called from the matching thread.

(cherry picked from commit 61117840c1)
2022-05-19 22:37:52 +02:00
Ondřej Surý
b4521486ed Use C2x [[fallthrough]] when supported by LLVM/clang
Clang added support for the gcc-style fallthrough
attribute (i.e. __attribute__((fallthrough))) in version 10.  However,
__has_attribute(fallthrough) will return 1 in C mode in older versions,
even though they only support the C++11 fallthrough attribute. At best,
the unsupported attribute is simply ignored; at worst, it causes errors.

The C2x fallthrough attribute has the advantages of being supported in
the broadest range of clang versions (added in version 9) and being easy
to check for support. Use C2x [[fallthrough]] attribute if possible, and
fall back to not using an attribute for clang versions that don't have
it.

Courtesy of Joshua Root

(cherry picked from commit 14c8d43863)
2022-05-19 22:01:59 +02:00
Michal Nowak
4dde80f655 Merge tag 'v9_18_3' into v9_18
BIND 9.18.3
2022-05-19 12:07:45 +02:00
Ondřej Surý
71b0e9e5b7 Lock the trampoline when attaching
When attaching to the trampoline, the isc__trampoline_max was access
unlocked.  This would not manifest under normal circumstances because we
initialize 65 trampolines by default and that's enough for most
commodity hardware, but there are ARM machines with 128+ cores where
this would be reported by ThreadSanitizer.

Add locking around the code in isc__trampoline_attach().  This also
requires the lock to leak on exit (along with memory that we already)
because a new thread might be attaching to the trampoline while we are
running the library destructor at the same time.

(cherry picked from commit 933162ae14)
2022-05-13 13:21:49 +02:00
Artem Boldariev
bd41100295 Fix a crash by avoiding destroying TLS stream socket too early
This commit fixes a crash in generic TLS stream code, which could be
reproduced during some runs of the 'sslyze' tool.

The intention of this commit is twofold.

Firstly, it ensures that the TLS socket object cannot be destroyed too
early. Now it is being deleted alongside the underlying TCP socket
object.

Secondly, it ensures that the TLS socket object cannot be destroyed as
a result of calling 'tls_do_bio()' (the primary function which
performs encryption/decryption during the IO) as the code did not
expect that. This code path is fixed now.

(cherry picked from commit a696be6a2d)
2022-05-04 19:56:57 +02:00
Artem Boldariev
4637b72da6 Change X509_STORE_up_ref() shim return value
X509_STORE_up_ref() must return 1 on success, while the previous
implementation would return the references count. This commit fixes
that.
2022-04-28 13:39:22 +03:00
Artem Boldariev
26feac0c61 Implement shim for SSL_CTX_set1_cert_store() (affects Debian 9)
This commit implements a shim for SSL_CTX_set1_cert_store() for
OpenSSL/LibreSSL versions where it is not available.
2022-04-28 13:39:22 +03:00
Artem Boldariev
9b0eb3e5a3 Add ISC_R_TLSBADPEERCERT error code to the TLS related code
This commit adds support for ISC_R_TLSBADPEERCERT error code, which is
supposed to be used to signal for TLS peer certificates verification
in dig and other code.

The support for this error code is added to our TLS and TLS DNS
implementations.

This commit also adds isc_nm_verify_tls_peer_result_string() function
which is supposed to be used to get a textual description of the
reason for getting a ISC_R_TLSBADPEERCERT error.
2022-04-28 13:39:21 +03:00
Artem Boldariev
e2a3ec2ba5 Extend TLS context cache with CA certificates store
This commit adds support for keeping CA certificates stores associated
with TLS contexts. The intention is to keep one reusable store per a
set of related TLS contexts.
2022-04-28 13:39:21 +03:00
Artem Boldariev
6fdc03102a Add foundational functions to implement Strict/Mutual TLS
This commit adds a set of functions that can be used to implement
Strict and Mutual TLS:

* isc_tlsctx_load_client_ca_names();
* isc_tlsctx_load_certificate();
* isc_tls_verify_peer_result_string();
* isc_tlsctx_enable_peer_verification().
2022-04-28 13:39:21 +03:00
Artem Boldariev
b975ee7be4 Add utility functions to manipulate X509 certificate stores
This commit adds a set of high-level utility functions to manipulate
the certificate stores. The stores are needed to implement TLS
certificates verification efficiently.
2022-04-28 13:39:21 +03:00
Artem Boldariev
3a75b33287 Add isc_nmsocket_set_tlsctx()
This commit adds isc_nmsocket_set_tlsctx() - an asynchronous function
that replaces the TLS context within a given TLS-enabled listener
socket object. It is based on the newly added reference counting
functionality.

The intention of adding this function is to add functionality to
replace a TLS context without recreating the whole socket object,
including the underlying TCP listener socket, as a BIND process might
not have enough permissions to re-create it fully on reconfiguration.
2022-04-27 23:58:38 +03:00
Artem Boldariev
f52c06054b Maintain a per-thread TLS ctx reference in TLS stream code
This commit changes the generic TLS stream code to maintain a
per-worker thread TLS context reference.
2022-04-27 23:58:38 +03:00
Artem Boldariev
28460151ca Use isc_tlsctx_attach() in TLS DNS code
This commit adds proper reference counting for TLS contexts into
generic TLS DNS (DoT) code.
2022-04-27 23:58:38 +03:00
Artem Boldariev
ff987957e7 Use isc_tlsctx_attach() in TLS stream code
This commit adds proper reference counting for TLS contexts into
generic TLS stream code.
2022-04-27 23:58:38 +03:00
Artem Boldariev
677819d22d Add isc_tlsctx_attach()
The implementation is done on top of the reference counting
functionality found in OpenSSL/LibreSSL, which allows for avoiding
wrapping the object.

Adding this function allows using reference counting for TLS contexts
in BIND 9's codebase.
2022-04-27 23:58:38 +03:00
Artem Boldariev
8b19f62ac5 TLSDNS: call send callbacks after only the data was sent
This commit ensures that write callbacks are getting called only after
the data has been sent via the network.

Without this fix, a situation could appear when a write callback could
get called before the actual encrypted data would have been sent to
the network. Instead, it would get called right after it would have
been passed to the OpenSSL (i.e. encrypted).

Most likely, the issue does not reveal itself often because the
callback call was asynchronous, so in most cases it should have been
called after the data has been sent, but that was not guaranteed by
the code logic.

Also, this commit removes one memory allocation (netievent) from a hot
path, as there is no need to call this callback asynchronously
anymore.
2022-04-27 17:57:11 +03:00
Ondřej Surý
2a648b9078 Abort when libuv at runtime mismatches libuv at compile time
When we compile with libuv that has some capabilities via flags passed
to f.e. uv_udp_listen() or uv_udp_bind(), the call with such flags would
fail with invalid arguments when older libuv version is linked at the
runtime that doesn't understand the flag that was available at the
compile time.

Enforce minimal libuv version when flags have been available at the
compile time, but are not available at the runtime.  This check is less
strict than enforcing the runtime libuv version to be same or higher
than compile time libuv version.
2022-04-26 12:11:51 +02:00
Michał Kępień
2da371d005 Prevent memory bloat caused by a jemalloc quirk
Since version 5.0.0, decay-based purging is the only available dirty
page cleanup mechanism in jemalloc.  It relies on so-called tickers,
which are simple data structures used for ensuring that certain actions
are taken "once every N times".  Ticker data (state) is stored in a
thread-specific data structure called tsd in jemalloc parlance.  Ticks
are triggered when extents are allocated and deallocated.  Once every
1000 ticks, jemalloc attempts to release some of the dirty pages hanging
around (if any).  This allows memory use to be kept in check over time.

This dirty page cleanup mechanism has a quirk.  If the first
allocator-related action for a given thread is a free(), a
minimally-initialized tsd is set up which does not include ticker data.
When that thread subsequently calls *alloc(), the tsd transitions to its
nominal state, but due to a certain flag being set during minimal tsd
initialization, ticker data remains unallocated.  This prevents
decay-based dirty page purging from working, effectively enabling memory
exhaustion over time. [1]

The quirk described above has been addressed (by moving ticker state to
a different structure) in jemalloc's development branch [2], but not in
any numbered jemalloc version released to date (the latest one being
5.2.1 as of this writing).

Work around the problem by ensuring that every thread spawned by
isc_thread_create() starts with a malloc() call.  Avoid immediately
calling free() for the dummy allocation to prevent an optimizing
compiler from stripping away the malloc() + free() pair altogether.

An alternative implementation of this workaround was considered that
used a pair of isc_mem_create() + isc_mem_destroy() calls instead of
malloc() + free(), enabling the change to be fully contained within
isc__trampoline_run() (i.e. to not touch struct isc__trampoline), as the
compiler is not allowed to strip away arbitrary function calls.
However, that solution was eventually dismissed as it triggered
ThreadSanitizer reports when tools like dig, nsupdate, or rndc exited
abruptly without waiting for all worker threads to finish their work.

[1] https://github.com/jemalloc/jemalloc/issues/2251
[2] c259323ab3

(cherry picked from commit 7aa7b6474b)
2022-04-21 14:22:13 +02:00
Ondřej Surý
9b78612e7d Revert "Run the RPZ update as offloaded work"
This reverts commit e128b6a951.
2022-04-06 10:30:06 +02:00
Ondřej Surý
df91d61dc7 Rename shutdown() to test_shutdown() in timer_test.c
The shutdown() is part of standard library (POSIX-1), don't use such
name in the timer_test.c, but rather rename it to test_shutdown().

(cherry picked from commit 7868d8145b)
2022-04-05 01:56:09 +02:00
Ondřej Surý
cd24556e14 Enable the load-balance-sockets configuration
Previously, HAVE_SO_REUSEPORT_LB has been defined only in the private
netmgr-int.h header file, making the configuration of load balanced
sockets inoperable.

Move the missing HAVE_SO_REUSEPORT_LB define the isc/netmgr.h and add
missing isc_nm_getloadbalancesockets() implementation.

(cherry picked from commit 142c63dda8)
2022-04-05 01:38:49 +02:00
Ondřej Surý
64265f1c0e Add option to configure load balance sockets
Previously, the option to enable kernel load balancing of the sockets
was always enabled when supported by the operating system (SO_REUSEPORT
on Linux and SO_REUSEPORT_LB on FreeBSD).

It was reported that in scenarios where the networking threads are also
responsible for processing long-running tasks (like RPZ processing, CATZ
processing or large zone transfers), this could lead to intermitten
brownouts for some clients, because the thread assigned by the operating
system might be busy.  In such scenarious, the overall performance would
be better served by threads competing over the sockets because the idle
threads can pick up the incoming traffic.

Add new configuration option (`load-balance-sockets`) to allow enabling
or disabling the load balancing of the sockets.

(cherry picked from commit 85c6e797aa)
2022-04-04 23:59:59 +02:00
Ondřej Surý
e128b6a951 Run the RPZ update as offloaded work
Previously, the RPZ updates ran quantized on the main nm_worker loops.
As the quantum was set to 1024, this might lead to service
interruptions when large RPZ update was processed.

Change the RPZ update process to run as the offloaded work.  The update
and cleanup loops were refactored to do as little locking of the
maintenance lock as possible for the shortest periods of time and the db
iterator is being paused for every iteration, so we don't hold the rbtdb
tree lock for prolonged periods of time.

(cherry picked from commit f106d0ed2b)
2022-04-04 22:59:59 +02:00
Ondřej Surý
fc500b96eb Consistenly use UNREACHABLE() instead of ISC_UNREACHABLE()
In couple places, we have missed INSIST(0) or ISC_UNREACHABLE()
replacement on some branches with UNREACHABLE().  Replace all
ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().
2022-03-28 23:27:33 +02:00
Ondřej Surý
5e19bbb48a Remove use of the inline keyword used as suggestion to compiler
Historically, the inline keyword was a strong suggestion to the compiler
that it should inline the function marked inline.  As compilers became
better at optimising, this functionality has receded, and using inline
as a suggestion to inline a function is obsolete.  The compiler will
happily ignore it and inline something else entirely if it finds that's
a better optimisation.

Therefore, remove all the occurences of the inline keyword with static
functions inside single compilation unit and leave the decision whether
to inline a function or not entirely on the compiler

NOTE: We keep the usage the inline keyword when the purpose is to change
the linkage behaviour.

(cherry picked from commit 20f0936cf2)
2022-03-25 08:42:18 +01:00
Ondřej Surý
07022525ff Replace ISC_NORETURN with C11's noreturn
C11 has builtin support for _Noreturn function specifier with
convenience noreturn macro defined in <stdnoreturn.h> header.

Replace ISC_NORETURN macro by C11 noreturn with fallback to
__attribute__((noreturn)) if the C11 support is not complete.

(cherry picked from commit 04d0b70ba2)
2022-03-25 08:42:18 +01:00
Ondřej Surý
128c550a95 Simplify way we tag unreachable code with only ISC_UNREACHABLE()
Previously, the unreachable code paths would have to be tagged with:

    INSIST(0);
    ISC_UNREACHABLE();

There was also older parts of the code that used comment annotation:

    /* NOTREACHED */

Unify the handling of unreachable code paths to just use:

    UNREACHABLE();

The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.

(cherry picked from commit 584f0d7a7e)
2022-03-25 08:42:16 +01:00
Ondřej Surý
c62a94363d Add FALLTHROUGH macro for __attribute__((fallthrough))
Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which
is explicit version of the /* FALLTHROUGH */ comment we are currently
using.

Add and apply FALLTHROUGH macro that uses the attribute if available,
but does nothing on older compilers.

In one case (lib/dns/zone.c), using the macro revealed that we were
using the /* FALLTHROUGH */ comment in wrong place, remove that comment.

(cherry picked from commit fe7ce629f4)
2022-03-25 08:41:09 +01:00
Ondřej Surý
485a2b329e Add couple missing braces around single-line statements
The clang-format-15 has new option InsertBraces that could add missing
branches around single line statements.  Use that to our advantage
without switching to not-yet-released LLVM version to add missing braces
in couple of places.
2022-03-17 18:29:57 +01:00
Ondřej Surý
6ec223a539 Run .closehandle_cb asynchrounosly in nmhandle_detach_cb()
When sock->closehandle_cb is set, we need to run nmhandle_detach_cb()
asynchronously to ensure correct order of multiple packets processing in
the isc__nm_process_sock_buffer().  When not run asynchronously, it
would cause:

  a) out-of-order processing of the return codes from processbuffer();

  b) stack growth because the next TCP DNS message read callback will
     be called from within the current TCP DNS message read callback.

The sock->closehandle_cb is set to isc__nm_resume_processing() for TCP
sockets which calls isc__nm_process_sock_buffer().  If the read callback
(called from isc__nm_process_sock_buffer()->processbuffer()) doesn't
attach to the nmhandle (f.e. because it wants to drop the processing or
we send the response directly via uv_try_write()), the
isc__nm_resume_processing() (via .closehandle_cb) would call
isc__nm_process_sock_buffer() recursively.

The below shortened code path shows how the stack can grow:

 1: ns__client_request(handle, ...);
 2: isc_nm_tcpdns_sequential(handle);
 3: ns_query_start(client, handle);
 4:   query_lookup(qctx);
 5:     query_send(qctcx->client);
 6:       isc__nmhandle_detach(&client->reqhandle);
 7:         nmhandle_detach_cb(&handle);
 8:           sock->closehandle_cb(sock); // isc__nm_resume_processing
 9:             isc__nm_process_sock_buffer(sock);
10:               processbuffer(sock); // isc__nm_tcpdns_processbuffer
11:                 isc_nmhandle_attach(req->handle, &handle);
12:                 isc__nm_readcb(sock, req, ISC_R_SUCCESS);
13:                   isc__nm_async_readcb(NULL, ...);
14:                     uvreq->cb.recv(...); // ns__client_request

Instead, if 'sock->closehandle_cb' is set, we need to run detach the
handle asynchroniously in 'isc__nmhandle_detach', so that on line 8 in
the code flow above does not start this recursion. This ensures the
correct order when processing multiple packets in the function
'isc__nm_process_sock_buffer()' and prevents the stack growth.

When not run asynchronously, the out-of-order processing leaves the
first TCP socket open until all requests on the stream have been
processed.

If the pipelining is disabled on the TCP via `keep-response-order`
configuration option, named would keep the first socket in lingering
CLOSE_WAIT state when the client sends an incomplete packet and then
closes the connection from the client side.
2022-03-16 23:18:18 +01:00
Ondřej Surý
6fbf582f18 Cleanup the nmhandle attach/detach in httpd.c
In httpd.c, the send callback can directly call read callback without
calling isc_nm_resumeread().  When per-send timeout was added, this
could lead to use-after-free when shutting down the named.

Cleanup the way how we attach to .readhandle and .sendhandle, so there's
assurance that .readhandle will be always non-NULL when reading and
.sendhandle will be always non-NULL when sending.

Additionally, it was found that the implementation ignored the
"Connection: close" header and it worked only accidentally by closing
the connection after the first read from the TCP socket.  This has been
also fixed.

(cherry picked from commit 49c804f8b7)
2022-03-11 10:52:22 +01:00
Ondřej Surý
fd351a60ff On shutdown, reset the established TCP connections
Previously, the established TCP connections (both client and server)
would be gracefully closed waiting for the write timeout.

Don't wait for TCP connections to gracefully shutdown, but directly
reset them for faster shutdown.

(cherry picked from commit 6ddac2d56d)
2022-03-11 10:52:22 +01:00
Ondřej Surý
27e47c5101 Change single write timer to per-send timers
Previously, there was a single per-socket write timer that would get
restarted for every new write.  This turned out to be insufficient
because the other side could keep reseting the timer, and never reading
back the responses.

Change the single write timer to per-send timer which would in turn
reset the TCP connection on the first send timeout.

(cherry picked from commit a761aa59e3)
2022-03-11 10:52:22 +01:00
Ondřej Surý
913e64e8e1 Remove usage of deprecated ATOMIC_VAR_INIT() macro
The C17 standard deprecated ATOMIC_VAR_INIT() macro (see [1]).  Follow
the suite and remove the ATOMIC_VAR_INIT() usage in favor of simple
assignment of the value as this is what all supported stdatomic.h
implementations do anyway:

  * MacOSX.plaform: #define ATOMIC_VAR_INIT(__v) {__v}
  * Gcc stdatomic.h: #define ATOMIC_VAR_INIT(VALUE)	(VALUE)

1. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1138r0.pdf

(cherry picked from commit f251d69eba)
2022-03-09 09:25:37 +01:00
Ondřej Surý
ce9908cb4e Make isc_ht_init() and isc_ht_iter_create() return void
Previously, the function(s) in the commit subject could fail for various
reasons - mostly allocation failures, or other functions returning
different return code than ISC_R_SUCCESS.  Now, the aforementioned
function(s) cannot ever fail and they would always return ISC_R_SUCCESS.

Change the function(s) to return void and remove the extra checks in
the code that uses them.

(cherry picked from commit 8fa27365ec)
2022-03-08 20:47:06 +01:00
Ondřej Surý
b3d0c95e68 Make isc_heap_create() and isc_heap_insert() return void
Previously, the function(s) in the commit subject could fail for various
reasons - mostly allocation failures, or other functions returning
different return code than ISC_R_SUCCESS.  Now, the aforementioned
function(s) cannot ever fail and they would always return ISC_R_SUCCESS.

Change the function(s) to return void and remove the extra checks in
the code that uses them.

(cherry picked from commit bbb4cdb92d)
2022-03-08 20:24:54 +01:00
Ondřej Surý
445ce0c165 Set TCP maximum segment size to minimum size of 1220
Previously the socket code would set the TCPv6 maximum segment size to
minimum value to prevent IP fragmentation for TCP.  This was not yet
implemented for the network manager.

Implement network manager functions to set and use minimum MTU socket
option and set the TCP_MAXSEG socket option for both IPv4 and IPv6 and
use those to clamp the TCP maximum segment size for TCP, TCPDNS and
TLSDNS layers in the network manager to 1220 bytes, that is 1280 (IPv6
minimum link MTU) minus 40 (IPv6 fixed header) minus 20 (TCP fixed
header)

We already rely on a similar value for UDP to prevent IP fragmentation
and it make sense to use the same value for IPv4 and IPv6 because the
modern networks are required to support IPv6 packet sizes.  If there's
need for small TCP segment values, the MTU on the interfaces needs to be
properly configured.

(cherry picked from commit 8098a58581)
2022-03-08 11:12:43 +01:00
Ondřej Surý
2a31f19817 Set minimum MTU (1280) on IPv6 sockets
The IPV6_USE_MIN_MTU socket option directs the IP layer to limit the
IPv6 packet size to the minimum required supported MTU from the base
IPv6 specification, i.e. 1280 bytes.  Many implementations of TCP
running over IPv6 neglect to check the IPV6_USE_MIN_MTU value when
performing MSS negotiation and when constructing a TCP segment despite
MSS being defined to be the MTU less the IP and TCP header sizes (60
bytes for IPv6).  This leads to oversized IPv6 packets being sent
resulting in unintended Path Maximum Transport Unit Discovery (PMTUD)
being performed and to fragmented IPv6 packets being sent.

Add and use a function to set socket option to limit the MTU on IPv6
sockets to the minimum MTU (1280) both for UDP and TCP.

(cherry picked from commit 5d34a14f22)
2022-03-08 11:12:43 +01:00
Ondřej Surý
555bdb9f82 Replace netievent lock-free queue with simple locked queue
The current implementation of isc_queue uses Michael-Scott lock-free
queue that in turn uses hazard pointers.  It was discovered that the way
we use the isc_queue, such complicated mechanism isn't really needed,
because most of the time, we either execute the work directly when on
nmthread (in case of UDP) or schedule the work from the matching
nmthreads.

Replace the current implementation of the isc_queue with a simple locked
ISC_LIST.  There's a slight improvement - since copying the whole list
is very lightweight - we move the queue into a new list before we start
the processing and locking just for moving the queue and not for every
single item on the list.

NOTE: There's a room for future improvements - since we don't guarantee
the order in which the netievents are processed, we could have two lists
- one unlocked that would be used when scheduling the work from the
matching thread and one locked that would be used from non-matching
thread.

(cherry picked from commit 6bd025942c)
2022-03-08 09:52:39 +01:00
Aram Sargsyan
b7e84e8a26 Remove EVP_CIPHER_CTX_new() and EVP_CIPHER_CTX_free() shims
LibreSSL 3.5.0 fails to compile with these shims. We could have just
removed the LibreSSL check from the pre-processor condition, but it
seems that these shims are no longer needed because all the supported
versions of OpenSSL and LibreSSL have those functions.

According to EVP_ENCRYPTINIT(3) manual page in LibreSSL,
EVP_CIPHER_CTX_new() and EVP_CIPHER_CTX_free() first appeared in
OpenSSL 0.9.8b, and have been available since OpenBSD 4.5.

(cherry picked from commit a3789053682b57a2031de8c544134f1923e76cf3)
2022-03-02 10:49:47 +00:00
Mark Andrews
651ef3ebb8 Grow the lex token buffer in one more place
when parsing key pairs, if the '=' character fell at max_token
a protective INSIST preventing buffer overrun could be triggered.
Attempt to grow the buffer immediately before the INSIST.

Also removed an unnecessary INSIST on the opening double quote
of key buffer pair.

(cherry picked from commit 4c356d2770)
2022-03-02 01:05:14 +00:00
Ondřej Surý
806848a440 Handle TCP sockets in isc__nmsocket_reset()
The isc__nmsocket_reset() was missing a case for raw TCP sockets (used
by RNDC and DoH) which would case a assertion failure when write timeout
would be triggered.

TCP sockets are now also properly handled in isc__nmsocket_reset().

(cherry picked from commit b220fb32bd)
2022-02-28 11:17:41 +01:00
Ondřej Surý
408b79ba24 Disable inactive uvreqs caching when compiled with sanitizers
When isc__nm_uvreq_t gets deactivated, it could be just put onto array
stack to be reused later to save some initialization time.
Unfortunately, this might hide some use-after-free errors.

Disable the inactive uvreqs caching when compiled with Address or
Thread Sanitizer.

(cherry picked from commit be339b3c83)
2022-02-24 00:16:25 +01:00
Ondřej Surý
dad941a288 Disable inactive handles caching when compiled with sanitizers
When isc_nmhandle_t gets deactivated, it could be just put onto array
stack to be reused later to safe some initialization time.
Unfortunately, this might hide some use-after-free errors.

Disable the inactive handles caching when compiled with Address or
Thread Sanitizer.

(cherry picked from commit 92cce1da65)
2022-02-24 00:11:03 +01:00
Ondřej Surý
51040c2806 Remove active handles tracking from isc__nmsocket_t
The isc__nmsocket_t has locked array of isc_nmhandle_t that's not used
for anything.  The isc__nmhandle_get() adds the isc_nmhandle_t to the
locked array (and resized if necessary) and removed when
isc_nmhandle_put() finally destroys the handle.  That's all it does, so
it serves no useful purpose.

Remove the .ah_handles, .ah_size, and .ah_frees members of the
isc__nmsocket_t and .ah_pos member of the isc_nmhandle_t struct.

(cherry picked from commit e2555a306f)
2022-02-23 23:49:13 +01:00
Ondřej Surý
afe8a60f98 Delay isc__nm_uvreq_t deallocation to connection callback
When the TCP, TCPDNS or TLSDNS connection times out, the isc__nm_uvreq_t
would be pushed into sock->inactivereqs before the uv_tcp_connect()
callback finishes.  Because the isc__nmsocket_t keeps the list of
inactive isc__nm_uvreq_t, this would cause use-after-free only when the
sock->inactivereqs is full (which could never happen because the failure
happens in connection timeout callback) or when the sock->inactivereqs
mechanism is completely removed (f.e. when running under Address or
Thread Sanitizer).

Delay isc__nm_uvreq_t deallocation to the connection callback and only
signal the connection callback should be called by shutting down the
libuv socket from the connection timeout callback.

(cherry picked from commit 3268627916)
2022-02-23 23:31:18 +01:00