Due to the added attach/detach tracing in the netmgr-v2 code, the
libns tests needs to be adjusted as the real function names have
changed from isc_nmhandle_* to isc__nmhandle_*.
The isc/util.h header redefine the DbC checks (REQUIRE, INSIST, ...) to
be cmocka "fake" assertions. However that means that cmocka.h needs to
be included after UNIT_TESTING is defined but before isc/util.h is
included. Because isc/util.h is included in most of the project headers
this means that the sequence MUST be:
#define UNIT_TESTING
#include <cmocka.h>
#include <isc/_anything_.h>
See !2204 for other header requirements for including cmocka.h.
(cherry picked from commit 0ba697fe8c)
This commit extends the perl Configure script to also check for libssl
in addition to libcrypto and change the vcxproj source files to link
with both libcrypto and libssl.
After turning the users callbacks to be asynchronous, there was a
visible performance drop. This commit prevents the unnecessary
allocations while keeping the code paths same for both asynchronous and
synchronous calls.
The same change was done to the isc__nm_udp_{read,send} as those two
functions are in the hot path.
(cherry picked from commit d6d2fbe0e9)
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested. It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.
NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.
The changes that are included in this commit are listed here
(extensively, but not exclusively):
* The netmgr_test unit test was split into individual tests (udp_test,
tcp_test, tcpdns_test and newly added tcp_quota_test)
* The udp_test and tcp_test has been extended to allow programatic
failures from the libuv API. Unfortunately, we can't use cmocka
mock() and will_return(), so we emulate the behaviour with #define and
including the netmgr/{udp,tcp}.c source file directly.
* The netievents that we put on the nm queue have variable number of
members, out of these the isc_nmsocket_t and isc_nmhandle_t always
needs to be attached before enqueueing the netievent_<foo> and
detached after we have called the isc_nm_async_<foo> to ensure that
the socket (handle) doesn't disappear between scheduling the event and
actually executing the event.
* Cancelling the in-flight TCP connection using libuv requires to call
uv_close() on the original uv_tcp_t handle which just breaks too many
assumptions we have in the netmgr code. Instead of using uv_timer for
TCP connection timeouts, we use platform specific socket option.
* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}
When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
waiting for socket to either end up with error (that path was fine) or
to be listening or connected using condition variable and mutex.
Several things could happen:
0. everything is ok
1. the waiting thread would miss the SIGNAL() - because the enqueued
event would be processed faster than we could start WAIT()ing.
In case the operation would end up with error, it would be ok, as
the error variable would be unchanged.
2. the waiting thread miss the sock->{connected,listening} = `true`
would be set to `false` in the tcp_{listen,connect}close_cb() as
the connection would be so short lived that the socket would be
closed before we could even start WAIT()ing
* The tcpdns has been converted to using libuv directly. Previously,
the tcpdns protocol used tcp protocol from netmgr, this proved to be
very complicated to understand, fix and make changes to. The new
tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
Closes: #2194, #2283, #2318, #2266, #2034, #1920
* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
pass accepted TCP sockets between netthreads, but instead (similar to
UDP) uses per netthread uv_loop listener. This greatly reduces the
complexity as the socket is always run in the associated nm and uv
loops, and we are also not touching the libuv internals.
There's an unfortunate side effect though, the new code requires
support for load-balanced sockets from the operating system for both
UDP and TCP (see #2137). If the operating system doesn't support the
load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
on FreeBSD 12+), the number of netthreads is limited to 1.
* The netmgr has now two debugging #ifdefs:
1. Already existing NETMGR_TRACE prints any dangling nmsockets and
nmhandles before triggering assertion failure. This options would
reduce performance when enabled, but in theory, it could be enabled
on low-performance systems.
2. New NETMGR_TRACE_VERBOSE option has been added that enables
extensive netmgr logging that allows the software engineer to
precisely track any attach/detach operations on the nmsockets and
nmhandles. This is not suitable for any kind of production
machine, only for debugging.
* The tlsdns netmgr protocol has been split from the tcpdns and it still
uses the old method of stacking the netmgr boxes on top of each other.
We will have to refactor the tlsdns netmgr protocol to use the same
approach - build the stack using only libuv and openssl.
* Limit but not assert the tcp buffer size in tcp_alloc_cb
Closes: #2061
(cherry picked from commit 634bdfb16d)
When calling the high level netmgr functions, the callback would be
sometimes called synchronously if we catch the failure directly, or
asynchronously if it happens later. The synchronous call to the
callback could create deadlocks as the caller would not expect the
failed callback to be executed directly.
(cherry picked from commit a49d88568f)
This commit adds couple of additional safeguards against running
sends/reads on inactive sockets. The changes was modeled after the
changes we made to netmgr/tcpdns.c
(cherry picked from commit fa424225af)
Parse the configuration of tls objects into SSL_CTX* objects. Listen on
DoT if 'tls' option is setup in listen-on directive. Use DoT/DoH ports
for DoT/DoH.
(cherry picked from commit 38b78f59a0)
- peer address was not being reported correctly by "dig +tls"
- the protocol used is now reported in the dig output: UDP, TCP, or TLS.
(cherry picked from commit 8886569e9d)
Add server-side TLS support to netmgr - that includes moving some of the
isc_nm_ functions from tcp.c to a wrapper in netmgr.c calling a proper
tcp or tls function, and a new isc_nm_listentls() function.
Add DoT support to tcpdns - isc_nm_listentlsdns().
(cherry picked from commit b2ee0e9dc3)
there were two failures during observed in testing, both occurring
when 'rndc halt' was run rather than 'rndc stop' - the latter dumps
zone contents to disk and presumably introduced enough delay to
prevent the races:
- a failure when the zone was shut down and called dns_xfrin_detach()
before the xfrin had finished connecting; the connect timeout
terminated without detaching its handle
- a failure when the tcpdns socket timer fired after the outerhandle
had already been cleared.
this commit incidentally addresses a failure observed in mutexatomic
due to a variable having been initialized incorrectly.
socket() call can return an error - e.g. EMFILE, so we need to handle
this nicely and not crash.
Additionally wrap the socket() call inside a platform independent helper
function as the Socket data type on Windows is unsigned integer:
> This means, for example, that checking for errors when the socket and
> accept functions return should not be done by comparing the return
> value with –1, or seeing if the value is negative (both common and
> legal approaches in UNIX). Instead, an application should use the
> manifest constant INVALID_SOCKET as defined in the Winsock2.h header
> file.
(cherry picked from commit 8af7f81d6c)
Because we use result earlier for setting the loadbalancing on the
socket, we could be left with a ISC_R_NOTIMPLEMENTED value stored in the
variable and when the UDP connection would succeed, we would
errorneously return this value instead of ISC_R_SUCCESS.
(cherry picked from commit 050258bda4)
use isc_nmhandle_settimeout() to set read/recv timeouts, and get rid
of connect_timeout() and related functions in dighost.c.
(cherry picked from commit ea2b04c361)
this function sets the read timeout for the socket associated
with a netmgr handle and, if the timer is running, resets it.
for TCPDNS sockets it also sets the read timeout and resets the
timer on the outer TCP socket.
(cherry picked from commit 4be63c5b00)
When we are operating on the tcpdns socket, we need to double check
whether the socket or its outerhandle or its listener or its mgr is
still active and when not, bail out early.
(cherry picked from commit c14c1fdd2c)
If dnslisten_readcb gets a read callback it needs to verify that the
outer socket wasn't closed in the meantime, and issue a CANCELED callback
if it was.
(cherry picked from commit 3ab3d90de0)
When binding a TCP socket, if bind() fails with EADDRINUSE,
try again with REUSEPORT/REUSEADDR (or the equivalent options).
(cherry picked from commit 26a3a22895)
There were more races that could happen while connecting to a
socket while closing or shutting down the same socket. This
commit introduces a .closing flag to guard the socket from
being closed twice.
(cherry picked from commit ed3ab63f74)
There was a data race where a new event could be scheduled after
isc__nm_async_shutdown() had cleaned up all the dangling UDP/TCP
sockets from the loop.
(cherry picked from commit 6cfadf9db0)
- more logical code flow.
- propagate errors back to the caller.
- add a 'reading' flag and call the callback from failed_read_cb()
only when it the socket was actively reading.
(cherry picked from commit 5fcd52209a)
- don't bother closing sockets that are already closing.
- UDP read timeout timer was not stopped after reading.
- improve handling of TCP connection failures.
(cherry picked from commit cdccac4993)
- isc_nm_tcpdnsconnect() sets up up an outgoing TCP DNS connection.
- isc_nm_tcpconnect(), _udpconnect() and _tcpdnsconnect() now take a
timeout argument to ensure connections time out and are correctly
cleaned up on failure.
- isc_nm_read() now supports UDP; it reads a single datagram and then
stops until the next time it's called.
- isc_nm_cancelread() now runs asynchronously to prevent assertion
failure if reading is interrupted by a non-network thread (e.g.
a timeout).
- isc_nm_cancelread() can now apply to UDP sockets.
- added shim code to support UDP connection in versions of libuv
prior to 1.27, when uv_udp_connect() was added
all these functions will be used to support outgoing queries in dig,
xfrin, dispatch, etc.
(cherry picked from commit 5dcdc00b93)
Since the queries sent towards root and TLD servers are now included in
the count (as a result of the fix for CVE-2020-8616),
"max-recursion-queries" has a higher chance of being exceeded by
non-attack queries. Increase its default value from 75 to 100.
(cherry picked from commit ab0bf49203)
Fallback to TCP when we have already seen a DNS COOKIE response
from the given address and don't have one in this UDP response. This
could be a server that has turned off DNS COOKIE support, a
misconfigured anycast server with partial DNS COOKIE support, or a
spoofed response. Falling back to TCP is the correct behaviour in
all 3 cases.
(cherry picked from commit 0e3b1f5a25)
Return value of dns_db_getservestalerefresh() and
dns_db_getservestalettl() functions were previously unhandled.
This commit purposefully ignore those return values since there is
no side effect if those results are != ISC_R_SUCCESS, it also supress
Coverity warnings.
Add unit test to ensure the right NSEC3PARAM event is scheduled in
'dns_zone_setnsec3param()'. To avoid scheduling and managing actual
tasks, split up the 'dns_zone_setnsec3param()' function in two parts:
1. 'dns__zone_lookup_nsec3param()' that will check if the requested
NSEC3 parameters already exist, and if a new salt needs to be
generated.
2. The actual scheduling of the new NSEC3PARAM event (if needed).
(cherry picked from commit 64db30942d)
When generating a new salt, compare it with the previous NSEC3
paremeters to ensure the new parameters are different from the
previous ones.
This moves the salt generation call from 'bin/named/*.s' to
'lib/dns/zone.c'. When setting new NSEC3 parameters, you can set a new
function parameter 'resalt' to enforce a new salt to be generated. A
new salt will also be generated if 'salt' is set to NULL.
Logging salt with zone context can now be done with 'dnssec_log',
removing the need for 'dns_nsec3_log_salt'.
(cherry picked from commit 6b5d7357df)
Upon request from Mark, change the configuration of salt to salt
length.
Introduce a new function 'dns_zone_checknsec3aram' that can be used
upon reconfiguration to check if the existing NSEC3 parameters are
in sync with the configuration. If a salt is used that matches the
configured salt length, don't change the NSEC3 parameters.
(cherry picked from commit 6f97bb6b1f)
NSEC3 is not backwards compatible with key algorithms that existed
before the RFC 5155 specification was published.
(cherry picked from commit 00c5dabea3)
Check 'nsec3param' configuration for the number of iterations. The
maximum number of iterations that are allowed are based on the key
size (see https://tools.ietf.org/html/rfc5155#section-10.3).
Check 'nsec3param' configuration for correct salt. If the string is
not "-" or hex-based, this is a bad salt.
(cherry picked from commit 7039c5f805)
Implement support for NSEC3 in dnssec-policy. Store the configuration
in kasp objects. When configuring a zone, call 'dns_zone_setnsec3param'
to queue an nsec3param event. This will ensure that any previous
chains will be removed and a chain according to the dnssec-policy is
created.
Add tests for dnssec-policy zones that uses the new 'nsec3param'
option, as well as changing to new values, changing to NSEC, and
changing from NSEC.
(cherry picked from commit 114af58ee2)
We will be using this function also on reconfig, so it should have
a wider availability than just bin/named/server.
(cherry picked from commit 84a4273074)
Make sure pointer checks in unit tests use cmocka assertion macros
dedicated for use with pointers instead of those dedicated for use with
integers or booleans.
(cherry picked from commit f440600126)
cppcheck 2.2 reports the following false positive:
lib/isc/tests/quota_test.c:71:21: error: Array 'quotas[101]' accessed at index 110, which is out of bounds. [arrayIndexOutOfBounds]
isc_quota_t *quotas[110];
^
The above is not even an array access, so this report is obviously
caused by a cppcheck bug. Yet, it seems to be triggered by the presence
of the add_quota() macro, which should really be a function. Convert
the add_quota() macro to a function in order to make the code cleaner
and to prevent the above cppcheck 2.2 false positive from being
triggered.
(cherry picked from commit ea54a932d2)