Commit Graph

14387 Commits

Author SHA1 Message Date
Evan Hunt
9c577e10c3 use separate barriers for "stop" and "listen" operations
On some platforms, when a synchronizing barrier is cleared, one
thread can progress while other threads are still in the process
of releasing the barrier. If a barrier is reused by the progressing
thread during this window, it can cause a deadlock. This can occur if,
for example, we stop listening immediately after we start, because the
stop and listen functions both use socket->barrier.  This has been
addressed by using separate barrier objects for stop and listen.
2023-01-07 16:30:21 -08:00
Ondřej Surý
44135371df Deduplicate DNS_RBTDB_STRONG_RWLOCK_CHECK macros
There were couple of redundant macros on both sides of
DNS_RBTDB_STRONG_RWLOCK_CHECK #ifdef block.  Use a single set of
macros, but disable the extra REQUIRES if the #define is not set.
2023-01-06 08:56:31 +01:00
Ondřej Surý
d693c2e7a0 Extend expire_header() to check node lock type
Extend the expire_header() to accept the node lock type as one of the
arguments and check whether the the node lock is always write locked +
fix that bug.

While doing that, it was found that expire_header() invocation in
rdataset_expire() passes `false` as a type of tree lock instead of
`isc_rwlocktype_none`.

(Un)fortunately, both values mapped to 0, so no harm was done, but it
has been fixed nevertheless.
2023-01-06 08:43:16 +01:00
Ondřej Surý
20670ee22d Replace repetetive _TRYUPGRADE() with _FORCEUPGRADE() macros
There was a repetetive pattern:

    if (NODE_TRYUPGRADE(&nodelock->lock, nlocktypep) != ISC_R_SUCCESS)
    {
        NODE_UNLOCK(&nodelock->lock, nlocktypep);
        NODE_WRLOCK(&nodelock->lock, nlocktypep);
    }

Instead of doing that over again, introduce new NODE_FORCEUPGRADE()
and TREE_FORCEUPGRADE() that does exactly this code, and simplify
the aforementioned code with just:

    NODE_FORCEUPGRADE(&nodelock->lock, nlocktypep);
2023-01-05 22:18:40 +01:00
Ondřej Surý
6613f89c62 Enhance the isc_loop unit to allow reference count tracking
Use ISC_REFCOUNT_TRACE_{IMPL,DECL} to allow better isc_loop reference
tracking - use `#define ISC_LOOP_TRACE 1` in <isc/loop.h> to enable.
2023-01-05 12:33:15 +00:00
Ondřej Surý
6553927d27 Enforce strong thread-affinity on StreamDNS sockets
Add a check that the isc__nm_streamdns_read(), isc__nm_streamdns_send(),
and isc__nm_streamdns_close() are being called from the matching thread.
2023-01-05 09:43:09 +01:00
Ondřej Surý
1a999353cd Pin the dns_dispatch to threads when reusing
Previously, dns_dispatch_gettcp() could pick a TCP connection created by
different thread - this breaks our contractual promise to DNS dispatch
by using the TCP connection on a different thread than it was created.
Add .tid member to the dns_dispatch_t struct and skip the dispatches
from other threads when looking up a TCP dispatch that we can reuse in
dns_request.

NOTE: This is going to be properly refactored, but this change could be
also backported to 9.18 for better stability and thread-affinity.
2023-01-05 09:11:42 +01:00
Evan Hunt
24a81fefe6 Fix control flow issues in zone.c
Since dns_master_dump() can no longer return DNS_R_CONTINUE,
the error recovery code in synchronous calls to zone_dump() had
branches that could no longer be reached. This has been cleaned
up by using a boolean variable to determine whether the write
is asynchronous rather than depending on the result code.
2023-01-04 14:14:42 -08:00
Matthijs Mekking
abd8c1cad0 Fix CID 432259: Sizeof not portable (remote.c)
The following report suggests that the 'size' parameter in the two
calls to 'isc_mem_get()' should be set to the
'count * sizeof(*tlsnames)' and 'count * sizeof(*keynames)'
respectively.

/lib/dns/remote.c: 117 in dns_remote_init()
111		}
112	} else {
113		remote->keynames = NULL;
114	}
115
116	if (tlsnames != NULL) {

>>>     CID 432259:    (SIZEOF_MISMATCH)
>>>     Passing argument "count * 8UL /* sizeof (tlsnames) */" to
>>>	function "isc__mem_get" and then casting the return value to
>>>	"dns_name_t **" is suspicious.  In this particular case
>>>	"sizeof (dns_name_t **)" happens to be equal to
>>>	"sizeof (dns_name_t *)", but this is not a portable assumption.

117	remote->tlsnames = isc_mem_get(mctx, count * sizeof(tlsnames));
118		for (i = 0; i < count; i++) {
119			remote->tlsnames[i] = NULL;
120		}
121		for (i = 0; i < count; i++) {
122			if (tlsnames[i] != NULL) {

/lib/dns/remote.c: 99 in dns_remote_init()
93		memmove(remote->dscps, dscp, count * sizeof(isc_dscp_t));
94	} else {
95		remote->dscps = NULL;
96	}
97
98	if (keynames != NULL) {

>>>     CID 432259:    (SIZEOF_MISMATCH)
>>>     Passing argument "count * 8UL /* sizeof (keynames) */" to
>>>	function "isc__mem_get" and then casting the return value to
>>>	"dns_name_t **" is suspicious.  In this particular case
>>>	"sizeof (dns_name_t **)" happens to be equal to "sizeof
>>>	(dns_name_t *)", but this is not a portable assumption.

99	 remote->keynames = isc_mem_get(mctx, count * sizeof(keynames));
100	 for (i = 0; i < count; i++) {
101		remote->keynames[i] = NULL;
102	 }
103	 for (i = 0; i < count; i++) {
104		if (keynames[i] != NULL) {
2023-01-03 16:47:57 +01:00
Aram Sargsyan
ef4f15d2d1 Fix an error path bug in rpz.c:update_nodes()
When dns_db_createiterator() fails, 'updbit' should not be destroyed
for obvious reasons, i.e. it is NULL.
2023-01-03 14:21:17 +00:00
Aram Sargsyan
d36728e42f Fix a shutdown and error path bugs in rpz.c:update_nodes()
When shutting down, or when dns_dbiterator_current() fails, 'node'
shouldn't be detached, because it is NULL at that point.
2023-01-03 14:21:17 +00:00
Aram Sargsyan
975d16230b Fix a shutdown bug in update_rpz_cb()
When shutting down, the cleanup path should not try to destroy
'newnodes', because it is NULL at that point.

Introduce another label for the "shuttingdown" scenario.
2023-01-03 13:27:43 +00:00
Aram Sargsyan
da7c448988 Don't expire an ADB entry without holding the entries lock
The clean_namehooks() function does't hold the 'adb->entries_lock'
lock, so calling maybe_expire_entry() is not thread-safe.

Instead of adding a lock/unlock, leave the expiration to later,
e.g. by the get_attached_and_locked_entry() function.

Also fix a couple of comment typos.
2023-01-03 08:21:52 +00:00
Mark Andrews
096b280b1c Do not pass NULL pointer to memmove - undefined behaviour
Check if 'old_base' is NULL and if so skip calling memmove.
2023-01-03 14:40:30 +11:00
Artem Boldariev
fbf1546fb8 TLS: use isc_buffer_t for send requests
This commit replaces ad-hoc code for send requests buffer management
within TLS with the one based on isc_buffer_t.

Previous version of the code was trying to use pre-allocated small
buffers to avoid extra allocations. The code would allocate a larger
dynamic buffer when needed. There is no need to have ad-hoc code for
this, as isc_buffer_t now provides this functionality internally.

Additionally to the above, the old version of the code lacked any
logic to reuse the dynamically allocated buffers. Now, as we do not
manage memory buffers, but isc_buffer_t objects, we can implement this
strategy. It can be in particular helpful for longer lasting
connections, as in this case the buffer will adjust itself to the size
of the messages being transferred. That is, it is in particular useful
for XoT, as Stream DNS happen to order send requests in such a way
that the send request will get reused.
2022-12-30 19:56:25 +02:00
Aram Sargsyan
41ca9d419e Don't pass a NULL pointer to isc_sockaddr_format()
The 'localaddr' pointer can be NULL, which causes an assertion failure.

Use '&disp->local' instead when printing a debug log message.
2022-12-28 12:10:09 +00:00
Matthijs Mekking
d8e98d4bba Remove unused dns_remote_t functions
Now that setting alternate transfer sources is removed, the functions
to check whether all addresses are considered good have become obsolete.
2022-12-23 15:17:54 +01:00
Matthijs Mekking
5954ae6458 Remove setting alternate transfer source
Remove parsing the configuration options 'alt-transfer-source',
'alt-transfer-source-v6', and 'use-alt-transfer-source', and remove
the corresponding code that implements the feature.
2022-12-23 14:39:59 +01:00
Matthijs Mekking
c4bffb3e64 Use 'source[-v6]' for transfer, notify, checkds
Use the configured 'source' and 'source-v6' when initiating a zone
transfer, sending a notify, or when checking for the DS. Remove the
special code for using alternate transfer sources.

Update some system tests to use the new configuration and make sure
the tests still work.
2022-12-23 13:36:50 +00:00
Matthijs Mekking
a92b9e40ce Small comment change in remote.h header file
The documentation for 'dns_remote_addr()' was incorrect (copy paste
error).
2022-12-23 13:36:50 +00:00
Matthijs Mekking
17e16c7a34 Parse and store new 'source[-v6]' option
Parse the new 'source' and 'source-v6' options and store them with
the corresponding remote servers (parental-agents, primaries, ...).
2022-12-23 13:36:50 +00:00
Matthijs Mekking
ad248f2261 Add new 'source[-v6]' option for remote servers
Add a new way to configure the preferred source address when talking to
remote servers such as primaries and parental-agents. This will
eventually deprecate options such as 'parental-source',
'parental-source-v6', 'transfer-source', etc.

Example of the new configuration:

    parental-agents "parents" port 5353 \
        source 10.10.10.10 port 5354 dscp 54 \
        source-v6 2001:db8::10 port 5355 dscp 55 {
	10.10.10.11;
	2001:db8::11;
    };
2022-12-23 13:36:50 +00:00
Matthijs Mekking
bf1dd57242 Refactor zone.c, use dns_remote_t structure
Use the new dns_remote_t structure for remote server communication to
primaries, parental agents, etc.
2022-12-23 13:36:50 +00:00
Matthijs Mekking
0300295944 Add new files for remote server communication
The dns_remote_t structure is intended to replace the variables in
the structure that deals with remote server communication to primaries,
parental agents, forwarders, etc.
2022-12-23 13:36:50 +00:00
Aram Sargsyan
53afe1f978 Fix an ADB quota management error in the resolver
Normally, when a 'resquery_t' object is created in fctx_query(),
we call dns_adb_beginudpfetch() (which increases the ADB quota)
only if it's a UDP query. Then, in fctx_cancelquery(), we call
dns_adb_endudpfetch() to decreases back the ADB quota, again only
if it's a UDP query.

The problem is that a UDP query can become a TCP query, preventing
the quota from adjusting back in fctx_cancelquery() later.

Call dns_adb_beginudpfetch() also when switching the query type
from UDP to TCP.
2022-12-23 09:45:20 +00:00
Aram Sargsyan
c7ba26c3d6 INSIST that active quota is 0 in destroy_adbentry()
This should catch ADB quota management errors in the resolver.
2022-12-23 09:45:20 +00:00
Artem Boldariev
7962e7f575 tlsctx_client_session_cache_new() -> tlsctx_client_session_create()
Additionally to renaming, it changes the function definition so that
it accepts a pointer to pointer instead of returning a pointer to the
new object.

It is mostly done to make it in line with other functions in the
module.
2022-12-23 11:10:11 +02:00
Artem Boldariev
f102df96b8 Rename isc_tlsctx_cache_new() -> isc_tlsctx_cache_create()
Additionally to renaming, it changes the function definition so that
it accepts a pointer to pointer instead of returning a pointer to the
new object.

It is mostly done to make it in line with other functions in the
module.
2022-12-23 11:10:11 +02:00
Ondřej Surý
9dd8deaf01 Call the connected dns_dispatch callback asynchronously
The dns_request code is very sensitive about calling the connected and
deadlocks when the timing is "right" in several places.  Move the call
to the connected callback to the (udp|tcp)_connected() functions, so
they are called asynchronously instead of directly from
the (udp|tcp)_dispentry_cancel() functions.
2022-12-21 12:13:35 +01:00
Ondřej Surý
3fac4ca57e Ignore TCP dispatches that have zero references
The TCP dispatches are removed from the dispatchmgr->list in the
dispatch_destroy() and there's a brief period of time where
dns_dispatch_gettcp() can find a dispatch in connected state that's
being destroyed.

Set the dispatch state to DNS_DISPATCHSTATE_NONE in the TCP connection
callback if there are no responses waiting, and ignore TCP dispatches
with zero references in dns_dispatch_gettcp().
2022-12-21 12:13:35 +01:00
Ondřej Surý
90cd14f620 Fix assignment vs comparison typo in tcp_connected()
In tcp_connected() a typo has turned a DbC check into an assignment
breaking the state machine and making the dns_dispatch_gettcp() try to
attach to dispatch in process of destruction.
2022-12-21 12:13:20 +01:00
Ondřej Surý
7310e2b424 Don't remove dispatches in CANCELED state from the list
In dns_dispatch_gettcp(), we can't remove canceled dispatches from the
mgr->list because ISC_LIST_NEXT() would fail in the next iteration.
2022-12-21 12:13:20 +01:00
Ondřej Surý
6d22d03208 Ignore TCP dispatches in DNS_DISPATCHSTATE_NONE state
The TCP dispatches in DNS_DISPATCHSTATE_NONE could be either very
fresh or those could be dispatches that failed connecting to the
destination.  Ignore them when trying to connect to an existing
TCP dispatch via dns_dispatch_gettcp().
2022-12-21 12:13:20 +01:00
Ondřej Surý
6cb6373b5a Convert Stream DNS to use isc_buffer API
Drop the whole isc_dnsbuffer API and use new improved isc_buffer API
that provides same functionality as the isc_dnsbuffer unit now.
2022-12-20 22:13:53 +02:00
Artem Boldariev
0a7e83feea StreamDNS: Use isc__nm_senddns() to send DNS messages
This commit modifies the Stream DNS message so that it uses the
optimised code path (isc__nm_senddns()) for sending DNS messages over
the underlying transport. This way we avoid allocating any
intermediate memory buffers needed to render a DNS message with its
length pre-pended ahead of the contents (TCP DNS message format).
2022-12-20 22:13:53 +02:00
Artem Boldariev
cb6f3dc3c8 TLS: isc__nm_senddns() support
This commit adds support for isc_nm_senddns() to the generic TLS code.
2022-12-20 22:13:53 +02:00
Artem Boldariev
ad876a65af Add isc__nm_senddns()
The new internal function works in the same way as isc_nm_send()
except that it sends a DNS message size ahead of the DNS message
data (the format used in DNS over TCP).

The intention is to provide a fast path for sending DNS messages over
streams protocols - that is, without allocating any intermediate
memory buffers.
2022-12-20 22:13:53 +02:00
Artem Boldariev
56732ac2a0 TLS: try to avoid allocating send request objects
This commit optimises TLS send request object allocation to enable
send request object reuse, somewhat reducing pressure on the memory
manager. It is especially helpful in the case when Stream DNS uses the
TLS implementation as the transport.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4277eeeb9c Remove TLS DNS transport (and parts common with TCP DNS)
This commit removes TLS DNS transport superseded by Stream DNS.
2022-12-20 22:13:53 +02:00
Artem Boldariev
e5649710d3 Remove TCP DNS transport
This commit removes TCP DNS transport superseded by Stream DNS.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4524bf4083 Make isc_nm_tlssocket non-optional
This commit unties generic TLS code (isc_nm_tlssocket) from DoH, so
that it will be available regardless of the fact if BIND was built
with DNS over HTTP support or not.
2022-12-20 22:13:53 +02:00
Artem Boldariev
efe4267044 DoH: use isc_nmhandle_set_tcp_nodelay()
This commit replaces ad-hoc code for disabling Nagle's algorithm with
a call to isc_nmhandle_set_tcp_nodelay().
2022-12-20 22:13:53 +02:00
Artem Boldariev
e89575ddce StreamDNS: opportunistically disable Nagle's algorithm
This commit ensures that Stream DNS code attempts to disable Nagle's
algorithm regardless of underlying stream transport (TCP or TLS), as
we are not interested in trading latency for throughout when dealing
with DNS messages.
2022-12-20 22:13:53 +02:00
Artem Boldariev
05cfb27b80 Disable Nagle's algorithm for TLS connections by default
This commit ensures that Nagle's algorithm is disabled by default for
TLS connections on best effort basis, just like other networking
software (e.g. NGINX) does, as, in the case of TLS, we are not
interested in trading latency for throughput, rather vice versa.

We attempt to disable it as early as we can, right after TCP
connections establishment, as an attempt to speed up handshake
handling.
2022-12-20 22:13:53 +02:00
Artem Boldariev
371b02f37a TCP: make it possible to set Nagle's algorithms state via handle
This commit adds ability to turn the Nagle's algorithm on or off via
connections handle. It adds the isc_nmhandle_set_tcp_nodelay()
function as the public interface for this functionality.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4606384345 Extend isc__nm_socket_tcp_nodelay() to accept value
This makes it possible to both enable and disable Nagle's algorithm
for a TCP socket descriptor, before the change it was possible only to
disable it.
2022-12-20 22:13:53 +02:00
Artem Boldariev
cce52fa4a2 BIND: use Stream DNS for DNS over TCP connections
This commit makes BIND use the new Stream DNS transport for DNS over
TCP.
2022-12-20 22:13:53 +02:00
Artem Boldariev
03e33a014c BIND: use Stream DNS for DNS over TLS connections
This commit makes BIND use the new Stream DNS transport for DNS over
TLS.
2022-12-20 22:13:52 +02:00
Artem Boldariev
f395cd4b3e Add isc_nm_streamdnssocket (aka Stream DNS)
This commit adds an initial implementation of isc_nm_streamdnssocket
transport: a unified transport for DNS over stream protocols messages,
which is capable of replacing both TCP DNS and TLS DNS
transports. Currently, the interface it provides is a unified set of
interfaces provided by both of the transports it attempts to replace.

The transport is built around "isc_dnsbuffer_t" and
"isc_dnsstream_assembler_t" objects and attempts to minimise both the
number of memory allocations during network transfers as well as
memory usage.
2022-12-20 22:13:51 +02:00
Artem Boldariev
338cf3e467 Add isc_dnsstream_assembler_t implementation
This commit adds the implementation for an "isc_dnsstream_assembler_t"
object. The object is built on top of "isc_dnsbuffer_t" and is
intended to encapsulate the state machine used for handling DNS
messages received in the format used for messages transmitted over
TCP.

The idea is that the object accepts the input data received from a
socket, tries to assemble DNS messages from the incoming data and
calls the callback which contains the status of the incoming data as
well as a pointer to the memory region referencing the data of the
assembled message. It is capable of assembling DNS messages no matter
how torn apart they are when sent over network.

The following statuses might be passed to the callback:

* ISC_R_SUCCESS - a message has been successfully assembled;
* ISC_R_NOMORE  - not enough data has been processed to assemble a
message;
* ISC_R_RANGE - there was an attempt to process a zero-sized DNS
message (someone attempts to send us junk data).

One could say that the object replaces the implementation of
"isc__nm_*_processbuffer()" functions used by the old TCP DNS and TLS
DNS transports with a better defined state machine completely
decoupled from the networking code itself.

Such a design makes it trivial to write unit tests for it, leading to
better verification of its correctness.

Another important difference is directly related to the fact that it
is built on top of "isc_dnsbuffer_t", which tries to manage memory in
a smart way. In particular:

* It tries to use a static buffer for smaller messages, reducing
pressure on the memory manager (hot path);
* When allocating dynamic memory for larger messages, it tries to
allocate memory conservatively (generic path).

These characteristics is a significant upgrade over the older logic
where a 64KB(+2 bytes) buffer was allocated from dynamic memory
regardless of the fact if we need a buffer this large or not. That is,
lesser memory usage is expected in a generic case for DNS transports
built on top of "isc_dnsstream_assembler_t."
2022-12-20 21:24:44 +02:00