Commit Graph

4629 Commits

Author SHA1 Message Date
Ondřej Surý
5abbcdadaf Use thread_local EVP_MD in isc_iterated_hash()
Cherry-pick small fixup commit from 9.18/9.16 branches needed for
thread-safety.  This fixup commit is not needed for 9.19+ because of
reworked application setup, but it decouples isc_iterated_hash and
isc_md units and keeps all the branches in sync.
2023-01-18 23:33:43 +01:00
Ondřej Surý
f3753d591f Use thread_local EVP_MD_CTX in isc_iterated_hash()
As this code is on hot path (NSEC3) this introduces an additional
optimization of the EVP_MD API - instead of calling EVP_MD_CTX_new() on
every call to isc_iterated_hash(), we create two thread_local objects
for each thread - a basectx and mdctx, initialize basectx once and then
use EVP_MD_CTX_copy_ex() to flip the initialized state into mdctx.  This
saves us couple more valuable microseconds from the isc_iterated_hash()
call.
2023-01-18 19:36:21 +01:00
Ondřej Surý
25db8d0103 Use OpenSSL 1.x SHA_CTX API in isc_iterated_hash()
If the OpenSSL SHA1_{Init,Update,Final} API is still available, use it.
The API has been deprecated in OpenSSL 3.0, but it is significantly
faster than EVP_MD API, so make an exception here and keep using it
until we can't.
2023-01-18 19:36:17 +01:00
Ondřej Surý
36654df732 Use OpenSSL EVP_MD API directly in isc_iterated_hash()
Instead of going through another layer, use OpenSSL EVP_MD API directly
in the isc_iterated_hash() implementation.  This shaves off couple of
microseconds in the microbenchmark.
2023-01-18 18:32:57 +01:00
Ondřej Surý
e6bfb8e456 Avoid implicit algorithm fetch for OpenSSL EVP_MD family
The implicit algorithm fetch causes a lock contention and significant
slowdown for small input buffers.  For more details, see:

https://github.com/openssl/openssl/issues/19612

Instead of using EVP_DigestInit_ex() initialize empty MD_CTX objects for
each algorithm and use EVP_MD_CTX_copy_ex() to initialize MD_CTX from a
static copy.  Additionally avoid implicit algorithm fetching by using
EVP_MD_fetch() for OpenSSL 3.0.
2023-01-18 18:32:57 +01:00
Tony Finch
290899661d Fix a typo in the NS_PER_ macros
Milliseconds and microseconds were swapped.
2023-01-16 20:33:57 +00:00
Ondřej Surý
d07c4a98da Prefer the pthread_barrier implementation over uv_barrier
Prefer the pthread_barrier implementation on platforms where it is
available over uv_barrier implementation.  This also solves the problem
with thread sanitizer builds on macOS that doesn't have pthread barrier.
2023-01-11 09:51:02 +01:00
Ondřej Surý
d06602f036 Get rid of locking during UDP and TCP listen
We already have a synchronization mechanism when starting the UDP and
TCP listener children - barriers.  Change how we start the first-born
child (tid == 0), so we don't have to race for sock->parent->result and
sock->parent->fd.
2023-01-11 07:17:46 +01:00
Ondřej Surý
10f884a5b8 Remove unused isc_astack unit
The isc_astack unit is now unused, so just remove it.
2023-01-10 20:31:24 +01:00
Ondřej Surý
359faf2ff7 Convert isc_astack usage in netmgr to mempool and ISC_LIST
Change the per-socket inactive uvreq cache (implemented as isc_astack)
to per-worker memory pool.

Change the per-socket inactive nmhandle cache (implemented as
isc_astack) to unlocked per-socket ISC_LIST.
2023-01-10 20:31:24 +01:00
Ondřej Surý
5bbba0d1a1 Simplify tracing the reference counting in isc_netmgr
Always track the per-worker sockets in the .active_sockets field in the
isc__networker_t struct and always track the per-socket handles in the
.active_handles field ian the isc_nmsocket_t struct.
2023-01-10 19:57:39 +01:00
Mark Andrews
349c23dbb7 Accept 'in=NULL' with 'inlen=0' in isc_{half}siphash24
Arthimetic on NULL pointers is undefined.  Avoid arithmetic operations
when 'in' is NULL and require 'in' to be non-NULL if 'inlen' is not zero.
2023-01-10 17:52:56 +11:00
Evan Hunt
916ea26ead remove nonfunctional DSCP implementation
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.

To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
2023-01-09 12:15:21 -08:00
Evan Hunt
9c577e10c3 use separate barriers for "stop" and "listen" operations
On some platforms, when a synchronizing barrier is cleared, one
thread can progress while other threads are still in the process
of releasing the barrier. If a barrier is reused by the progressing
thread during this window, it can cause a deadlock. This can occur if,
for example, we stop listening immediately after we start, because the
stop and listen functions both use socket->barrier.  This has been
addressed by using separate barrier objects for stop and listen.
2023-01-07 16:30:21 -08:00
Ondřej Surý
6613f89c62 Enhance the isc_loop unit to allow reference count tracking
Use ISC_REFCOUNT_TRACE_{IMPL,DECL} to allow better isc_loop reference
tracking - use `#define ISC_LOOP_TRACE 1` in <isc/loop.h> to enable.
2023-01-05 12:33:15 +00:00
Ondřej Surý
6553927d27 Enforce strong thread-affinity on StreamDNS sockets
Add a check that the isc__nm_streamdns_read(), isc__nm_streamdns_send(),
and isc__nm_streamdns_close() are being called from the matching thread.
2023-01-05 09:43:09 +01:00
Mark Andrews
096b280b1c Do not pass NULL pointer to memmove - undefined behaviour
Check if 'old_base' is NULL and if so skip calling memmove.
2023-01-03 14:40:30 +11:00
Artem Boldariev
fbf1546fb8 TLS: use isc_buffer_t for send requests
This commit replaces ad-hoc code for send requests buffer management
within TLS with the one based on isc_buffer_t.

Previous version of the code was trying to use pre-allocated small
buffers to avoid extra allocations. The code would allocate a larger
dynamic buffer when needed. There is no need to have ad-hoc code for
this, as isc_buffer_t now provides this functionality internally.

Additionally to the above, the old version of the code lacked any
logic to reuse the dynamically allocated buffers. Now, as we do not
manage memory buffers, but isc_buffer_t objects, we can implement this
strategy. It can be in particular helpful for longer lasting
connections, as in this case the buffer will adjust itself to the size
of the messages being transferred. That is, it is in particular useful
for XoT, as Stream DNS happen to order send requests in such a way
that the send request will get reused.
2022-12-30 19:56:25 +02:00
Artem Boldariev
7962e7f575 tlsctx_client_session_cache_new() -> tlsctx_client_session_create()
Additionally to renaming, it changes the function definition so that
it accepts a pointer to pointer instead of returning a pointer to the
new object.

It is mostly done to make it in line with other functions in the
module.
2022-12-23 11:10:11 +02:00
Artem Boldariev
f102df96b8 Rename isc_tlsctx_cache_new() -> isc_tlsctx_cache_create()
Additionally to renaming, it changes the function definition so that
it accepts a pointer to pointer instead of returning a pointer to the
new object.

It is mostly done to make it in line with other functions in the
module.
2022-12-23 11:10:11 +02:00
Ondřej Surý
6cb6373b5a Convert Stream DNS to use isc_buffer API
Drop the whole isc_dnsbuffer API and use new improved isc_buffer API
that provides same functionality as the isc_dnsbuffer unit now.
2022-12-20 22:13:53 +02:00
Artem Boldariev
0a7e83feea StreamDNS: Use isc__nm_senddns() to send DNS messages
This commit modifies the Stream DNS message so that it uses the
optimised code path (isc__nm_senddns()) for sending DNS messages over
the underlying transport. This way we avoid allocating any
intermediate memory buffers needed to render a DNS message with its
length pre-pended ahead of the contents (TCP DNS message format).
2022-12-20 22:13:53 +02:00
Artem Boldariev
cb6f3dc3c8 TLS: isc__nm_senddns() support
This commit adds support for isc_nm_senddns() to the generic TLS code.
2022-12-20 22:13:53 +02:00
Artem Boldariev
ad876a65af Add isc__nm_senddns()
The new internal function works in the same way as isc_nm_send()
except that it sends a DNS message size ahead of the DNS message
data (the format used in DNS over TCP).

The intention is to provide a fast path for sending DNS messages over
streams protocols - that is, without allocating any intermediate
memory buffers.
2022-12-20 22:13:53 +02:00
Artem Boldariev
56732ac2a0 TLS: try to avoid allocating send request objects
This commit optimises TLS send request object allocation to enable
send request object reuse, somewhat reducing pressure on the memory
manager. It is especially helpful in the case when Stream DNS uses the
TLS implementation as the transport.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4277eeeb9c Remove TLS DNS transport (and parts common with TCP DNS)
This commit removes TLS DNS transport superseded by Stream DNS.
2022-12-20 22:13:53 +02:00
Artem Boldariev
e5649710d3 Remove TCP DNS transport
This commit removes TCP DNS transport superseded by Stream DNS.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4524bf4083 Make isc_nm_tlssocket non-optional
This commit unties generic TLS code (isc_nm_tlssocket) from DoH, so
that it will be available regardless of the fact if BIND was built
with DNS over HTTP support or not.
2022-12-20 22:13:53 +02:00
Artem Boldariev
efe4267044 DoH: use isc_nmhandle_set_tcp_nodelay()
This commit replaces ad-hoc code for disabling Nagle's algorithm with
a call to isc_nmhandle_set_tcp_nodelay().
2022-12-20 22:13:53 +02:00
Artem Boldariev
e89575ddce StreamDNS: opportunistically disable Nagle's algorithm
This commit ensures that Stream DNS code attempts to disable Nagle's
algorithm regardless of underlying stream transport (TCP or TLS), as
we are not interested in trading latency for throughout when dealing
with DNS messages.
2022-12-20 22:13:53 +02:00
Artem Boldariev
05cfb27b80 Disable Nagle's algorithm for TLS connections by default
This commit ensures that Nagle's algorithm is disabled by default for
TLS connections on best effort basis, just like other networking
software (e.g. NGINX) does, as, in the case of TLS, we are not
interested in trading latency for throughput, rather vice versa.

We attempt to disable it as early as we can, right after TCP
connections establishment, as an attempt to speed up handshake
handling.
2022-12-20 22:13:53 +02:00
Artem Boldariev
371b02f37a TCP: make it possible to set Nagle's algorithms state via handle
This commit adds ability to turn the Nagle's algorithm on or off via
connections handle. It adds the isc_nmhandle_set_tcp_nodelay()
function as the public interface for this functionality.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4606384345 Extend isc__nm_socket_tcp_nodelay() to accept value
This makes it possible to both enable and disable Nagle's algorithm
for a TCP socket descriptor, before the change it was possible only to
disable it.
2022-12-20 22:13:53 +02:00
Artem Boldariev
f395cd4b3e Add isc_nm_streamdnssocket (aka Stream DNS)
This commit adds an initial implementation of isc_nm_streamdnssocket
transport: a unified transport for DNS over stream protocols messages,
which is capable of replacing both TCP DNS and TLS DNS
transports. Currently, the interface it provides is a unified set of
interfaces provided by both of the transports it attempts to replace.

The transport is built around "isc_dnsbuffer_t" and
"isc_dnsstream_assembler_t" objects and attempts to minimise both the
number of memory allocations during network transfers as well as
memory usage.
2022-12-20 22:13:51 +02:00
Artem Boldariev
338cf3e467 Add isc_dnsstream_assembler_t implementation
This commit adds the implementation for an "isc_dnsstream_assembler_t"
object. The object is built on top of "isc_dnsbuffer_t" and is
intended to encapsulate the state machine used for handling DNS
messages received in the format used for messages transmitted over
TCP.

The idea is that the object accepts the input data received from a
socket, tries to assemble DNS messages from the incoming data and
calls the callback which contains the status of the incoming data as
well as a pointer to the memory region referencing the data of the
assembled message. It is capable of assembling DNS messages no matter
how torn apart they are when sent over network.

The following statuses might be passed to the callback:

* ISC_R_SUCCESS - a message has been successfully assembled;
* ISC_R_NOMORE  - not enough data has been processed to assemble a
message;
* ISC_R_RANGE - there was an attempt to process a zero-sized DNS
message (someone attempts to send us junk data).

One could say that the object replaces the implementation of
"isc__nm_*_processbuffer()" functions used by the old TCP DNS and TLS
DNS transports with a better defined state machine completely
decoupled from the networking code itself.

Such a design makes it trivial to write unit tests for it, leading to
better verification of its correctness.

Another important difference is directly related to the fact that it
is built on top of "isc_dnsbuffer_t", which tries to manage memory in
a smart way. In particular:

* It tries to use a static buffer for smaller messages, reducing
pressure on the memory manager (hot path);
* When allocating dynamic memory for larger messages, it tries to
allocate memory conservatively (generic path).

These characteristics is a significant upgrade over the older logic
where a 64KB(+2 bytes) buffer was allocated from dynamic memory
regardless of the fact if we need a buffer this large or not. That is,
lesser memory usage is expected in a generic case for DNS transports
built on top of "isc_dnsstream_assembler_t."
2022-12-20 21:24:44 +02:00
Artem Boldariev
cbb758abd4 Add isc_dnsbuffer_t implementation
This commit adds "isc_dnsbuffer_t" object implementation, a thin
wrapper on top of "isc_buffer_t" which has the following
characteristics:

* provides interface specifically atuned for handling/generating DNS
messages, especially in the format used for DNS messages over TCP;
* avoids allocating dynamic memory when handling small DNS messages,
while transparently switching to using dynamic memory when handling
larger messages. This approach significantly reduces pressure on the
memory allocator, as most of the DNS messages are small.
2022-12-20 21:24:44 +02:00
Artem Boldariev
c0c59b55ab TLS: add an internal function isc__nmhandle_get_selected_alpn()
The added function provides the interface for getting an ALPN tag
negotiated during TLS connection establishment.

The new function can be used by higher level transports.
2022-12-20 21:24:44 +02:00
Artem Boldariev
15e626f1ca TLS: add manual read timer control mode
This commit adds manual read timer control mode, similarly to TCP.
This way the read timer can be controlled manually using:

* isc__nmsocket_timer_start();
* isc__nmsocket_timer_stop();
* isc__nmsocket_timer_restart().

The change is required to make it possible to implement more
sophisticated read timer control policies in DNS transports, built on
top of TLS.
2022-12-20 21:24:44 +02:00
Artem Boldariev
9aabd55725 TCP: add manual read timer control mode
This commit adds a manual read timer control mode to the TCP
code (adding isc__nmhandle_set_manual_timer() as the interface to it).

Manual read timer control mode suppresses read timer restarting the
read timer when receiving any amount of data. This way the read timer
can be controlled manually using:

* isc__nmsocket_timer_start();
* isc__nmsocket_timer_stop();
* isc__nmsocket_timer_restart().

The change is required to make it possible to implement more
sophisticated read timer control policies in DNS transports, built on
top of TCP.
2022-12-20 21:24:44 +02:00
Artem Boldariev
f4760358f8 TLS: expose the ability to (re)start and stop underlying read timer
This commit adds implementation of isc__nmsocket_timer_restart() and
isc__nmsocket_timer_stop() for generic TLS code in order to make its
interface more compatible with that of TCP.
2022-12-20 21:24:44 +02:00
Artem Boldariev
f18a9b3743 TLS: add isc__nmsocket_timer_running() support
This commit adds isc__nmsocket_timer_running() support to the generic
TLS code in order to make it more compatible with TCP.
2022-12-20 21:24:44 +02:00
Artem Boldariev
c0808532e1 TLS: isc_nm_bad_request() and isc__nmsocket_reset() support
This commit adds implementations of isc_nm_bad_request() and
isc__nmsocket_reset() to the generic TLS stream code in order to make
it more compatible with TCP code.
2022-12-20 21:24:44 +02:00
Artem Boldariev
94e650ce89 Use 'restrict' and 'const' for 'isc_buffer_t'
The purpose of this commit is to aid compiler in generating better
code when working with `isc_buffer_t` objects by using restricted
pointers (and, to a lesser extent, 'const' modifier for read-only
arguments).

This way we, basically, instruct the compiler that the members of
structured passed by pointers into the functions can be treated as
local variables in the scope of a function. That should reduce the
number of load/store operations emitted by compilers when accessing
objects (e.g. 'isc_buffer_t') via pointers.
2022-12-20 21:01:27 +02:00
Ondřej Surý
460afcda18 Add isc_buffer_trycompact() function needed for StreamDNS
Add isc_buffer_trycompact() that's an optimization; it will compact the
buffer only when the remaining length is smaller than used length.
2022-12-20 19:13:48 +01:00
Ondřej Surý
e6062ee3ae Add isc_buffer_setmctx() and isc_buffer_clearmctx() function
Add two extra functions needed by StreamDNS:

1. isc_buffer_setmctx() sets the buffer internal memory context, so we
   can use isc_buffer_reserve() on the buffer.  For this, we also need
   to track whether the .base was dynamically allocated or not.  This
   needs to be called after isc_buffer_init() and before first
   isc_buffer_reserve() call.

2. isc_buffer_clearmctx() clears the buffer internal memory context, and
   frees any dynamically allocated buffer.  This needs to be called
   after the last isc_buffer_reserve() call and before calling the
   isc_buffer_invalidate()
2022-12-20 19:13:48 +01:00
Ondřej Surý
8e3a86f6dd Make the isc_buffer unit header-only
The isc_buffer is often used in the hot-path, so make it header-only
implementation.
2022-12-20 19:13:48 +01:00
Ondřej Surý
2ddea1e41c Add a static pre-allocated buffer to isc_buffer_t
When the buffer is allocated via isc_buffer_allocate() and the size is
smaller or equal ISC_BUFFER_STATIC_SIZE (currently 512 bytes), the
buffer will be allocated as a flexible array member in the buffer
structure itself instead of allocating it on the heap.  This should help
when the buffer is used on the hot-path with small allocations.
2022-12-20 19:13:48 +01:00
Ondřej Surý
6bd2b34180 Enable auto-reallocation for all isc_buffer_allocate() buffers
When isc_buffer_t buffer is created with isc_buffer_allocate() assume
that we want it to always auto-reallocate instead of having an extra
call to enable auto-reallocation.
2022-12-20 19:13:48 +01:00
Ondřej Surý
135ec7a0f0 Remove single use isc_buffer_putdecint() function
The isc_buffer_putdecint() could be easily replaced with
isc_buffer_printf() with just a small overhead of calling vsnprintf()
twice instead once.  This is not on a hot-path (dns_catz unit), so we
can ignore the overhead and instead have less single-use code in favor
of using reusable more generic function.
2022-12-20 19:13:48 +01:00
Ondřej Surý
2a94123d5b Refactor the isc_buffer_{get,put}uintN, add isc_buffer_peekuintN
The Stream DNS implementation needs a peek methods that read the value
from the buffer, but it doesn't advance the current position.  Add
isc_buffer_peekuintX methods, refactor the isc_buffer_{get,put}uintN
methods to modern integer types, and move the isc_buffer_getuintN to the
header as static inline functions.
2022-12-20 19:13:48 +01:00