Commit Graph

13781 Commits

Author SHA1 Message Date
Ondřej Surý
8f6e4dfa15 Change xfer-out timer message log level to DEBUG(1)
When max-transfer-*-out timeouts were reintroduced, the log message
about starting the timer was errorneously left as ISC_LOG_ERROR.
Change the log level of said message to ISC_LOG_DEBUG(1).
2022-03-17 21:28:29 +01:00
Ondřej Surý
ff22498849 Add couple missing braces around single-line statements
The clang-format-15 has new option InsertBraces that could add missing
branches around single line statements.  Use that to our advantage
without switching to not-yet-released LLVM version to add missing braces
in couple of places.
2022-03-17 18:27:45 +01:00
Ondřej Surý
cd52953f8a Update the isc_ht unit test to also tesh rehashing
As incremental rehashing has been added to isc_ht implementation, we
need to test whether the rehashing works.

Update the isc_ht unit test to test:

 * preinitialized hash table large enough to hold all the elements
 * smallest hash table that fully grows to hold all the elements
 * partially preinitialized hash table that grows
 * iterating while rehashing is in progress
2022-03-17 08:16:24 +01:00
Ondřej Surý
e42cb1f198 Implement incremental hash table resizing in isc_ht
Previously, an incremental hash table resizing was implemented for the
dns_rbt_t hash table implementation.  Using that as a base, also
implement the incremental hash table resizing also for isc_ht API
hashtables:

 1. During the resize, allocate the new hash table, but keep the old
    table unchanged.
 2. In each lookup, delete, or iterator operation, check both tables.
 3. Perform insertion operations only in the new table.
 4. At each insertion also move <r> elements from the old table to
    the new table.
 5. When all elements are removed from the old table, deallocate it.

To ensure that the old table is completely copied over before the new
table itself needs to be enlarged, it is necessary to increase the
size of the table by a factor of at least (<r> + 1)/<r> during resizing.

In our implementation <r> is equal to 1.

The downside of this approach is that the old table and the new table
could stay in memory for longer when there are no new insertions into
the hash table for prolonged periods of time as the incremental
rehashing happens only during the insertions.
2022-03-17 08:16:24 +01:00
Aram Sargsyan
f0f3370e14 Check if the fetch is shutting down in resume_dslookup()
The fetch can be in the shutting down state when resume_dslookup() is
trying to operate on it.

This is also a security issue, because a malicious actor can set up a
name server which delays certain queries in such a way that the fetch
will time out and shut down, which will cause named to crash.

Add a check to see if the fetch has the shutting down attribute set,
and cancel any further operations on it in such case.

A similar bug had been fixed earlier for the resume_qmin() function,
see [GL #966].
2022-03-16 22:11:49 +01:00
Mark Andrews
9fcc028f5c Skip calling find_coveringnsec if we found a DNAME
This is an optimisation as we can skip a lot of pointless work when we
know there is a DNAME there.

When we have a partial match and a DNAME above the QNAME, the closest
encloser has the same owner as the DNAME, will have the DNAME bit set
in the type map, and we wouldn't use it as we would return the
DNAME + RRSIG(DNAME) instead.

So there is no point in looking for it nor in attempting to check that
it is valid for the QNAME.
2022-03-16 22:11:49 +01:00
Mark Andrews
5c271f91e1 Only update foundname if returning DNS_R_COVERINGNSEC
'setup_delegation' depends on 'foundname' being the value returned
by 'dns_rbt_findnode' in the cache and 'find_coveringnsec' was
modifying 'foundname' when a covering NSEC was not found.
2022-03-16 22:11:49 +01:00
Ondřej Surý
bfa4b9c141 Run .closehandle_cb asynchrounosly in nmhandle_detach_cb()
When sock->closehandle_cb is set, we need to run nmhandle_detach_cb()
asynchronously to ensure correct order of multiple packets processing in
the isc__nm_process_sock_buffer().  When not run asynchronously, it
would cause:

  a) out-of-order processing of the return codes from processbuffer();

  b) stack growth because the next TCP DNS message read callback will
     be called from within the current TCP DNS message read callback.

The sock->closehandle_cb is set to isc__nm_resume_processing() for TCP
sockets which calls isc__nm_process_sock_buffer().  If the read callback
(called from isc__nm_process_sock_buffer()->processbuffer()) doesn't
attach to the nmhandle (f.e. because it wants to drop the processing or
we send the response directly via uv_try_write()), the
isc__nm_resume_processing() (via .closehandle_cb) would call
isc__nm_process_sock_buffer() recursively.

The below shortened code path shows how the stack can grow:

 1: ns__client_request(handle, ...);
 2: isc_nm_tcpdns_sequential(handle);
 3: ns_query_start(client, handle);
 4:   query_lookup(qctx);
 5:     query_send(qctcx->client);
 6:       isc__nmhandle_detach(&client->reqhandle);
 7:         nmhandle_detach_cb(&handle);
 8:           sock->closehandle_cb(sock); // isc__nm_resume_processing
 9:             isc__nm_process_sock_buffer(sock);
10:               processbuffer(sock); // isc__nm_tcpdns_processbuffer
11:                 isc_nmhandle_attach(req->handle, &handle);
12:                 isc__nm_readcb(sock, req, ISC_R_SUCCESS);
13:                   isc__nm_async_readcb(NULL, ...);
14:                     uvreq->cb.recv(...); // ns__client_request

Instead, if 'sock->closehandle_cb' is set, we need to run detach the
handle asynchroniously in 'isc__nmhandle_detach', so that on line 8 in
the code flow above does not start this recursion. This ensures the
correct order when processing multiple packets in the function
'isc__nm_process_sock_buffer()' and prevents the stack growth.

When not run asynchronously, the out-of-order processing leaves the
first TCP socket open until all requests on the stream have been
processed.

If the pipelining is disabled on the TCP via `keep-response-order`
configuration option, named would keep the first socket in lingering
CLOSE_WAIT state when the client sends an incomplete packet and then
closes the connection from the client side.
2022-03-16 22:11:49 +01:00
Mark Andrews
fe1bbba259 Look for zones deeper than the current domain or forward name
When caching glue, we need to ensure that there is no closer
source of truth for the name. If the owner name for the glue
record would be answered by a locally configured zone, do not
cache.
2022-03-16 22:11:49 +01:00
Mark Andrews
c289913e5c Check cached names for possible "forward only" clause
When caching additional and glue data *not* from a forwarder, we must
check that there is no "forward only" clause covering the owner name
that would take precedence.  Such names would normally be allowed by
baliwick rules, but a "forward only" zone introduces a new baliwick
scope.
2022-03-16 22:11:49 +01:00
Mark Andrews
7e37b5e379 Check that the forward declaration is unchanged and not overridden
If we are using a fowarder, in addition to checking that names to
be cached are subdomains of the forwarded namespace, we must also
check that there are no subsidiary forwarded namespaces which would
take precedence. To be safe, we don't cache any responses if the
forwarding configuration has changed since the query was sent.
2022-03-16 22:11:49 +01:00
Mark Andrews
5dc3b25d03 Add additional name checks when using a forwarder
When using a forwarder, check that the owner name of response
records are within the bailiwick of the forwarded name space.
2022-03-16 22:11:49 +01:00
Ondřej Surý
79b5ccbf34 Implement isc_interval_t on top of isc_time_t
Change the isc_interval_t implementation from separate data type and
separate implementation to be shim implementation on top of isc_time_t.
The distinction between isc_interval_t and isc_time_t has been kept
because they are semantically different - isc_interval_t is relative and
isc_time_t is absolute, but this allows isc_time_t and isc_interval_t to
be freely interchangeable, f.e. this:

    isc_time_t *t1;
    isc_interval_t *interval;
    isc_time_t *t2;

    isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2);;
    isc_time_subtract(t1, interval, t2);
    isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2));

to just:

    isc_time_t *t1;
    isc_interval_t *interval;
    isc_time_t *t2;

    isc_time_subtract(t1, t2, interval);

without introducing a whole set of new functions.
2022-03-14 13:00:05 -07:00
Ondřej Surý
e6ca2a651f Refactor isc_timer_reset() use with semantic patch
Add and apply semantic patch to remove expires argument from the
isc_timer_reset() calls through the codebase.
2022-03-14 13:00:05 -07:00
Ondřej Surý
6437bcc488 Remove expires argument from isc_timer API
The isc_timer_reset() now works only with intervals for once timers.

This makes the API almost 1:1 compatible with the libuv timers making
the further refactoring possible.
2022-03-14 13:00:05 -07:00
Ondřej Surý
27850a5ad2 Change isc_timer_reset() usage to never use expires argument
There were two places where expires argument (absolute isc_time_t value)
was being used.  Both places has been converted to use relative interval
argument in preparation of simplification and refactoring of isc_timer
API.
2022-03-14 13:00:05 -07:00
Ondřej Surý
c259cecc90 Refactor isc_timer_create() to just create timer
The isc_timer_create() function was a bit conflated.  It could have been
used to create a timer and start it at the same time.  As there was a
single place where this was done before (see the previous commit for
nta.c), this was cleaned up and the isc_timer_create() function was
changed to only create new timer.
2022-03-14 13:00:05 -07:00
Ondřej Surý
514053f244 Change lib/dns/nta.c to create inactive timer and then reset it
In nta.c, it was the only place where the active timer was created
directly instead of first creating inactive timer and then starting it
with isc_timer_reset().

Change the code to create inactive timer first, so we can refactor the
isc_timer_create() function.
2022-03-14 13:00:05 -07:00
Ondřej Surý
8fbb42c49c Remove "a temporary hack, 'rndc timerpoke'"
In 2002, "a temporary hack, 'rndc timerpoke'" was added.  It's time
for it to go, so it was removed.
2022-03-14 13:00:05 -07:00
Ondřej Surý
f4751a91f7 Remove unused isc_timer_touch() function
The isc_timer_touch() was unused, just remove it.
2022-03-14 13:00:05 -07:00
Ondřej Surý
bbe1c06a8b Remove isc_timertype_limited from isc_timer API
The isc_timertype_limited timer type was never used (not even in tests).
Remove isc_timertype_limited timer type before planned refactoring.
2022-03-14 13:00:05 -07:00
Ondřej Surý
49c804f8b7 Cleanup the nmhandle attach/detach in httpd.c
In httpd.c, the send callback can directly call read callback without
calling isc_nm_resumeread().  When per-send timeout was added, this
could lead to use-after-free when shutting down the named.

Cleanup the way how we attach to .readhandle and .sendhandle, so there's
assurance that .readhandle will be always non-NULL when reading and
.sendhandle will be always non-NULL when sending.

Additionally, it was found that the implementation ignored the
"Connection: close" header and it worked only accidentally by closing
the connection after the first read from the TCP socket.  This has been
also fixed.
2022-03-11 09:57:10 +01:00
Ondřej Surý
6ddac2d56d On shutdown, reset the established TCP connections
Previously, the established TCP connections (both client and server)
would be gracefully closed waiting for the write timeout.

Don't wait for TCP connections to gracefully shutdown, but directly
reset them for faster shutdown.
2022-03-11 09:56:57 +01:00
Ondřej Surý
a761aa59e3 Change single write timer to per-send timers
Previously, there was a single per-socket write timer that would get
restarted for every new write.  This turned out to be insufficient
because the other side could keep reseting the timer, and never reading
back the responses.

Change the single write timer to per-send timer which would in turn
reset the TCP connection on the first send timeout.
2022-03-11 09:56:57 +01:00
Ondřej Surý
f251d69eba Remove usage of deprecated ATOMIC_VAR_INIT() macro
The C17 standard deprecated ATOMIC_VAR_INIT() macro (see [1]).  Follow
the suite and remove the ATOMIC_VAR_INIT() usage in favor of simple
assignment of the value as this is what all supported stdatomic.h
implementations do anyway:

  * MacOSX.plaform: #define ATOMIC_VAR_INIT(__v) {__v}
  * Gcc stdatomic.h: #define ATOMIC_VAR_INIT(VALUE)	(VALUE)

1. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1138r0.pdf
2022-03-08 23:55:10 +01:00
Ondřej Surý
d128656d2e Make dns_catz_get_iterator() return void
Previously, the function(s) in the commit subject could fail for various
reasons - mostly allocation failures, or other functions returning
different return code than ISC_R_SUCCESS.  Now, the aforementioned
function(s) cannot ever fail and they would always return ISC_R_SUCCESS.

Change the function(s) to return void and remove the extra checks in
the code that uses them.
2022-03-08 14:51:55 +01:00
Ondřej Surý
8fa27365ec Make isc_ht_init() and isc_ht_iter_create() return void
Previously, the function(s) in the commit subject could fail for various
reasons - mostly allocation failures, or other functions returning
different return code than ISC_R_SUCCESS.  Now, the aforementioned
function(s) cannot ever fail and they would always return ISC_R_SUCCESS.

Change the function(s) to return void and remove the extra checks in
the code that uses them.
2022-03-08 14:51:55 +01:00
Ondřej Surý
bbb4cdb92d Make isc_heap_create() and isc_heap_insert() return void
Previously, the function(s) in the commit subject could fail for various
reasons - mostly allocation failures, or other functions returning
different return code than ISC_R_SUCCESS.  Now, the aforementioned
function(s) cannot ever fail and they would always return ISC_R_SUCCESS.

Change the function(s) to return void and remove the extra checks in
the code that uses them.
2022-03-08 11:19:34 +01:00
Ondřej Surý
8098a58581 Set TCP maximum segment size to minimum size of 1220
Previously the socket code would set the TCPv6 maximum segment size to
minimum value to prevent IP fragmentation for TCP.  This was not yet
implemented for the network manager.

Implement network manager functions to set and use minimum MTU socket
option and set the TCP_MAXSEG socket option for both IPv4 and IPv6 and
use those to clamp the TCP maximum segment size for TCP, TCPDNS and
TLSDNS layers in the network manager to 1220 bytes, that is 1280 (IPv6
minimum link MTU) minus 40 (IPv6 fixed header) minus 20 (TCP fixed
header)

We already rely on a similar value for UDP to prevent IP fragmentation
and it make sense to use the same value for IPv4 and IPv6 because the
modern networks are required to support IPv6 packet sizes.  If there's
need for small TCP segment values, the MTU on the interfaces needs to be
properly configured.
2022-03-08 10:27:05 +01:00
Ondřej Surý
5d34a14f22 Set minimum MTU (1280) on IPv6 sockets
The IPV6_USE_MIN_MTU socket option directs the IP layer to limit the
IPv6 packet size to the minimum required supported MTU from the base
IPv6 specification, i.e. 1280 bytes.  Many implementations of TCP
running over IPv6 neglect to check the IPV6_USE_MIN_MTU value when
performing MSS negotiation and when constructing a TCP segment despite
MSS being defined to be the MTU less the IP and TCP header sizes (60
bytes for IPv6).  This leads to oversized IPv6 packets being sent
resulting in unintended Path Maximum Transport Unit Discovery (PMTUD)
being performed and to fragmented IPv6 packets being sent.

Add and use a function to set socket option to limit the MTU on IPv6
sockets to the minimum MTU (1280) both for UDP and TCP.
2022-03-08 10:27:05 +01:00
Mark Andrews
9bcf45f4ce Check dnssec-policy key roles for validity
For each algorithm there must be a key performing the KSK and
ZSK rolls.  After reading the keys from named.conf check that
each algorithm present has both rolls.  CSK implicitly has both
rolls.
2022-03-08 13:23:14 +11:00
Aram Sargsyan
963f6a2203 Fix a function cleanup bug in dns_request_createraw
When get_dispatch() returns an error code, the dns_request_createraw()
function jumps to the `cleanup` label, which will leave a previous
attachment to the `request` pointer unattached.

Fix the issue by jumping to the `detach` label instead.
2022-03-07 11:24:09 +00:00
Ondřej Surý
f24b26188d Merge lib/dns/gen.h contents to lib/dns/gen.c
Formerly, the gen.h header contained a compatibility layer between Win32
and POSIX platforms.  Since we have already dropped the Win32 build, we
can merged gen.h into gen.c as the header file is not used elsewhere.
2022-03-04 14:13:58 +01:00
Ondřej Surý
6bd025942c Replace netievent lock-free queue with simple locked queue
The current implementation of isc_queue uses Michael-Scott lock-free
queue that in turn uses hazard pointers.  It was discovered that the way
we use the isc_queue, such complicated mechanism isn't really needed,
because most of the time, we either execute the work directly when on
nmthread (in case of UDP) or schedule the work from the matching
nmthreads.

Replace the current implementation of the isc_queue with a simple locked
ISC_LIST.  There's a slight improvement - since copying the whole list
is very lightweight - we move the queue into a new list before we start
the processing and locking just for moving the queue and not for every
single item on the list.

NOTE: There's a room for future improvements - since we don't guarantee
the order in which the netievents are processed, we could have two lists
- one unlocked that would be used when scheduling the work from the
matching thread and one locked that would be used from non-matching
thread.
2022-03-04 13:49:51 +01:00
Ondřej Surý
f3ca90a804 Add attach/detach for the dns_dispatch_send()
The order in which the netievents are processed on the network manager
loop is not guaranteed.  Therefore the recv/read callback can come
earlier than the send/write callback.

The dns_request API wasn't ready for this reordering and it was
destroying the dns_request_t object before the send callback has been
called.

Add additional attach/detach in the req_send()/req_senddone() functions
to make sure we don't destroy the dns_request_t while it's still being
references by asynchronous call.
2022-03-04 13:47:59 +01:00
Aram Sargsyan
117dac11d1 Use autoconf check for BN_GENCB_new()
BIND unconditionally uses shims for BN_GENCB_new(), BN_GENCB_free(),
and BN_GENCB_get_arg() for all LibreSSL versions and, correctly, for
OpenSSL <1.1.0 versions.

This breaks LibreSSL compilation starting with LibreSSL 3.5.0.

Use autoconf check instead to check whether the family of the functions
are available.
2022-03-02 10:48:09 +00:00
Aram Sargsyan
ef0d7177b6 Remove EVP_CIPHER_CTX_new() and EVP_CIPHER_CTX_free() shims
LibreSSL 3.5.0 fails to compile with these shims. We could have just
removed the LibreSSL check from the pre-processor condition, but it
seems that these shims are no longer needed because all the supported
versions of OpenSSL and LibreSSL have those functions.

According to EVP_ENCRYPTINIT(3) manual page in LibreSSL,
EVP_CIPHER_CTX_new() and EVP_CIPHER_CTX_free() first appeared in
OpenSSL 0.9.8b, and have been available since OpenBSD 4.5.
2022-03-02 10:48:09 +00:00
Evan Hunt
4ca74eee49 document zone grammar more correctly
the "zone" clause can be documented using, for instance,
`cfg_test --zonegrammar primary", which prints only
options that are valid in primary zones. this was not
the method being used when generating the named.conf
man page; instead, "zone" was documented with all possible
options, and no zone types at all.

this commit removes "zone" from the generic documentation
and adds include statements in named.conf.rst so that
correct zone grammars will be included in the man page.
2022-03-02 01:53:24 -08:00
Mark Andrews
4c356d2770 Grow the lex token buffer in one more place
when parsing key pairs, if the '=' character fell at max_token
a protective INSIST preventing buffer overrun could be triggered.
Attempt to grow the buffer immediately before the INSIST.

Also removed an unnecessary INSIST on the opening double quote
of key buffer pair.
2022-03-01 16:05:39 -08:00
Mark Andrews
b8b99603f1 Use unsigned arithmetic when shifting by 24
By default C promotes short unsigned values to signed int which
leads to undefined behaviour when the value is shifted by too much.
Force unsigned arithmetic to be perform by explicitly casting to a
unsigned type.
2022-03-01 23:36:00 +00:00
Ondřej Surý
b220fb32bd Handle TCP sockets in isc__nmsocket_reset()
The isc__nmsocket_reset() was missing a case for raw TCP sockets (used
by RNDC and DoH) which would case a assertion failure when write timeout
would be triggered.

TCP sockets are now also properly handled in isc__nmsocket_reset().
2022-02-28 02:06:03 -08:00
Mark Andrews
26f817f574 Return ISC_R_NOTIMPLEMENTED rather than ISC_R_UNEXPECTEDEND
If the keydata rdata is shorter that 16 octets it is not out private
keydata type and we have not implemented a tostruct method for it.
2022-02-25 21:06:16 -08:00
Mark Andrews
48039fa25e Do not return ISC_R_UNEXPECTEDEND
All rdata passed to dns_rdata_tostruct is supposed to be well formed,
assert if it isn't.
2022-02-25 20:57:08 -08:00
Evan Hunt
bbaade23eb mem_maybedup() can no longer fail
mem_maybedup() calls isc_mem_allocate() if an mctx is supplied,
but that can no longer fail, so now the only way mem_maybedup()
could return NULL is if it was given a NULL source address by the
caller. this commit adds a REQUIRE to prevent that scenario, and
cleans up all the calling code that previously checked for NULL
return values.

this function is mostly used in rdata tostruct() implementations, so
the documentation for dns_rdata_tostruct() has been updated to
remove 'ISC_R_NOMEMORY' as a possible return value.
2022-02-25 20:57:08 -08:00
Evan Hunt
0bde07261b remove old zone type documentation
we now document zone type as either "primary" or "secondary",
omitting the old terms (though they are still accepted).
2022-02-25 16:33:37 -08:00
Evan Hunt
0e57fc160e add a CFG_CLAUSEFLAG_NODOC flag for use with outdated terms
"masters" and "default-masters" are now flagged so they will
not be included in the named.conf man page, despite being
accepted as valid options by the parser for backward
compatibiility.
2022-02-25 16:33:30 -08:00
Ondřej Surý
ecf042991c Fix typo __SANITIZE_ADDRESS -> __SANITIZE_ADDRESS__
When checking for Address Sanitizer to disable the inactivehandles
caching, there was a typo in the macro.
2022-02-24 00:15:16 +01:00
Ondřej Surý
be339b3c83 Disable inactive uvreqs caching when compiled with sanitizers
When isc__nm_uvreq_t gets deactivated, it could be just put onto array
stack to be reused later to save some initialization time.
Unfortunately, this might hide some use-after-free errors.

Disable the inactive uvreqs caching when compiled with Address or
Thread Sanitizer.
2022-02-24 00:15:16 +01:00
Ondřej Surý
92cce1da65 Disable inactive handles caching when compiled with sanitizers
When isc_nmhandle_t gets deactivated, it could be just put onto array
stack to be reused later to safe some initialization time.
Unfortunately, this might hide some use-after-free errors.

Disable the inactive handles caching when compiled with Address or
Thread Sanitizer.
2022-02-23 23:21:29 +01:00
Ondřej Surý
e2555a306f Remove active handles tracking from isc__nmsocket_t
The isc__nmsocket_t has locked array of isc_nmhandle_t that's not used
for anything.  The isc__nmhandle_get() adds the isc_nmhandle_t to the
locked array (and resized if necessary) and removed when
isc_nmhandle_put() finally destroys the handle.  That's all it does, so
it serves no useful purpose.

Remove the .ah_handles, .ah_size, and .ah_frees members of the
isc__nmsocket_t and .ah_pos member of the isc_nmhandle_t struct.
2022-02-23 22:54:47 +01:00