Commit Graph

35741 Commits

Author SHA1 Message Date
Ondřej Surý
405559e4f3 Merge branch '3166-delay-isc__nm_uvreq_t-deallocation-v9_18' into 'v9_18'
Delay isc__nm_uvreq_t deallocation to connection callback

See merge request isc-projects/bind9!5892
2022-02-23 22:47:48 +00:00
Ondřej Surý
afe8a60f98 Delay isc__nm_uvreq_t deallocation to connection callback
When the TCP, TCPDNS or TLSDNS connection times out, the isc__nm_uvreq_t
would be pushed into sock->inactivereqs before the uv_tcp_connect()
callback finishes.  Because the isc__nmsocket_t keeps the list of
inactive isc__nm_uvreq_t, this would cause use-after-free only when the
sock->inactivereqs is full (which could never happen because the failure
happens in connection timeout callback) or when the sock->inactivereqs
mechanism is completely removed (f.e. when running under Address or
Thread Sanitizer).

Delay isc__nm_uvreq_t deallocation to the connection callback and only
signal the connection callback should be called by shutting down the
libuv socket from the connection timeout callback.

(cherry picked from commit 3268627916)
2022-02-23 23:31:18 +01:00
Ondřej Surý
dc750e090b Merge branch 'ondrej-cleanup-nm_destroy-dequeue-v9_18' into 'v9_18'
Properly free up enqueued netievents in nm_destroy()

See merge request isc-projects/bind9!5889
2022-02-23 22:25:42 +00:00
Ondřej Surý
74948421a6 Properly free up enqueued netievents in nm_destroy()
When the isc_netmgr is being destroyed, the normal and priority queues
should be dequeued and netievents properly freed.  This wasn't the case.

(cherry picked from commit 88418c3372)
2022-02-23 22:53:31 +01:00
Michał Kępień
9862265cc9 Merge branch '3147-fix-more-ns_statscounter_recursclients-underflows-v9_18' into 'v9_18'
[v9_18] Fix more ns_statscounter_recursclients underflows

See merge request isc-projects/bind9!5880
2022-02-23 14:02:57 +00:00
Michał Kępień
87d5dff4a3 Add CHANGES entry for GL #3147
(cherry picked from commit 600f9010d2)
2022-02-23 14:43:09 +01:00
Michał Kępień
d1f27a336a Add release note for GL #3147
(cherry picked from commit 1c462a63ec)
2022-02-23 14:43:09 +01:00
Michał Kępień
5929411f90 Fix more ns_statscounter_recursclients underflows
Commit aab691d512 did not fix all possible
scenarios in which the ns_statscounter_recursclients counter underflows.
The solution implemented therein can be ineffective e.g. when CNAME
chaining happens with prefetching enabled.

Here is an example recursive resolution scenario in which the
ns_statscounter_recursclients counter can underflow with the current
logic in effect:

 1. Query processing starts, the answer is not found in the cache, so
    recursion is started.  The NS_CLIENTATTR_RECURSING attribute is set.
    ns_statscounter_recursclients is incremented (Δ = +1).

 2. Recursion completes, returning a CNAME.  client->recursionquota is
    non-NULL, so the NS_CLIENTATTR_RECURSING attribute remains set.
    ns_statscounter_recursclients is decremented (Δ = 0).

 3. Query processing restarts.

 4. The current QNAME (the target of the CNAME from step 2) is found in
    the cache, with a TTL low enough to trigger a prefetch.

 5. query_prefetch() attaches to client->recursionquota.
    ns_statscounter_recursclients is not incremented because
    query_prefetch() does not do that (Δ = 0).

 6. Query processing restarts.

 7. The current QNAME (the target of the CNAME from step 4) is not found
    in the cache, so recursion is started.  client->recursionquota is
    already attached to (since step 5) and the NS_CLIENTATTR_RECURSING
    attribute is set (since step 1), so ns_statscounter_recursclients is
    not incremented (Δ = 0).

 8. The prefetch from step 5 completes.  client->recursionquota is
    detached from in prefetch_done().  ns_statscounter_recursclients is
    not decremented because prefetch_done() does not do that (Δ = 0).

 9. Recursion for the current QNAME completes.  client->recursionquota
    is already detached from, i.e. set to NULL (since step 8), and the
    NS_CLIENTATTR_RECURSING attribute is set (since step 1), so
    ns_statscounter_recursclients is decremented (Δ = -1).

Another possible scenario is that after step 7, recursion for the target
of the CNAME from step 4 completes before the prefetch for the CNAME
itself.  fetch_callback() then notices that client->recursionquota is
non-NULL and decrements ns_statscounter_recursclients, even though
client->recursionquota was attached to by query_prefetch() and therefore
not accompanied by an incrementation of ns_statscounter_recursclients.
The net result is also an underflow.

Instead of trying to properly handle all possible orderings of events
set into motion by normal recursion and prefetch-triggered recursion,
adjust ns_statscounter_recursclients whenever the recursive clients
quota is successfully attached to or detached from.  Remove the
NS_CLIENTATTR_RECURSING attribute altogether as its only purpose is made
obsolete by this change.

(cherry picked from commit f7482b68b9)
2022-02-23 14:43:09 +01:00
Petr Špaček
e5cc4be4b4 Merge branch 'pspacek/windows-bat-removal-v9_18' into 'v9_18'
Remove leftover .bat file [v9_18]

See merge request isc-projects/bind9!5876
2022-02-22 15:09:37 +00:00
Petr Špaček
414cbdbee3 Remove last .bat file from the source tree
This fixes an omission in !5739, "Remove leftover test code for Windows".

(cherry picked from commit 653db956f0)
2022-02-22 16:05:29 +01:00
Matthijs Mekking
2c972005f6 Merge branch '3164-fix-parental-agents-documentation-v9_18' into 'v9_18'
Fix typo in DNSSEC guide parental-agents example

See merge request isc-projects/bind9!5873
2022-02-22 13:47:02 +00:00
Matthijs Mekking
7012680a11 Fix typo in DNSSEC guide parental-agents example
The example will not load because of the typo, the comma should be a
semicolon.

(cherry picked from commit fd5e39cc76)
2022-02-22 14:06:59 +01:00
Michał Kępień
01074e26f2 Merge branch 'michal/handle-fctx-in-FCTXTRACE-macro-stubs-v9_18' into 'v9_18'
[v9_18] Add "UNUSED(fctx);" to FCTXTRACE*() macro stubs

See merge request isc-projects/bind9!5871
2022-02-21 10:09:04 +00:00
Michał Kępień
08b2c1be44 Add "UNUSED(fctx);" to FCTXTRACE*() macro stubs
Commit 21ae6bb1b2 removed most uses of the
'fctx' variable from the rctx_dispfail() function: it is now only needed
by the FCTXTRACE3() macro.  However, when --enable-querytrace is not in
effect, that macro evaluates to a list of UNUSED() macros that does not
include "UNUSED(fctx);".  This triggers the following compilation
warning when building without --enable-querytrace:

    resolver.c: In function 'rctx_dispfail':
    resolver.c:7888:21: warning: unused variable 'fctx' [-Wunused-variable]
     7888 |         fetchctx_t *fctx = rctx->fctx;
          |                     ^~~~

Fix by adding "UNUSED(fctx);" lines to all FCTXTRACE*() macros.  This is
safe to do because all of those macros use the 'fctx' local variable, so
there is no danger of introducing new errors caused by use of undeclared
identifiers.

(cherry picked from commit b645e28167)
2022-02-21 11:06:28 +01:00
Evan Hunt
92fc6c2221 Merge branch '3141-remove-the-artificial-stream-clients-limit-v9_18' into 'v9_18'
Remove the limit on the number of simultaneous TCP queries

See merge request isc-projects/bind9!5865
2022-02-18 02:36:35 +00:00
Ondřej Surý
bf21c4de6a Add CHANGES and release note for [GL #3141]
(cherry picked from commit 2bcf5a5315)
2022-02-17 16:57:34 -08:00
Ondřej Surý
780a89012d Remove the limit on the number of simultaneous TCP queries
There was an artificial limit of 23 on the number of simultaneous
pipelined queries in the single TCP connection.  The new network
managers is capable of handling "unlimited" (limited only by the TCP
read buffer size ) queries similar to "unlimited" handling of the DNS
queries receive over UDP.

Don't limit the number of TCP queries that we can process within a
single TCP read callback.

(cherry picked from commit 4f5b4662b6)
2022-02-17 16:57:34 -08:00
Ondřej Surý
d96ac73bd4 Merge branch '1897-fix-max-transfer-timeouts-v9_18' into 'v9_18'
Reimplement the max-transfer-time-out and max-transfer-idle-out (v9.18)

See merge request isc-projects/bind9!5862
2022-02-17 22:00:20 +00:00
Ondřej Surý
c625e791af Merge branch 'ondrej/v9_18-is-not-a-main-branch' into 'v9_18'
Disable main branch checking for v9_18 branch

Closes #3120 and #1918

See merge request isc-projects/bind9!5864
2022-02-17 21:29:36 +00:00
Ondřej Surý
f66edb7ee9 Add CHANGES and release note for [GL #1897]
(cherry picked from commit 987ad32fac)
2022-02-17 22:29:29 +01:00
Ondřej Surý
9157fcdec6 Add XFR max-transfer-time-out and max-tranfer-idle-out system tests
Extend the timeouts system test to ensure that the maximum outgoing
transfer time (max-transfer-time-out) and maximum outgoing transfer idle
time (max-transfer-idle-out) works as expected.  This is done by
lowering the limits to 5/1 minutes and testing that the connection has
been dropped while sleeping between the individual XFR messages.

(cherry picked from commit 8fed1b6461)
2022-02-17 22:29:29 +01:00
Ondřej Surý
0ccc14fae9 Reimplement the max-transfer-time-out and max-transfer-idle-out
While refactoring the libns to use the new network manager, the
max-transfer-*-out options were not implemented and they were turned
non-operational.

Reimplement the max-transfer-idle-out functionality using the write
timer and max-transfer-time-out using the new isc_nm_timer API.

(cherry picked from commit 8643bbab84)
2022-02-17 22:29:29 +01:00
Ondřej Surý
8f39c9a8d7 Remove unused client->shutdown and client->shutdown_arg
While refactoring the lib/ns/xfrout.c, it was discovered that .shutdown
and .shutdown_arg members of ns_client_t structure are unused.

Remove the unused members and associated code that was using in it in
the ns_xfrout.

(cherry picked from commit 037549c405)
2022-02-17 22:29:29 +01:00
Ondřej Surý
8b2ae8cc84 Add network manager based timer API
This commits adds API that allows to create arbitrary timers associated
with the network manager handles.

(cherry picked from commit 3c7b04d015)
2022-02-17 22:29:29 +01:00
Ondřej Surý
7d12f734f1 Disable main branch checking for v9_18 branch
The util/check-changes script has two modes of operation - more relaxed
release branch mode and strict development branch mode.  When we forked
the v9_18 branch, the stricter mode stayed enabled.

Disable the strict CHANGES file checking suitable only for development
branch.
2022-02-17 22:24:51 +01:00
Ondřej Surý
aec1ce9f52 Merge branch '3149-drop-TCP-connection-when-garbage-is-received-v9_18' into 'v9_18'
Reset the TCP connection when garbage is received

See merge request isc-projects/bind9!5861
2022-02-17 20:24:00 +00:00
Ondřej Surý
c5f4887ee8 Add CHANGES and release note for [GL #3149]
(cherry picked from commit 9f1c439335)
2022-02-17 21:02:02 +01:00
Ondřej Surý
8a66d6d58d Add TCP garbage system test
Test if the TCP connection gets reset when garbage instead of DNS
message is sent.

I'm only happy when it rains
Pour some misery down on me
- Garbage

(cherry picked from commit ebfdb50ac7)
2022-02-17 21:02:02 +01:00
Ondřej Surý
2514f41ade Reset the TCP connection when garbage is received
When invalid DNS message is received, there was a handling mechanism for
DoH that would be called to return proper HTTP response.

Reuse this mechanism and reset the TCP connection when the client is
blackholed, DNS message is completely bogus or the ns_client receives
response instead of query.

(cherry picked from commit 4716c56ebb)
2022-02-17 21:02:02 +01:00
Ondřej Surý
d7bcb0b5b7 Merge branch '3133-tcp-error-handling-v9_18' into 'v9_18'
correct TCP error handling in dispatch and resolver

See merge request isc-projects/bind9!5857
2022-02-17 15:50:26 +00:00
Evan Hunt
4a448d09ee Add CHANGES note for [GL #3133]
(cherry picked from commit 1b25b76921)
2022-02-17 16:03:39 +01:00
Evan Hunt
21ae6bb1b2 correct TCP error handling in dispatch and resolver
- certain TCP result codes, including ISC_R_EOF and
  ISC_R_CONNECTIONRESET, were being mapped to ISC_R_SHUTTINGDOWN
  before calling the response handler in tcp_recv_cancelall().
  the result codes should be passed through to the response handler
  without being changed.

- the response handlers, resquery_response() and req_response(), had
  code to return immediately if encountering ISC_R_EOF, but this is
  not the correct behavior; that should only happen in the case of
  ISC_R_CANCELED when it was the caller that canceled the operation

- ISC_R_CONNECTIONRESET was not being caught in rctx_dispfail().

- removed code in rctx_dispfail() to retry queries without EDNS
  when receiving ISC_R_EOF; this is now treated the same as any
  other connection failure.

(cherry picked from commit b6d40b3c4e)
2022-02-17 16:03:39 +01:00
Ondřej Surý
4d912e0dec Merge branch '3132-add-send-timeout-v9_18' into 'v9_18'
Add TCP, TCPDNS and TLSDNS write timer

See merge request isc-projects/bind9!5855
2022-02-17 09:07:25 +00:00
Ondřej Surý
6a0e82b379 Add CHANGES and release note for [GL #3132]
(cherry picked from commit 0c35bda762)
2022-02-17 09:47:43 +01:00
Ondřej Surý
a0bc051782 Update writetimeout to be T_IDLE in netmgr_test.c
Use the isc_nmhandle_setwritetimeout() function in the netmgr unit test
to allow more time for writing and reading the responses because some of
the intervals that are used in the unit tests are really small leaving a
little room for any delays.

(cherry picked from commit ee359d6ffa)
2022-02-17 09:47:43 +01:00
Ondřej Surý
da34d1d69c Add isc_nmhandle_setwritetimeout() function
In some situations (unit test and forthcoming XFR timeouts MR), we need
to modify the write timeout independently of the read timeout.  Add a
isc_nmhandle_setwritetimeout() function that could be called before
isc_nm_send() to specify a custom write timeout interval.

(cherry picked from commit a89d9e0fa6)
2022-02-17 09:47:43 +01:00
Ondřej Surý
531406c2b1 Add TCP write timeout system test
Extend the timeouts system test that bursts the queries for large TXT
record and never read any responses back filling up the server TCP write
buffer.  The test should work with the default wmem_max value on
Linux (208k).

(cherry picked from commit b735182ae0)
2022-02-17 09:47:43 +01:00
Ondřej Surý
b5265eedfb Add TCP, TCPDNS and TLSDNS write timer
When the outgoing TCP write buffers are full because the other party is
not reading the data, the uv_write() could wait indefinitely on the
uv_loop and never calling the callback.  Add a new write timer that uses
the `tcp-idle-timeout` value to interrupt the TCP connection when we are
not able to send data for defined period of time.

(cherry picked from commit 408b362169)
2022-02-17 09:47:43 +01:00
Ondřej Surý
e262aff29b Add uv_tcp_close_reset compat
The uv_tcp_close_reset() function was added in libuv 1.32.0 and since we
support older libuv releases, we have to add a shim uv_tcp_close_reset()
implementation loosely based on libuv.

(cherry picked from commit cd3b58622c)
2022-02-17 09:47:43 +01:00
Ondřej Surý
a532533aab Rename sock->timer to sock->read_timer
Before adding the write timer, we have to remove the generic sock->timer
to sock->read_timer.  We don't touch the function names to limit the
impact of the refactoring.

(cherry picked from commit 45a73c113f)
2022-02-17 09:47:43 +01:00
Ondřej Surý
091284936b Merge branch '3157-blackhole-request-v9_18' into 'v9_18'
negative match on the 'blackhole' ACL could be treated as positive

See merge request isc-projects/bind9!5854
2022-02-17 08:46:56 +00:00
Evan Hunt
839a17186e CHANGES and release note for [GL #3157]
(cherry picked from commit 04361b0ad5)
2022-02-16 22:20:25 -08:00
Evan Hunt
da029f10ba negative 'blackhole' ACL match could be treated as positive
There was a bug in the checking of the "blackhole" ACL in
dns_request_create*(), causing an address to be treated as included
in the ACL if it was explicitly *excluded*. Thus, leaving "blackhole"
unset had no effect, but setting it to "none" would cause any
destination addresses to be rejected for dns_request purposes. This
would cause zone transfer requests and SOA queries to fail, among
other things.

The bug has been fixed, and "blackhole { none; };" was added to the
xfer system test as a regression test.

(cherry picked from commit 4444b168db)
2022-02-16 22:20:25 -08:00
Michał Kępień
01529bf791 Merge branch '3139-log-the-result-of-each-resolver-priming-attempt-v9_18' into 'v9_18'
[v9_18] Log the result of each resolver priming attempt

See merge request isc-projects/bind9!5847
2022-02-16 12:54:03 +00:00
Michał Kępień
899e5a7e3f Add CHANGES entry for [GL #3139]
(cherry picked from commit 39df399d9f)
2022-02-16 13:28:00 +01:00
Michał Kępień
a74e60a325 Log the result of each resolver priming attempt
When a resolver priming attempt completes, the following message is
currently logged:

    resolver priming query complete

This message is identical for both successful and failed priming
attempts.  Consider the following log excerpts:

  - successful priming attempt:

        10-Feb-2022 11:33:11.272 all zones loaded
        10-Feb-2022 11:33:11.272 running
        10-Feb-2022 11:33:19.722 resolver priming query complete

  - failed priming attempt:

        10-Feb-2022 11:33:29.978 all zones loaded
        10-Feb-2022 11:33:29.978 running
        10-Feb-2022 11:33:38.432 timed out resolving '_.org/A/IN': 2001:500:9f::42#53
        10-Feb-2022 11:33:38.522 timed out resolving './NS/IN': 2001:500:9f::42#53
        10-Feb-2022 11:33:42.132 timed out resolving '_.org/A/IN': 2001:500:12::d0d#53
        10-Feb-2022 11:33:42.285 timed out resolving './NS/IN': 2001:500:12::d0d#53
        10-Feb-2022 11:33:44.685 resolver priming query complete

Include the result of each priming attempt in the relevant log message
to give the administrator better insight into named's resolver priming
process.

(cherry picked from commit f286c845b0)
2022-02-16 13:28:00 +01:00
Ondřej Surý
5f77292475 Merge branch 'ondrej/add-UV_RUNTIME_CHECK-macro-v9_18' into 'v9_18'
Add UV_RUNTIME_CHECK() macro to print uv_strerror()

See merge request isc-projects/bind9!5845
2022-02-16 11:19:55 +00:00
Ondřej Surý
c30735707c Add semantic patch to keep UV_RUNTIME_CHECK in sync
The UV_RUNTIME_CHECK() macro requires to keep the function name in sync
like this:

    r = func(...);
    UV_RUNTIME_CHECK(func, r);

Add semantic patch to keep the function name and return variable in sync
with the previous line.

(cherry picked from commit 62bd5cb08c)
2022-02-16 11:46:00 +01:00
Ondřej Surý
f641507022 Use UV_RUNTIME_CHECK() as appropriate
Replace the RUNTIME_CHECK() calls for libuv API calls with
UV_RUNTIME_CHECK() to get more detailed error message when
something fails and should not.

(cherry picked from commit 8715be1e4b)
2022-02-16 11:46:00 +01:00
Ondřej Surý
b8be8048b5 Add UV_RUNTIME_CHECK() macro to print uv_strerror()
When libuv functions fail, they return correct return value that could
be useful for more detailed debugging.  Currently, we usually just check
whether the return value is 0 and invoke assertion error if it doesn't
throwing away the details why the call has failed.  Unfortunately, this
often happen on more exotic platforms.

Add a UV_RUNTIME_CHECK() macro that can be used to print more detailed
error message (via uv_strerror() before ending the execution of the
program abruptly with the assertion.

(cherry picked from commit 62e15bb06d)
2022-02-16 11:46:00 +01:00