Check that the fix in the previous commit works and that the
'ServerQuota' counter in the statistics channel is still unset
after a SERVFAIL result in a 'forward only' zone.
(cherry picked from commit 81b3c5d908)
The 'all_spilled' local variable in resolver.c:fctx_getaddresses()
is 'true' by default, and only becomes false when there is at least
one successfully found NS address. However, when a 'forward only;'
configuration is used, the code jumps over the part where it looks
for NS addresses and doesn't reset the 'all_spilled' to false, which
results in incorretly increased 'serverquota' statistics variable,
and also in invalid return error code from the function. The result
code error didn't make any differences, because all codes other than
'ISC_R_SUCCESS' or 'DNS_R_WAIT' were treated in the same way, and
the result code was never logged anywhere.
Set the default value of 'all_spilled' to 'false', and only make it
'true' before actually starting to look up NS addresses.
(cherry picked from commit e430ce7039)
If the operating system UDP queue gets full and the outgoing UDP sending
starts to be delayed, BIND 9 could exhibit memory spikes as it tries to
enqueue all the outgoing UDP messages. Try a bit harder to deliver the
outgoing UDP messages synchronously and if that fails, drop the outgoing
DNS message that would get queued up and then timeout on the client side.
Closes#4930
Backport of MR !9506
Merge branch 'backport-4930-limit-the-UDP-send-queue-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9512
If the operating system UDP queue gets full and the outgoing UDP sending
starts to be delayed, BIND 9 could exhibit memory spikes as it tries to
enqueue all the outgoing UDP messages. As those are not going to be
delivered anyway (as we argued when we stopped enlarging the operating
system send and receive buffers), try to send the UDP messages directly
using `uv_udp_try_send()` and if that fails, drop the outgoing UDP
message.
(cherry picked from commit b576c4c977)
Silence Coverity CID 468757 and 468767 (DATA RACE read not locked) by converting dnssec-signzone to use atomics for statistics counters rather than using a lock.
Closes#4939
Backport of MR !9496
Merge branch 'backport-4939-remove-stats-lock-from-dnssec-signzone-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9501
Silence Coverity CID 468757 and 468767 (DATA RACE read not locked)
by converting dnssec-signzone to use atomics for statistics counters
rather than using a lock. This should be marginally faster than
using the lock as well when statistics are requested.
(cherry picked from commit 473cbd4e87)
Closes#4634
Backport of MR !9150
Merge branch 'backport-4634-drop-dns.resolver-module-from-system-tests-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9492
When the libxml2 and libjson-c libraries are not supported, the statistics channel can't return anything useful, so it is now disabled. Use of `statistics-channel` in `named.conf` is a fatal error.
Closes#4895
Backport of MR !9423
Merge branch 'backport-4895-link-style-sheet-to-libxml2-support-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9487
If neither libxml2 nor libjson_c are available have named-checkconf
fail if a statistics-channels block is specified.
(cherry picked from commit b9246418e8)
The `statschannel` system test failed if only one of `libxml2` or `json-c` is
available / configured as checks were being run against the non available
statistics page.
Closes#4919
Backport of MR !9454
Merge branch 'backport-4919-fix-statschannel-system-test-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9484
clang 19 was updated in the base image.
Backport of MR !9475
Merge branch 'backport-mnowak/fix-clang-format-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9478
This change allows fallback from an IXFR failure to AXFR when the reason is `DNS_R_TOOMANYRECORDS`. This is because this error condition could be temporary only in an intermediate version of IXFR transactions and it's possible that the latest version of the zone doesn't have that condition. In such a case, the secondary would never be able to update the zone (even if it could) without this fallback.
This fallback behavior is particularly useful with the recently introduced `max-records-per-type` and `max-types-per-name` options: the primary may not have these limitations and may temporarily introduce "too many" records, breaking IXFR. If the primary side subsequently deletes these records, this fallback will help recover the zone transfer failure automatically; without it, the secondary side would first need to increase the limit, which requires more operational overhead and has its own adverse effect.
Closes#4928
Backport of MR !9333
Merge branch 'backport-fallback-ixfr-to-axfr-on-toomanyrecords-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9472
This change allows fallback from an IXFR failure to AXFR when the
reason is DNS_R_TOOMANYRECORDS. This is because this error condition
could be temporary only in an intermediate version of IXFR
transactions and it's possible that the latest version of the zone
doesn't have that condition. In such a case, the secondary would never
be able to update the zone (even if it could) without this fallback.
This fallback behavior is particularly useful with the recently
introduced max-records-per-type and max-types-per-name options:
the primary may not have these limitations and may temporarily
introduce "too many" records, breaking IXFR. If the primary side
subsequently deletes these records, this fallback will help recover
the zone transfer failure automatically; without it, the secondary
side would first need to increase the limit, which requires more
operational overhead and has its own adverse effect.
This change also fixes a minor glitch that DNS_R_TOOMANYRECORDS wasn't
logged in xfrin_fail.
(cherry picked from commit 7289090683)
When the TCP test is run on the busy server, the server might take a
while to wind the server down because it might still be processing all
that 300k invalid XFR requests.
Increate the rncd wait time to 120 seconds, the SIGTERM time to 300
seconds, and reduce the time to wait for ans servers from 1200 second
to just 120 seconds.
(cherry picked from commit d971472321)
Backport of MR !6847
Merge branch 'backport-ondrej-increase-the-time-to-wait-for-servers-to-gracefully-shutdown-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9467
When the TCP test is run on the busy server, the server might take a
while to wind the server down because it might still be processing all
that 300k invalid XFR requests.
Increate the rncd wait time to 120 seconds, the SIGTERM time to 300
seconds, and reduce the time to wait for ans servers from 1200 second
to just 120 seconds.
(cherry picked from commit d971472321)
Administrators may wish to constrain the set of cores that BIND 9 runs on via the 'taskset', 'cpuset' or 'numactl' programs (or equivalent on other O/S).
If the admin has used taskset, the `named` will now follow to automatically use the given number of CPUs rather than the system wide count.
Closes#4884
Backport of MR !9398
Merge branch 'backport-4884-use-cpuset-to-get-number-of-cpus-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9443
Administrators may wish to constrain the set of cores that BIND 9 runs
on via the 'taskset', 'cpuset' or 'numactl' programs (or equivalent on
other O/S), for example to achieve higher (or more stable) performance
by more closely associating threads with individual NIC rx queues. If
the admin has used taskset, it follows that BIND ought to
automatically use the given number of CPUs rather than the system wide
count.
Co-Authored-By: Ray Bellis <ray@isc.org>
(cherry picked from commit 5a2df8caf5)
Use the fact that the database returns the longest matching part of the requested name to find the required NSEC3 record. If there are multiple versions present in the database we may have to search further.
Closes#4460
Backport of MR !9436
Merge branch 'backport-4460-auth-nsec3-many-labels-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9439
Return partial match from dns_db_find/dns_db_find when requested
to short circuit the closest encloser discover process. Most of the
time this will be the actual closest encloser but may not be when
there yet to be committed / cleaned up versions of the zone with
names below the actual closest encloser.
(cherry picked from commit d42ea08f16)
With max-recursion-queries set to 50 the resolver system test was
unstable in the "checking query resolution for a domain with a valid
glueless delegation chain" check as ns1 replied with SERVFAIL.
Closes#4897
Backport of MR !9435
Merge branch 'backport-4897-resolver-ns1-max-recursion-queries-100-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9441
With max-recursion-queries set to 50 the resolver system test was
unstable in the "checking query resolution for a domain with a valid
glueless delegation chain" check as ns1 replied with SERVFAIL.
(cherry picked from commit 8e0244d300)
Backport of MR !9424
Merge branch 'backport-mnowak/avoid-some-artifacts-in-stress-tests-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9429
The `malloc_usable_size()` can return size larger than originally allocated and when these sizes disagree the fortifier enabled by `_FORTIFY_SOURCE=3` detects overflow and stops the `named` execution abruptly. Stop using these convenience functions as they are primary used for introspection-only.
Closes#4880
Backport of MR !9400
Merge branch 'backport-4880-dont-use-malloc_usable_size-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9419
Although the nanual page of malloc_usable_size says:
Although the excess bytes can be over‐written by the application
without ill effects, this is not good programming practice: the
number of excess bytes in an allocation depends on the underlying
implementation.
it looks like the premise is broken with _FORTIFY_SOURCE=3 on newer
systems and it might return a value that causes program to stop with
"buffer overflow" detected from the _FORTIFY_SOURCE. As we do have own
implementation that tracks the allocation size that we can use to track
the allocation size, we can stop relying on this introspection function.
Also the newer manual page for malloc_usable_size changed the NOTES to:
The value returned by malloc_usable_size() may be greater than the
requested size of the allocation because of various internal
implementation details, none of which the programmer should rely on.
This function is intended to only be used for diagnostics and
statistics; writing to the excess memory without first calling
realloc(3) to resize the allocation is not supported. The returned
value is only valid at the time of the call.
Remove usage of both malloc_usable_size() and malloc_size() to be on the
safe size and only use the internal size tracking mechanism when
jemalloc is not available.
(cherry picked from commit d61712d14e)
The ISC_ATTR_UNUSED macro was missing in BIND 9.18, which
complicated things when backporting merge requests from main.
As __attribute__((__unused__)) is ubiquitous, just define the
macro.
The cross-version-config-tests job fails when a system test is removed
from the upcoming release. To avoid this, remove the system test also
from the $BIND_BASELINE_VERSION.
See the failure mode at https://gitlab.isc.org/isc-projects/bind9/-/jobs/4668947.
Backport of MR !9413
Merge branch 'backport-mnowak/remove-dialup-from-cross-version-config-tests-job-9.18' into 'bind-9.18'
See merge request isc-projects/bind9!9416
The cross-version-config-tests job fails when a system test is removed
from the upcoming release. To avoid this, remove the system test also
from the $BIND_BASELINE_VERSION.
(cherry picked from commit 60f5f2a9d9)