Commit Graph

15543 Commits

Author SHA1 Message Date
Mark Andrews
5fad79c92f Log the rcode returned to for a query
Log to the querylog the rcode of a previous query using
the identifier 'response:' to diffenciate queries from
responses.
2024-09-19 21:44:06 +00:00
Evan Hunt
5a444838db rename 'rbtiterator' and similar names in qpcache
when the QP cache was adapted from the RBT database, some names
weren't changed. this could be confusing, so let's change them now.
also, we no longer need to include rbt.h.
2024-09-19 19:32:27 +00:00
Nicki Křížek
377831a290 Merge tag 'v9.21.1' 2024-09-18 18:02:41 +02:00
Ondřej Surý
62d59766d6 Remove DNSRPS implementation
DNSRPS was the API for a commercial implementation of Response-Policy
Zones that was supposedly better.  However, it was never open-sourced
and has only ever been available from a single vendor.  This goes against
the principle that the open-source edition of BIND 9 should contain only
features that are generally available and universal.

This commit removes the DNSRPS implementation from BIND 9.  It may be
reinstated in the subscription edition if there's enough interest from
customers, but it would have to be rewritten as a plugin (hook) instead
of hard-wiring it again in so many places.
2024-09-18 17:39:14 +02:00
Evan Hunt
98ae5dfc7e fix DNSRPS errors
silence some reported snprintf() overrun warnings that prevented
DNSRPS from building on some platforms.
2024-09-18 17:24:13 +02:00
Evan Hunt
dc13333957 use uv_dlopen() instead of dlopen() when linking DNSRPZ
take advantage of libuv's shared library handling capability
when linking to a DNSRPS library.  (see b396f55586 and 37b9511ce1
for prior related work.)
2024-09-18 17:24:13 +02:00
Ondřej Surý
d7bff3c0f9 Remove old cruft from dnsrps code
There was some old cruft for ancient compilers checking for attributes
that we regularly use, etc.  Just remove the cruft.
2024-09-18 17:24:13 +02:00
Aram Sargsyan
7c45caa8a5 Set logging category for notify/xfer related messages
Some notify/xfer related log messages are logged at the general
category. Set a more suitable caterogry for those messages.
2024-09-17 15:08:40 +00:00
Ondřej Surý
b576c4c977 Limit the outgoing UDP send queue size
If the operating system UDP queue gets full and the outgoing UDP sending
starts to be delayed, BIND 9 could exhibit memory spikes as it tries to
enqueue all the outgoing UDP messages.  As those are not going to be
delivered anyway (as we argued when we stopped enlarging the operating
system send and receive buffers), try to send the UDP messages directly
using `uv_udp_try_send()` and if that fails, drop the outgoing UDP
message.
2024-09-17 14:02:03 +00:00
alessio
8b8149cdd2 Do not set SO_INCOMING_CPU
We currently set SO_INCOMING_CPU incorrectly, and testing by Ondrej
shows that fixing the issue and setting affinities is worse than letting
the kernel schedule threads without constraints. So we should not set
SO_INCOMING_CPU anymore.
2024-09-16 12:18:22 +00:00
Aram Sargsyan
a018b4e36f Implement the ForwardOnlyFail statistics channel counter
The new ForwardOnlyFail statistics channel counter indicates the
number of queries failed due to bad forwarders for 'forward only'
zones.
2024-09-16 09:31:14 +00:00
Aram Sargsyan
e430ce7039 Fix a 'serverquota' counter calculation bug
The 'all_spilled' local variable in resolver.c:fctx_getaddresses()
is 'true' by default, and only becomes false when there is at least
one successfully found NS address. However, when a 'forward only;'
configuration is used, the code jumps over the part where it looks
for NS addresses and doesn't reset the 'all_spilled' to false, which
results in incorretly increased 'serverquota' statistics variable,
and also in invalid return error code from the function. The result
code error didn't make any differences, because all codes other than
'ISC_R_SUCCESS' or 'DNS_R_WAIT' were treated in the same way, and
the result code was never logged anywhere.

Set the default value of 'all_spilled' to 'false', and only make it
'true' before actually starting to look up NS addresses.
2024-09-16 08:23:12 +00:00
Ondřej Surý
8a96a3af6a Move offloaded DNSSEC operations to different helper threads
Currently, the isc_work API is overloaded.  It runs both the
CPU-intensive operations like DNSSEC validations and long-term tasks
like RPZ processing, CATZ processing, zone file loading/dumping and few
others.

Under specific circumstances, when many large zones are being loaded, or
RPZ zones processed, this stops the CPU-intensive tasks and the DNSSEC
validation is practically stopped until the long-running tasks are
finished.

As this is undesireable, this commit moves the CPU-intensive operations
from the isc_work API to the isc_helper API that only runs fast memory
cleanups now.
2024-09-12 12:09:45 +00:00
Ondřej Surý
6370e9b311 Add isc_helper API that adds 1:1 thread for each loop
Add an extra thread that can be used to offload operations that would
affect latency, but are not long-running tasks; those are handled by
isc_work API.

Each isc_loop now has matching isc_helper thread that also built on top
of uv_loop.  In fact, it matches most of the isc_loop functionality, but
only the `isc_helper_run()` asynchronous call is exposed.
2024-09-12 12:09:45 +00:00
Aram Sargsyan
35ef25e5ea Fix data race in offloaded dns_message_checksig()
When verifying a message in an offloaded thread there is a race with
the worker thread which writes to the same buffer. Clone the message
buffer before offloading.
2024-09-12 09:08:35 +00:00
alessio
da0e48b611 Remove "port" from source address options
Remove the use of "port" when configuring query-source(-v6),
transfer-source(-v6), notify-source(-v6), parental-source(-v6),
etc. Remove the use of source ports for parental-agents.

Also remove the deprecated options use-{v4,v6}-udp-ports and
avoid-{v4,v6}udp-ports.
2024-09-12 08:15:58 +02:00
Mark Andrews
b9246418e8 Fix named-checkconf and statistics-channels
If neither libxml2 nor libjson_c are available have named-checkconf
fail if a statistics-channels block is specified.
2024-09-12 09:21:44 +10:00
Michal Nowak
ff69d07fed Update code formatting
clang 19 was updated in the base image.
2024-09-10 17:31:32 +02:00
JINMEI Tatuya
7289090683 allow IXFR-to-AXFR fallback on DNS_R_TOOMANYRECORDS
This change allows fallback from an IXFR failure to AXFR when the
reason is DNS_R_TOOMANYRECORDS. This is because this error condition
could be temporary only in an intermediate version of IXFR
transactions and it's possible that the latest version of the zone
doesn't have that condition. In such a case, the secondary would never
be able to update the zone (even if it could) without this fallback.

This fallback behavior is particularly useful with the recently
introduced max-records-per-type and max-types-per-name options:
the primary may not have these limitations and may temporarily
introduce "too many" records, breaking IXFR. If the primary side
subsequently deletes these records, this fallback will help recover
the zone transfer failure automatically; without it, the secondary
side would first need to increase the limit, which requires more
operational overhead and has its own adverse effect.

This change also fixes a minor glitch that DNS_R_TOOMANYRECORDS wasn't
logged in xfrin_fail.
2024-09-10 14:02:38 +02:00
Aram Sargsyan
0367c60759 Fix RCU API usage in acl.c
The rcu_xchg_pointer() function can be used outside of a critical
section, and usually must be followed by a synchronize_rcu() or
call_rcu() call to detach from the resource, unless if there are
some guarantees in place because of our own reference counting.
2024-09-10 09:54:20 +00:00
Mark Andrews
61faffd06f Add flag to named-checkconf to ignore "not configured" errors
named-checkconf now takes "-n" to ignore "not configured" errors.
This allows named-checkconf to check the syntax of configurations
from other builds which have support for more options.
2024-09-09 23:32:16 +00:00
Nicki Křížek
6857df20a4 Double the number of threadpool threads
Introduce this temporary workaround to reduce the impact of long-running
tasks in offload threads which can block the resolution of queries.
2024-09-06 14:15:21 +02:00
Matthijs Mekking
911daeb306 Nit logging change
Fix wrong function name (dns_dnssec_keymgr -> dns_keymgr_run).

Add error log if dns_keymgr_offline() fails.
2024-09-03 12:01:21 +02:00
Matthijs Mekking
5af53a329f Fix bug in dns_keymgr_offline
If the ZSK has lifetime unlimited, the timing metadata "Inactive" and
"Delete" cannot be found and is treated as an error. Fix by allowing
these metadata to not exist.
2024-09-03 11:57:56 +02:00
Aram Sargsyan
d85918aebf Process canceled/shut down results in validate_dnskey_dsset_done()
When a validator is already shut down, val->name becomes NULL. We
need to process and keep the ISC_R_CANCELED or ISC_R_SHUTTINGDOWN
result code before calling validate_async_done(), otherwise, when it
is called with the hardcoded DNS_R_NOVALIDSIG result code, it can
cause an assetion failure when val->name (being NULL) is used in
proveunsecure().
2024-09-02 15:40:30 +00:00
Ondřej Surý
5a2df8caf5 Follow the number of CPU set by taskset/cpuset
Administrators may wish to constrain the set of cores that BIND 9 runs
on via the 'taskset', 'cpuset' or 'numactl' programs (or equivalent on
other O/S), for example to achieve higher (or more stable) performance
by more closely associating threads with individual NIC rx queues. If
the admin has used taskset, it follows that BIND ought to
automatically use the given number of CPUs rather than the system wide
count.

Co-Authored-By: Ray Bellis <ray@isc.org>
2024-08-29 14:43:18 +00:00
Mark Andrews
d42ea08f16 Return partial match when requested
Return partial match from dns_db_find/dns_db_find when requested
to short circuit the closest encloser discover process.  Most of the
time this will be the actual closest encloser but may not be when
there yet to be committed / cleaned up versions of the zone with
names below the actual closest encloser.
2024-08-29 12:48:20 +00:00
Mark Andrews
43f0b0e8eb Move lock earlier in the call sequence
fctx->state should be read with the lock held.

    1559        /*
    1560         * Caller must be holding the fctx lock.
    1561         */

    CID 468796: (#1 of 1): Data race condition (MISSING_LOCK)
    1. missing_lock: Accessing fctx->state without holding lock fetchctx.lock.
       Elsewhere, fetchctx.state is written to with fetchctx.lock held 2 out of 2 times.
    1562        REQUIRE(fctx->state == fetchstate_done);
    1563
    1564        FCTXTRACE("sendevents");
    1565
    1566        LOCK(&fctx->lock);
    1567
2024-08-29 04:33:56 +00:00
Mark Andrews
a45e39d114 Use atomics to access find->status 2024-08-28 22:42:16 +00:00
Mark Andrews
c900300f21 Use an accessor fuction to access find->status
find->status is marked as private and access is controlled
by find->lock.
2024-08-28 22:42:16 +00:00
Aram Sargsyan
c7e8b7cf63 Exempt prefetches from the fetches-per-server quota
Give prefetches a free pass through the quota so that the cache
entries for popular zones could be updated successfully even if the
quota for is already reached.
2024-08-26 15:50:21 +00:00
Aram Sargsyan
cada2de31f Exempt prefetches from the fetches-per-zone quota
Give prefetches a free pass through the quota so that the cache entry
for a popular zone could be updated successfully even if the quota for
it is already reached.
2024-08-26 15:50:21 +00:00
Ondřej Surý
d61712d14e Stop using malloc_usable_size and malloc_size
Although the nanual page of malloc_usable_size says:

    Although the excess bytes can be over‐written by the application
    without ill effects, this is not good programming practice: the
    number of excess bytes in an allocation depends on the underlying
    implementation.

it looks like the premise is broken with _FORTIFY_SOURCE=3 on newer
systems and it might return a value that causes program to stop with
"buffer overflow" detected from the _FORTIFY_SOURCE.  As we do have own
implementation that tracks the allocation size that we can use to track
the allocation size, we can stop relying on this introspection function.

Also the newer manual page for malloc_usable_size changed the NOTES to:

    The value returned by malloc_usable_size() may be greater than the
    requested size of the allocation because of various internal
    implementation details, none of which the programmer should rely on.
    This function is intended to only be used for diagnostics and
    statistics; writing to the excess memory without first calling
    realloc(3) to resize the allocation is not supported.  The returned
    value is only valid at the time of the call.

Remove usage of both malloc_usable_size() and malloc_size() to be on the
safe size and only use the internal size tracking mechanism when
jemalloc is not available.
2024-08-26 15:00:44 +00:00
Evan Hunt
642a1b985d remove the "dialup" and "heartbeat-interval" options
mark "dialup" and "heartbeat-interval" options as ancient and
remove the documentation and the code implementing them.
2024-08-22 11:11:10 -07:00
Aram Sargsyan
c05a823e8b Implement the 'request-ixfr-max-diffs' configuration option
This limits the maximum number of received incremental zone
transfer differences for a secondary server. Upon reaching the
confgiured limit, the secondary aborts IXFR and initiates a full
zone transfer (AXFR).
2024-08-22 13:42:27 +00:00
Mark Andrews
035289be71 Check key tag range when matching dnssec keys to kasp keys 2024-08-22 12:12:02 +00:00
Mark Andrews
c5bc0a1805 Add optional range directive to keys in dnssec-policy 2024-08-22 12:12:02 +00:00
Mark Andrews
25bf77fac6 Add the concept of allowed key tag ranges to kasp 2024-08-22 12:12:02 +00:00
Matthijs Mekking
f37eb33f29 Fix algorithm rollover bug wrt keytag conflicts
If there is an algorithm rollover and two keys of different algorithm
share the same keytags, then there is a possibility that if we check
that a key matches a specific state, we are checking against the wrong
key.

Fix this by not only checking for matching key id but also key
algorithm.
2024-08-22 11:29:43 +02:00
Ondřej Surý
7b756350f5 Use clang-format-19 to update formatting
This is purely result of running:

    git-clang-format-19 --binary clang-format-19 origin/main
2024-08-22 09:21:55 +02:00
Matthijs Mekking
2e3068ed60 Disable some behavior in offline-ksk mode
Some things we no longer want to do when we are in offline-ksk mode.

1. Don't check for inactive and private keys if the key is a KSK.
2. Don't update the TTL of DNSKEY, CDS and CDNSKEY RRset, these come
   from the SKR.
2024-08-22 08:21:52 +02:00
Matthijs Mekking
61cf599fbf Retrieve RRSIG from SKR
When it is time to generate a new signature (dns_dnssec_sign), rather
than create a new one, retrieve it from the SKR.
2024-08-22 08:21:52 +02:00
Matthijs Mekking
30d20b110e Don't read private key files for offline KSKs
When we are appending contents of a DNSKEY rdataset to a keylist,
don't attempt to read the private key file of a KSK when we are in
offline-ksk mode.
2024-08-22 08:21:52 +02:00
Matthijs Mekking
2190aa904f Update key states in offline-ksk mode
With offline-ksk enabled, we don't run the keymgr because the key
timings are determined by the SKR. We do update the key states but
we derive them from the timing metadata.

Then, we can skip a other tasks in offline-ksk mode, like DS checking
at the parent and CDS synchronization, because the CDS and CDNSKEY
RRsets also come from the SKR.
2024-08-22 08:21:52 +02:00
Matthijs Mekking
63e058c29e Apply SKR bundle on rekey
When a zone has a skr structure, lookup the currently active bundle
that contains the right key and signature material.
2024-08-22 08:21:52 +02:00
Matthijs Mekking
037382c4a5 Implement SKR import
When 'rndc skr import' is called, read the file contents and store the
data in the zone's skr structure.
2024-08-22 08:21:52 +02:00
Matthijs Mekking
445722d2bf Add code to store SKR
This added source code stores SKR data. It is loosely based on:
https://www.iana.org/dnssec/archive/files/draft-icann-dnssec-keymgmt-01.txt

A SKR contains a list of signed DNSKEY RRsets. Each change in data
should be stored in a separate bundle. So if the RRSIG is refreshed that
means it is stored in the next bundle. Likewise, if there is a new ZSK
pre-published, it is in the next bundle.

In addition (not mentioned in the draft), each bundle may contain
signed CDS and CDNSKEY RRsets.

Each bundle has an inception time. These will determine when we need
to re-sign or re-key the zone.
2024-08-22 08:21:52 +02:00
Matthijs Mekking
0598381236 Add offline-ksk option
Add a new configuration option to enable Offline KSK key management.

Offline KSK cannot work with CSK because it splits how keys with the
KSK and ZSK role operate. Therefore, one key cannot have both roles.
Add a configuration check to ensure this.
2024-08-22 08:21:52 +02:00
Nicki Křížek
779de4ec34 Merge tag 'v9.21.0' 2024-08-21 16:23:09 +02:00
Ondřej Surý
3bca3cb5cf Destroy the dns_xfrin isc_timers on the correct loop
There are few places where we attach/detach from the dns_xfrin object
while running on a different thread than the zone's assigned thread -
xfrin_xmlrender() in the statschannel and dns_zone_stopxfr() to name the
two places where it happens now.  In the rare case, when the incoming
transfer completes (or shuts down) in the brief period between the other
thread attaches and detaches from the dns_xfrin, the isc_timer_destroy()
calls would be called by the last thread calling the xfrin_detach().
In the worst case, it would be this other thread causing assertion
failure.  Move the isc_timer_destroy() call to xfrin_end() function
which is always called on the right thread and to match this move
isc_timer_create() to xfrin_start() - although this other change makes
no difference.
2024-08-21 13:54:40 +02:00