Commit Graph

42357 Commits

Author SHA1 Message Date
Aram Sargsyan
533d8c099d Test that RPZ "passthru" doesn't alter the answer's TTL with ANY queries
Expand the test_rpz_passthru_logging() check in the "rpzextra" system
test to check the answer's TTL values with ANY type queries.

(cherry picked from commit 98ff3a4432)
2025-02-27 09:22:01 +00:00
Aram Sargsyan
2d48cb33e3 Fix TTL issue with ANY queries processed through RPZ "passthru"
Answers to an "ANY" query which are processed by the RPZ "passthru"
policy have the response-policy's 'max-policy-ttl' value unexpectedly
applied. Do not change the records' TTL when RPZ uses a policy which
does not alter the answer.

(cherry picked from commit 5633dc90d3)
2025-02-27 09:22:01 +00:00
Mark Andrews
ea9f0f4315 [9.20] fix: doc: Fix command to generate KSR in DNSSEC guide
Backport of MR !10087

Merge branch 'backport-doc-fix-dnssec-ksr-request-command-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10175
2025-02-26 02:38:41 +00:00
Doug Freed
fd2a37139c Fix command to generate KSR in DNSSEC guide
(cherry picked from commit 0dd046d007)
2025-02-26 01:52:13 +00:00
Mark Andrews
a47dab2c5e [9.20] fix: usr: Fix dual-stack-servers configuration option
The dual-stack-servers configuration option was not working as expected; the specified servers were not being used when they should have been, leading to resolution failures. This has been fixed.

Closes #5019

Backport of MR !9708

Merge branch 'backport-5019-dual-stack-servers-wasn-t-working-in-all-cases-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10174
2025-02-26 01:43:53 +00:00
Mark Andrews
c77032caf5 Removing now unneeded priming queries
Now that fctx_try is being called when adb returns DNS_ADB_NOMOREADDRESSES
we don't need these priming queries for the dual-stack-servers test
to succeed.

(cherry picked from commit 14ab1629b7)
2025-02-26 01:04:59 +00:00
Mark Andrews
14bd113b8f Fix dual-stack-servers
Named was stopping nameserver address resolution attempts too soon
when dual stack servers are configured.  Dual stack servers are
used when there are *not* addresses for the server in a particular
address family so find->status == DNS_ADB_NOMOREADDRESSES is not a
sufficient stopping condition when dual stack servers are available.
Call fctx_try to see if the alternate servers can be used.

(cherry picked from commit f98a8331aa)
2025-02-26 01:04:59 +00:00
Evan Hunt
0201e3eacb [9.20] fix: dev: Prevent a reference leak when using plugins
The `NS_QUERY_DONE_BEGIN` and `NS_QUERY_DONE_SEND` plugin hooks could cause a reference leak if they returned `NS_HOOK_RETURN` without cleaning up the query context properly.

Closes #2094

Backport of MR !9971

Merge branch 'backport-2094-plugin-reference-leak-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10170
2025-02-26 00:56:01 +00:00
Evan Hunt
cc0fc98244 wrap ns_client_error() for unit testing
When testing, the client object doesn't have a proper
netmgr handle, so ns_client_error() needs to be a no-op.

(cherry picked from commit ae37ef45ff)
2025-02-26 00:55:51 +00:00
Evan Hunt
4f1f958d6d prevent a reference leak from the ns_query_done hooks
if the NS_QUERY_DONE_BEGIN or NS_QUERY_DONE_SEND hook is
used in a plugin and returns NS_HOOK_RETURN, some of the
cleanup in ns_query_done() can be skipped over, leading
to reference leaks that can cause named to hang on shut
down.

this has been addressed by adding more housekeeping
code after the cleanup: tag in ns_query_done().

(cherry picked from commit c2e4358267)
2025-02-26 00:55:51 +00:00
Mark Andrews
455080866c [9.20] fix: usr: Relax private DNSKEY and RRSIG constraints
DNSKEY, KEY, RRSIG and SIG constraints have been relaxed to allow empty key and signature material after the algorithm identifier for PRIVATEOID and PRIVATEDNS. It is arguable whether this falls within the expected use of these types as no key material is shared and the signatures are ineffective but these are private algorithms and they can be totally insecure.

Closes #5167

Backport of MR !10083

Merge branch 'backport-5167-relax-private-dnskey-constraints-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10173
2025-02-26 00:17:35 +00:00
Mark Andrews
a0dae15cd1 Relax private DNSKEY and RRSIG constraints
DNSKEY, KEY, RRSIG and SIG constraints have been relaxed to allow
empty key and signature material after the algorithm identifier for
PRIVATEOID and PRIVATEDNS. It is arguable whether this falls within
the expected use of these types as no key material is shared and
the signatures are ineffective but these are private algorithms and
they can be totally insecure.

(cherry picked from commit b048190e23)
2025-02-25 23:40:38 +00:00
Mark Andrews
2d4b4fe15e [9.20] fix: usr: dnssec-signzone needs to check for a NULL key when setting offline
dnssec-signzone could dereference a NULL key pointer when resigning a zone.  This has been fixed.

Closes #5192

Backport of MR !10161

Merge branch 'backport-5192-dnssec-signzone-needs-to-check-for-a-null-key-when-setting-offline-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10169
2025-02-25 23:21:58 +00:00
Mark Andrews
da9fbf72e4 Check if key is NULL before dereferencing it
(cherry picked from commit 1784e4a9ae)
2025-02-25 22:25:55 +00:00
Mark Andrews
a8f422d3dc [9.20] fix: test: Handle example3.db being modified in upforwd system test
The zone file for example3 (ns1/example3.db) can be modified in the
upforwd test as example3 is updated as part of the test.  Whether
the zone is written out or not by the end of the test is timing
dependent.  Rename ns1/example3.db to ns1/example3.db.in and copy it to
ns1/example3.db in setup so we don't trigger post test changes checks.

Closes #5180

Backport of MR !10160

Merge branch 'backport-5180-create-example3-in-setup-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10163
2025-02-25 22:15:40 +00:00
Mark Andrews
9bb9f0f21b Handle example3.db being modified in upforwd system test
The zone file for example3 (ns1/example3.db) can be modified in the
upforwd test as example3 is updated as part of the test.  Whether
the zone is written out or not by the end of the test is timing
dependent.  Rename ns1/example3.db to ns1/example3.db.in and copy
it to ns1/example3.db in setup so we don't trigger post test changes
checks.

(cherry picked from commit afc4413862)
2025-02-25 21:39:55 +00:00
Ondřej Surý
5d913c3383 [9.20] fix: usr: Fix assertion failure when dumping recursing clients
Previously, if a new counter was added to the hashtable
while dumping recursing clients via the `rndc recursing`
command, and `fetches-per-zone` was enabled, an assertion
failure could occur. This has been fixed.

Closes #5200

Backport of MR !10164

Merge branch 'backport-5200-destroy-iterator-inside-the-rwlock-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10168
2025-02-25 16:58:21 +00:00
Ondřej Surý
7682d63bd4 Destroy the hashmap iterator inside the rwlock
Previously, the hashmap iterator for fetches-per-zone was destroy
outside the rwlock.  This could lead to an assertion failure due to a
timing race with the internal rehashing of the hashmap table as the
rehashing process requires no iterators to be running when rehashing the
hashmap table.  This has been fixed by moving the destruction of the
iterator inside the read locked section.

(cherry picked from commit 1e4fb53c61)
2025-02-25 15:41:30 +00:00
Evan Hunt
b8bd65763c [9.20] fix: dev: Fix a logic error in cache_name()
A change in 6aba56ae8 (checking whether a rejected RRset was identical
to the data it would have replaced, so that we could still cache a
signature) inadvertently introduced cases where processing of a
response would continue when previously it would have been skipped.

Closes #5197

Backport of MR !10157

Merge branch 'backport-5197-cache_name-logic-error-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10158
2025-02-25 00:23:52 +00:00
Evan Hunt
16a80f401a Fix a logic error in cache_name()
A change in 6aba56ae8 (checking whether a rejected RRset was identical
to the data it would have replaced, so that we could still cache a
signature) inadvertently introduced cases where processing of a
response would continue when previously it would have been skipped.

(cherry picked from commit d0fd9cbe3b)
2025-02-24 23:42:25 +00:00
Ondřej Surý
b2033b7e4c [9.20] fix:usr: Dump the active resolver fetches from dns_resolver_dumpfetches()
Previously, active resolver fetches were only dumped when the `fetches-per-zone` configuration option was enabled. Now, active resolver fetches are dumped along with the number of `clients-per-server` counters per resolver fetch.

Backport of MR !10107

Merge branch 'backport-ondrej/make-dns_resolver_dumpfetches-dump-fetches-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10148
2025-02-21 22:05:29 +00:00
Ondřej Surý
37e95cb4dd Dump the fetches from dns_resolver_dumpfetches()
Previously, the dns_resolver_dumpfetches() would go over the fetch
counters.  Alas, because of the earlier optimization, the fetch counters
would be increased only when fetches-per-zone was not 0, otherwise the
whole counting was skipped for performance reasons.

Instead of using the auxiliary fetch counters hash table, use the real
hash table that stores the fetch contexts to dump the ongoing fetches to
the recursing file.

Additionally print more information about the fetch context like start
and expiry times, number of fetch responses, number of queries and count
of allowed and dropped fetches.

(cherry picked from commit c6b0368b21)
2025-02-21 22:05:24 +00:00
Ondřej Surý
20cf51dfc5 [9.20] fix:usr: Fix the data race causing a permanent active client increase
Previously, a data race could cause a newly created fetch context for a new client to be used
before it had been fully initialized, which would cause the query to become stuck; queries for the same
data would be either paused indefinitely or dropped because of
the `clients-per-query` limit. This has been fixed.

Closes #5053

Backport of MR !10146

Merge branch 'backport-5053-fetch-context-create-data-race-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10147
2025-02-21 22:05:16 +00:00
Ondřej Surý
eec7b79ee0 Fix the fetch context hash table lock ordering
The order of the fetch context hash table rwlock and the individual
fetch context was reversed when calling the release_fctx() function.
This was causing a problem when iterating the hash table, and thus the
ordering has been corrected in a way that the hash table rwlock is now
always locked on the outside and the fctx lock is the interior lock.

(cherry picked from commit cf078fadeb)
2025-02-21 22:27:34 +01:00
Ondřej Surý
ace7c879a8 Add isc_timer_running() function to check status of timer
In the next commit, we need to know whether the timer has been started
or stopped.  Add isc_timer_running() function that returns true if the
timer has been started.

(cherry picked from commit b9e3cd5d2a)
2025-02-21 22:27:25 +01:00
Arаm Sаrgsyаn
eca9a3279e [9.20] fix: usr: Fix RPZ race condition during a reconfiguration
With RPZ in use, `named` could terminate unexpectedly because of a race condition when a reconfiguration command was received using `rndc`. This has been fixed.

Closes #5146

Backport of MR !10079

Merge branch 'backport-5146-rpz-reconfig-bug-fix-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10144
2025-02-21 12:45:27 +00:00
Aram Sargsyan
0add37862e Fix RPZ bug when resuming a query during a reconfiguration
After a reconfiguration the old view can be left without a valid
'rpzs' member, because when the RPZ is not changed during the named
reconfiguration 'rpzs' "migrate" from the old view into the new
view, so when a query resumes it can find that 'qctx->view->rpzs'
is NULL which query_resume() currently doesn't expect to happen if
it's recursing and 'qctx->rpz_st' is not NULL.

Fix the issue by adding a NULL-check. In order to not split the log
message to two different log messages depending on whether
'qctx->view->rpzs' is NULL or not, change the message to not log
the RPZ policy's "version" which is just a runtime counter and is
most likely not very useful for the users.

(cherry picked from commit 3ea2fbc238)
2025-02-21 11:45:45 +00:00
Mark Andrews
b752db0c3f [9.20] fix: usr: Remove NSEC/DS/NSEC3 RRSIG check from dns_message_parse
Previously, when parsing responses, named incorrectly rejected responses without matching RRSIG records for NSEC/DS/NSEC3 records in the authority section. This rejection, if appropriate, should have been left for the validator to determine and has been fixed.

Closes #5185

Backport of MR !10125

Merge branch 'backport-5185-remove-rrsig-check-from-dns_message_parse-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10142
2025-02-21 03:37:15 +00:00
Mark Andrews
3279aa7381 Check insecure response with missing RRSIG in authority
This scenario should succeed but wasn't due rejection of the
message at the message parsing stage.

(cherry picked from commit 4271d93f00)
2025-02-21 03:00:29 +00:00
Mark Andrews
db364baa83 Remove check for missing RRSIG records from getsection
Checking whether the authority section is properly signed should
be left to the validator.  Checking in getsection (dns_message_parse)
was way too early and resulted in resolution failures of lookups
that should have otherwise succeeded.

(cherry picked from commit 83159d0a54)
2025-02-21 03:00:29 +00:00
Arаm Sаrgsyаn
95af81b674 [9.20] fix: usr: Implement sig0key-checks-limit and sig0message-checks-limit
Previously a hard-coded limitation of maximum two key or message
verification checks were introduced when checking the message's
SIG(0) signature. It was done in order to protect against possible
DoS attacks. The logic behind choosing the number 2 was that more
than a single key should only be required during key rotations, and
in that case two keys are enough. But later it became apparent that
there are other use cases too where even more keys are required, see
issue number #5050 in GitLab.

This change introduces two new configuration options for the views,
`sig0key-checks-limit` and `sig0message-checks-limit`, which define how
many keys are allowed to be checked to find a matching key, and how
many message verifications are allowed to take place once a matching
key has been found. The latter protects against expensive cryptographic
operations when there are keys with colliding tags and algorithm
numbers, with default being 2, and the former protects against a bit
less expensive key parsing operations and defaults to 16.

Closes #5050

Backport of MR !9967

Merge branch 'backport-5050-sig0-let-considering-more-than-two-keys-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10141
2025-02-20 15:22:24 +00:00
Aram Sargsyan
33ddef1244 Document sig0key-checks-limit and sig0message-checks-limit
(cherry picked from commit 5861c10dfb)
2025-02-20 14:48:01 +00:00
Aram Sargsyan
5d69aab92d Implement sig0key-checks-limit and sig0message-checks-limit
Previously a hard-coded limitation of maximum two key or message
verification checks were introduced when checking the message's
SIG(0) signature. It was done in order to protect against possible
DoS attacks. The logic behind choosing the number two was that more
than one key should only be required only during key rotations, and
in that case two keys are enough. But later it became apparent that
there are other use cases too where even more keys are required, see
issue number #5050 in GitLab.

This change introduces two new configuration options for the views,
sig0key-checks-limit and sig0message-checks-limit, which define how
many keys are allowed to be checked to find a matching key, and how
many message verifications are allowed to take place once a matching
key has been found. The latter protects against expensive cryptographic
operations when there are keys with colliding tags and algorithm
numbers, with default being 2, and the former protects against a bit
less expensive key parsing operations and defaults to 16.

(cherry picked from commit 716b936045)
2025-02-20 14:48:01 +00:00
Arаm Sаrgsyаn
dbc635c148 [9.20] fix: dev: Fix isc_quota bug
Running jobs which were entered into the isc_quota queue is the
responsibility of the isc_quota_release() function, which, when
releasing a previously acquired quota, checks whether the queue
is empty, and if it's not, it runs a job from the queue without touching
the 'quota->used' counter. This mechanism is susceptible to a possible
hangup of a newly queued job in case when between the time a decision
has been made to queue it (because used >= max) and the time it was
actually queued, the last quota was released. Since there is no more
quotas to be released (unless arriving in the future), the newly
entered job will be stuck in the queue.

Fix the issue by adding checks in both isc_quota_release() and
isc_quota_acquire_cb() to make sure that the described hangup does
not happen. Also see code comments.

Closes #4965

Backport of MR !10082

Merge branch 'backport-4965-isc_quota-bug-fix-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10139
2025-02-20 13:34:58 +00:00
Aram Sargsyan
18fbc3f735 Fix isc_quota bug
Running jobs which were entered into the isc_quota queue is the
responsibility of the isc_quota_release() function, which, when
releasing a previously acquired quota, checks whether the queue
is empty, and if it's not, it runs a job from the queue without touching
the 'quota->used' counter. This mechanism is susceptible to a possible
hangup of a newly queued job in case when between the time a decision
has been made to queue it (because used >= max) and the time it was
actually queued, the last quota was released. Since there is no more
quotas to be released (unless arriving in the future), the newly
entered job will be stuck in the queue.

Fix the wrong memory ordering for 'quota->used', as the relaxed
ordering doesn't ensure that data modifications made by one thread
are visible in other threads.

Add checks in both isc_quota_release() and isc_quota_acquire_cb()
to make sure that the described hangup does not happen. Also see
code comments.

(cherry picked from commit c6529891bb)
2025-02-20 12:20:25 +00:00
Arаm Sаrgsyаn
4a5a9c8256 [9.20] new: usr: Implement the min-transfer-rate-in configuration option
A new option 'min-transfer-rate-in <bytes> <minutes>' has been added
to the view and zone configurations. It can abort incoming zone
transfers which run very slowly due to network related issues, for
example. The default value is set to 10240 bytes in 5 minutes.

Closes #3914

Backport of MR !9098

Merge branch 'backport-3914-detect-and-restart-stalled-zone-transfers-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10137
2025-02-20 12:18:08 +00:00
Aram Sargsyan
0bd251a496 Expose the incoming transfers' rates in the statistics channel
Expose the average transfer rate (in bytes-per-second) during the
last full 'min-transfer-rate-in <bytes> <minutes>' minutes interval.
If no such interval has passed yet, then the overall average rate is
reported instead.

(cherry picked from commit c701b590e4)
2025-02-20 11:05:09 +00:00
Aram Sargsyan
0f5295af40 Test the new min-transfer-rate-in configuration option
Add a new big zone, run a zone transfer in slow mode, and check
whether the zone transfer gets canceled because 100000 bytes are
not transferred in 5 seconds (as it's running in slow mode).

(cherry picked from commit b9c6aa24f8)
2025-02-20 11:05:09 +00:00
Aram Sargsyan
a1e391aeb3 Document the min-transfer-rate-in configuration option
Add a new section in ARM describing min-transfer-rate-in.

(cherry picked from commit f6dfff01ab)
2025-02-20 11:05:09 +00:00
Aram Sargsyan
e6b14365ad Implement the min-transfer-rate-in configuration option
This new option sets a minimum amount of transfer rate for
an incoming zone transfer that will abort a transfer, which
for some network related reasons run very slowly.

(cherry picked from commit 91ea156203)
2025-02-20 11:05:09 +00:00
Evan Hunt
9b3e1facf6 [9.20] fix: dev: Do not cache signatures for rejected data
The cache has been updated so that if new data is rejected - for example, because there was already existing data at a higher trust level - then its covering RRSIG will also be rejected.

Closes #5132

Backport of MR !9999

Merge branch 'backport-5132-improve-cd-behavior-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10134
2025-02-20 03:13:14 +00:00
Evan Hunt
a6a75f8262 add a test with an inconsistent NS RRset
add a zone with different NS RRsets in the parent and child,
and test resolver and forwarder behavior with and without +CD.

(cherry picked from commit e4652a0444)
2025-02-19 18:29:47 -08:00
Evan Hunt
fad9b3771f Check whether a rejected rrset is different
Add a new dns_rdataset_equals() function to check whether two
rdatasets are equal in DNSSEC terms.

When an rdataset being cached is rejected because its trust
level is lower than the existing rdataset, we now check to see
whether the rejected data was identical to the existing data.
This allows us to cache a potentially useful RRSIG when handling
CD=1 queries, while still rejecting RRSIGs that would definitely
have resulted in a validation failure.

(cherry picked from commit 6aba56ae89)
2025-02-19 18:29:34 -08:00
Artem Boldariev
9d4aa15c1f [9.20] fix: dev: Post [CVE-2024-12705] Performance Drop Fixes
This merge request fixes a [performance drop](https://gitlab.isc.org/isc-projects/bind9/-/pipelines/216728) after merging the fixes for #4795, in particular in 9.18.

The MR [fixes the problem](https://gitlab.isc.org/isc-projects/bind9/-/pipelines/219825) without affecting performance for the newer versions, in particular for [the development version](https://gitlab.isc.org/isc-projects/bind9/-/pipelines/220619).

Backport of MR !10109

Merge branch 'backport-artem-doh-performance-drop-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10129
2025-02-19 19:28:34 +00:00
Artem Boldariev
788e925261 DoH: http_send_outgoing() return value is not used
The value returned by http_send_outgoing() is not used anywhere, so we
make it not return anything (void). Probably it is an omission from
older times.

(cherry picked from commit 2adabe835a)
2025-02-19 20:34:29 +02:00
Artem Boldariev
47e9b47742 DoH: Fix missing send callback calls
When handling outgoing data, there were a couple of rarely executed
code paths that would not take into account that the callback MUST be
called.

It could lead to potential memory leaks and consequent shutdown hangs.

(cherry picked from commit 8b8f4d500d)
2025-02-19 20:34:29 +02:00
Artem Boldariev
6b9387e2ee DoH: change how the active streams number is calculated
This commit changes the way how the number of active HTTP streams is
calculated and allows it to scale with the values of the maximum
amount of streams per connection, instead of effectively capping at
STREAM_CLIENTS_PER_CONN.

The original limit, which is intended to define the pipelining limit
for TCP/DoT. However, it appeared to be too restrictive for DoH, as it
works quite differently and implements pipelining at protocol level by
the means of multiplexing multiple streams. That renders each stream
to be effectively a separate connection from the point of view of the
rest of the codebase.

(cherry picked from commit a22bc2d7d4)
2025-02-19 20:34:29 +02:00
Artem Boldariev
96e8ea1245 DoH: Track the amount of in flight outgoing data
Previously we would limit the amount of incoming data to process based
solely on the presence of not completed send requests. That worked,
however, it was found to severely degrade performance in certain
cases, as was revealed during extended testing.

Now we switch to keeping track of how much data is in flight (or ready
to be in flight) and limit the amount of processed incoming data when
the amount of in flight data surpasses the given threshold, similarly
to like we do in other transports.

(cherry picked from commit 05e8a50818)
2025-02-19 20:34:29 +02:00
Andoni Duarte Pintado
5f6080a959 Merge tag 'v9.20.6' into bind-9.20 2025-02-19 17:43:41 +01:00
Evan Hunt
0682684028 [9.20] fix: dev: Delete dead nodes when committing a new version
In the qpzone implementation of `dns_db_closeversion()`, if there are changed nodes that have no remaining data, delete them.

Closes #5169

Backport of MR !10089

Merge branch 'backport-5169-qpzone-delete-dead-nodes-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10124
2025-02-18 23:28:41 +00:00