Commit Graph

612 Commits

Author SHA1 Message Date
Matthijs Mekking
f481073110 Don't set EDE in ns_client_aclchecksilent
The ns_client_aclchecksilent is used to check multiple ACLs before
the decision is made that a query is denied. It is also used to
determine if recursion is available. In those cases we should not
set the extended DNS error "Prohibited".

(cherry picked from commit 798c8f57d4)
2023-01-10 10:02:14 +00:00
Evan Hunt
5fd93c66aa remove nonfunctional DSCP implementation
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.

To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.

(cherry picked from commit 916ea26ead)
2023-01-09 14:23:26 -08:00
Michał Kępień
90408617d7 Check for NULL before dereferencing qctx->rpz_st
Commit 9ffb4a7ba1 causes Clang Static
Analyzer to flag a potential NULL dereference in query_nxdomain():

    query.c:9394:26: warning: Dereference of null pointer [core.NullDereference]
            if (!qctx->nxrewrite || qctx->rpz_st->m.rpz->addsoa) {
                                    ^~~~~~~~~~~~~~~~~~~
    1 warning generated.

The warning above is for qctx->rpz_st potentially being a NULL pointer
when query_nxdomain() is called from query_resume().  This is a false
positive because none of the database lookup result codes currently
causing query_nxdomain() to be called (DNS_R_EMPTYWILD, DNS_R_NXDOMAIN)
can be returned by a database lookup following a recursive resolution
attempt.  Add a NULL check nevertheless in order to future-proof the
code and silence Clang Static Analyzer.

(cherry picked from commit 07592d1315)
(cherry picked from commit a4547a1093)
2023-01-09 14:26:02 +01:00
Matthijs Mekking
271bc20b1c Consider non-stale data when in serve-stale mode
With 'stale-answer-enable yes;' and 'stale-answer-client-timeout off;',
consider the following situation:

A CNAME record and its target record are in the cache, then the CNAME
record expires, but the target record is still valid.

When a new query for the CNAME record arrives, and the query fails,
the stale record is used, and then the query "restarts" to follow
the CNAME target. The problem is that the query's multiple stale
options (like DNS_DBFIND_STALEOK) are not reset, so 'query_lookup()'
treats the restarted query as a lookup following a failed lookup,
and returns a SERVFAIL answer when there is no stale data found in the
cache, even if there is valid non-stale data there available.

With this change, query_lookup() now considers non-stale data in the
cache in the first place, and returns it if it is available.

(cherry picked from commit 91a1a8efc5)
2023-01-09 14:26:02 +01:00
Artem Boldariev
5de938c6cf Fix TLS session resumption via IDs when Mutual TLS is used
This commit fixes TLS session resumption via session IDs when
client certificates are used. To do so it makes sure that session ID
contexts are set within server TLS contexts. See OpenSSL documentation
for 'SSL_CTX_set_session_id_context()', the "Warnings" section.

(cherry picked from commit 837fef78b1)
2022-12-14 18:32:26 +02:00
Tom Krizek
f4d0b2dca9 Revert "Merge branch '3678-serve-stale-servfailing-unexpectedly-v9_18' into 'v9_18'"
This reverts commit 81b6f17e7c, reversing
changes made to ea47a9c100.

It also removes release note 6038, since the fix is reverted.
2022-12-08 10:22:33 +01:00
Mark Andrews
6f998bbe51 Extend dns_db_allrdatasets to control interation results
Add an options parameter to control what rdatasets are returned when
iteratating over the node.  Specific modes will be added later.

(cherry picked from commit 7695c36a5d)
2022-12-07 23:59:36 +00:00
Ondřej Surý
85e35d4c27 Propagate the shutdown event to the recursing ns_client(s)
Send the ns_query_cancel() on the recursing clients when we initiate the
named shutdown for faster shutdown.

When we are shutting down the resolver, we cancel all the outstanding
fetches, and the ISC_R_CANCEL events doesn't propagate to the ns_client
callback.

In the future, the better solution how to fix this would be to look at
the shutdown paths and let them all propagate from bottom (loopmgr) to
top (f.e. ns_client).

(cherry picked from commit d861d403bb9a7912e29a06aba6caf6d502839f1b)
2022-12-07 18:08:29 +01:00
Michał Kępień
a4547a1093 Check for NULL before dereferencing qctx->rpz_st
Commit 9ffb4a7ba1 causes Clang Static
Analyzer to flag a potential NULL dereference in query_nxdomain():

    query.c:9394:26: warning: Dereference of null pointer [core.NullDereference]
            if (!qctx->nxrewrite || qctx->rpz_st->m.rpz->addsoa) {
                                    ^~~~~~~~~~~~~~~~~~~
    1 warning generated.

The warning above is for qctx->rpz_st potentially being a NULL pointer
when query_nxdomain() is called from query_resume().  This is a false
positive because none of the database lookup result codes currently
causing query_nxdomain() to be called (DNS_R_EMPTYWILD, DNS_R_NXDOMAIN)
can be returned by a database lookup following a recursive resolution
attempt.  Add a NULL check nevertheless in order to future-proof the
code and silence Clang Static Analyzer.

(cherry picked from commit 07592d1315)
2022-12-06 13:47:51 +00:00
Matthijs Mekking
65443cb59f Consider non-stale data when in serve-stale mode
With 'stale-answer-enable yes;' and 'stale-answer-client-timeout off;',
consider the following situation:

A CNAME record and its target record are in the cache, then the CNAME
record expires, but the target record is still valid.

When a new query for the CNAME record arrives, and the query fails,
the stale record is used, and then the query "restarts" to follow
the CNAME target. The problem is that the query's multiple stale
options (like DNS_DBFIND_STALEOK) are not reset, so 'query_lookup()'
treats the restarted query as a lookup following a failed lookup,
and returns a SERVFAIL answer when there is no stale data found in the
cache, even if there is valid non-stale data there available.

With this change, query_lookup() now considers non-stale data in the
cache in the first place, and returns it if it is available.

(cherry picked from commit 86a80e723f)
2022-12-06 13:47:22 +00:00
Michal Nowak
1d7d504338 Update sources to Clang 15 formatting 2022-11-29 09:14:07 +01:00
Evan Hunt
2cc8874c90 ensure RPZ lookups handle CD=1 correctly
RPZ rewrites called dns_db_findext() without passing through the
client database options; as as result, if the client set CD=1,
DNS_DBFIND_PENDINGOK was not used as it should have been, and
cache lookups failed, resulting in failure of the rewrite.

(cherry picked from commit 305a50dbe1)
2022-10-19 13:12:31 -07:00
Aram Sargsyan
82991451b4 Fix ns_statscounter_recursclients counting bug
The incrementing and decrementing of 'ns_statscounter_recursclients'
were not properly balanced: for example, it would be incremented for
a prefetch query but not decremented if the query failed.

This commit ensures that the recursion quota and the recursive clients
counter are always in sync with each other.
2022-10-18 08:54:04 +00:00
Tony Finch
f273fdfc12 De-duplicate __FILE__, __LINE__
Mostly generated automatically with the following semantic patch,
except where coccinelle was confused by #ifdef in lib/isc/net.c

@@ expression list args; @@
- UNEXPECTED_ERROR(__FILE__, __LINE__, args)
+ UNEXPECTED_ERROR(args)
@@ expression list args; @@
- FATAL_ERROR(__FILE__, __LINE__, args)
+ FATAL_ERROR(args)

(cherry picked from commit ec50c58f52)
2022-10-17 16:00:26 +01:00
Michał Kępień
0a53f61727 Merge tag 'v9_18_7' into v9_18
BIND 9.18.7
2022-09-21 13:13:30 +02:00
Evan Hunt
592c7b1049 fix an incorrect detach in update processing
when processing UDPATE requests, hold the request handle until
we either drop the request or respond to it.

(cherry picked from commit 00e0758e12)
2022-09-15 11:34:33 -07:00
Petr Špaček
c095ac9ad1 Log reason why cache peek is not available
Log which ACL caused RD=0 query into cache to be refused.
Expected performance impact is negligible.

(cherry picked from commit fdf7456643)
2022-09-15 09:41:01 +02:00
Petr Špaček
e067d11396 Log reason why recursion is not available
Log which ACL caused RA=0 condition.
Expected performance impact is negligible.

(cherry picked from commit 95fc05c454)
2022-09-15 09:40:57 +02:00
Matthijs Mekking
b9e2f3333d Only refresh RRset once
Don't attempt to resolve DNS responses for intermediate results. This
may create multiple refreshes and can cause a crash.

One scenario is where for the query there is a CNAME and canonical
answer in cache that are both stale. This will trigger a refresh of
the RRsets because we encountered stale data and we prioritized it over
the lookup. It will trigger a refresh of both RRsets. When we start
recursing, it will detect a recursion loop because the recursion
parameters will eventually be the same. In 'dns_resolver_destroyfetch'
the sanity check fails, one of the callers did not get its event back
before trying to destroy the fetch.

Move the call to 'query_refresh_rrset' to 'ns_query_done', so that it
is only called once per client request.

Another scenario is where for the query there is a stale CNAME in the
cache that points to a record that is also in cache but not stale. This
will trigger a refresh of the RRset (because we encountered stale data
and we prioritized it over the lookup).

We mark RRsets that we add to the message with
DNS_RDATASETATTR_STALE_ADDED to prevent adding a duplicate RRset when
a stale lookup and a normal lookup conflict with each other. However,
the other non-stale RRset when following a CNAME chain will be added to
the message without setting that attribute, because it is not stale.

This is a variant of the bug in #2594. The fix covered the same crash
but for stale-answer-client-timeout > 0.

Fix this by clearing all RRsets from the message before refreshing.
This requires the refresh to happen after the query is send back to
the client.

(cherry picked from commit d939d2ecde)
2022-09-08 11:50:44 +02:00
Aram Sargsyan
35e37505f0 Fix RRL responses-per-second bypass using wildcard names
It is possible to bypass Response Rate Limiting (RRL)
`responses-per-second` limitation using specially crafted wildcard
names, because the current implementation, when encountering a found
DNS name generated from a wildcard record, just strips the leftmost
label of the name before making a key for the bucket.

While that technique helps with limiting random requests like
<random>.example.com (because all those requests will be accounted
as belonging to a bucket constructed from "example.com" name), it does
not help with random names like subdomain.<random>.example.com.

The best solution would have been to strip not just the leftmost
label, but as many labels as necessary until reaching the suffix part
of the wildcard record from which the found name is generated, however,
we do not have that information readily available in the context of RRL
processing code.

Fix the issue by interpreting all valid wildcard domain names as
the zone's origin name concatenated to the "*" name, so they all will
be put into the same bucket.

(cherry picked from commit baa9698c9d)
2022-09-08 09:36:50 +02:00
Evan Hunt
acfca3f4fa when creating an interface, set magic before linking
set the magic number in a newly-created interface object
before appending it to mgr->interfaces in order to prevent
a possible assertion.

(cherry picked from commit 8c01662048)
2022-09-06 21:48:28 -07:00
Matthijs Mekking
39c0c5022d Wait with NSEC3 during a DNSSEC policy change
When doing a dnssec-policy reconfiguration from a zone with NSEC only
keys to a zone that uses NSEC3, figure out to wait with building the
NSEC3 chain.

Previously, BIND 9 would attempt to sign such a zone, but failed to
do so because the NSEC3 chain conflicted with existing DNSKEY records
in the zone that were not compatible with NSEC3.

There exists logic for detecting such a case in the functions
dnskey_sane() (in lib/dns/zone.c) and check_dnssec() (in
lib/ns/update.c). Both functions look very similar so refactor them
to use the same code and call the new function (called
dns_zone_check_dnskey_nsec3()).

Also update the dns_nsec_nseconly() function to take an additional
parameter 'diff' that, if provided, will be checked whether an
offending NSEC only DNSKEY will be deleted from the zone. If so,
this key will not be considered when checking the zone for NSEC only
DNSKEYs. This is needed to allow a transition from an NSEC zone with
NSEC only DNSKEYs to an NSEC3 zone.

(cherry picked from commit 09a81dc84ce0fee37442f03cdbd63c2398215376)
2022-08-22 19:21:39 +02:00
Matthijs Mekking
5e908a988f Don't enable serve-stale on duplicate queries
When checking if we should enable serve-stale, add an early out case
when the result is an error signalling a duplicate query or a query
that would be dropped.

(cherry picked from commit 059a4c2f4d)
2022-08-09 09:36:11 +02:00
Ondřej Surý
3c1d6e164e Increase the BUFSIZ-long buffers
The BUFSIZ value varies between platforms, it could be 8K on Linux and
512 bytes on mingw.  Make sure the buffers are always big enough for the
output data to prevent truncation of the output by appropriately
enlarging or sizing the buffers.

(cherry picked from commit b19d932262)
2022-07-15 21:16:51 +02:00
Evan Hunt
cc3070a0b3 log the reason for falling back to AXFR from IXFR at level info
messages indicating the reason for a fallback to AXFR (i.e, because
the requested serial number is not present in the journal, or because
the size of the IXFR response would exceeed "max-ixfr-ratio") are now
logged at level info instead of debug(4).

(cherry picked from commit df1d81cf96)
2022-07-12 16:26:13 -07:00
Mark Andrews
44bfc8a9b2 Clone the message buffer before forwarding UPDATE messages
this prevents named forwarding a buffer that may have been over
written.

(cherry picked from commit 7a42417d61)
2022-07-12 19:00:38 +10:00
Mark Andrews
4d9287dca5 Check the synth-form-dnssec namespace when synthesising responses
Call dns_view_sfd_find to find the namespace to be used to verify
the covering NSEC records returned for the given QNAME.  Check that
the NSEC owner names are within that namespace.

(cherry picked from commit 228dadb026)
2022-07-07 07:47:45 +10:00
Artem Boldariev
b6b07c5646 Update the set of HTTP endpoints on reconfiguration
This commit ensures that on reconfiguration the set of HTTP
endpoints (=paths) is being updated within HTTP listeners.

(cherry picked from commit d2e13ddf22)
2022-06-28 16:37:31 +03:00
Artem Boldariev
bb8ba2c027 Update max concurrent streams limit in HTTP listeners on reconfig
This commit ensures that HTTP listeners concurrent streams limit gets
updated properly on reconfiguration.

(cherry picked from commit e72962d5f1)
2022-06-28 16:37:31 +03:00
Artem Boldariev
1ccbb24078 Update HTTP listeners quotas on reconfiguration
This commit ensures that on reconfiguration a proper value for HTTP
connections limit is picked up.

The commit also refactors how listeners settings are updated so that
there is less code duplication.

(cherry picked from commit a2379135fa)
2022-06-28 16:37:31 +03:00
Artem Boldariev
63a4c12227 Store HTTP quota size inside a listenlist instead of the quota
This way only quota size is passed to the interface/listener
management code instead of a quota object. Thus, we can implement
updating the quota object size instead of recreating the object.

(cherry picked from commit 3f0b310772)
2022-06-28 16:37:31 +03:00
Michał Kępień
457b666c6f Fix destination port extraction for client queries
The current logic for determining the address of the socket to which a
client sent its query is:

 1. Get the address:port tuple from the netmgr handle using
    isc_nmhandle_localaddr().

 2. Convert the address:port tuple from step 1 into an isc_netaddr_t
    using isc_netaddr_fromsockaddr().

 3. Convert the address from step 2 back into a socket address with the
    port set to 0 using isc_sockaddr_fromnetaddr().

Note that the port number (readily available in the netmgr handle) is
needlessly lost in the process, preventing it from being recorded in
dnstap captures of client traffic produced by named.

Fix by first storing the address:port tuple returned by
isc_nmhandle_localaddr() in client->destsockaddr and then creating an
isc_netaddr_t from that structure.  This allows the port number to be
retained in client->destsockaddr, which is what subsequently gets passed
to dns_dt_send().

(cherry picked from commit 2f945703f2)
2022-06-22 13:46:42 +02:00
Michal Nowak
d3eb307e3c Update clang to version 14
(cherry picked from commit 1c45a9885a)
2022-06-16 18:09:33 +02:00
Artem Boldariev
6ec48f1e78 Extend TLS context cache with TLS client session cache
This commit extends TLS context cache with TLS client session cache so
that an associated session cache can be stored alongside the TLS
context within the context cache.

(cherry picked from commit 987892d113)
2022-06-15 17:02:45 +03:00
Aram Sargsyan
12aefe6ced Fix a race condition between shutdown and route_connected()
When shutting down, the interface manager can be destroyed
before the `route_connected()` callback is called, which is
unexpected for the latter and can cause a crash.

Move the interface manager attachment code from the callback
to the place before the callback is registered using
`isc_nm_routeconnect()` function, which will make sure that
the interface manager will live at least until the callback
is called.

Make sure to detach the interface manager if the
`isc_nm_routeconnect()` function is not implemented, or when
the callback is called with a result value which differs from
`ISC_R_SUCCESS`.

(cherry picked from commit f6e729635f)
2022-06-14 14:57:23 +00:00
Aram Sargsyan
e92b261235 Do not use the interface manager until it is ready
The `ns_interfacemgr_create()` function, when calling
`isc_nm_routeconnect()`, uses the newly created `ns_interfacemgr_t`
instance before initializing its reference count and the magic value.

Defer the `isc_nm_routeconnect()` call until the initializations
are complete.

(cherry picked from commit 1d93fe973b)
2022-06-14 14:55:57 +00:00
Ondřej Surý
f128a9bcf2 Move all the unit tests to /tests/<libname>/
The unit tests are now using a common base, which means that
lib/dns/tests/ code now has to include lib/isc/include/isc/test.h and
link with lib/isc/test.c and lib/ns/tests has to include both libisc and
libdns parts.

Instead of cross-linking code between the directories, move the
/lib/<foo>/test.c to /tests/<foo>.c and /lib/<foo>/include/<foo>test.h
to /tests/include/tests/<foo>.h and create a single libtest.la
convenience library in /tests/.

At the same time, move the /lib/<foo>/tests/ to /tests/<foo>/ (but keep
it symlinked to the old location) and adjust paths accordingly.  In few
places, we are now using absolute paths instead of relative paths,
because the directory level has changed.  By moving the directories
under the /tests/ directory, the test-related code is kept in a single
place and we can avoid referencing files between libns->libdns->libisc
which is unhealthy because they live in a separate Makefile-space.

In the future, the /bin/tests/ should be merged to /tests/ and symlink
kept, and the /fuzz/ directory moved to /tests/fuzz/.

(cherry picked from commit 2c3b2dabe9)
2022-05-31 12:06:00 +02:00
Ondřej Surý
f0df0d679a Give the unit tests a big overhaul
The unit tests contain a lot of duplicated code and here's an attempt
to reduce code duplication.

This commit does several things:

1. Remove #ifdef HAVE_CMOCKA - we already solve this with automake
   conditionals.

2. Create a set of ISC_TEST_* and ISC_*_TEST_ macros to wrap the test
   implementations, test lists, and the main test routine, so we don't
   have to repeat this all over again.  The macros were modeled after
   libuv test suite but adapted to cmocka as the test driver.

   A simple example of a unit test would be:

    ISC_RUN_TEST_IMPL(test1) { assert_true(true); }

    ISC_TEST_LIST_START
    ISC_TEST_ENTRY(test1)
    ISC_TEST_LIST_END

    ISC_TEST_MAIN (Discussion: Should this be ISC_TEST_RUN ?)

   For more complicated examples including group setup and teardown
   functions, and per-test setup and teardown functions.

3. The macros prefix the test functions and cmocka entries, so the name
   of the test can now match the tested function name, and we don't have
   to append `_test` because `run_test_` is automatically prepended to
   the main test function, and `setup_test_` and `teardown_test_` is
   prepended to setup and teardown function.

4. Update all the unit tests to use the new syntax and fix a few bits
   here and there.

5. In the future, we can separate the test declarations and test
   implementations which are going to greatly help with uncluttering the
   bigger unit tests like doh_test and netmgr_test, because the test
   implementations are not declared static (see `ISC_RUN_TEST_DECLARE`
   and `ISC_RUN_TEST_IMPL` for more details.

NOTE: This heavily relies on preprocessor macros, but the result greatly
outweighs all the negatives of using the macros.  There's less
duplicated code, the tests are more uniform and the implementation can
be more flexible.

(cherry picked from commit 63fe9312ff)
2022-05-31 11:34:54 +02:00
Evan Hunt
60e25826c6 Cleanup: always count ns_statscounter_recursclients
The ns_statscounter_recursclients counter was previously only
incremented or decremented if client->recursionquota was non-NULL.
This was harmless, because that value should always be non-NULL if
recursion is enabled, but it made the code slightly confusing.

(cherry picked from commit 0201eab655)
2022-05-14 00:59:09 -07:00
Mark Andrews
a742b7c5d7 Check the cache as well when glue NS are returned processing RPZ
(cherry picked from commit 8fb72012e3)
2022-05-04 23:52:29 +10:00
Mark Andrews
83cb796dcd Process learned records as well as glue
(cherry picked from commit 07c828531c)
2022-05-04 23:52:29 +10:00
Mark Andrews
9b467801ac Process the delegating NS RRset when checking rpz rules
(cherry picked from commit cf97c61f48)
2022-05-04 23:52:29 +10:00
Artem Boldariev
6c05fb09c3 Add support for Strict/Mutual TLS into BIND
This commit adds support for Strict/Mutual TLS into BIND. It does so
by implementing the backing code for 'hostname' and 'ca-file' options
of the 'tls' statement. The commit also updates the documentation
accordingly.
2022-04-28 13:39:21 +03:00
Artem Boldariev
e2a3ec2ba5 Extend TLS context cache with CA certificates store
This commit adds support for keeping CA certificates stores associated
with TLS contexts. The intention is to keep one reusable store per a
set of related TLS contexts.
2022-04-28 13:39:21 +03:00
Matthijs Mekking
125603a543 Add stale answer extended errors
Add DNS extended errors 3 (Stale Answer) and 19 (Stale NXDOMAIN Answer)
to responses. Add extra text with the reason why the stale answer was
returned.

To test, we need to change the configuration such that for the first
set of tests the stale-refresh-time window does not interfer with the
expected extended errors.

(cherry picked from commit c66b9abc0b)
2022-04-28 11:21:22 +02:00
Evan Hunt
af37c6f60c prevent a deadlock in the shutdown system test
The shutdown test sends 'rdnc status' commands in parallel with
'rndc stop' A new rndc connection arriving will reference the ACL
environment to see whether the client is allowed to connect.
Commit c0995bc380 added a mutex lock to ns_interfacemgr_getaclenv(),
but if the new connection arrives while the interfaces are being
purged during shutdown, that lock is already being held. If the
the connection event slips in ahead of one of the netmgr's "stop
listening" events on a worker thread, a deadlock can occur.

The fix is not to hold the interfacemgr lock while shutting down
interfaces; only while actually traversing the interface list to
identify interfaces needing shutdown.

(cherry picked from commit 5c4cf3fcc4)
2022-04-27 23:56:59 -07:00
Artem Boldariev
f39041ee65 Replace listener TLS contexts on reconfiguration
This commit makes use of isc_nmsocket_set_tlsctx(). Now, instead of
recreating TLS-enabled listeners (including the underlying TCP
listener sockets), only the TLS context in use is replaced.
2022-04-27 23:58:38 +03:00
Ondřej Surý
192df8d2f1 The route socket and its storage was detached while still reading
The interfacemgr and the .route was being detached while the network
manager had pending read from the socket.  Instead of detaching from the
socket, we need to cancel the read which in turn will detach the route
socket and the associated interfacemgr.

(cherry picked from commit 9ae34a04e8)
2022-04-26 16:41:24 +02:00
Ondřej Surý
8beaee0b08 Remove task exclusive mode from ns_clientmgr
The .lock, .exiting and .excl members were not using for anything else
than starting task exclusive mode, setting .exiting to true and ending
exclusive mode.

Remove all the stray members and dead code eliminating the task
exclusive mode use from ns_clientmgr.

(cherry picked from commit 4f74e1010e)
2022-04-26 15:56:30 +02:00
Ondřej Surý
ce8ffdda69 Remove exclusive mode from ns_interfacemgr
Now that the dns_aclenv_t has now properly rwlocked .localhost and
.localnets member, we can remove the task exclusive mode use from the
ns_interfacemgr.  Some light related cleanup has been also done.

(cherry picked from commit c0995bc380)
2022-04-26 14:21:57 +02:00