The ns_client_aclchecksilent is used to check multiple ACLs before
the decision is made that a query is denied. It is also used to
determine if recursion is available. In those cases we should not
set the extended DNS error "Prohibited".
(cherry picked from commit 798c8f57d4)
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.
To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
(cherry picked from commit 916ea26ead)
Commit 9ffb4a7ba1 causes Clang Static
Analyzer to flag a potential NULL dereference in query_nxdomain():
query.c:9394:26: warning: Dereference of null pointer [core.NullDereference]
if (!qctx->nxrewrite || qctx->rpz_st->m.rpz->addsoa) {
^~~~~~~~~~~~~~~~~~~
1 warning generated.
The warning above is for qctx->rpz_st potentially being a NULL pointer
when query_nxdomain() is called from query_resume(). This is a false
positive because none of the database lookup result codes currently
causing query_nxdomain() to be called (DNS_R_EMPTYWILD, DNS_R_NXDOMAIN)
can be returned by a database lookup following a recursive resolution
attempt. Add a NULL check nevertheless in order to future-proof the
code and silence Clang Static Analyzer.
(cherry picked from commit 07592d1315)
(cherry picked from commit a4547a1093)
With 'stale-answer-enable yes;' and 'stale-answer-client-timeout off;',
consider the following situation:
A CNAME record and its target record are in the cache, then the CNAME
record expires, but the target record is still valid.
When a new query for the CNAME record arrives, and the query fails,
the stale record is used, and then the query "restarts" to follow
the CNAME target. The problem is that the query's multiple stale
options (like DNS_DBFIND_STALEOK) are not reset, so 'query_lookup()'
treats the restarted query as a lookup following a failed lookup,
and returns a SERVFAIL answer when there is no stale data found in the
cache, even if there is valid non-stale data there available.
With this change, query_lookup() now considers non-stale data in the
cache in the first place, and returns it if it is available.
(cherry picked from commit 91a1a8efc5)
This commit fixes TLS session resumption via session IDs when
client certificates are used. To do so it makes sure that session ID
contexts are set within server TLS contexts. See OpenSSL documentation
for 'SSL_CTX_set_session_id_context()', the "Warnings" section.
(cherry picked from commit 837fef78b1)
Add an options parameter to control what rdatasets are returned when
iteratating over the node. Specific modes will be added later.
(cherry picked from commit 7695c36a5d)
Send the ns_query_cancel() on the recursing clients when we initiate the
named shutdown for faster shutdown.
When we are shutting down the resolver, we cancel all the outstanding
fetches, and the ISC_R_CANCEL events doesn't propagate to the ns_client
callback.
In the future, the better solution how to fix this would be to look at
the shutdown paths and let them all propagate from bottom (loopmgr) to
top (f.e. ns_client).
(cherry picked from commit d861d403bb9a7912e29a06aba6caf6d502839f1b)
Commit 9ffb4a7ba1 causes Clang Static
Analyzer to flag a potential NULL dereference in query_nxdomain():
query.c:9394:26: warning: Dereference of null pointer [core.NullDereference]
if (!qctx->nxrewrite || qctx->rpz_st->m.rpz->addsoa) {
^~~~~~~~~~~~~~~~~~~
1 warning generated.
The warning above is for qctx->rpz_st potentially being a NULL pointer
when query_nxdomain() is called from query_resume(). This is a false
positive because none of the database lookup result codes currently
causing query_nxdomain() to be called (DNS_R_EMPTYWILD, DNS_R_NXDOMAIN)
can be returned by a database lookup following a recursive resolution
attempt. Add a NULL check nevertheless in order to future-proof the
code and silence Clang Static Analyzer.
(cherry picked from commit 07592d1315)
With 'stale-answer-enable yes;' and 'stale-answer-client-timeout off;',
consider the following situation:
A CNAME record and its target record are in the cache, then the CNAME
record expires, but the target record is still valid.
When a new query for the CNAME record arrives, and the query fails,
the stale record is used, and then the query "restarts" to follow
the CNAME target. The problem is that the query's multiple stale
options (like DNS_DBFIND_STALEOK) are not reset, so 'query_lookup()'
treats the restarted query as a lookup following a failed lookup,
and returns a SERVFAIL answer when there is no stale data found in the
cache, even if there is valid non-stale data there available.
With this change, query_lookup() now considers non-stale data in the
cache in the first place, and returns it if it is available.
(cherry picked from commit 86a80e723f)
RPZ rewrites called dns_db_findext() without passing through the
client database options; as as result, if the client set CD=1,
DNS_DBFIND_PENDINGOK was not used as it should have been, and
cache lookups failed, resulting in failure of the rewrite.
(cherry picked from commit 305a50dbe1)
The incrementing and decrementing of 'ns_statscounter_recursclients'
were not properly balanced: for example, it would be incremented for
a prefetch query but not decremented if the query failed.
This commit ensures that the recursion quota and the recursive clients
counter are always in sync with each other.
Mostly generated automatically with the following semantic patch,
except where coccinelle was confused by #ifdef in lib/isc/net.c
@@ expression list args; @@
- UNEXPECTED_ERROR(__FILE__, __LINE__, args)
+ UNEXPECTED_ERROR(args)
@@ expression list args; @@
- FATAL_ERROR(__FILE__, __LINE__, args)
+ FATAL_ERROR(args)
(cherry picked from commit ec50c58f52)
Don't attempt to resolve DNS responses for intermediate results. This
may create multiple refreshes and can cause a crash.
One scenario is where for the query there is a CNAME and canonical
answer in cache that are both stale. This will trigger a refresh of
the RRsets because we encountered stale data and we prioritized it over
the lookup. It will trigger a refresh of both RRsets. When we start
recursing, it will detect a recursion loop because the recursion
parameters will eventually be the same. In 'dns_resolver_destroyfetch'
the sanity check fails, one of the callers did not get its event back
before trying to destroy the fetch.
Move the call to 'query_refresh_rrset' to 'ns_query_done', so that it
is only called once per client request.
Another scenario is where for the query there is a stale CNAME in the
cache that points to a record that is also in cache but not stale. This
will trigger a refresh of the RRset (because we encountered stale data
and we prioritized it over the lookup).
We mark RRsets that we add to the message with
DNS_RDATASETATTR_STALE_ADDED to prevent adding a duplicate RRset when
a stale lookup and a normal lookup conflict with each other. However,
the other non-stale RRset when following a CNAME chain will be added to
the message without setting that attribute, because it is not stale.
This is a variant of the bug in #2594. The fix covered the same crash
but for stale-answer-client-timeout > 0.
Fix this by clearing all RRsets from the message before refreshing.
This requires the refresh to happen after the query is send back to
the client.
(cherry picked from commit d939d2ecde)
It is possible to bypass Response Rate Limiting (RRL)
`responses-per-second` limitation using specially crafted wildcard
names, because the current implementation, when encountering a found
DNS name generated from a wildcard record, just strips the leftmost
label of the name before making a key for the bucket.
While that technique helps with limiting random requests like
<random>.example.com (because all those requests will be accounted
as belonging to a bucket constructed from "example.com" name), it does
not help with random names like subdomain.<random>.example.com.
The best solution would have been to strip not just the leftmost
label, but as many labels as necessary until reaching the suffix part
of the wildcard record from which the found name is generated, however,
we do not have that information readily available in the context of RRL
processing code.
Fix the issue by interpreting all valid wildcard domain names as
the zone's origin name concatenated to the "*" name, so they all will
be put into the same bucket.
(cherry picked from commit baa9698c9d)
set the magic number in a newly-created interface object
before appending it to mgr->interfaces in order to prevent
a possible assertion.
(cherry picked from commit 8c01662048)
When doing a dnssec-policy reconfiguration from a zone with NSEC only
keys to a zone that uses NSEC3, figure out to wait with building the
NSEC3 chain.
Previously, BIND 9 would attempt to sign such a zone, but failed to
do so because the NSEC3 chain conflicted with existing DNSKEY records
in the zone that were not compatible with NSEC3.
There exists logic for detecting such a case in the functions
dnskey_sane() (in lib/dns/zone.c) and check_dnssec() (in
lib/ns/update.c). Both functions look very similar so refactor them
to use the same code and call the new function (called
dns_zone_check_dnskey_nsec3()).
Also update the dns_nsec_nseconly() function to take an additional
parameter 'diff' that, if provided, will be checked whether an
offending NSEC only DNSKEY will be deleted from the zone. If so,
this key will not be considered when checking the zone for NSEC only
DNSKEYs. This is needed to allow a transition from an NSEC zone with
NSEC only DNSKEYs to an NSEC3 zone.
(cherry picked from commit 09a81dc84ce0fee37442f03cdbd63c2398215376)
When checking if we should enable serve-stale, add an early out case
when the result is an error signalling a duplicate query or a query
that would be dropped.
(cherry picked from commit 059a4c2f4d)
The BUFSIZ value varies between platforms, it could be 8K on Linux and
512 bytes on mingw. Make sure the buffers are always big enough for the
output data to prevent truncation of the output by appropriately
enlarging or sizing the buffers.
(cherry picked from commit b19d932262)
messages indicating the reason for a fallback to AXFR (i.e, because
the requested serial number is not present in the journal, or because
the size of the IXFR response would exceeed "max-ixfr-ratio") are now
logged at level info instead of debug(4).
(cherry picked from commit df1d81cf96)
Call dns_view_sfd_find to find the namespace to be used to verify
the covering NSEC records returned for the given QNAME. Check that
the NSEC owner names are within that namespace.
(cherry picked from commit 228dadb026)
This commit ensures that on reconfiguration the set of HTTP
endpoints (=paths) is being updated within HTTP listeners.
(cherry picked from commit d2e13ddf22)
This commit ensures that on reconfiguration a proper value for HTTP
connections limit is picked up.
The commit also refactors how listeners settings are updated so that
there is less code duplication.
(cherry picked from commit a2379135fa)
This way only quota size is passed to the interface/listener
management code instead of a quota object. Thus, we can implement
updating the quota object size instead of recreating the object.
(cherry picked from commit 3f0b310772)
The current logic for determining the address of the socket to which a
client sent its query is:
1. Get the address:port tuple from the netmgr handle using
isc_nmhandle_localaddr().
2. Convert the address:port tuple from step 1 into an isc_netaddr_t
using isc_netaddr_fromsockaddr().
3. Convert the address from step 2 back into a socket address with the
port set to 0 using isc_sockaddr_fromnetaddr().
Note that the port number (readily available in the netmgr handle) is
needlessly lost in the process, preventing it from being recorded in
dnstap captures of client traffic produced by named.
Fix by first storing the address:port tuple returned by
isc_nmhandle_localaddr() in client->destsockaddr and then creating an
isc_netaddr_t from that structure. This allows the port number to be
retained in client->destsockaddr, which is what subsequently gets passed
to dns_dt_send().
(cherry picked from commit 2f945703f2)
This commit extends TLS context cache with TLS client session cache so
that an associated session cache can be stored alongside the TLS
context within the context cache.
(cherry picked from commit 987892d113)
When shutting down, the interface manager can be destroyed
before the `route_connected()` callback is called, which is
unexpected for the latter and can cause a crash.
Move the interface manager attachment code from the callback
to the place before the callback is registered using
`isc_nm_routeconnect()` function, which will make sure that
the interface manager will live at least until the callback
is called.
Make sure to detach the interface manager if the
`isc_nm_routeconnect()` function is not implemented, or when
the callback is called with a result value which differs from
`ISC_R_SUCCESS`.
(cherry picked from commit f6e729635f)
The `ns_interfacemgr_create()` function, when calling
`isc_nm_routeconnect()`, uses the newly created `ns_interfacemgr_t`
instance before initializing its reference count and the magic value.
Defer the `isc_nm_routeconnect()` call until the initializations
are complete.
(cherry picked from commit 1d93fe973b)
The unit tests are now using a common base, which means that
lib/dns/tests/ code now has to include lib/isc/include/isc/test.h and
link with lib/isc/test.c and lib/ns/tests has to include both libisc and
libdns parts.
Instead of cross-linking code between the directories, move the
/lib/<foo>/test.c to /tests/<foo>.c and /lib/<foo>/include/<foo>test.h
to /tests/include/tests/<foo>.h and create a single libtest.la
convenience library in /tests/.
At the same time, move the /lib/<foo>/tests/ to /tests/<foo>/ (but keep
it symlinked to the old location) and adjust paths accordingly. In few
places, we are now using absolute paths instead of relative paths,
because the directory level has changed. By moving the directories
under the /tests/ directory, the test-related code is kept in a single
place and we can avoid referencing files between libns->libdns->libisc
which is unhealthy because they live in a separate Makefile-space.
In the future, the /bin/tests/ should be merged to /tests/ and symlink
kept, and the /fuzz/ directory moved to /tests/fuzz/.
(cherry picked from commit 2c3b2dabe9)
The unit tests contain a lot of duplicated code and here's an attempt
to reduce code duplication.
This commit does several things:
1. Remove #ifdef HAVE_CMOCKA - we already solve this with automake
conditionals.
2. Create a set of ISC_TEST_* and ISC_*_TEST_ macros to wrap the test
implementations, test lists, and the main test routine, so we don't
have to repeat this all over again. The macros were modeled after
libuv test suite but adapted to cmocka as the test driver.
A simple example of a unit test would be:
ISC_RUN_TEST_IMPL(test1) { assert_true(true); }
ISC_TEST_LIST_START
ISC_TEST_ENTRY(test1)
ISC_TEST_LIST_END
ISC_TEST_MAIN (Discussion: Should this be ISC_TEST_RUN ?)
For more complicated examples including group setup and teardown
functions, and per-test setup and teardown functions.
3. The macros prefix the test functions and cmocka entries, so the name
of the test can now match the tested function name, and we don't have
to append `_test` because `run_test_` is automatically prepended to
the main test function, and `setup_test_` and `teardown_test_` is
prepended to setup and teardown function.
4. Update all the unit tests to use the new syntax and fix a few bits
here and there.
5. In the future, we can separate the test declarations and test
implementations which are going to greatly help with uncluttering the
bigger unit tests like doh_test and netmgr_test, because the test
implementations are not declared static (see `ISC_RUN_TEST_DECLARE`
and `ISC_RUN_TEST_IMPL` for more details.
NOTE: This heavily relies on preprocessor macros, but the result greatly
outweighs all the negatives of using the macros. There's less
duplicated code, the tests are more uniform and the implementation can
be more flexible.
(cherry picked from commit 63fe9312ff)
The ns_statscounter_recursclients counter was previously only
incremented or decremented if client->recursionquota was non-NULL.
This was harmless, because that value should always be non-NULL if
recursion is enabled, but it made the code slightly confusing.
(cherry picked from commit 0201eab655)
This commit adds support for Strict/Mutual TLS into BIND. It does so
by implementing the backing code for 'hostname' and 'ca-file' options
of the 'tls' statement. The commit also updates the documentation
accordingly.
This commit adds support for keeping CA certificates stores associated
with TLS contexts. The intention is to keep one reusable store per a
set of related TLS contexts.
Add DNS extended errors 3 (Stale Answer) and 19 (Stale NXDOMAIN Answer)
to responses. Add extra text with the reason why the stale answer was
returned.
To test, we need to change the configuration such that for the first
set of tests the stale-refresh-time window does not interfer with the
expected extended errors.
(cherry picked from commit c66b9abc0b)
The shutdown test sends 'rdnc status' commands in parallel with
'rndc stop' A new rndc connection arriving will reference the ACL
environment to see whether the client is allowed to connect.
Commit c0995bc380 added a mutex lock to ns_interfacemgr_getaclenv(),
but if the new connection arrives while the interfaces are being
purged during shutdown, that lock is already being held. If the
the connection event slips in ahead of one of the netmgr's "stop
listening" events on a worker thread, a deadlock can occur.
The fix is not to hold the interfacemgr lock while shutting down
interfaces; only while actually traversing the interface list to
identify interfaces needing shutdown.
(cherry picked from commit 5c4cf3fcc4)
This commit makes use of isc_nmsocket_set_tlsctx(). Now, instead of
recreating TLS-enabled listeners (including the underlying TCP
listener sockets), only the TLS context in use is replaced.
The interfacemgr and the .route was being detached while the network
manager had pending read from the socket. Instead of detaching from the
socket, we need to cancel the read which in turn will detach the route
socket and the associated interfacemgr.
(cherry picked from commit 9ae34a04e8)
The .lock, .exiting and .excl members were not using for anything else
than starting task exclusive mode, setting .exiting to true and ending
exclusive mode.
Remove all the stray members and dead code eliminating the task
exclusive mode use from ns_clientmgr.
(cherry picked from commit 4f74e1010e)
Now that the dns_aclenv_t has now properly rwlocked .localhost and
.localnets member, we can remove the task exclusive mode use from the
ns_interfacemgr. Some light related cleanup has been also done.
(cherry picked from commit c0995bc380)