Commit Graph

659 Commits

Author SHA1 Message Date
Mark Andrews
da6359345e Add check-svcb to named
check-svcb signals whether to perform additional contraint tests
when loading / update primary zone files.
2022-10-29 00:22:54 +11:00
Evan Hunt
305a50dbe1 ensure RPZ lookups handle CD=1 correctly
RPZ rewrites called dns_db_findext() without passing through the
client database options; as as result, if the client set CD=1,
DNS_DBFIND_PENDINGOK was not used as it should have been, and
cache lookups failed, resulting in failure of the rewrite.
2022-10-19 11:36:11 -07:00
Tony Finch
ec50c58f52 De-duplicate __FILE__, __LINE__
Mostly generated automatically with the following semantic patch,
except where coccinelle was confused by #ifdef in lib/isc/net.c

@@ expression list args; @@
- UNEXPECTED_ERROR(__FILE__, __LINE__, args)
+ UNEXPECTED_ERROR(args)
@@ expression list args; @@
- FATAL_ERROR(__FILE__, __LINE__, args)
+ FATAL_ERROR(args)
2022-10-17 11:58:26 +01:00
Tony Finch
45b2d8938b Simplify and speed up DNS name compression
All we need for compression is a very small hash set of compression
offsets, because most of the information we need (the previously added
names) can be found in the message using the compression offsets.

This change combines dns_compress_find() and dns_compress_add() into
one function dns_compress_name() that both finds any existing suffix,
and adds any new prefix to the table. The old split led to performance
problems caused by duplicate names in the compression context.

Compression contexts are now either small or large, which the caller
chooses depending on the expected size of the message. There is no
dynamic resizing.

There is a behaviour change: compression now acts on all the labels in
each name, instead of just the last few.

A small benchmark suggests this is about 2x faster.
2022-10-17 08:45:44 +02:00
Petr Špaček
53b3ceacd4 Replace #define DNS_NAMEATTR_ with struct of bools
sizeof(dns_name_t) did not change but the boolean attributes are now
separated as one-bit structure members. This allows debuggers to
pretty-print dns_name_t attributes without any special hacks, plus we
got rid of manual bit manipulation code.
2022-10-13 17:04:02 +02:00
Ondřej Surý
0dcbc6274b Record the 'edns-udp-size' in the view, not in the resolver
Getting the recorded value of 'edns-udp-size' from the resolver requires
strong attach to the dns_view because we are accessing `view->resolver`.
This is not the case in places (f.e. dns_zone unit) where `.udpsize` is
accessed.  By moving the .udpsize field from `struct dns_resolver` to
`struct dns_view`, we can access the value directly even with weakly
attached dns_view without the need to lock the view because `.udpsize`
can be accessed after the dns_view object has been shut down.
2022-10-05 11:59:36 -07:00
Ondřej Surý
c0598d404c Use designated initializers instead of memset()/MEM_ZERO for structs
In several places, the structures were cleaned with memset(...)) and
thus the semantic patch converted the isc_mem_get(...) to
isc_mem_getx(..., ISC_MEM_ZERO).  Use the designated initializer to
initialized the structures instead of zeroing the memory with
ISC_MEM_ZERO flag as this better matches the intended purpose.
2022-10-05 16:44:05 +02:00
Ondřej Surý
c1d26b53eb Add and use semantic patch to replace isc_mem_get/allocate+memset
Add new semantic patch to replace the straightfoward uses of:

  ptr = isc_mem_{get,allocate}(..., size);
  memset(ptr, 0, size);

with the new API call:

  ptr = isc_mem_{get,allocate}x(..., size, ISC_MEM_ZERO);
2022-10-05 16:44:05 +02:00
Matthijs Mekking
0681b15225 If refresh stale RRset times out, start stale-refresh-time
The previous commit failed some tests because we expect that if a
fetch fails and we have stale candidates in cache, the
stale-refresh-time window is started. This means that if we hit a stale
entry in cache and answering stale data is allowed, we don't bother
resolving it again for as long we are within the stale-refresh-time
window.

This is useful for two reasons:
- If we failed to fetch the RRset that we are looking for, we are not
  hammering the authoritative servers.

- Successor clients don't need to wait for stale-answer-client-timeout
  to get their DNS response, only the first one to query will take
  the latency penalty.

The latter is not useful when stale-answer-client-timeout is 0 though.

So this exception code only to make sure we don't try to refresh the
RRset again if it failed to do so recently.
2022-10-05 08:20:48 +02:00
Matthijs Mekking
64d51285d5 Reuse recursion type code for refresh stale RRset
Refreshing a stale RRset is similar to prefetching an RRset, so
reuse the existing code. When refreshing an RRset we need to clear
all db options related to serve-stale so that stale RRsets in cache
are ignored during the refresh.

We no longer need to set the "nodetach" flag, because the refresh
fetch is now a "fetch and forget". So we can detach from the client
in the query_send().

This code will break some serve-stale test cases, this will be fixed
in the successor commit.

TODO: add explanation why the serve-stale test cases fail.
2022-10-05 08:20:48 +02:00
Matthijs Mekking
5fb8e555bc Add new recursion type for refreshing stale RRset
Refreshing a stale RRset is similar to a prefetch query, so we can
refactor this code to use the new recursion types introduced in !5883.
2022-10-05 08:20:48 +02:00
Ondřej Surý
c14a4ac763 Add a case-insensitive option directly to siphash 2-4 implementation
Formerly, the isc_hash32() would have to change the key in a local copy
to make it case insensitive.  Change the isc_siphash24() and
isc_halfsiphash24() functions to lowercase the input directly when
reading it from the memory and converting the uint8_t * array to
64-bit (respectively 32-bit numbers).
2022-10-04 10:32:40 +02:00
Michał Kępień
2ee16067c5 Merge tag 'v9_19_5'
BIND 9.19.5
2022-09-21 13:04:58 +02:00
Ondřej Surý
f6e4f620b3 Use the semantic patch to do the unsigned -> unsigned int change
Apply the semantic patch on the whole code base to get rid of 'unsigned'
usage in favor of explicit 'unsigned int'.
2022-09-19 15:56:02 +02:00
Evan Hunt
00e0758e12 fix an incorrect detach in update processing
when processing UDPATE requests, hold the request handle until
we either drop the request or respond to it.
2022-09-15 10:33:42 -07:00
Petr Špaček
fdf7456643 Log reason why cache peek is not available
Log which ACL caused RD=0 query into cache to be refused.
Expected performance impact is negligible.
2022-09-15 06:50:13 +02:00
Petr Špaček
95fc05c454 Log reason why recursion is not available
Log which ACL caused RA=0 condition.
Expected performance impact is negligible.
2022-09-15 06:50:13 +02:00
Matthijs Mekking
d939d2ecde Only refresh RRset once
Don't attempt to resolve DNS responses for intermediate results. This
may create multiple refreshes and can cause a crash.

One scenario is where for the query there is a CNAME and canonical
answer in cache that are both stale. This will trigger a refresh of
the RRsets because we encountered stale data and we prioritized it over
the lookup. It will trigger a refresh of both RRsets. When we start
recursing, it will detect a recursion loop because the recursion
parameters will eventually be the same. In 'dns_resolver_destroyfetch'
the sanity check fails, one of the callers did not get its event back
before trying to destroy the fetch.

Move the call to 'query_refresh_rrset' to 'ns_query_done', so that it
is only called once per client request.

Another scenario is where for the query there is a stale CNAME in the
cache that points to a record that is also in cache but not stale. This
will trigger a refresh of the RRset (because we encountered stale data
and we prioritized it over the lookup).

We mark RRsets that we add to the message with
DNS_RDATASETATTR_STALE_ADDED to prevent adding a duplicate RRset when
a stale lookup and a normal lookup conflict with each other. However,
the other non-stale RRset when following a CNAME chain will be added to
the message without setting that attribute, because it is not stale.

This is a variant of the bug in #2594. The fix covered the same crash
but for stale-answer-client-timeout > 0.

Fix this by clearing all RRsets from the message before refreshing.
This requires the refresh to happen after the query is send back to
the client.
2022-09-08 11:24:37 +02:00
Aram Sargsyan
baa9698c9d Fix RRL responses-per-second bypass using wildcard names
It is possible to bypass Response Rate Limiting (RRL)
`responses-per-second` limitation using specially crafted wildcard
names, because the current implementation, when encountering a found
DNS name generated from a wildcard record, just strips the leftmost
label of the name before making a key for the bucket.

While that technique helps with limiting random requests like
<random>.example.com (because all those requests will be accounted
as belonging to a bucket constructed from "example.com" name), it does
not help with random names like subdomain.<random>.example.com.

The best solution would have been to strip not just the leftmost
label, but as many labels as necessary until reaching the suffix part
of the wildcard record from which the found name is generated, however,
we do not have that information readily available in the context of RRL
processing code.

Fix the issue by interpreting all valid wildcard domain names as
the zone's origin name concatenated to the "*" name, so they all will
be put into the same bucket.
2022-09-08 09:15:30 +02:00
Evan Hunt
8c01662048 when creating an interface, set magic before linking
set the magic number in a newly-created interface object
before appending it to mgr->interfaces in order to prevent
a possible assertion.
2022-09-06 17:12:14 -07:00
Mark Andrews
785d021d00 Remove dead code
*** CID 352817:  Control flow issues  (DEADCODE) /lib/ns/xfrout.c: 1568 in sendstream()
    1562
    1563     	/* Advance lasttsig to be the last TSIG generated */
    1564     	CHECK(dns_message_getquerytsig(msg, xfr->mctx, &xfr->lasttsig));
    1565
    1566     failure:
    1567     	if (msgname != NULL) {
    >>>     CID 352817:  Control flow issues  (DEADCODE)
    >>>     Execution cannot reach this statement: "if (msgrds != NULL) {
      if ...".
    1568     		if (msgrds != NULL) {
    1569     			if (dns_rdataset_isassociated(msgrds)) {
    1570     				dns_rdataset_disassociate(msgrds);
    1571     			}
    1572     			dns_message_puttemprdataset(msg, &msgrds);
    1573     		}
2022-09-06 12:47:08 +00:00
Mark Andrews
5805457d9d Remove dead code
*** CID 352816:  Control flow issues  (DEADCODE) /lib/ns/query.c: 8443 in query_dns64()
    8437     cleanup:
    8438     	if (buffer != NULL) {
    8439     		isc_buffer_free(&buffer);
    8440     	}
    8441
    8442     	if (dns64_rdata != NULL) {
    >>>     CID 352816:  Control flow issues  (DEADCODE)
    >>>     Execution cannot reach this statement: "dns_message_puttemprdata(cl...".
    8443     		dns_message_puttemprdata(client->message, &dns64_rdata);
    8444     	}
    8445
    8446     	if (dns64_rdataset != NULL) {
    8447     		dns_message_puttemprdataset(client->message, &dns64_rdataset);
    8448     	}
2022-09-06 12:47:08 +00:00
Mark Andrews
3ef734e0f5 Remove dead code
*** CID 352812:  Control flow issues  (DEADCODE) /lib/ns/query.c: 8584 in query_filter64()
    8578     cleanup:
    8579     	if (buffer != NULL) {
    8580     		isc_buffer_free(&buffer);
    8581     	}
    8582
    8583     	if (myrdata != NULL) {
    >>>     CID 352812:  Control flow issues  (DEADCODE)
    >>>     Execution cannot reach this statement: "dns_message_puttemprdata(cl...".
    8584     		dns_message_puttemprdata(client->message, &myrdata);
    8585     	}
    8586
    8587     	if (myrdataset != NULL) {
    8588     		dns_message_puttemprdataset(client->message, &myrdataset);
    8589     	}
2022-09-06 12:47:08 +00:00
Aram Sargsyan
83395f4cfb Set the extended DNS error code for RPZ-modified queries
When enabled through a configuration option, set the configured EDE code
for the modified queries.
2022-08-31 08:56:03 +00:00
Ondřej Surý
b69e783164 Update netmgr, tasks, and applications to use isc_loopmgr
Previously:

* applications were using isc_app as the base unit for running the
  application and signal handling.

* networking was handled in the netmgr layer, which would start a
  number of threads, each with a uv_loop event loop.

* task/event handling was done in the isc_task unit, which used
  netmgr event loops to run the isc_event calls.

In this refactoring:

* the network manager now uses isc_loop instead of maintaining its
  own worker threads and event loops.

* the taskmgr that manages isc_task instances now also uses isc_loopmgr,
  and every isc_task runs on a specific isc_loop bound to the specific
  thread.

* applications have been updated as necessary to use the new API.

* new ISC_LOOP_TEST macros have been added to enable unit tests to
  run isc_loop event loops. unit tests have been updated to use this
  where needed.
2022-08-26 09:09:24 +02:00
Ondřej Surý
49b149f5fd Update isc_timer to use isc_loopmgr
* isc_timer was rewritten using the uv_timer, and isc_timermgr_t was
  completely removed; isc_timer objects are now directly created on the
  isc_loop event loops.

* the isc_timer API has been simplified. the "inactive" timer type has
  been removed; timers are now stopped by calling isc_timer_stop()
  instead of resetting to inactive.

* isc_manager now creates a loop manager rather than a timer manager.

* modules and applications using isc_timer have been updated to use the
  new API.
2022-08-25 17:17:07 +02:00
Matthijs Mekking
501dc87d75 Wait with NSEC3 during a DNSSEC policy change
When doing a dnssec-policy reconfiguration from a zone with NSEC only
keys to a zone that uses NSEC3, figure out to wait with building the
NSEC3 chain.

Previously, BIND 9 would attempt to sign such a zone, but failed to
do so because the NSEC3 chain conflicted with existing DNSKEY records
in the zone that were not compatible with NSEC3.

There exists logic for detecting such a case in the functions
dnskey_sane() (in lib/dns/zone.c) and check_dnssec() (in
lib/ns/update.c). Both functions look very similar so refactor them
to use the same code and call the new function (called
dns_zone_check_dnskey_nsec3()).

Also update the dns_nsec_nseconly() function to take an additional
parameter 'diff' that, if provided, will be checked whether an
offending NSEC only DNSKEY will be deleted from the zone. If so,
this key will not be considered when checking the zone for NSEC only
DNSKEYs. This is needed to allow a transition from an NSEC zone with
NSEC only DNSKEYs to an NSEC3 zone.
2022-08-22 15:55:46 +02:00
Aram Sargsyan
c51b052827 dns_rdatalist_tordataset() and dns_rdatalist_fromrdataset() can not fail
Clean up dns_rdatalist_tordataset() and dns_rdatalist_fromrdataset()
functions by making them return void, because they cannot fail.

Clean up other functions that subsequently cannot fail.
2022-08-09 08:19:51 +00:00
Matthijs Mekking
c5b71e2472 Don't enable serve-stale on duplicate queries
When checking if we should enable serve-stale, add an early out case
when the result is an error signalling a duplicate query or a query
that would be dropped.
2022-08-09 09:13:53 +02:00
Ondřej Surý
b35861f1eb Increase the BUFSIZ-long buffers
The BUFSIZ value varies between platforms, it could be 8K on Linux and
512 bytes on mingw.  Make sure the buffers are always big enough for the
output data to prevent truncation of the output by appropriately
enlarging or sizing the buffers.
2022-07-15 10:33:46 +00:00
Evan Hunt
df1d81cf96 log the reason for falling back to AXFR from IXFR at level info
messages indicating the reason for a fallback to AXFR (i.e, because
the requested serial number is not present in the journal, or because
the size of the IXFR response would exceeed "max-ixfr-ratio") are now
logged at level info instead of debug(4).
2022-07-12 16:02:54 -07:00
Mark Andrews
7a42417d61 Clone the message buffer before forwarding UPDATE messages
this prevents named forwarding a buffer that may have been over
written.
2022-07-12 17:13:24 +10:00
Mark Andrews
228dadb026 Check the synth-form-dnssec namespace when synthesising responses
Call dns_view_sfd_find to find the namespace to be used to verify
the covering NSEC records returned for the given QNAME.  Check that
the NSEC owner names are within that namespace.
2022-07-05 12:29:01 +10:00
Michał Kępień
887c666caf Obsolete the "glue-cache" option
The "glue-cache" option was marked as deprecated by commit
5ae33351f2 (first released in BIND 9.17.6,
back in October 2020), so now obsolete that option, removing all code
and documentation related to it.

Note: this causes the glue cache feature to be permanently enabled, not
disabled.
2022-06-30 15:24:08 +02:00
Artem Boldariev
d2e13ddf22 Update the set of HTTP endpoints on reconfiguration
This commit ensures that on reconfiguration the set of HTTP
endpoints (=paths) is being updated within HTTP listeners.
2022-06-28 15:42:38 +03:00
Artem Boldariev
e72962d5f1 Update max concurrent streams limit in HTTP listeners on reconfig
This commit ensures that HTTP listeners concurrent streams limit gets
updated properly on reconfiguration.
2022-06-28 15:42:38 +03:00
Artem Boldariev
a2379135fa Update HTTP listeners quotas on reconfiguration
This commit ensures that on reconfiguration a proper value for HTTP
connections limit is picked up.

The commit also refactors how listeners settings are updated so that
there is less code duplication.
2022-06-28 15:42:38 +03:00
Artem Boldariev
3f0b310772 Store HTTP quota size inside a listenlist instead of the quota
This way only quota size is passed to the interface/listener
management code instead of a quota object. Thus, we can implement
updating the quota object size instead of recreating the object.
2022-06-28 15:42:38 +03:00
Michał Kępień
2f945703f2 Fix destination port extraction for client queries
The current logic for determining the address of the socket to which a
client sent its query is:

 1. Get the address:port tuple from the netmgr handle using
    isc_nmhandle_localaddr().

 2. Convert the address:port tuple from step 1 into an isc_netaddr_t
    using isc_netaddr_fromsockaddr().

 3. Convert the address from step 2 back into a socket address with the
    port set to 0 using isc_sockaddr_fromnetaddr().

Note that the port number (readily available in the netmgr handle) is
needlessly lost in the process, preventing it from being recorded in
dnstap captures of client traffic produced by named.

Fix by first storing the address:port tuple returned by
isc_nmhandle_localaddr() in client->destsockaddr and then creating an
isc_netaddr_t from that structure.  This allows the port number to be
retained in client->destsockaddr, which is what subsequently gets passed
to dns_dt_send().
2022-06-22 13:45:46 +02:00
Michal Nowak
1c45a9885a Update clang to version 14 2022-06-16 17:21:11 +02:00
Aram Sargsyan
f6e729635f Fix a race condition between shutdown and route_connected()
When shutting down, the interface manager can be destroyed
before the `route_connected()` callback is called, which is
unexpected for the latter and can cause a crash.

Move the interface manager attachment code from the callback
to the place before the callback is registered using
`isc_nm_routeconnect()` function, which will make sure that
the interface manager will live at least until the callback
is called.

Make sure to detach the interface manager if the
`isc_nm_routeconnect()` function is not implemented, or when
the callback is called with a result value which differs from
`ISC_R_SUCCESS`.
2022-06-14 14:31:24 +00:00
Aram Sargsyan
1d93fe973b Do not use the interface manager until it is ready
The `ns_interfacemgr_create()` function, when calling
`isc_nm_routeconnect()`, uses the newly created `ns_interfacemgr_t`
instance before initializing its reference count and the magic value.

Defer the `isc_nm_routeconnect()` call until the initializations
are complete.
2022-06-14 14:31:24 +00:00
Michał Kępień
0d7e5d513c Assert on unknown isc_quota_attach() return values
The only values that the isc_quota_attach() function (called from
check_recursionquota() via recursionquotatype_attach_soft()) can
currently return are: ISC_R_SUCCESS, ISC_R_SOFTQUOTA, and ISC_R_QUOTA.
Instead of just propagating any other (unexpected) error up the call
stack, assert immediately, so that if the isc_quota_* API gets updated
in the future to return values currently matching the "default"
statement, check_recursionquota() can be promptly updated to handle such
new return values as desired.
2022-06-14 13:13:32 +02:00
Michał Kępień
8d64beb06f Use a switch statement in check_recursionquota()
Improve readability of the check_recursionquota() function by replacing
a sequence of conditional statements with a switch statement.
2022-06-14 13:13:32 +02:00
Michał Kępień
a06cdfc7b7 Add helper function for recursive-clients logging
Reduce code duplication in check_recursionquota() by extracting its
parts responsible for logging to a separate helper function.

Remove result text from the "no more recursive clients" message because
it always says "quota reached" (as the relevant branch is only evaluated
when 'result' is set to ISC_R_QUOTA) and therefore brings no additional
value.
2022-06-14 13:13:32 +02:00
Michał Kępień
7bc4425e2a Remove redundant recursion quota pointer checks
When the client->recursionquota pointer was overloaded by different
features, each of those features had to be aware of that fact and handle
any updates of that pointer gracefully.  Example: prefetch code
initiates recursion, attaching to client->recursionquota, then query
processing restarts due to a CNAME being encountered, then that CNAME is
not found in the cache, so another recursion is triggered, but
client->recursionquota is already attached to; even though it is not
CNAME chaining code that attached to that pointer, that code still has
to handle such a situation gracefully.

However, all features that can initiate recursion have now been updated
to use separate slots in the 'recursions' array, so keeping the old
checks in place means masking future programming errors that could
otherwise be caught - and should be caught because each feature needs to
properly maintain its own quota reference.

Remove outdated recursion quota pointer checks to enable the assertions
in isc_quota_*() functions to detect programming errors in code paths
that can start recursion.  Remove an outdated comment to prevent
confusion.
2022-06-14 13:13:32 +02:00
Michał Kępień
ccee3507c2 Check recursions at the end of request processing
ns_client_endrequest() currently contains code that looks for
outstanding quota references and cleans them up if necessary.  This
approach masks programming errors because ns_client_endrequest() is only
called from ns__client_reset_cb(), which in turn is only called when all
references to the client's netmgr handle are released, which in turn
only happens after all recursion completion callbacks are invoked
(because isc_nmhandle_attach() is called before every call to
dns_resolver_createfetch() in lib/ns/query.c and the completion callback
is expected to detach from the handle), which in turn is expected to
happen for all recursions attempts, even those that get canceled.

Furthermore, declaring the prototype of ns_client_endrequest() at the
top of lib/ns/client.c is redundant because the definition of that
function is placed before its first use in that file.  Remove the
redundant function prototype.

Finally, remove INSIST assertions ensuring quota pointers are NULL in
ns__client_reset_cb() because the latter calls ns_client_endrequest() a
few lines earlier.
2022-06-14 13:13:32 +02:00
Michał Kępień
e09b36f2cc Adjust recursion quota when starting a fetch fails
Some functions fail to detach from the recursion quota if an error
occurs while initiating recursion.  This causes the recursive client
counter to be off.  Add missing recursionquota_detach() calls, reworking
cleanup code where appropriate.
2022-06-14 13:13:32 +02:00
Michał Kępień
172e15f7ad Attach to separate recursion quota pointers
Similarly to how different code paths reused common client handle
pointers and fetch references despite being logically unrelated, they
also reuse client->recursionquota, the field in which a reference to the
recursion quota is stored.  This unnecessarily forces all code using
that field to be aware of the fact that it is overloaded by different
features.

Overloading client->recursionquota also causes inconsistent behavior.
For example, if prefetch code triggers recursion and then delegation
handling code also triggers recursion, only one of these code paths will
be able to attach to the recursion quota, but both recursions will be
started anyway.  In other words, each code path only checks whether the
recursion quota has not been exceeded if the quota has not yet been
attached to by another code path.  This behavior theoretically allows
the configured recursion quota to be slightly exceeded; while it is not
expected to be a real-world operational issue, it is still confusing and
should therefore be fixed.

Extend the structures comprising the 'recursions' array with a new field
holding a pointer to the recursion quota that a given recursion process
attached to.  Update all code paths using client->recursionquota so that
they use the appropriate slot in the 'recursions' array.  Drop the
'recursionquota' field from ns_client_t.
2022-06-14 13:13:32 +02:00
Michał Kępień
95e703121d Ensure ns_query_cancel() handles all recursions
Previously, multiple code paths reused client->query.fetch, so it was
enough for ns_query_cancel() to issue a single call to
dns_resolver_cancelfetch() with that fetch as an argument.  Now, since
each slot in the 'recursions' array can hold a reference to a separate
resolver fetch, ns_query_cancel() needs to handle all of them, so that
all recursion callbacks get a chance to clean up the associated
resources when a query is canceled.
2022-06-14 13:13:32 +02:00