'-T transferinsecs' makes named interpret the max-transfer-time-out,
max-transfer-idle-out, max-transfer-time-in and max-transfer-idle-in
configuration options as seconds instead of minutes.
'-T transferslowly' makes named to sleep for one second for every
xfrout message.
'-T transferstuck' makes named to sleep for one minute for every
xfrout message.
In e185412872, the TCP accept quota code
became broken in a subtle way - the quota would get initialized on the
first accept for the server socket and then deleted from the server
socket, so it would never get applied again.
Properly fixing this required a bigger refactoring of the isc_quota API
code to make it much simpler. The new code decouples the ownership of
the quota and acquiring/releasing the quota limit.
After (during) the refactoring it became more clear that we need to use
the callback from the child side of the accepted connection, and not the
server side.
This change makes the zone table lock-free for reads. Previously, the
zone table used a red-black tree, which is not thread safe, so the hot
read path acquired both the per-view mutex and the per-zonetable
rwlock. (The double locking was to fix to cleanup races on shutdown.)
One visible difference is that zones are not necessarily shut down
promptly: it depends on when the qp-trie garbage collector cleans up
the zone table. The `catz` system test checks several times that zones
have been deleted; the test now checks for zones to be removed from
the server configuration, instead of being fully shut down. The catz
test does not churn through enough zones to trigger a gc, so the zones
are not fully detached until the server exits.
After this change, it is still possible to improve the way we handle
changes to the zone table, for instance, batching changes, or better
compaction heuristics.
This should have no functional effects.
The message size stats are specified by RSSAC002 so it's best not
to mess around with how they appear in the statschannel. But it's
worth changing the implementation to use general-purpose histograms,
to reduce code size and benefit from sharded counters.
The isc_time_now() and isc_time_now_hires() were used inconsistently
through the code - either with status check, or without status check,
or via TIME_NOW() macro with RUNTIME_CHECK() on failure.
Refactor the isc_time_now() and isc_time_now_hires() to always fail when
getting current time has failed, and return the isc_time_t value as
return value instead of passing the pointer to result in the argument.
This is a simple replacement using the semantic patch from the previous
commit and as added bonus, one removal of previously undetected unused
variable in named/server.c.
add a public function ns_interface_create() allowing the caller
to set up a listening interface directly without having to set
up listen-on and scan network interfaces.
This implements node reference tracing that passes all the internal
layers from dns_db API (and friends) to increment_reference() and
decrement_reference().
It can be enabled by #defining DNS_DB_NODETRACE in <dns/trace.h> header.
The output then looks like this:
incr:node:check_address_records:rootns.c:409:0x7f67f5a55a40->references = 1
decr:node:check_address_records:rootns.c:449:0x7f67f5a55a40->references = 0
incr:nodelock:check_address_records:rootns.c:409:0x7f67f5a55a40:0x7f68304d7040->references = 1
decr:nodelock:check_address_records:rootns.c:449:0x7f67f5a55a40:0x7f68304d7040->references = 0
There's associated python script to find the missing detach located at:
https://gitlab.isc.org/isc-projects/bind9/-/snippets/1038
this function was just a front-end for gethostname(). it was
needed when we supported windows, which has a different function
for looking up the hostname; it's not needed any longer.
as there is no further use of isc_task in BIND, this commit removes
it, along with isc_taskmgr, isc_event, and all other related types.
functions that accepted taskmgr as a parameter have been cleaned up.
as a result of this change, some functions can no longer fail, so
they've been changed to type void, and their callers have been
updated accordingly.
the tasks table has been removed from the statistics channel and
the stats version has been updated. dns_dyndbctx has been changed
to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION
has been udpated as well.
callback events from dns_resolver_createfetch() are now posted
using isc_async_run.
other modules which called the resolver and maintained task/taskmgr
objects for this purpose have been cleaned up.
removed some functions that are no longer used and unlikely to
be resurrected, and also some that were only used to support Windows
and can now be replaced with generic versions.
Instead of using an extra rarely-used paramater to dns_clientinfo_init()
to set ECS information for a client, this commit adds a function
dns_clientinfo_setecs() which can be called only when ECS is needed.
Return 'isc_result_t' type value instead of 'bool' to indicate
the actual failure. Rename the function to something not suggesting
a boolean type result. Make changes in the places where the API
function is being used to check for the result code instead of
a boolean value.
Although 'dns_fetch_t' fetch can have two associated events, one for
each of 'DNS_EVENT_FETCHDONE' and 'DNS_EVENT_TRYSTALE' types, the
dns_resolver_cancelfetch() function is designed in a way that it
expects only one existing event, which it must cancel, and when it
happens so that 'stale-answer-client-timeout' is enabled and there
are two events, only one of them is canceled, and it results in an
assertion in dns_resolver_destroyfetch(), when it finds a dangling
event.
Change the logic of dns_resolver_cancelfetch() function so that it
cancels both the events (if they exist), and in the right order.
dns_db_findext() asserts if RRSIG is passed to it and
query_lookup_stale() failed to map RRSIG to ANY to prevent this. To
avoid cases like this in the future, move the mapping of SIG and RRSIG
to ANY for qctx->type to qctx_init().
check allow-update, update-policy, and allow-update-forwarding before
consuming quota slots, so that unauthorized clients can't fill the
quota.
(this moves the access check before the prerequisite check, which
violates the precise wording of RFC 2136. however, RFC co-author Paul
Vixie has stated that the RFC is mistaken on this point; it should have
said that access checking must happen *no later than* the completion of
prerequisite checks, not that it must happen exactly then.)
limit the number of simultaneous DNS UPDATE events that can be
processed by adding a quota for update and update forwarding.
this quota currently, arbitrarily, defaults to 100.
also add a statistics counter to record when the update quota
has been exceeded.
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.
To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
The ns_client_aclchecksilent is used to check multiple ACLs before
the decision is made that a query is denied. It is also used to
determine if recursion is available. In those cases we should not
set the extended DNS error "Prohibited".
With 'stale-answer-enable yes;' and 'stale-answer-client-timeout off;',
consider the following situation:
A CNAME record and its target record are in the cache, then the CNAME
record expires, but the target record is still valid.
When a new query for the CNAME record arrives, and the query fails,
the stale record is used, and then the query "restarts" to follow
the CNAME target. The problem is that the query's multiple stale
options (like DNS_DBFIND_STALEOK) are not reset, so 'query_lookup()'
treats the restarted query as a lookup following a failed lookup,
and returns a SERVFAIL answer when there is no stale data found in the
cache, even if there is valid non-stale data there available.
With this change, query_lookup() now considers non-stale data in the
cache in the first place, and returns it if it is available.
The Stream DNS implementation needs a peek methods that read the value
from the buffer, but it doesn't advance the current position. Add
isc_buffer_peekuintX methods, refactor the isc_buffer_{get,put}uintN
methods to modern integer types, and move the isc_buffer_getuintN to the
header as static inline functions.
This commit fixes TLS session resumption via session IDs when
client certificates are used. To do so it makes sure that session ID
contexts are set within server TLS contexts. See OpenSSL documentation
for 'SSL_CTX_set_session_id_context()', the "Warnings" section.
Send the ns_query_cancel() on the recursing clients when we initiate the
named shutdown for faster shutdown.
When we are shutting down the resolver, we cancel all the outstanding
fetches, and the ISC_R_CANCEL events doesn't propagate to the ns_client
callback.
In the future, the better solution how to fix this would be to look at
the shutdown paths and let them all propagate from bottom (loopmgr) to
top (f.e. ns_client).
With 'stale-answer-enable yes;' and 'stale-answer-client-timeout off;',
consider the following situation:
A CNAME record and its target record are in the cache, then the CNAME
record expires, but the target record is still valid.
When a new query for the CNAME record arrives, and the query fails,
the stale record is used, and then the query "restarts" to follow
the CNAME target. The problem is that the query's multiple stale
options (like DNS_DBFIND_STALEOK) are not reset, so 'query_lookup()'
treats the restarted query as a lookup following a failed lookup,
and returns a SERVFAIL answer when there is no stale data found in the
cache, even if there is valid non-stale data there available.
With this change, query_lookup() now considers non-stale data in the
cache in the first place, and returns it if it is available.
After change 5995, zone transfers were using a small
compression context that only had space for the first
few dozen names in each message. They now use a large
compression context with enough space for every name.
The dns_cache API contained a cache cleaning mechanism that would be
disabled for 'rbt' based cache. As named doesn't have any other cache
implementations, remove the cache cleaning mechanism from dns_cache API.
the 'nupdates' field was originally used to track whether a client
was ready to shut down, along with other similar counters nreads,
nrecvs, naccepts and nsends. this is now tracked differently, but
nupdates was overlooked when the other counters were removed.
Remove code that triggers key and denial of existence management
operations. Dynamic update should no longer be used to do DNSSEC
maintenance (other than that of course signatures need to be
created for the new zone contents).
The small/large tuning has been completely removed from the code with
last remnant of the dead code in ns_interfacemgr. Remove the dead code
and the configure option.
RPZ rewrites called dns_db_findext() without passing through the
client database options; as as result, if the client set CD=1,
DNS_DBFIND_PENDINGOK was not used as it should have been, and
cache lookups failed, resulting in failure of the rewrite.
Mostly generated automatically with the following semantic patch,
except where coccinelle was confused by #ifdef in lib/isc/net.c
@@ expression list args; @@
- UNEXPECTED_ERROR(__FILE__, __LINE__, args)
+ UNEXPECTED_ERROR(args)
@@ expression list args; @@
- FATAL_ERROR(__FILE__, __LINE__, args)
+ FATAL_ERROR(args)
All we need for compression is a very small hash set of compression
offsets, because most of the information we need (the previously added
names) can be found in the message using the compression offsets.
This change combines dns_compress_find() and dns_compress_add() into
one function dns_compress_name() that both finds any existing suffix,
and adds any new prefix to the table. The old split led to performance
problems caused by duplicate names in the compression context.
Compression contexts are now either small or large, which the caller
chooses depending on the expected size of the message. There is no
dynamic resizing.
There is a behaviour change: compression now acts on all the labels in
each name, instead of just the last few.
A small benchmark suggests this is about 2x faster.
sizeof(dns_name_t) did not change but the boolean attributes are now
separated as one-bit structure members. This allows debuggers to
pretty-print dns_name_t attributes without any special hacks, plus we
got rid of manual bit manipulation code.