Using a restored catalog zone excercised a use-after-free bug.
The test checks that the use-after-free bug is gone and is just
a reasonable behaviour check in its own right.
dns_db_updatenotify_unregister needed to be called earlier to ensure
that listener->onupdate_arg always points to a valid object. The
existing lazy cleanup in rbtdb_free did not ensure that.
With 'stale-answer-enable yes;' and 'stale-answer-client-timeout off;',
consider the following situation:
A CNAME record and its target record are in the cache, then the CNAME
record expires, but the target record is still valid.
When a new query for the CNAME record arrives, and the query fails,
the stale record is used, and then the query "restarts" to follow
the CNAME target. The problem is that the query's multiple stale
options (like DNS_DBFIND_STALEOK) are not reset, so 'query_lookup()'
treats the restarted query as a lookup following a failed lookup,
and returns a SERVFAIL answer when there is no stale data found in the
cache, even if there is valid non-stale data there available.
With this change, query_lookup() now considers non-stale data in the
cache in the first place, and returns it if it is available.
Prime the cache with the following records:
shortttl.cname.example. 1 IN CNAME longttl.target.example.
longttl.target.example. 600 IN A 10.53.0.2
Wait for the CNAME record to expire, disable the authoritative server,
and query 'shortttl.cname.example' again, expecting a stale answer.
This commit adds a check if 'sock->recv_cb' might have been nullified
during the call to 'sock->recv_cb'. That could happen, e.g. by an
indirect call to 'isc_nmhandle_close()' from within the callback when
wrapping up.
In this case, let's close the TLS connection.
This commit ensures that the non-atomic flags inside a DoH listener
socket object (and associated worker) are accessed when doing accept
for a connection only from within the context of the dedicated thread,
but not other worker threads.
The purpose of this commit is to avoid TSAN errors during
isc__nmsocket_closing() calls. It is a continuation of
4b5559cd8f.
This commit ensures that the non-atomic flags inside a TLS listener
socket object (and associated worker) are accessed when doing
handshake for a connection only from within the context of the
dedicated thread, but not other worker threads.
The purpose of this commit is to avoid TSAN errors during
isc__nmsocket_closing() calls. It is a continuation of
4b5559cd8f.
While some of these tests are for DoT which doesn't require nghttp2,
the server configs won't allow the server to start without nghttp2
support during compile time.
It might be possible to split these tests into DoT and DoH and only
require nghttp2 for DoH tests, but since almost all of our CI jobs are
compiled with nghttp2, we wouldn't gain a lot of coverage, so it's
probably not worth the effort.
Avoid using the environment variables for feature detection and use the
feature-test utility instead.
Remove the obsolete environment variables from conf.sh, since they're no
longer used anywhere.
Previously, there were two different ways to detect feature support.
Either through an environment variable set by configure in conf.sh, or
using the feature-test utility.
It is more simple and consistent to have only one way of detecting the
feature support. Using the feature-test utility seems superior the the
environment variables set by configure.
This commit ensures that the flags inside a TLS listener socket
object (and associated worker) are accessed when accepting a
connection only from within the context of the dedicated thread, but
not other worker threads.
The TLSDNS transport was not honouring the single read callback for
TLSDNS client. It would call the read callbacks repeatedly in case the
single TLS read would result in multiple DNS messages in the decoded
buffer.
This is second in the series of fixing the usage of hashtables in the
dns_adb and the dns_resolver units.
Currently, the fetch buckets (used to hold the fetch context) and zone
buckets (used to hold per-domain counters) would never get cleaned from
the memory. Combined with the fact that the hashtable now grows as
needed (instead of using hashtable as buckets), the memory usage in the
resolver can just grow and it never drops down.
In this commit, the usage of hashtables (hashmaps) has been completely
rewritten, so there are no "buckets" and all the matching conditions are
directly mapped into the hashtable key:
1. For per-domain counter hashtable, this is simple as the lowercase
domain name is used directly as a counter.
2. For fetch context hashtable, this requires copying some extra flags
back and forth in the key.
As we don't hold the "buckets" forever, the cleaning mechanism has been
rewritten as well:
1. For per-domain counter hashtable, this is again much simpler, as we
only need to check whether the usage counter is still zero under the
lock and bail-out on cleaning if the counter is in use.
2. For fetch context hashtable, this is more complicated as the fetch
context cannot be reused after it has been finished. The algorithm
is different, the fetch context is always removed from the
hashtable, but if we find the fetch context that has been marked
as finished in the lookup function, we help with the cleaning from
the hashtable and try again.
Couple of additional changes have been implemented in this refactoring
as those were needed for correct functionality and could not be split
into individual commits (or would not make sense as seperate commits):
1. The dns_resolver_createfetch() has an option to create "unshared"
fetch. The "unshared" fetch will never get matched, so there's
little point in storing the "unshared" fetch in the hashtable.
Therefore the "unshared" fetches are now detached from the
hashtable and live just on their own.
2. Replace the custom reference counting with ISC_REFCOUNT_DECL/IMPL
macros for better tracing.
3. fctx_done_detach() is idempotent, it makes the "final" detach (the
one matching the create function) only once. But that also means
that it has to be called before the detach that kept the fetch
context alive in the callback. A new macro fctx_done_unref() has
been added to allow this code flow:
fctx_done_unref(fctx, result);
fctx_detach(&fctx);
Doing this the other way around could cause fctx to get destroyed in
the fctx_unref() first and fctx_done_detach() would cause UAF.
4. The resume_qmin() and resume_dslookup() callbacks have been
refactored for more readability and simpler code paths. The
validated() callback has also received some of the simplifications,
but it should be refactored in the future as it is bit of spaghetti
now.
The mechanism for associating a worker task to a database now
uses loops rather than tasks.
For this reason, the parameters to dns_cache_create() have been
updated to take a loop manager rather than a task manager.
This commit ensures that send callbacks are always called from within
the context of its worker thread even in the case of
shuttigdown/inactive socket, just like TCP transport does and with
which TLS attempts to be as compatible as possible.
This commit changes ISC_R_NOTCONNECTED error code to ISC_R_CANCELLED
when attempting to start reading data on the shutting down socket in
order to make its behaviour compatible with that of TCP and not break
the common code in the unit tests.
It turned out that after the latest Network Manager refactoring
'sock->reading' flag was not processed correctly. Due to this
isc_nm_read_stop() might not work as expected because reading from the
underlying TCP socket could have been resume in 'tls_do_bio()'
regardless of the 'sock->reading' value.
This bug did not seem to cause problems with DoH, so it was not
noticed, but Stream DNS has more strict expectations regarding the
underlying transport.
Additionally to the above, the 'sock->recv_read' flag was completely
ignored and corresponding logic was completely unimplemented. That did
not allow to implement one fine detail compared to TCP: once reading
is started, it could be satisfied by one datum reading.
This commit fixes the issues above.