Using a restored catalog zone excercised a use-after-free bug.
The test checks that the use-after-free bug is gone and is just
a reasonable behaviour check in its own right.
Prime the cache with the following records:
shortttl.cname.example. 1 IN CNAME longttl.target.example.
longttl.target.example. 600 IN A 10.53.0.2
Wait for the CNAME record to expire, disable the authoritative server,
and query 'shortttl.cname.example' again, expecting a stale answer.
While some of these tests are for DoT which doesn't require nghttp2,
the server configs won't allow the server to start without nghttp2
support during compile time.
It might be possible to split these tests into DoT and DoH and only
require nghttp2 for DoH tests, but since almost all of our CI jobs are
compiled with nghttp2, we wouldn't gain a lot of coverage, so it's
probably not worth the effort.
Avoid using the environment variables for feature detection and use the
feature-test utility instead.
Remove the obsolete environment variables from conf.sh, since they're no
longer used anywhere.
Previously, there were two different ways to detect feature support.
Either through an environment variable set by configure in conf.sh, or
using the feature-test utility.
It is more simple and consistent to have only one way of detecting the
feature support. Using the feature-test utility seems superior the the
environment variables set by configure.
The mechanism for associating a worker task to a database now
uses loops rather than tasks.
For this reason, the parameters to dns_cache_create() have been
updated to take a loop manager rather than a task manager.
The dns_adb unit has been refactored to be much simpler. Following
changes have been made:
1. Simplify the ADB to always allow GLUE and hints
There were only two places where dns_adb_createfind() was used - in
the dns_resolver unit where hints and GLUE addresses were ok, and in
the dns_zone where dns_adb_createfind() would be called without
DNS_ADBFIND_HINTOK and DNS_ADBFIND_GLUEOK set.
Simplify the logic by allowing hint and GLUE addresses when looking
up the nameserver addresses to notify. The difference is negligible
and would cause a difference in the notified addresses only when
there's mismatch between the parent and child addresses and we
haven't cached the child addresses yet.
2. Drop the namebuckets and entrybuckets
Formerly, the namebuckets and entrybuckets were used to reduced the
lock contention when accessing the double-linked lists stored in each
bucket. In the previous refactoring, the custom hashtable for the
buckets has been replaced with isc_ht/isc_hashmap, so only a single
item (mostly, see below) would end up in each bucket.
Removing the entrybuckets has been straightforward, the only matching
was done on the isc_sockaddr_t member of the dns_adbentry.
Removing the zonebuckets required GLUEOK and HINTOK bits to be
removed because the find could match entries with-or-without the bits
set, and creating a custom key that stores the
DNS_ADBFIND_STARTATZONE in the first byte of the key, so we can do a
straightforward lookup into the hashtable without traversing a list
that contains items with different flags.
3. Remove unassociated entries from ADB database
Previously, the adbentries could live in the ADB database even after
unlinking them from dns_adbnames. Such entries would show up as
"Unassociated entries" in the ADB dump. The benefit of keeping such
entries is little - the chance that we link such entry to a adbname
is small, and it's simpler to evict unlinked entries from the ADB
cache (and the hashtable) than create second LRU cleaning mechanism.
Unlinked ADB entries are now directly deleted from the hash
table (hashmap) upon destruction.
4. Cleanup expired entries from the hash table
When buckets were still in place, the code would keep the buckets
always allocated and never shrink the hash table (hashmap). With
proper reference counting in place, we can delete the adbnames from
the hash table and the LRU list.
5. Stop purging the names early when we hit the time limit
Because the LRU list is now time ordered, we can stop purging the
names when we find a first entry that doesn't fullfil our time-based
eviction criteria because no further entry on the LRU list will meet
the criteria.
Future work:
1. Lock contention
In this commit, the focus was on correctness of the data structure,
but in the future, the lock contention in the ADB database needs to
be addressed. Currently, we use simple mutex to lock the hash
tables, because we almost always need to use a write lock for
properly purging the hashtables. The ADB database needs to be
sharded (similar to the effect that buckets had in the past). Each
shard would contain own hashmap and own LRU list.
2. Time-based purging
The ADB names and entries stay intact when there are no lookups.
When we add separate shards, a timer needs to be added for time-based
cleaning in case there's no traffic hashing to the inactive shard.
3. Revisit the 30 minutes limit
The ADB cache is capped at 30 minutes. This needs to be revisited,
and at least the limit should be configurable (in both directions).
The system test should never attempt to start or stop any other server
than those that belong to that system test. Therefore, it is not
necessary to specify the system test name in function calls.
Additionally, this makes it possible to run the test inside a
differently named directory, as its name is automatically detected with
the $SYSTESTDIR variable. This enables running the system tests inside a
temporary directory.
Direct use of stop.pl was replaced with a more systematic approach to
use stop_servers helper function.
Remove test cases that rely upon key and denial of existence
management operations triggered by dynamic updates.
The autosign system test needed a bit more care than just removing
because the test cases are dependent on each other, so there are some
additional tweaks such as setting the NSEC3PARAM via rndc signing,
and renaming zone input files. In the process, some additional
debug output files have been added, and a 'ret' fail case overwrite
was fixed.
Create a zone that triggers DNAME owner name checks in a zone that
is only reachable using a dual stack server. The answer contains
a name that is higher in the tree than the query name.
e.g.
foo.v4only.net. CNAME v4only.net.
v4only.net. A 10.0.0.1
ns4 is serving the test zone (ipv4-only)
ns6 is the root server for this test (dual stacked)
ns7 is acting as the dual stack server (dual stacked)
ns9 is the server under test (ipv6-only)
checkbashisms warns about possible reliance on HOSTNAME environmental
variable which Bash sets to the name of the current host, and some
commands may leverage it:
possible bashism in builtin/tests.sh line 199 ($HOST(TYPE|NAME)):
grep "^\"$HOSTNAME\"$" dig.out.ns1.$n > /dev/null || ret=1
possible bashism in builtin/tests.sh line 221 ($HOST(TYPE|NAME)):
grep "^\"$HOSTNAME\"$" dig.out.ns2.$n > /dev/null || ret=1
possible bashism in builtin/tests.sh line 228 ($HOST(TYPE|NAME)):
grep "^; NSID: .* (\"$HOSTNAME\")$" dig.out.ns2.$n > /dev/null || ret=1
We don't use the variable this way but rename it to HOST_NAME to silence
the tool.
"next_key_event_threshold" is assigned with
"next_key_event_threshold+i", but "i" is empty (never set, nor used
afterwards).
posh, the Policy-compliant Ordinary SHell, failed on this assignment
with:
tests.sh:253: : unexpected `end of expression'
checkbashisms gets confused by the rndc command being on two lines:
possible bashism in bin/tests/system/nzd2nzf/tests.sh line 37 (type):
rndccmd 10.53.0.1 addzone "added.example { type primary; file \"added.db\";
checkbashisms reports Bash-style ("==") string comparisons inside test/[
command:
possible bashism in bin/tests/system/checkconf/tests.sh line 105 (should be 'b = a'):
if [ $? == 0 ]; then echo_i "failed"; ret=1; fi
possible bashism in bin/tests/system/keyfromlabel/tests.sh line 62 (should be 'b = a'):
test $ret == 0 || continue
possible bashism in bin/tests/system/keyfromlabel/tests.sh line 79 (should be 'b = a'):
test $ret == 0 || continue
The checkbashisms script reports errors like this one:
script util/check-line-length.sh does not appear to have a #! interpreter line;
you may get strange results
It was possible to set operating system limits (RLIMIT_DATA,
RLIMIT_STACK, RLIMIT_CORE and RLIMIT_NOFILE) from named.conf. It's
better to leave these untouched as setting these is responsibility of
the operating system and/or supervisor.
Deprecate the configuration options and remove them in future BIND 9
release.
The retry 3 times when checking signatures did not make sense because
at this point the input file does not change.
Raise the number of retries when checking the apex DNSKEY response to
reduce the number of intermittent failures due to unexpected delays.
If 'set -x' is in effect file.prev gets populated with debugging output.
To prevent this open descriptor 3 and redirect stderr from the awk
command to descriptor 3. Debugging output will stay directed to stderr.
The changes in the code have the side effect that the CDNSKEY and CDS
records in the secure version of the zone are not reusable and thus
are thrashed from the zone. Remove the apex checks for this use case.
We only care about that the zone is not immediately goes bogus, but
a user really should use the built-in "insecure" policy when unsigning
a zone.
Add one more case that tests reconfiguring a zone to turn off
inline-signing. It should still be a valid DNSSEC zone and the NSEC3
parameters should not change.
Add another test to ensure that you cannot update the zone with a
NSEC3 record.
We no longer accept copying DNSSEC records from the raw zone to
the secure zone, so update the kasp system test that relies on this
accordingly.
Also add more debugging and store the dnssec-verify results in a file.
Add a kasp system test that reconfigures a dnssec-policy zone from
maintaining DNSSEC records directly to the zone to using inline-signing.
Add a similar test case to the nsec3 system test, testing the same
thing but now with NSEC3 in use.
the dupsigs test is prone to failing on slow CI machines
because the first test can occur before the zone is fully
signed.
instead of just waiting ten seconds arbitrarily, we now
check every second, and allow up to 30 seconds before giving
up.