The dns_adb unit has been refactored to be much simpler. Following
changes have been made:
1. Simplify the ADB to always allow GLUE and hints
There were only two places where dns_adb_createfind() was used - in
the dns_resolver unit where hints and GLUE addresses were ok, and in
the dns_zone where dns_adb_createfind() would be called without
DNS_ADBFIND_HINTOK and DNS_ADBFIND_GLUEOK set.
Simplify the logic by allowing hint and GLUE addresses when looking
up the nameserver addresses to notify. The difference is negligible
and would cause a difference in the notified addresses only when
there's mismatch between the parent and child addresses and we
haven't cached the child addresses yet.
2. Drop the namebuckets and entrybuckets
Formerly, the namebuckets and entrybuckets were used to reduced the
lock contention when accessing the double-linked lists stored in each
bucket. In the previous refactoring, the custom hashtable for the
buckets has been replaced with isc_ht/isc_hashmap, so only a single
item (mostly, see below) would end up in each bucket.
Removing the entrybuckets has been straightforward, the only matching
was done on the isc_sockaddr_t member of the dns_adbentry.
Removing the zonebuckets required GLUEOK and HINTOK bits to be
removed because the find could match entries with-or-without the bits
set, and creating a custom key that stores the
DNS_ADBFIND_STARTATZONE in the first byte of the key, so we can do a
straightforward lookup into the hashtable without traversing a list
that contains items with different flags.
3. Remove unassociated entries from ADB database
Previously, the adbentries could live in the ADB database even after
unlinking them from dns_adbnames. Such entries would show up as
"Unassociated entries" in the ADB dump. The benefit of keeping such
entries is little - the chance that we link such entry to a adbname
is small, and it's simpler to evict unlinked entries from the ADB
cache (and the hashtable) than create second LRU cleaning mechanism.
Unlinked ADB entries are now directly deleted from the hash
table (hashmap) upon destruction.
4. Cleanup expired entries from the hash table
When buckets were still in place, the code would keep the buckets
always allocated and never shrink the hash table (hashmap). With
proper reference counting in place, we can delete the adbnames from
the hash table and the LRU list.
5. Stop purging the names early when we hit the time limit
Because the LRU list is now time ordered, we can stop purging the
names when we find a first entry that doesn't fullfil our time-based
eviction criteria because no further entry on the LRU list will meet
the criteria.
Future work:
1. Lock contention
In this commit, the focus was on correctness of the data structure,
but in the future, the lock contention in the ADB database needs to
be addressed. Currently, we use simple mutex to lock the hash
tables, because we almost always need to use a write lock for
properly purging the hashtables. The ADB database needs to be
sharded (similar to the effect that buckets had in the past). Each
shard would contain own hashmap and own LRU list.
2. Time-based purging
The ADB names and entries stay intact when there are no lookups.
When we add separate shards, a timer needs to be added for time-based
cleaning in case there's no traffic hashing to the inactive shard.
3. Revisit the 30 minutes limit
The ADB cache is capped at 30 minutes. This needs to be revisited,
and at least the limit should be configurable (in both directions).
The dns_rpz_zones structure was using .refs and .irefs for strong and
weak reference counting. Rewrite the unit to use just a single
reference counting + shutdown sequence (dns_rpz_destroy_rpzs) that must
be called by the creator of the dns_rpz_zones_t object. Remove the
reference counting from the dns_rpz_zone structure as it is not needed
because the zone objects are fully embedded into the dns_rpz_zones
structure and dns_rpz_zones_t object must never be destroyed before all
dns_rpz_zone_t objects.
The dns_rps_zones_t reference counting uses the new ISC_REFCOUNT_TRACE
capability - enable by defining DNS_RPZ_TRACE in the dns/rpz.h header.
Additionally, add magic numbers to the dns_rpz_zone and dns_rpz_zones
structures.
The dns_cache API contained a cache cleaning mechanism that would be
disabled for 'rbt' based cache. As named doesn't have any other cache
implementations, remove the cache cleaning mechanism from dns_cache API.
The various factors like NS_PER_MS are now defined in a single place
and the names are no longer inconsistent. I chose the _PER_SEC names
rather than _PER_S because it is slightly more clear in isolation;
but the smaller units are always NS, US, and MS.
The system test should never attempt to start or stop any other server
than those that belong to that system test. Therefore, it is not
necessary to specify the system test name in function calls.
Additionally, this makes it possible to run the test inside a
differently named directory, as its name is automatically detected with
the $SYSTESTDIR variable. This enables running the system tests inside a
temporary directory.
Direct use of stop.pl was replaced with a more systematic approach to
use stop_servers helper function.
Remove test cases that rely upon key and denial of existence
management operations triggered by dynamic updates.
The autosign system test needed a bit more care than just removing
because the test cases are dependent on each other, so there are some
additional tweaks such as setting the NSEC3PARAM via rndc signing,
and renaming zone input files. In the process, some additional
debug output files have been added, and a 'ret' fail case overwrite
was fixed.
Create a zone that triggers DNAME owner name checks in a zone that
is only reachable using a dual stack server. The answer contains
a name that is higher in the tree than the query name.
e.g.
foo.v4only.net. CNAME v4only.net.
v4only.net. A 10.0.0.1
ns4 is serving the test zone (ipv4-only)
ns6 is the root server for this test (dual stacked)
ns7 is acting as the dual stack server (dual stacked)
ns9 is the server under test (ipv6-only)
checkbashisms warns about possible reliance on HOSTNAME environmental
variable which Bash sets to the name of the current host, and some
commands may leverage it:
possible bashism in builtin/tests.sh line 199 ($HOST(TYPE|NAME)):
grep "^\"$HOSTNAME\"$" dig.out.ns1.$n > /dev/null || ret=1
possible bashism in builtin/tests.sh line 221 ($HOST(TYPE|NAME)):
grep "^\"$HOSTNAME\"$" dig.out.ns2.$n > /dev/null || ret=1
possible bashism in builtin/tests.sh line 228 ($HOST(TYPE|NAME)):
grep "^; NSID: .* (\"$HOSTNAME\")$" dig.out.ns2.$n > /dev/null || ret=1
We don't use the variable this way but rename it to HOST_NAME to silence
the tool.
"next_key_event_threshold" is assigned with
"next_key_event_threshold+i", but "i" is empty (never set, nor used
afterwards).
posh, the Policy-compliant Ordinary SHell, failed on this assignment
with:
tests.sh:253: : unexpected `end of expression'
checkbashisms gets confused by the rndc command being on two lines:
possible bashism in bin/tests/system/nzd2nzf/tests.sh line 37 (type):
rndccmd 10.53.0.1 addzone "added.example { type primary; file \"added.db\";
checkbashisms reports Bash-style ("==") string comparisons inside test/[
command:
possible bashism in bin/tests/system/checkconf/tests.sh line 105 (should be 'b = a'):
if [ $? == 0 ]; then echo_i "failed"; ret=1; fi
possible bashism in bin/tests/system/keyfromlabel/tests.sh line 62 (should be 'b = a'):
test $ret == 0 || continue
possible bashism in bin/tests/system/keyfromlabel/tests.sh line 79 (should be 'b = a'):
test $ret == 0 || continue
The checkbashisms script reports errors like this one:
script util/check-line-length.sh does not appear to have a #! interpreter line;
you may get strange results
It was possible to set operating system limits (RLIMIT_DATA,
RLIMIT_STACK, RLIMIT_CORE and RLIMIT_NOFILE) from named.conf. It's
better to leave these untouched as setting these is responsibility of
the operating system and/or supervisor.
Deprecate the configuration options and remove them in future BIND 9
release.
There were a number of places where the zone table should have been
locked, but wasn't, when dns_zt_apply was called.
Added a isc_rwlocktype_t type parameter to dns_zt_apply and adjusted
all calls to using it. Removed locks in callers.
The retry 3 times when checking signatures did not make sense because
at this point the input file does not change.
Raise the number of retries when checking the apex DNSKEY response to
reduce the number of intermittent failures due to unexpected delays.
If 'set -x' is in effect file.prev gets populated with debugging output.
To prevent this open descriptor 3 and redirect stderr from the awk
command to descriptor 3. Debugging output will stay directed to stderr.
If after a reconfig a zone is not reusable because inline-signing
was turned on/off, trigger a full resign. This is necessary because
otherwise the zone maintenance may decide to only apply the changes
in the journal, leaving the zone in an inconsistent DNSSEC state.
The changes in the code have the side effect that the CDNSKEY and CDS
records in the secure version of the zone are not reusable and thus
are thrashed from the zone. Remove the apex checks for this use case.
We only care about that the zone is not immediately goes bogus, but
a user really should use the built-in "insecure" policy when unsigning
a zone.
Add one more case that tests reconfiguring a zone to turn off
inline-signing. It should still be a valid DNSSEC zone and the NSEC3
parameters should not change.
Add another test to ensure that you cannot update the zone with a
NSEC3 record.
We no longer accept copying DNSSEC records from the raw zone to
the secure zone, so update the kasp system test that relies on this
accordingly.
Also add more debugging and store the dnssec-verify results in a file.
Add a kasp system test that reconfigures a dnssec-policy zone from
maintaining DNSSEC records directly to the zone to using inline-signing.
Add a similar test case to the nsec3 system test, testing the same
thing but now with NSEC3 in use.
On Linux, the libcap is now mandatory. It makes things simpler for us.
System without {set,get}res{uid,gid} now have compatibility shim using
setreuid/setregid or seteuid/setegid to setup effective UID/GID, so the
same code can be called all the time (including on Linux).
The ns_interfacemgr_scan() now requires the loopmgr to be running, so we
need to end exclusive mode for the rescan and then begin it again.
This is relatively safe operation (because the scan happens on the timer
anyway), but we need to ensure that we won't load the configuration from
different threads. This is already the case because the initial load
happens on the main thread and the control channel also listens just on
the main loop.
the dupsigs test is prone to failing on slow CI machines
because the first test can occur before the zone is fully
signed.
instead of just waiting ten seconds arbitrarily, we now
check every second, and allow up to 30 seconds before giving
up.
Checks that malformed _dns SVCB records are rejected unless
check-svcb is set to no, in which case they are accepted. Both
missing ALPN and missing DOHPATH are checked for.
_dns SVBC records have additional constrains which should be checked
when records are being added. This adds those constraint checks but
allows the user to override them using 'check-svcb no'.
Add a new upforwd system test that checks if update forwarding still
works if the first primary is badly configured.
We cannot reuse the 'example.' zone for this test because that
checks if update forwarding works for DoT. What transport is used
in the new test is of no relevance.
Update the system test to use different known good file names for
the different zones that are being tested.
Use the ALGORITHM_SET option to use randomly selected default algorithm
in this test. Make sure the test works by using variables instead of
hard-coding values.