The Windows support has been completely removed from the source tree
and BIND 9 now no longer supports native compilation on Windows.
We might consider reviewing mingw-w64 port if contributed by external
party, but no development efforts will be put into making BIND 9 compile
and run on Windows again.
Add a test case where 'named' is restarted and ensure that an already
signed zone does not change its NSEC3 parameters.
The test case first tests the current zone and saves the used salt
value. Then after restart it checks if the salt (and other parameters)
are the same as before the restart.
This test case changes 'set_nsec3param'. This will now reset the salt
value, and when checking for NSEC3PARAM we will store the salt and
use it when testing the NXDOMAIN response. This does mean that for
every test case we now have to call 'set_nsec3param' explicitly (and
can not omit it because it is the same as the previous zone).
Finally, slightly changed some echo output to make debugging friendlier.
Make sure an incoming IXFR containing an SOA record which is not placed
at the apex of the transferred zone does not result in a broken version
of the zone being served by named and/or a subsequent crash.
Add a test case where a client request is received and the stale
timeout occurs, but it is not served stale data because there is no entry
in the cache, then is served an authoritative answer once the background
fetch completes. This ensures that a stale timeout only affects a
subsequent response if the client was answered.
- send a query for an AAAA which will be resolved as a mapped A
- disable authoritative responses
- wait for the negative AAAA response to become stale
- send another query, wait for the stale answer
- re-enable authorative responses so that a real answer arrives
- currently, this triggers an assertion in query.c
In the shutdown system test multiple queries are sent to a resolver
instance, in the meantime we terminate the same resolver process for
which the queries were sent to, either via rndc stop or a SIGTERM
signal, that means the resolver may not be able to answer all those
queries, since it has initiated the shutdown process.
The dnspython library raises a dns.resolver.NoNameservers exception when
a resolver object fails to receive an answer from the specified list
of nameservers (resolver.nameservers list), we need to handle this
exception as this is something that may happen since we asked the
resolver to terminate, as a result it may not answer clients even if
an answer is available, as the operation will be canceled.
The isc_nmiface_t type was holding just a single isc_sockaddr_t,
so we got rid of the datatype and use plain isc_sockaddr_t in place
where isc_nmiface_t was used before. This means less type-casting and
shorter path to access isc_sockaddr_t members.
At the same time, instead of keeping the reference to the isc_sockaddr_t
that was passed to us when we start listening, we will keep a local
copy. This prevents the data race on destruction of the ns_interface_t
objects where pending nmsockets could reference the sockaddr of already
destroyed ns_interface_t object.
the idle timeout for rndc connections was set to 10 seconds, but this
caused intermittent system failures of the 'rndc' system test on slow
platforms, since 'rndc reconfig' could time out before reconfiguration
was complete.
this commit restores the original timeout value of 60 seconds, which was
changed inadvertently after rndc was updated to use the network manager.
even with this change, however, the test can still time out under
TSAN because loading the huge zone can take a very long time (upwards
of two minutes). so the test is modified here to generate a smaller zone
file when running under TSAN.
When executed in "legacy mode" (i.e. without the '-r' parameter)
run.sh invokes make with a modified environment.
SYSTEMTEST_FORCE_COLOR is now preserved for use by the individual test
scripts.
CYGWIN is now preserved for named, as it controls behavior relating to
crash reporting.
This restores legacy behavior in bin/tests/system where running:
SYSTEMTEST_NO_CLEAN=1 ./run.sh <testname>
would run the test and preserve the output files.
This has been broken since the change that has run.sh invoke "make",
due to SYSTEMTEST_NO_CLEAN not being preserved in the environment
that's set up for "make".
Another option would be to completely remove SYSTEMTEST_NO_CLEAN.
This seems to be the only behavior-changing environment variable
not accounted for in the call to "make".
I don't think this needs a CHANGES entry.
dns_message_gettempname() now returns a pointer to an initialized
name associated with a dns_fixedname_t object. it is no longer
necessary to allocate a buffer for temporary names associated with
the message object.
Also, add "set -e" to all shell scripts of the views test to exit when
any command fails or is unknown, e.g., this on OpenBSD:
tests.sh[174]: seq: not found
The seq command is not defined in the POSIX standard and is missing on
OpenBSD. Given that the system test code is meant to be POSIX-compliant
replace it with a shell construct.
Add two tests to make sure named-checkconf catches key-directory issues
where a zone in multiple views uses the same directory but has
different dnssec-policies. One test sets the key-directory specifically,
the other inherits the default key-directory (NULL, aka the working
directory).
Also update the good.conf test to allow zones in different views
with the same key-directory if they use the same dnssec-policy.
Also allow zones in different views with different key-directories if
they use different dnssec-policies.
Also allow zones in different views with the same key-directories if
only one view uses a dnssec-policy (the other is set to "none").
Also allow zones in different views with the same key-directories if
no views uses a dnssec-policy (zone in both views has the dnssec-policy
set to "none").
PyLint 2.8.2 reports the following suggestions for two Python scripts
used in the system test suite:
************* Module tests_rndc_deadlock
bin/tests/system/addzone/tests_rndc_deadlock.py:71:4: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
************* Module tests-shutdown
bin/tests/system/shutdown/tests-shutdown.py:68:4: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
bin/tests/system/shutdown/tests-shutdown.py:154:8: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
Implement the above suggestions by using
concurrent.futures.ThreadPoolExecutor() and subprocess.Popen() as
context managers.
With taskmgr running on top of netmgr, the ordering of how the tasks and
netmgr shutdown interacts was wrong as previously isc_taskmgr_destroy()
was waiting until all tasks were properly shutdown and detached. This
responsibility was moved to netmgr, so we now need to do the following:
1. shutdown all the tasks - this schedules all shutdown events onto
the netmgr queue
2. shutdown the netmgr - this also makes sure all the tasks and
events are properly executed
3. Shutdown the taskmgr - this now waits for all the tasks to finish
running before returning
4. Shutdown the netmgr - this call waits for all the netmgr netievents
to finish before returning
This solves the race when the taskmgr object would be destroyed before
all the tasks were finished running in the netmgr loops.
Previously, netmgr, taskmgr, timermgr and socketmgr all had their own
isc_<*>mgr_create() and isc_<*>mgr_destroy() functions. The new
isc_managers_create() and isc_managers_destroy() fold all four into a
single function and makes sure the objects are created and destroy in
correct order.
Especially now, when taskmgr runs on top of netmgr, the correct order is
important and when the code was duplicated at many places it's easy to
make mistake.
The former isc_<*>mgr_create() and isc_<*>mgr_destroy() functions were
made private and a single call to isc_managers_create() and
isc_managers_destroy() is required at the program startup / shutdown.
Just like with dynamic and/or inline-signing zones, check if no two
or more zone configurations set the same filename. In these cases,
the zone files are not read-only and named-checkconf should catch
a configuration where multiple zone statements write to the same file.
Add some bad configuration tests where KASP zones reference the same
zone file.
Update the good-kasp test to allow for two zones configure the same
file name, dnssec-policy none.
Add a test for default.kasp that if we remove the private key file,
no successor key is created for it. We need to update the kasp script
to deal with a missing private key. If this is the case, skip checks
for private key files.
Add a test with a zone for which the private key of the ZSK is missing.
Add a test with a zone for which the private key of the KSK is missing.
The kasp system test performs for each zone a couple of checks to make
sure the zone is signed correctly. To avoid test failures caused by
timing issues, there is first a check to ensure the zone is done
signing, 'wait_for_done_signing'. This function waits with the DNSSEC
checks until a "zone_rekey done" log message is seen for a specific
key.
Unfortunately this is not sufficient to avoid test failures due to
timing issues, because there is a small amount of time in between this
log message and the newly signed zone actually being served.
Therefore, in 'check_apex', retry for three seconds the DNSKEY query
check. After that, additional checks should pass without retries,
because at that point we know for sure the zone has been resigned with
the expected keys.
Also reduce the number of redundant 'check_signatures'
The nsupdate system test did not record failures from the
'update_test.pl' Perl script. This was because the 'ret' value was
not being saved outside the '{ $PERL ... || ret=1 } cat_i' scope.
Change this piece to store the output in a separate file and then
cat its contents. Now the 'ret' value is being saved.
Also record failures in 'update_test.pl' if sending the update
failed.
Add missing 'n' incrementals to 'nsupdate/test.sh' to keep track of
test numbers.
Add a test case when a dnssec-policy is reconfigured to "none",
without setting it to "insecure" first. This is unsupported behavior,
but we want to make sure the behavior is somewhat expected. The
zone should remain signed (but will go bogus once the signatures
expire).
While it is meant to be used for transitioning a zone to insecure,
add a test case where a zone uses the "insecure" policy immediately.
The zone will go through DNSSEC maintenance, but the outcome should
be the same as 'dnssec-policy none;', that is the zone should be
unsigned.
The tests for going insecure should be changed to use the built-in
"insecure" policy.
The function that checks dnssec status output should again check
for the special case "none".
* The location of the digest type field has changed to where the
reserved field was.
* The reserved field is now called scheme and is where the digest
type field was.
* Digest type 2 has been defined (SHA256).
The pytest "cacheprovider" plugin produces a .cache/v/cache/lastfailed
file, which holds a Python dictionary structure with failed tests.
However, on Ubuntu 16.04 (Xenial) the file is created even though the
test passed and the file contains just an empty dictionary ("{}").
Given that we are not interested in this feature, disabling the
"cacheprovider" plugin globally and removing per-test removals of the
.cache directory seems like the best course of action.
When the `named` would hang on startup it would be killed with SIGKILL
leaving us with no information about the state the process was in.
This commit changes the start.pl script to send SIGABRT instead, so we
can properly collect and process the coredump from the hung named
process.
When reducing the number of NSEC3 iterations to 150, commit
aa26cde2ae added tests for dnssec-policy
to check that a too high iteration count is a configuration failure.
The test is not sufficient because 151 was always too high for
ECDSAP256SHA256. The test should check for a different algorithm.
There was an existing test case that checks for NSEC3 iterations.
Update the test with the new maximum values.
Update the code in 'kaspconf.c' to allow at most 150 iterations.
While working on the serve-stale backports, I noticed the following
oddities:
1. In the serve-stale system test, in one case we keep track of the
time how long it took for dig to complete. In commit
aaed7f9d8c, the code removed the
exception to check for result == ISC_R_SUCCESS on stale found
answers, and adjusted the test accordingly. This failed to update
the time tracking accordingly. Move the t1/t2 time track variables
back around the two dig commands to ensure the lookups resolved
faster than the resolver-query-timeout.
2. We can remove the setting of NS_QUERYATTR_STALEOK and
DNS_RDATASETATTR_STALE_ADDED on the "else if (stale_timeout)"
code path, because they are added later when we know we have
actually found a stale answer on a stale timeout lookup.
3. We should clear the NS_QUERYATTR_STALEOK flag from the client
query attributes instead of DNS_RDATASETATTR_STALE_ADDED (that
flag is set on the rdataset attributes).
4. In 'bin/named/config.c' we should set the configuration options
in alpabetical order.
5. In the ARM, in the backports we have added "(stale)" between
"cached" and "RRset" to make more clear a stale RRset may be
returned in this scenario.
Exerting excessive I/O load on the host running system tests should be
avoided in order to limit the number of false positives reported by the
system test suite. In some cases, running named with "-d 99" (which is
the default for system tests) results in a massive amount of logs being
generated, most of which are useless. Implement a log file size check
to draw developers' attention to overly verbose named instances used in
system tests. The warning threshold of 200,000 lines was chosen
arbitrarily.
The regression test for CVE-2020-8620 causes a lot of useless messages
to be logged. However, globally decreasing the log level for the
affected named instance would be a step too far as debugging information
may be useful for troubleshooting other checks in the "tcp" system test.
Starting a separate named instance for a single check should be avoided
when possible and thus is also not a good solution. As a compromise,
run "rndc trace 1" for the affected named instance before starting the
regression test for CVE-2020-8620.