While working on the serve-stale backports, I noticed the following
oddities:
1. In the serve-stale system test, in one case we keep track of the
time how long it took for dig to complete. In commit
aaed7f9d8c, the code removed the
exception to check for result == ISC_R_SUCCESS on stale found
answers, and adjusted the test accordingly. This failed to update
the time tracking accordingly. Move the t1/t2 time track variables
back around the two dig commands to ensure the lookups resolved
faster than the resolver-query-timeout.
2. We can remove the setting of NS_QUERYATTR_STALEOK and
DNS_RDATASETATTR_STALE_ADDED on the "else if (stale_timeout)"
code path, because they are added later when we know we have
actually found a stale answer on a stale timeout lookup.
3. We should clear the NS_QUERYATTR_STALEOK flag from the client
query attributes instead of DNS_RDATASETATTR_STALE_ADDED (that
flag is set on the rdataset attributes).
4. In 'bin/named/config.c' we should set the configuration options
in alpabetical order.
5. In the ARM, in the backports we have added "(stale)" between
"cached" and "RRset" to make more clear a stale RRset may be
returned in this scenario.
Exerting excessive I/O load on the host running system tests should be
avoided in order to limit the number of false positives reported by the
system test suite. In some cases, running named with "-d 99" (which is
the default for system tests) results in a massive amount of logs being
generated, most of which are useless. Implement a log file size check
to draw developers' attention to overly verbose named instances used in
system tests. The warning threshold of 200,000 lines was chosen
arbitrarily.
The regression test for CVE-2020-8620 causes a lot of useless messages
to be logged. However, globally decreasing the log level for the
affected named instance would be a step too far as debugging information
may be useful for troubleshooting other checks in the "tcp" system test.
Starting a separate named instance for a single check should be avoided
when possible and thus is also not a good solution. As a compromise,
run "rndc trace 1" for the affected named instance before starting the
regression test for CVE-2020-8620.
The system test framework starts all named instances with the "-d 99"
command line option (unless it is overridden by a named.args file in a
given instance's working directory). This causes a lot of log messages
to be written to named.run files - currently over 5 million lines for a
single test suite run. While debugging information preserved in the log
files is essential for troubleshooting intermittent test failures, some
system tests involve sending hundreds or even thousands of queries,
which causes the relevant log files to explode in size. When multiple
tests (or even multiple test suites) are run in parallel, excessive
logging contributes considerably to the I/O load on the test host,
increasing the odds of intermittent test failures getting triggered.
Decrease the debug level for the seven most verbose named instances:
- use "-d 3" for ns2 in the "cacheclean" system test (it is the lowest
logging level at which the test still passes without the need to
apply any changes to tests.sh),
- use "-d 1" for the other six named instances.
This roughly halves the number of lines logged by each test suite run
while still leaving enough information in the logs to allow at least
basic troubleshooting in case of test failures.
This approach was chosen as it results in a greater decrease in the
number of lines logged than running all named instances with "-d 3",
without causing any test failures.
Add a check to the "dnssec" system test which ensures that RRSIG(SOA)
RRsets present anywhere else than at the zone apex are automatically
removed after a zone containing such RRsets is loaded.
If "tkey-gssapi-credential" is set in the configuration and GSSAPI
support is not available, named will refuse to start. As the test
system framework does not support starting named instances
conditionally, ensure that "tkey-gssapi-credential" is only present in
named.conf if GSSAPI support is available.
The test spawns 4 parallel workers that keep adding, modifying and
deleting zones, the main thread repeatedly checks wheter rndc
status responds within a reasonable period.
While environment and timing issues may affect the test, in most
test cases the deadlock that was taking place before the fix used to
trigger in less than 7 seconds in a machine with at least 2 cores.
Four named instances in the "nsupdate" system test have GSS-TSIG support
enabled. All of them currently use "tkey-gssapi-keytab". Configure two
of them with "tkey-gssapi-credential" to test that option.
As "tkey-gssapi-keytab" and "tkey-gssapi-credential" both provide the
same functionality, no test modifications are required. The difference
between the two options is that the value of "tkey-gssapi-keytab" is an
explicit path to the keytab file to acquire credentials from, while the
value of "tkey-gssapi-credential" is the name of the principal whose
credentials should be used; those credentials are looked up in the
keytab file expected by the Kerberos library, i.e. /etc/krb5.keytab by
default. The path to the default keytab file can be overridden using by
setting the KRB5_KTNAME environment variable. Utilize that variable to
use existing keytab files with the "tkey-gssapi-credential" option.
The KRB5_KTNAME environment variable should not interfere with the
"tkey-gssapi-keytab" option. Nevertheless, rename one of the keytab
files used with "tkey-gssapi-keytab" to something else than the contents
of the KRB5_KTNAME environment variable in order to make sure that both
"tkey-gssapi-keytab" and "tkey-gssapi-credential" are actually tested.
This commit changes the taskmgr to run the individual tasks on the
netmgr internal workers. While an effort has been put into keeping the
taskmgr interface intact, couple of changes have been made:
* The taskmgr has no concept of universal privileged mode - rather the
tasks are either privileged or unprivileged (normal). The privileged
tasks are run as a first thing when the netmgr is unpaused. There
are now four different queues in in the netmgr:
1. priority queue - netievent on the priority queue are run even when
the taskmgr enter exclusive mode and netmgr is paused. This is
needed to properly start listening on the interfaces, free
resources and resume.
2. privileged task queue - only privileged tasks are queued here and
this is the first queue that gets processed when network manager
is unpaused using isc_nm_resume(). All netmgr workers need to
clean the privileged task queue before they all proceed normal
operation. Both task queues are processed when the workers are
finished.
3. task queue - only (traditional) task are scheduled here and this
queue along with privileged task queues are process when the
netmgr workers are finishing. This is needed to process the task
shutdown events.
4. normal queue - this is the queue with netmgr events, e.g. reading,
sending, callbacks and pretty much everything is processed here.
* The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t)
object.
* The isc_nm_destroy() function now waits for indefinite time, but it
will print out the active objects when in tracing mode
(-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been
made a little bit more asynchronous and it might take longer time to
shutdown all the active networking connections.
* Previously, the isc_nm_stoplistening() was a synchronous operation.
This has been changed and the isc_nm_stoplistening() just schedules
the child sockets to stop listening and exits. This was needed to
prevent a deadlock as the the (traditional) tasks are now executed on
the netmgr threads.
* The socket selection logic in isc__nm_udp_send() was flawed, but
fortunatelly, it was broken, so we never hit the problem where we
created uvreq_t on a socket from nmhandle_t, but then a different
socket could be picked up and then we were trying to run the send
callback on a socket that had different threadid than currently
running.
When resolve.c was moved from lib/samples to bin/tests/system, the
resolve.vcxproj.in would still contain old paths to the directory
root. This commit adds one more ..\ to match the directory depth.
Additionally, fixup the path in BINDInstall.vcxproj.in to be
bin/tests/system and not bin/tests/samples.
Previously, the taskmgr, timermgr and socketmgr had a constructor
variant, that would create the mgr on top of existing appctx. This was
no longer true and isc_<*>mgr was just calling isc_<*>mgr_create()
directly without any extra code.
This commit just cleans up the extra function.
"resolve" is used by the resolver system tests, and I'm not
certain whether delv exercises the same code, so rather than
remove it, I moved it to bin/tests/system.
the libdns client API is no longer being maintained for
external use, we can remove the code that isn't being used
internally, as well as the related tests.
Too much logic was cramped inside the dns_journal_rollforward() that
made it harder to follow. The dns_journal_rollforward() was refactored
to work over already opened journal and some of the previous logic was
moved to new static zone_journal_rollforward() that separates the
journal "rollforward" logic from the "zone" logic.
when dns_journal_rollforward returned ISC_R_RECOVERABLE the distintion
between 'up to date' and 'success' was lost, as a consequence
zone_needdump() was called writing out the zone file when it shouldn't
have been. This change restores that distintion. Adjust system
test to reflect visible changes.
Due to the lack of "match-clients" clauses in ns4/named2.conf.in, the
same view is incorrectly chosen for all queries received by ns4 in the
"keymgr2kasp" system test. This causes only one version of the
"view-rsasha256.kasp" zone to actually be checked. Add "match-clients"
clauses to ns4/named2.conf.in to ensure the test really checks what it
claims to.
Use identical view names ("ext", "int") in ns4/named.conf.in and
ns4/named2.conf.in so that it is easier to quickly identify the
differences between these two files.
Update tests.sh to account for the above changes. Also fix a copy-paste
error in a comment to prevent confusion.
The test case for a zone with a missing include file was wrong for two
reasons:
1. It was loading the wrong file (master5 instead of master6)
2. It did actually not set the $ret variable to 1 if the test failed
(it should default to ret=1 and clear the variable if the
appropriate log is found).
Add a test case for inline-signing for a zone with an $INCLUDE
statement. There is already a test for a missing include file, this
one adds a test for a zone with an include file that does exist.
Test if the record in the included file is loaded.
The draft says that the NSEC(3) TTL must have the same TTL value
as the minimum of the SOA MINIMUM field and the SOA TTL. This was
always the intended behaviour.
Update the zone structure to also track the SOA TTL. Whenever we
use the MINIMUM value to determine the NSEC(3) TTL, use the minimum
of MINIMUM and SOA TTL instead.
There is no specific test for this, however two tests need adjusting
because otherwise they failed: They were testing for NSEC3 records
including the TTL. Update these checks to use 600 (the SOA TTL),
rather than 3600 (the SOA MINIMUM).
It is more intuitive to have the countdown 'max-stale-ttl' as the
RRset TTL, instead of 0 TTL. This information was already available
in a comment "; stale (will be retained for x more seconds", but
Support suggested to put it in the TTL field instead.
Before binding an RRset, check the time and see if this record is
stale (or perhaps even ancient). Marking a header stale or ancient
happens only when looking up an RRset in cache, but binding an RRset
can also happen on other occasions (for example when dumping the
database).
Check the time and compare it to the header. If according to the
time the entry is stale, but not ancient, set the STALE attribute.
If according to the time is ancient, set the ANCIENT attribute.
We could mark the header stale or ancient here, but that requires
locking, so that's why we only compare the current time against
the rdh_ttl.
Adjust the test to check the dump-db before querying for data. In the
dumped file the entry should be marked as stale, despite no cache
lookup happened since the initial query.
When introducing change 5149, "rndc dumpdb" started to print a line
above a stale RRset, indicating how long the data will be retained.
At that time, I thought it should also be possible to load
a cache from file. But if a TTL has a value of 0 (because it is stale),
stale entries wouldn't be loaded from file. So, I added the
'max-stale-ttl' to TTL values, and adjusted the $DATE accordingly.
Since we actually don't have a "load cache from file" feature, this
is premature and is causing confusion at operators. This commit
changes the 'max-stale-ttl' adjustments.
A check in the serve-stale system test is added for a non-stale
RRset (longttl.example) to make sure the TTL in cache is sensible.
Also, the comment above stale RRsets could have nonsensical
values. A possible reason why this may happen is when the RRset was
marked a stale but the 'max-stale-ttl' has passed (and is actually an
RRset awaiting cleanup). This would lead to the "will be retained"
value to be negative (but since it is stored in an uint32_t, you would
get a nonsensical value (e.g. 4294362497).
To mitigate against this, we now also check if the header is not
ancient. In addition we check if the stale_ttl would be negative, and
if so we set it to 0. Most likely this will not happen because the
header would already have been marked ancient, but there is a possible
race condition where the 'rdh_ttl + serve_stale_ttl' has passed,
but the header has not been checked for staleness.
When system tests are run on Windows, they are assigned port ranges that
are 100 ports wide and start from port number 5000. This is a different
port assignment method than the one used on Unix systems. Drop the "-p"
command line option from bin/tests/system/run.sh invocations used for
starting system tests on Windows to unify the port assignment method
used across all operating systems.
The get_ports.sh script is used for determining the range of ports a
given system test should use. It first determines the start of the port
range to return (the base port); it can either be specified explicitly
by the caller or chosen randomly. Subsequent ports are picked
sequentially, starting from the base port. To ensure no single port is
used by multiple tests, a state file (get_ports.state) containing the
last assigned port is maintained by the script. Concurrent access to
the state file is protected by a lock file (get_ports.lock); if one
instance of the script holds the lock file while another instance tries
to acquire it, the latter retries its attempt to acquire the lock file
after sleeping for 1 second; this retry process can be repeated up to 10
times before the script returns an error.
There are some problems with this approach:
- the sleep period in case of failure to acquire the lock file is
fixed, which leads to a "thundering herd" type of problem, where
(depending on how processes are scheduled by the operating system)
multiple system tests try to acquire the lock file at the same time
and subsequently sleep for 1 second, only for the same situation to
likely happen the next time around,
- the lock file is being locked and then unlocked for every single
port assignment made, not just once for the entire range of ports a
system test should use; in other words, the lock file is currently
locked and unlocked 13 times per system test; this increases the
odds of the "thundering herd" problem described above preventing a
system test from getting one or more ports assigned before the
maximum retry count is reached (assuming multiple system tests are
run in parallel); it also enables the range of ports used by a given
system test to be non-sequential (which is a rather cosmetic issue,
but one that can make log interpretation harder than necessary when
test failures are diagnosed),
- both issues described above cause unnecessary delays when multiple
system tests are started in parallel (due to high lock file
contention among the system tests being started),
- maintaining a state file requires ensuring proper locking, which
complicates the script's source code.
Rework the get_ports.sh script so that it assigns non-overlapping port
ranges to its callers without using a state file or a lock file:
- add a new command line switch, "-t", which takes the name of the
system test to assign ports for,
- ensure every instance of get_ports.sh knows how many ports all
system tests which form the test suite are going to need in total
(based on the number of subdirectories found in bin/tests/system/),
- in order to ensure all instances of get_ports.sh work on the same
global port range (so that no port range collisions happen), a
stable (throughout the expected run time of a single system test
suite) base port selection method is used instead of the random one;
specifically, the base port, unless specified explicitly using the
"-p" command line switch, is derived from the number of hours which
passed since the Unix Epoch time,
- use the name of the system test to assign ports for (passed via the
new "-t" command line switch) as a unique index into the global
system test range, to ensure all system tests use disjoint port
ranges.
The fromhex.pl script needs to be copied from the source directory to
the build directory before any test is run, otherwise the out-of-tree
fails to find it. Given that the script is used only in system test,
move it to bin/tests/system/.
Update the system to include a recoverable managed.keys journal created
with <size,serial0,serial1,0> transactions and test that it has been
updated as part of the start up process.
The isc_nm_*connect() functions were refactored to always return the
connection status via the connect callback instead of sometimes returning
the hard failure directly (for example, when the socket could not be
created, or when the network manager was shutting down).
This commit changes the connect functions in all the network manager
modules, and also makes the necessary refactoring changes in places
where the connect functions are called.
Using "stale-answer-client-timeout" turns out to have unforeseen
negative consequences, and thus it is better to disable the feature
by default for the time being.
When implementing "stale-answer-client-timeout", we decided that
we should only return positive answers prematurely to clients. A
negative response is not useful, and in that case it is better to
wait for the recursion to complete.
To do so, we check the result and if it is not ISC_R_SUCCESS, we
decide that it is not good enough. However, there are more return
codes that could lead to a positive answer (e.g. CNAME chains).
This commit removes the exception and now uses the same logic that
other stale lookups use to determine if we found a useful stale
answer (stale_found == true).
This means we can simplify two test cases in the serve-stale system
test: nodata.example is no longer treated differently than data.example.
Tag the libraries with check_ to prevent them being installed
by "make install". Additionally make check requires .so to be
create which requires .lai files to be constructed which, in
turn, requires -rpath <dir> as part of "linking" the .la file.
Added tests to ensure that dig won't retry sending a query over tcp
(+tcp) when a TCP connection is closed prematurely (EOF is read) if
either +tries=1 or retry=0 is specified on the command line.
Now that premature EOF on tcp connections take +tries and +retry into
account, the dig system tests handling TCP EOF with +tries=1 were
expecting dig to do a second attempt in handling the tcp query, which
doesn't happen anymore.
To make the test work as expected +tries value was adjusted to 2, to
make it behave as before after the new update on dig.
Add kasp.sh to the list of scripts copied from the source directory to
the build directory before any test is run. This will fix
the out-of-tree test failures introduced in commit
ecb073bdd6 on the 'main' branch.
When calling "rndc dnssec -checkds", it may take some milliseconds
before the appropriate changes have been written to the state file.
Add retry_quiet mechanisms to allow the write operation to finish.
Also retry_quiet the check for the next key event. A "rndc dnssec"
command may trigger a zone_rekey event and this will write out
a new "next key event" log line, but it may take a bit longer than
than expected in the tests.
Call 'dns_zone_rekey' after a 'rndc dnssec -checkds' or 'rndc dnssec
-rollover' command is received, because such a command may influence
the next key event. Updating the keys immediately avoids unnecessary
rollover delays.
The kasp system test no longer needs to call 'rndc loadkeys' after
a 'rndc dnssec -checkds' or 'rndc dnssec -rollover' command.
CDS/CDNSKEY DELETE records are only useful if they are signed,
otherwise the parent cannot verify these RRsets anyway. So once the DS
has been removed (and signaled to BIND), we can remove the DNSKEY and
RRSIG records, and at this point we can also remove the CDS/CDNSKEY
records.
Change the 'check_keys' function to try three times. Some intermittent
kasp test failures are because we are inspecting the key files
before the actual change has happen. The 'retry_quiet' approach allows
for a bit more time to let the write operation finish.
Add two test zones that migrate to dnssec-policy. Test if the key
states are set accordingly given the timing metadata.
The rumoured.kasp zone has its Publish/Active/SyncPublish times set
not too long ago so the key states should be set to RUMOURED. The
omnipresent.kasp zone has its Publish/Active/SyncPublish times set
long enough to set the key states to OMNIPRESENT.
Slightly change the init_migration_keys function to set the
key lifetime to "none" (legacy keys don't have lifetime). Then in the
test case set the expected key lifetime explicitly.
This commit is somewhat editorial as it does not introduce something
new nor fixes anything.
The layout in keymgr2kasp/tests.sh has been changed, with the
intention to make more clear where a test scenario ends and begins.
The publication time of some ZSKs has been changed. It makes a more
clear distinction between publication time and activation time.
The kasp system test was getting pretty large, and more tests are on
the way. Time to split up. Move tests that are related to migrating
to dnssec-policy to a separate directory 'keymgr2kasp'.
The system tests were missing a test that would test tcp-initial-timeout
and tcp-idle-timeout.
This commit adds new "timeouts" system test that adds:
* Test that waits longer than tcp-initial-timeout and then checks
whether the socket was closed
* Test that sends and receives DNS message then waits longer than
tcp-initial-timeout but shorter time than tcp-idle-timeout than
sends DNS message again than waits longer than tcp-idle-timeout
and checks whether the socket was closed
* Similar test, but bursting 25 DNS messages than waiting longer than
tcp-initial-timeout and shorter than tcp-idle-timeout than do second
25 DNS message burst
* Check whether transfer longer than tcp-initial-timeout succeeds