Simply looking for the key ID surrounded by spaces in the tested
dnssec-signzone output file is not a precise enough method of checking
for signatures prepared using a given key ID: it can be tripped up by
cross-algorithm key ID collisions and certain low key IDs (e.g. 60, the
TTL specified in bin/tests/system/dnssec/signer/example.db.in), which
triggers false positives for the "dnssec" system test. Make key ID
extraction precise by using an awk script which operates on specific
fields.
The "mirror" system test expects all dig queries (including recursive
ones) to be responded to within 1 second, which turns out to be overly
optimistic in certain cases and leads to false positives being
triggered. Increase dig query timeout used throughout the "mirror"
system test to 2 seconds in order to alleviate the issue.
Currently, ns3 in the "mirror" system test sends trust anchor telemetry
queries every second as it is started with "-T tat=1". Given the number
of trust anchors configured on ns3 (9), TAT-related traffic clutters up
log files, hindering troubleshooting efforts. Increase TAT query
interval to 3 seconds in order to alleviate the issue.
Note that the interval chosen cannot be much higher if intermittent test
failures are to be avoided: TAT queries are only sent after the
configured number of seconds passes since resolver startup. Quick
experiments show that even on contemporary hardware, ns3 should be
running for at least 5 seconds before it is first shut down, so a
3-second TAT query interval seems to be a reasonable, future-proof
compromise. Ensure the relevant check is performed before ns3 is first
shut down to emphasize this trade-off and make it more clear by what
time TAT queries are expected to be sent.
"rndc dumpdb" works asynchronously, i.e. the requested dump may not yet
be fully written to disk by the time "rndc" returns. Prevent false
positives for the "serve-stale" system test by only checking dump
contents after the line indicating that it is complete is written.
This tests both the cases when the DLV trust anchor is of an
unsupported or disabled algorithm, as well as if the DLV zone
contains a key with an unsupported or disabled algorithm.
Some values returned by dstkey_fromconfig() indicate that key loading
should be interrupted, others do not. There are also certain subsequent
checks to be made after parsing a key from configuration and the results
of these checks also affect the key loading process. All of this
complicates the key loading logic.
In order to make the relevant parts of the code easier to follow, reduce
the body of the inner for loop in load_view_keys() to a single call to a
new function, process_key(). Move dstkey_fromconfig() error handling to
process_key() as well and add comments to clearly describe the effects
of various key loading errors.
More specifically: ignore configured trusted and managed keys that
match a disabled algorithm. The behavioral change is that
associated responses no longer SERVFAIL, but return insecure.
bin/tests/system/stop.pl only waits for the PID file to be cleaned up
while named cleans up the lock file after the PID file. Thus, the
aforementioned script may consider a named instance to be fully shut
down when in fact it is not.
Fix by also checking whether the lock file exists when determining a
given instance's shutdown status. This change assumes that if a named
instance uses a lock file, it is called "named.lock".
Also rename clean_pid_file() to pid_file_exists(), so that it is called
more appropriately (it does not clean up the PID file itself, it only
returns the server's identifier if its PID file is not yet cleaned up).
MR !1141 broke the way stop.pl is invoked when start.pl fails:
- start.pl changes the working directory to $testdir/$server before
attempting to start $server,
- commit 27ee629e6b causes the $testdir
variable in stop.pl to be determined using the $SYSTEMTESTTOP
environment variable, which is set to ".." by all tests.sh scripts,
- commit e227815af5 makes start.pl pass
$test (the test's name) rather than $testdir (the path to the test's
directory) to stop.pl when a given server fails to start.
Thus, when a server is restarted from within a tests.sh script and such
a restart fails, stop.pl attempts to look for the server directory in a
nonexistent location ($testdir/$server/../$test, i.e. $testdir/$test,
instead of $testdir/../$test). Fix the issue by changing the working
directory before stop.pl is invoked in the scenario described above.