This mostly comprises of:
* using $(...) instead of `...`
* changing the directories in subshell and not ignoring `cd` return code
* handling every error gracefully instead of ignoring the return code
The indentation for dumping the master zone was driven by two
global variables dns_master_indent and dns_master_indentstr. In
threaded mode, this becomes prone to data access races, so this commit
converts the global variables into a local per-context tuple that
consist of count and string.
The oldsigs test was checking only for the validity of the A
a.oldsigs.example. resource record and associated DNSSEC signature while
the zone might not have been fully signed yet leading to validation
failures because of bogus signatures on the validation path.
This commit changes the test to test that all old signatures in the
oldsigs.example. zone were replaced and the zone is fully resigned
before running the main check.
- restore support for tcp-initial-timeout, tcp-idle-timeout,
tcp-keepalive-timeout and tcp-advertised-timeout configuration
options, which were ineffective previously.
Make the "tcp" system test fail if the Python tool used for establishing
TCP connections (ans6) logs a result different than "OK" after
processing a command sent to it (as that means the tool was unable to
successfully perform the requested action), with the exception of
cleanup errors at the end of the test which can be safely ignored. Note
that the tool not returning any result at all in 10 seconds is still a
fatal error in all cases.
this adds functions in conf.sh.common to create DS-style trust anchor
files. those functions are then used to create nearly all of the trust
anchors in the system tests.
there are a few exceptions:
- some tests in dnssec and mkeys rely on detection of unsupported
algorithms, which only works with key-style trust anchors, so those
are used for those tests in particular.
- the mirror test had a problem with the use of a CSK without a
SEP bit, which still needs addressing
in the future, some of these tests should be changed back to using
traditional trust anchors, so that both types will be exercised going
forward.
use empty placeholder KEYDATA records for all trust anchors, not just
DS-style trust anchors.
this revealed a pre-existing bug: keyfetch_done() skips keys without
the SEP bit when populating the managed-keys zone. consequently, if a
zone only has a single ZSK which is configured as trust anchor and no
KSKs, then no KEYDATA record is ever written to the managed-keys zone
when keys are refreshed.
that was how the root server in the dnssec system test was configured.
however, previously, the KEYDATA was created when the key was
initialized; this prevented us from noticing the bug until now.
configuring a ZSK as an RFC 5011 trust anchor is not forbidden by the
spec, but it is highly unusual and not well defined. so for the time
being, I have modified the system test to generate both a KSK and ZSK
for the root zone, enabling the test to pass.
we should consider adding code to detect this condition and allow keys
without the SEP bit to be used as trust anchors if no key with the SEP
bit is available, or at minimum, log a warning.
With the netmgr in use, named may start answering queries before zones
are loaded. This can cause transient failures in system tests after
servers are restarted or reconfigured. This commit adds retry loops
and sleep statements where needed to address this problem.
Also incidentally silenced a clang warning.
- ns__client_request() is now called by netmgr with an isc_nmhandle_t
parameter. The handle can then be permanently associated with an
ns_client object.
- The task manager is paused so that isc_task events that may be
triggred during client processing will not fire until after the netmgr is
finished with it. Before any asynchronous event, the client MUST
call isc_nmhandle_ref(client->handle), to prevent the client from
being reset and reused while waiting for an event to process. When
the asynchronous event is complete, isc_nmhandle_unref(client->handle)
must be called to ensure the handle can be reused later.
- reference counting of client objects is now handled in the nmhandle
object. when the handle references drop to zero, the client's "reset"
callback is used to free temporary resources and reiniialize it,
whereupon the handle (and associated client) is placed in the
"inactive handles" queue. when the sysstem is shutdown and the
handles are cleaned up, the client's "put" callback is called to free
all remaining resources.
- because client allocation is no longer handled in the same way,
the '-T clienttest' option has now been removed and is no longer
used by any system tests.
- the unit tests require wrapping the isc_nmhandle_unref() function;
when LD_WRAP is supported, that is used. otherwise we link a
libwrap.so interposer library and use that.
When a task manager is created, we can now specify an `isc_nm`
object to associate with it; thereafter when the task manager is
placed into exclusive mode, the network manager will be paused.
Ensure any unexpected failure in the "tcp" system test causes it to be
immediately interrupted with an error to make the aforementioned test
more reliable. Since the exit code for "expr 0 + 0" is 1, the status
variable needs to be updated using arithmetic expansion.
assert_int_equal() calls in bin/tests/system/tcp/tests.sh pass the found
value as the first argument and the expected value as the second
argument, while the function interprets its arguments the other way
round. Fix argument handling in assert_int_equal() to make sure the
error messages printed by that function are correct.
In the TCP high-water checks, "rndc stats" is run after ans6 reports
that it opened the requested number of TCP connections. However, we
fail to account for the fact that ns5 might not yet have called accept()
for these connections, in which case the counts output by "rndc stats"
will be off. To prevent intermittent "tcp" system test failures, allow
the relevant connection count checks to be retried (just once, after one
second, as that should be enough for any system to accept() a dozen TCP
connections under any circumstances).
the current method used for testing distribution of signatures
is failure-prone. we need to replace it with something both
effective and portable, but in the meantime we're commenting
out the jitter test.
The original requirement for the check to pass was <-10;10> interval and
the first test was failing by 1 second. As the minimum interval for
checking is 7200 seconds, the commit relaxes the requirement to <-60;60>
interval, which is still sane, but not that draconic.