- restore support for tcp-initial-timeout, tcp-idle-timeout,
tcp-keepalive-timeout and tcp-advertised-timeout configuration
options, which were ineffective previously.
Make the "tcp" system test fail if the Python tool used for establishing
TCP connections (ans6) logs a result different than "OK" after
processing a command sent to it (as that means the tool was unable to
successfully perform the requested action), with the exception of
cleanup errors at the end of the test which can be safely ignored. Note
that the tool not returning any result at all in 10 seconds is still a
fatal error in all cases.
this adds functions in conf.sh.common to create DS-style trust anchor
files. those functions are then used to create nearly all of the trust
anchors in the system tests.
there are a few exceptions:
- some tests in dnssec and mkeys rely on detection of unsupported
algorithms, which only works with key-style trust anchors, so those
are used for those tests in particular.
- the mirror test had a problem with the use of a CSK without a
SEP bit, which still needs addressing
in the future, some of these tests should be changed back to using
traditional trust anchors, so that both types will be exercised going
forward.
use empty placeholder KEYDATA records for all trust anchors, not just
DS-style trust anchors.
this revealed a pre-existing bug: keyfetch_done() skips keys without
the SEP bit when populating the managed-keys zone. consequently, if a
zone only has a single ZSK which is configured as trust anchor and no
KSKs, then no KEYDATA record is ever written to the managed-keys zone
when keys are refreshed.
that was how the root server in the dnssec system test was configured.
however, previously, the KEYDATA was created when the key was
initialized; this prevented us from noticing the bug until now.
configuring a ZSK as an RFC 5011 trust anchor is not forbidden by the
spec, but it is highly unusual and not well defined. so for the time
being, I have modified the system test to generate both a KSK and ZSK
for the root zone, enabling the test to pass.
we should consider adding code to detect this condition and allow keys
without the SEP bit to be used as trust anchors if no key with the SEP
bit is available, or at minimum, log a warning.
note: this also needs further refactoring.
- when initializing RFC 5011 for a name, we populate the managed-keys
zone with KEYDATA records derived from the initial-key trust anchors.
however, with initial-ds trust anchors, there is no key. but the
managed-keys zone still must have a KEYDATA record for the name,
otherwise zone_refreshkeys() won't refresh that key. so, for
initial-ds trust anchors, we now add an empty KEYDATA record and set
the key refresh timer so that the real keys will be looked up as soon
as possible.
- when a key refresh query is done, we verify it against the
trust anchor; this is done in two ways, one with the DS RRset
set up during configuration if present, or with the keys linked
from each keynode in the list if not. because there are two different
verification methods, the loop structure is overly complex and should
be simplified.
- the keyfetch_done() and sync_keyzone() functions are both too long
and should be broken into smaller functions.
note: this is a frankensteinian kluge which needs further refactoring.
the keytable started as an RBT where the node->data points to a list of
dns_keynode structures, each of which points to a single dst_key.
later it was modified so that the list could instead point to a single
"null" keynode structure, which does not reference a key; this means
a trust anchor has been configured but the RFC 5011 refresh failed.
in this branch it is further updated to allow the first keynode in
the list to point to an rdatalist of DS-style trust anchors. these will
be used by the validator to populate 'val->dsset' when validating a zone
key.
a DS style trust anchor can be updated as a result of RFC 5011
processing to contain DST keys instead; this results in the DS list
being freed. the reverse is not possible; attempting to add a DS-style
trust anchor if a key-style trust anchor is already in place results
in an error.
later, this should be refactored to use rdatalists for both DS-style
and key-style trust anchors, but we're keeping the existing code for
old-style trust anchors for now.
With the netmgr in use, named may start answering queries before zones
are loaded. This can cause transient failures in system tests after
servers are restarted or reconfigured. This commit adds retry loops
and sleep statements where needed to address this problem.
Also incidentally silenced a clang warning.
- ns__client_request() is now called by netmgr with an isc_nmhandle_t
parameter. The handle can then be permanently associated with an
ns_client object.
- The task manager is paused so that isc_task events that may be
triggred during client processing will not fire until after the netmgr is
finished with it. Before any asynchronous event, the client MUST
call isc_nmhandle_ref(client->handle), to prevent the client from
being reset and reused while waiting for an event to process. When
the asynchronous event is complete, isc_nmhandle_unref(client->handle)
must be called to ensure the handle can be reused later.
- reference counting of client objects is now handled in the nmhandle
object. when the handle references drop to zero, the client's "reset"
callback is used to free temporary resources and reiniialize it,
whereupon the handle (and associated client) is placed in the
"inactive handles" queue. when the sysstem is shutdown and the
handles are cleaned up, the client's "put" callback is called to free
all remaining resources.
- because client allocation is no longer handled in the same way,
the '-T clienttest' option has now been removed and is no longer
used by any system tests.
- the unit tests require wrapping the isc_nmhandle_unref() function;
when LD_WRAP is supported, that is used. otherwise we link a
libwrap.so interposer library and use that.
When a task manager is created, we can now specify an `isc_nm`
object to associate with it; thereafter when the task manager is
placed into exclusive mode, the network manager will be paused.
Ensure any unexpected failure in the "tcp" system test causes it to be
immediately interrupted with an error to make the aforementioned test
more reliable. Since the exit code for "expr 0 + 0" is 1, the status
variable needs to be updated using arithmetic expansion.
assert_int_equal() calls in bin/tests/system/tcp/tests.sh pass the found
value as the first argument and the expected value as the second
argument, while the function interprets its arguments the other way
round. Fix argument handling in assert_int_equal() to make sure the
error messages printed by that function are correct.
In the TCP high-water checks, "rndc stats" is run after ans6 reports
that it opened the requested number of TCP connections. However, we
fail to account for the fact that ns5 might not yet have called accept()
for these connections, in which case the counts output by "rndc stats"
will be off. To prevent intermittent "tcp" system test failures, allow
the relevant connection count checks to be retried (just once, after one
second, as that should be enough for any system to accept() a dozen TCP
connections under any circumstances).
the current method used for testing distribution of signatures
is failure-prone. we need to replace it with something both
effective and portable, but in the meantime we're commenting
out the jitter test.
The original requirement for the check to pass was <-10;10> interval and
the first test was failing by 1 second. As the minimum interval for
checking is 7200 seconds, the commit relaxes the requirement to <-60;60>
interval, which is still sane, but not that draconic.
The get_keyids() function can return multiple keyids, when the
return value was not quoted, only the first keyid would be checked
with check_key() function. This MR fixes both the error that came
with quoting the "$id" with value "12345 54321", and the code now
checks all returned keyids.