The kasp system test is timing critical. The test passes on all
Linux based machines, but fails frequently on Windows. The test
takes a lot more time on Windows and at the final checks fail
because the expected next key event is too far off. For example:
I:kasp:check next key event for zone step2.algorithm-roll.kasp (570)
I:kasp:error: bad next key event time 20909 for zone \
step2.algorithm-roll.kasp (expect 21600)
I:kasp:failed
This is because the kasp system test calculates the time when the
next key event should occur based on the policy. This assumes that
named is able to do key management within a minute. But starting,
named, doing key management for other zones, and reconfiguring takes
much more time on Windows and thus the next key event on Windows is
much shorter than anticipated.
That this happens is a good thing because this means that the
correct next key event is used, but is not so nice for testing, as
it is hard to determine how much time named needed before finishing
the current key event.
Disable the kasp test on Windows now because it is blocking the
release. We know the cause of these test failures, and it is clear
that this is a fault in the test, not the code. Therefore we feel
comfortable disabling the test right now and work on a fix while
unblocking the release.
A data race was happening while BIND was starting due to
isc_log_wouldlog function accessing lctx->logconfig without a lock.
To prevent that without incurring much costs, that variable was made
atomic.
ABI checker tools generate HTML and TXT API compatibility reports of
BIND libraries. Comparison is being done between two bind source trees
which hold built BIND.
In the CI one version is the reference version defined by
BIND_BASELINE_VERSION variable, the latter one is the HEAD of branch
under test.
When configuring the same dnssec-policy for two zones with the same
name but in different views, there is a race condition for who will
run the keymgr first. If running sequential only one set of keys will
be created, if running parallel two set of keys will be created.
Lock the kasp when running looking for keys and running the key
manager. This way, for the same zone in different views only one
keyset will be created.
The dnssec-policy does not implement sharing keys between different
zones.
OpenBSD virtual machines seem to affected particularly badly by other
activity happening on the host. This causes trouble around release
time: when multiple tags are pushed to the repository, a large number of
jobs is started concurrently on all CI runners. In extreme cases, this
causes the system test suite to run for about an hour (!) on OpenBSD
VMs, with multiple tests failing. We investigated the test artifacts
for all such cases in the past and the outcome was always the same: test
failures were caused by extremely slow I/O on the guest. We tried
various tricks to work around this problem, but nothing helped.
Given the above, stop running OpenBSD system test jobs for pending BIND
releases to prevent the results of these jobs from affecting the
assessment of a given release's readiness for publication. This change
does not affect OpenBSD build jobs. OpenBSD system test jobs will still
be run for scheduled and web-requested pipelines, to make sure we catch
any severe issues with test code on that platform sooner or later.
Some comments started with a lowercased letter. Capitalized them to
be more consistent with the rest of the comments.
Add some newlines between `set_*` calls and check calls, also to be
more consistent with the other test cases.
There is a failure mode which gets triggered on heavily loaded
systems. A key change is scheduled in 5 seconds to make ZSK2 inactive
and ZSK3 active, but `named` takes more than 5 seconds to progress
from `rndc loadkeys` to the query check. At this time the SOA RRset
is already signed by the new ZSK which is not expected to be active
at that point yet.
Split up the checks to test the case where RRsets are signed
correctly with the offline KSK (maintained the signature) and
the active ZSK. First run, RRsets should be signed with the still
active ZSK2, second run RRsets should be signed with the new active
ZSK3.
This commit fixes isc_glob function on windows environments.
The file_list_t * object pointed to by pglob->reserved was missing
ISC_LIST_INIT intialization macro.
We may be checking the algorithm steps too fast: the reconfig
command may still be in progress. Make sure the zones are signed
and loaded by digging the NSEC records for these zones.
Algorithm rollover waited too long before introducing zone
signatures. It waited to make sure all signatures were resigned,
but when introducing a new algorithm, all signatures are resigned
immediately. Only add the sign delay if there is a predecessor key.
Algorithm rollover was stuck on submitting DS because keymgr thought
it would move to an invalid state. It did not match the current
key because it checked it against the current key in the next state.
Fixed by when checking the current key, check it against the desired
state, not the existing state.
Add a test case for algorithm rollover. This is triggered by
changing the dnssec-policy. A new nameserver ns6 is introduced
for tests related to dnssec-policy changes.
This requires a slight change in check_next_key_event to only
check the last occurrence. Also, change the debug log message in
lib/dns/zone.c to deal with checks when no next scheduled key event
exists (and default to loadkeys interval 3600).