some versions of perl failed to run packet.pl because the 'last'
keyword can't be used outside of a loop block. this commit changes
the packet dumping code to a function so we can use 'return' instead.
the tcp system test uses the 'packet.pl' test tool to send a packet
thousands of times. this took a long time because the tool was waiting
for replies and parsing them; however, for that particular test the
replies aren't relevant.
this commit uses non-blocking sockets and moves the reply parsing
outside the send loop, which speeds the system test up substantially.
The "huge.zone" zone can take longer than 100 seconds to load when
running under a sanitizer. Increase the relevant zone load timeout to
prevent intermittent failures of the "rndc" system test.
The CDS/CDNSKEY record will be published when the DS is in the
rumoured state. However, with the introduction of the rndc '-checkds'
command, the logic in the keymgr was changed to prevent the DS
state to go in RUMOURED unless the specific command was given. Hence,
the CDS was never published before it was seen in the parent.
Initially I thought this was a policy approval rule, however it is
actually a DNSSEC timing rule. Remove the restriction from
'keymgr_policy_approval' and update the 'keymgr_transition_time'
function. When looking to move the DS state to OMNIPRESENT it will
no longer calculate the state from its last change, but from when
the DS was seen in the parent, "DS Publish". If the time was not set,
default to next key event of an hour.
Similarly for moving the DS state to HIDDEN, the time to wait will
be derived from the "DS Delete" time, not from when the DS state
last changed.
The 'rndc_checkds' utility now allows "now" as the time when the DS
has been seen in/seen removed from the parent.
Also it uses "KEYX" as the key argument, rather than key id.
The 'rndc_checkds' will retrieve the key from the "KEYX" string. This
makes the call a bit more readable.
This commit has a lot of updates on comments, mainly to make the
system test more readable.
Also remove some redundant signing policy checks (check_keys,
check_dnssecstatus, check_keytimes).
Finally, move key time checks and expected key time settings above
'rndc_checkds' calls (with the new way of testing next key event
times there is no need to do them after 'rndc_checkds', and moving
them above 'rndc_checkds' makes the flow of testing easier to follow.
The test works as follows:
1. Client wants to resolve unusual ip6.arpa. name:
test1.test2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. IN TXT
2. Query is sent to ns7, a qmin enabled resolver.
3. ns7 do the first stage in query minimization for the name and send a new
query to root (ns1):
_.1.0.0.2.ip6.arpa. IN A
4. ns1 delegates ip6.arpa. to ns2.good.:
;; AUTHORITY SECTION:
;ip6.arpa. 20 IN NS ns2.good.
;; ADDITIONAL SECTION:
;ns2.good. 20 IN A 10.53.0.2
5. ns7 do a second round in minimizing the name and send a new query
to ns2.good. (10.53.0.2):
_.8.2.6.0.1.0.0.2.ip6.arpa. IN A
6. ans2 delegates 8.2.6.0.1.0.0.2.ip6.arpa. to ns3.good.:
;; AUTHORITY SECTION:
;8.2.6.0.1.0.0.2.ip6.arpa. 60 IN NS ns3.good.
;; ADDITIONAL SECTION:
;ns3.good. 60 IN A 10.53.0.3
7. ns7 do a third round in minimizing the name and send a new query to
ns3.good.:
_.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. IN A
8. ans3 delegates 1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. to ns4.good.:
;; AUTHORITY SECTION:
;1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. 60 IN NS ns4.good.
;; ADDITIONAL SECTION:
;ns4.good. 60 IN A 10.53.0.4
9. ns7 do fourth round in minimizing the name and send a new query to
ns4.good.:
_.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. IN A
10. ns4.good. doesn't know such name, but answers stating it is authoritative for
the domai:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 53815
...
;; AUTHORITY SECTION:
1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. 60 IN SOA ns4.good. ...
11. ns7 do another minimization on name:
_.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa
sends to ns4.good. and gets the same SOA response stated in item #10
12. ns7 do another minimization on name:
_.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa
sends to ns4.good. and gets the same SOA response stated in item #10.
13. ns7 do the last query minimization name for the ip6.arpa. QNAME.
After all IPv6 labels are exausted the algorithm falls back to the
original QNAME:
test1.test2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa
ns7 sends a new query with the original QNAME to ans4.
14. Finally ans4 answers with the expected response:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40969
;; flags: qr aa; QUESTION: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 8192
;; QUESTION SECTION:
;test1.test2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. IN TXT
;; ANSWER SECTION:
;test1.test2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. 1 IN TXT "long_ip6_name"
The test for assertion failure via large TCP packet needs to be repeated
multiple times (we use 300000). This commit fixes the input file to be
properly hexlified and uses the new packet.pl -r feature to send it
300000 times via TCP.
For some tests, we need to send big data streams (for TCP) or repeated
packets (for UDP), this commits adds `-r` option to packet.pl that sends
the same input <repeats> times using the specified protocol.
In order to lower the amount of memory allocated at startup by named
instances used in the BIND system test suite, set the default value of
"max-cache-size" for these to 2 megabytes. The purpose of this change
is to prevent named instances (or even entire virtual machines) from
getting killed by the operating system on the test host due to excessive
memory use.
Remove all "max-cache-size" statements from named configuration files
used in system tests ("checkconf" notwithstanding) to prevent confusion
as the "-T maxcachesize=..." command line option takes precedence over
configuration files.
Prevent intermittent false positives on slow platforms by subtracting
the number of seconds which passed between key creation and invoking
'rndc dnssec -checkds'.
This particularly fails for the step3.csk-roll2.autosign zone because
the closest next key event is when the zone signatures become
omnipresent. Running 'rndc dnssec -checkds' some time later means
that the next key event is in fact closer than the calculated time
and thus we need to adjust the expected time by the time already
passed.
Previously .txt files with full backtrace may be identified as a
crashed test:
I:Core dumps were found for the following system tests:
I: core.19948-backtrace.txt
I: shutdown
Now .txt files are removed from the list.
Change 'run.sh.in' to match the core matching pattern in
'testsummary.sh'.
Make sure the 'checkds' command correctly sets the right key timing
metadata and also make sure that it rejects setting the key timing
metadata if there are multiple keys with the KSK role and no key
identifier is provided.
With 'checkds' replacing 'parent-registration-delay', the kasp
test needs the expected times to be adjusted. Also the system test
needs to call 'rndc dnssec -checkds' to progress the rollovers.
Since we pretend that the KSK is active as soon as the DS is
submitted (and parent registration delay is no longer applicable)
we can simplify the 'csk_rollover_predecessor_keytimes' function
to take only one "addtime" parameter.
This commit also slightly changes the 'check_dnssecstatus' function,
passing the zone as a parameter.
The named configuration files used in the "geoip2" system test cause a
rather large number of views (6-8) to be set up in each tested named
instance. Each view has its own cache.
Commit e24bc324b4 caused the RBT hash
table to be pre-allocated to a size derived from "max-cache-size", so
that it never needs to be rehashed. The size of that hash table is not
expected to be significant enough to cause memory use issues in typical
conditions even for large "max-cache-size" settings.
However, these two factors combined can cause memory exhaustion issues
in GitLab CI, where we run multiple "instances" of the test suite in
parallel on the same runner, each test suite executes multiple system
tests concurrently, and each system test may potentially start multiple
named instances at the same time. In practice, this problem currently
only seems to be affecting the "geoip2" system test, which is failing
intermittently due to named instances used by that test getting killed
by oom-killer.
Prevent the "geoip2" system test from failing intermittently by setting
"max-cache-size" in named configuration files used in that test to a low
value in order to keep memory usage at bay even with a large number of
views configured.
The current serve-stale implementation in BIND 9 stores all received
records in the cache for a max-stale-ttl interval (default 12 hours).
This allows DNS operators to turn the serve-stale answers in an event of
large authoritative DNS outage. The caching of the stale answers needs
to be enabled before the outage happens or the feature would be
otherwise useless.
The negative consequence of the default setting is the inevitable
cache-bloat that happens for every and each DNS operator running named.
In this MR, a new configuration option `stale-cache-enable` is
introduced that allows the operators to selectively enable or disable
the serve-stale feature of BIND 9 based on their decision.
The newly introduced option has been disabled by default,
e.g. serve-stale is disabled in the default configuration and has to be
enabled if required.
It seems that config.guess gets always created in source root, so for
that sake of out-of-tree system test, we should expect the file there
instead of where configure was run.
The $SYSTEMTESTTOP shell variable if often set to .. in various shell
scripts inside bin/tests/system/, but most of the time it is only
used one line later, while sourcing conf.sh. This hardly improves
code readability.
$SYSTEMTESTTOP is also used for the purpose of referencing
scripts/files living in bin/tests/system/, but given that the
variable is always set to a short, relative path, we can drop it and
replace all of its occurrences with the relative path without adversely
affecting code readability.