Commit Graph

4612 Commits

Author SHA1 Message Date
Mark Andrews
1ef4fa9a0b Check 'deny name' + 'grant subdomain' for the same name
(cherry picked from commit a402ffbced)
2020-09-03 16:22:01 +10:00
Mark Andrews
91daae5c62 Increase zone load timeout in the "rndc" test
The "huge.zone" zone can take longer than 100 seconds to load when
running under a sanitizer.  Increase the relevant zone load timeout to
prevent intermittent failures of the "rndc" system test.

(cherry picked from commit fd08918df5)
2020-09-02 21:41:30 +00:00
Diego Fronza
55c0fa2bf6 Added test for the proposed fix
The test works as follows:

1. Client wants to resolve unusual ip6.arpa. name:

   test1.test2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. IN TXT

2. Query is sent to ns7, a qmin enabled resolver.

3. ns7 do the first stage in query minimization for the name and send a new
   query to root (ns1):

  _.1.0.0.2.ip6.arpa.        IN  A

4. ns1 delegates ip6.arpa. to ns2.good.:

    ;; AUTHORITY SECTION:
    ;ip6.arpa.      20  IN  NS  ns2.good.

    ;; ADDITIONAL SECTION:
    ;ns2.good.      20  IN  A   10.53.0.2

5. ns7 do a second round in minimizing the name and send a new query
   to ns2.good. (10.53.0.2):

   _.8.2.6.0.1.0.0.2.ip6.arpa.    IN  A

6. ans2 delegates 8.2.6.0.1.0.0.2.ip6.arpa. to ns3.good.:

    ;; AUTHORITY SECTION:
    ;8.2.6.0.1.0.0.2.ip6.arpa. 60   IN  NS  ns3.good.

    ;; ADDITIONAL SECTION:
    ;ns3.good.      60  IN  A   10.53.0.3

7. ns7 do a third round in minimizing the name and send a new query to
   ns3.good.:

    _.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. IN A

8. ans3 delegates 1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. to ns4.good.:

    ;; AUTHORITY SECTION:
    ;1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. 60 IN    NS  ns4.good.

    ;; ADDITIONAL SECTION:
    ;ns4.good.      60  IN  A   10.53.0.4

9. ns7 do fourth round in minimizing the name and send a new query to
   ns4.good.:

	_.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa.    IN A

10. ns4.good. doesn't know such name, but answers stating it is authoritative for
    the domai:

	;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id:  53815
	...
	;; AUTHORITY SECTION:
	1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. 60 IN    SOA ns4.good.  ...

11. ns7 do another minimization on name:
   _.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa
   sends to ns4.good. and gets the same SOA response stated in item #10

12. ns7 do another minimization on name:
	_.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa
	sends to ns4.good. and gets the same SOA response stated in item #10.

13. ns7 do the last query minimization name for the ip6.arpa. QNAME.
	After all IPv6 labels are exausted the algorithm falls back to the
	original QNAME:
	test1.test2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa

    ns7 sends a new query with the original QNAME to ans4.

14. Finally ans4 answers with the expected response:
	;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id:  40969
	;; flags: qr aa; QUESTION: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
	;; OPT PSEUDOSECTION:
	; EDNS: version: 0, flags:; udp: 8192
	;; QUESTION SECTION:
	;test1.test2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. IN TXT

	;; ANSWER SECTION:
	;test1.test2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.9.0.9.4.1.1.1.1.8.2.6.0.1.0.0.2.ip6.arpa. 1    IN TXT "long_ip6_name"

(cherry picked from commit 11add69198)
2020-09-02 16:52:39 +02:00
Matthijs Mekking
4a7f87aa89 Log when CDS/CDNSKEY is published in zone.
Log when named decides to add a CDS/CDNSKEY record to the zone. Now
you understand how the bug was found that was fixed in the previous
commits.

(cherry picked from commit f9ef5120c1)
2020-09-02 14:59:20 +02:00
Matthijs Mekking
6405b04477 Fix CDS (non-)publication
The CDS/CDNSKEY record will be published when the DS is in the
rumoured state. However, with the introduction of the rndc '-checkds'
command, the logic in the keymgr was changed to prevent the DS
state to go in RUMOURED unless the specific command was given. Hence,
the CDS was never published before it was seen in the parent.

Initially I thought this was a policy approval rule, however it is
actually a DNSSEC timing rule. Remove the restriction from
'keymgr_policy_approval' and update the 'keymgr_transition_time'
function. When looking to move the DS state to OMNIPRESENT it will
no longer calculate the state from its last change, but from when
the DS was seen in the parent, "DS Publish". If the time was not set,
default to next key event of an hour.

Similarly for moving the DS state to HIDDEN, the time to wait will
be derived from the "DS Delete" time, not from when the DS state
last changed.

(cherry picked from commit c8205bfa0e)
2020-09-02 14:59:20 +02:00
Matthijs Mekking
7065299a9d Silence two grep calls
(cherry picked from commit 2d2b8e7c02)
2020-09-02 14:59:20 +02:00
Matthijs Mekking
94fe9f1fdf Update rndc_checkds test util
The 'rndc_checkds' utility now allows "now" as the time when the DS
has been seen in/seen removed from the parent.

Also it uses "KEYX" as the key argument, rather than key id.
The 'rndc_checkds' will retrieve the key from the "KEYX" string. This
makes the call a bit more readable.

(cherry picked from commit dd754a974c)
2020-09-02 14:59:20 +02:00
Matthijs Mekking
2a9e4fea5a Improve kasp test readability
This commit has a lot of updates on comments, mainly to make the
system test more readable.

Also remove some redundant signing policy checks (check_keys,
check_dnssecstatus, check_keytimes).

Finally, move key time checks and expected key time settings above
'rndc_checkds' calls (with the new way of testing next key event
times there is no need to do them after 'rndc_checkds', and moving
them above 'rndc_checkds' makes the flow of testing easier to follow.

(cherry picked from commit 8cb394e047)
2020-09-02 14:59:20 +02:00
Matthijs Mekking
a33c49a838 Add dnssec-settime [-P ds|-D ds] to kasp test
Add the new '-P ds' and '-D ds' calls to the kasp test setup so that
next key event times can reliably be tested.

(cherry picked from commit 4a67cdabfe)
2020-09-02 14:59:20 +02:00
Ondřej Surý
c978d3efdf Skip the large TCP assertion failure test in the CI environment 2020-09-02 13:11:10 +02:00
Ondřej Surý
d1643772eb Reorder the response reading in packet.pl to not fill TCP buffers 2020-09-02 12:46:43 +02:00
Mark Andrews
02be5fc953 Dump the returned packet 2020-09-02 12:46:43 +02:00
Ondřej Surý
92df4ba652 Multiply 1996-alloc_dnsbuf-crash-test.pkt by 300000 via TCP
The test for assertion failure via large TCP packet needs to be repeated
multiple times (we use 300000).  This commit fixes the input file to be
properly hexlified and uses the new packet.pl -r feature to send it
300000 times via TCP.

(cherry picked from commit 5f6eb014aa)
2020-09-02 12:46:43 +02:00
Ondřej Surý
4dc666d474 Add -r <repeats> option to packet.pl
For some tests, we need to send big data streams (for TCP) or repeated
packets (for UDP), this commits adds `-r` option to packet.pl that sends
the same input <repeats> times using the specified protocol.

(cherry picked from commit dd46559a19)
2020-09-02 12:46:43 +02:00
Ondřej Surý
677e569dda Properly format 2037-pk11_numbits-crash-test.pkt file
(cherry picked from commit 22e0272063)
2020-09-02 12:46:43 +02:00
Mark Andrews
0dc04cb901 dig +bufsize=0 failed to disable EDNS as a side effect. 2020-09-02 09:07:55 +00:00
Michał Kępień
665dcfae5f Fix the "forward" system test on Windows
Make the "forward" system test work on Windows again by fixing a
backporting glitch in commit 544ea41224.
2020-09-02 10:01:29 +02:00
Michał Kępień
894b7a8345 Use "-T maxcachesize=2097152" in all system tests
In order to lower the amount of memory allocated at startup by named
instances used in the BIND system test suite, set the default value of
"max-cache-size" for these to 2 megabytes.  The purpose of this change
is to prevent named instances (or even entire virtual machines) from
getting killed by the operating system on the test host due to excessive
memory use.

Remove all "max-cache-size" statements from named configuration files
used in system tests ("checkconf" notwithstanding) to prevent confusion
as the "-T maxcachesize=..." command line option takes precedence over
configuration files.

(cherry picked from commit dad6572093)
2020-08-31 23:42:38 +02:00
Ondřej Surý
9d3c6785b5 Add PoC for assertion failure on large TCP DNS messages
(cherry picked from commit 2c796bb9c8)
2020-08-31 13:38:17 +02:00
Evan Hunt
544ea41224 test whether DS chasing works correctly when forwarding
(cherry picked from commit dd8db89525)
2020-08-31 12:00:13 +02:00
Ondřej Surý
f195c192a6 Add PoC system test for pk11_numbits() assertion
(cherry picked from commit a69433ba40)
2020-08-31 10:58:56 +02:00
Mark Andrews
6acd6ae943 check that a malformed truncated response to a TSIG query is handled
(cherry picked from commit 8bbf3eb5f3)
2020-08-31 08:35:30 +02:00
Evan Hunt
1c7e3c8515 Merge tag 'v9_16_6' into v9_16
BIND 9.16.6
2020-08-20 12:08:57 -07:00
Matthijs Mekking
9b69ab4c38 Fix check next key event check in kasp test
Prevent intermittent false positives on slow platforms by subtracting
the number of seconds which passed between key creation and invoking
'rndc dnssec -checkds'.

This particularly fails for the step3.csk-roll2.autosign zone because
the closest next key event is when the zone signatures become
omnipresent. Running 'rndc dnssec -checkds' some time later means
that the next key event is in fact closer than the calculated time
and thus we need to adjust the expected time by the time already
passed.

(cherry picked from commit 262b52a154)
2020-08-13 12:07:08 +02:00
Michal Nowak
01119ac4f9 Make sure .txt files are not identified as crashed test
Previously .txt files with full backtrace may be identified as a
crashed test:

    I:Core dumps were found for the following system tests:
    I:	 core.19948-backtrace.txt
    I:   shutdown

Now .txt files are removed from the list.

Change 'run.sh.in' to match the core matching pattern in
'testsummary.sh'.

(cherry picked from commit c2dcd95966)
2020-08-12 09:52:07 +02:00
Matthijs Mekking
624f1b9531 rndc dnssec -checkds set algorithm
In the rare case that you have multiple keys acting as KSK and that
have the same keytag, you can now set the algorithm when calling
'-checkds'.

(cherry picked from commit 46fcd927e7)
2020-08-07 13:34:10 +02:00
Matthijs Mekking
ad0752bc22 Test 'rndc dnssec -checkds' on multiple zones
Make sure the 'checkds' command correctly sets the right key timing
metadata and also make sure that it rejects setting the key timing
metadata if there are multiple keys with the KSK role and no key
identifier is provided.

(cherry picked from commit a43bb41909)
2020-08-07 13:30:59 +02:00
Matthijs Mekking
4892006a92 Make 'parent-registration-delay' obsolete
With the introduction of 'checkds', the 'parent-registration-delay'
option becomes obsolete.

(cherry picked from commit a25f49f153)
2020-08-07 13:30:50 +02:00
Matthijs Mekking
0475646ebb Adjust kasp tests to use 'checkds'
With 'checkds' replacing 'parent-registration-delay', the kasp
test needs the expected times to be adjusted. Also the system test
needs to call 'rndc dnssec -checkds' to progress the rollovers.

Since we pretend that the KSK is active as soon as the DS is
submitted (and parent registration delay is no longer applicable)
we can simplify the 'csk_rollover_predecessor_keytimes' function
to take only one "addtime" parameter.

This commit also slightly changes the 'check_dnssecstatus' function,
passing the zone as a parameter.

(cherry picked from commit 38cb43bc86)
2020-08-07 13:30:40 +02:00
Mark Andrews
14aa0c5df6 Add a test for update-policy 'zonesub'
The new test checks that 'update-policy zonesub' is properly enforced.
2020-08-05 15:55:06 +02:00
Mark Andrews
5bf457e89a Add a test for update-policy 'subdomain'
The new test checks that 'update-policy subdomain' is properly enforced.
2020-08-05 15:55:06 +02:00
Michał Kępień
fec4664ab0 Set "max-cache-size" in the "geoip2" system test
The named configuration files used in the "geoip2" system test cause a
rather large number of views (6-8) to be set up in each tested named
instance.  Each view has its own cache.

Commit aa72c31422 caused the RBT hash
table to be pre-allocated to a size derived from "max-cache-size", so
that it never needs to be rehashed.  The size of that hash table is not
expected to be significant enough to cause memory use issues in typical
conditions even for large "max-cache-size" settings.

However, these two factors combined can cause memory exhaustion issues
in GitLab CI, where we run multiple "instances" of the test suite in
parallel on the same runner, each test suite executes multiple system
tests concurrently, and each system test may potentially start multiple
named instances at the same time.  In practice, this problem currently
only seems to be affecting the "geoip2" system test, which is failing
intermittently due to named instances used by that test getting killed
by oom-killer.

Prevent the "geoip2" system test from failing intermittently by setting
"max-cache-size" in named configuration files used in that test to a low
value in order to keep memory usage at bay even with a large number of
views configured.

(cherry picked from commit 4292d5bdfe)
2020-08-05 11:08:24 +02:00
Matthijs Mekking
f3103660d0 keyword 'primaries' is unknown in 9.16
In 9.17 we introduced 'primaries' as a synonym for 'masters' in the
configuration file. This synonym has not been backported so change
the serve-stale test to make use of the 'masters' keyword.
2020-08-05 09:09:16 +02:00
Matthijs Mekking
c92de6cb44 stale-cache-enable is enabled by default
Because this is a backport, the option should default to keep the
serve-stale caching enabled.
2020-08-05 09:09:16 +02:00
Ondřej Surý
c4e6ade0e5 Add tests with stale-cache-disabled into serve-stale system test
Add a fifth named (ns5) that runs with `stale-cache-enable no;` and
check that there are no stale records in the cache.

(cherry picked from commit abc2ab9223)
2020-08-05 09:09:16 +02:00
Ondřej Surý
b48e9ab201 Add stale-cache-enable option and disable serve-stable by default
The current serve-stale implementation in BIND 9 stores all received
records in the cache for a max-stale-ttl interval (default 12 hours).

This allows DNS operators to turn the serve-stale answers in an event of
large authoritative DNS outage.  The caching of the stale answers needs
to be enabled before the outage happens or the feature would be
otherwise useless.

The negative consequence of the default setting is the inevitable
cache-bloat that happens for every and each DNS operator running named.

In this MR, a new configuration option `stale-cache-enable` is
introduced that allows the operators to selectively enable or disable
the serve-stale feature of BIND 9 based on their decision.

The newly introduced option has been disabled by default,
e.g. serve-stale is disabled in the default configuration and has to be
enabled if required.

(cherry picked from commit ce53db34d6)
2020-08-05 09:09:16 +02:00
Mark Andrews
20bc6aefff Check rcode is FORMERR
(cherry picked from commit 88ff6b846c)
2020-08-04 23:04:34 +10:00
Michał Kępień
e734651fbd Only run system tests as root in developer mode
Running system tests with root privileges is potentially dangerous.
Only allow it when explicitly requested (by building with
--enable-developer).

(cherry picked from commit 3ef106f69d)
2020-07-31 07:46:27 +02:00
Michal Nowak
0f319908f0 Remove cross-test dependency on ckdnsrps.sh 2020-07-30 16:25:23 +02:00
Michal Nowak
72a6b0dc6f Fix name of the test directory of stop.pl in masterformat test 2020-07-30 16:24:18 +02:00
Michal Nowak
24f5f68d7a Ensure test fails if packet.pl does not work as expected 2020-07-30 16:20:46 +02:00
Ondřej Surý
aa72c31422 Fix the rbt hashtable and grow it when setting max-cache-size
There were several problems with rbt hashtable implementation:

1. Our internal hashing function returns uint64_t value, but it was
   silently truncated to unsigned int in dns_name_hash() and
   dns_name_fullhash() functions.  As the SipHash 2-4 higher bits are
   more random, we need to use the upper half of the return value.

2. The hashtable implementation in rbt.c was using modulo to pick the
   slot number for the hash table.  This has several problems because
   modulo is: a) slow, b) oblivious to patterns in the input data.  This
   could lead to very uneven distribution of the hashed data in the
   hashtable.  Combined with the single-linked lists we use, it could
   really hog-down the lookup and removal of the nodes from the rbt
   tree[a].  The Fibonacci Hashing is much better fit for the hashtable
   function here.  For longer description, read "Fibonacci Hashing: The
   Optimization that the World Forgot"[b] or just look at the Linux
   kernel.  Also this will make Diego very happy :).

3. The hashtable would rehash every time the number of nodes in the rbt
   tree would exceed 3 * (hashtable size).  The overcommit will make the
   uneven distribution in the hashtable even worse, but the main problem
   lies in the rehashing - every time the database grows beyond the
   limit, each subsequent rehashing will be much slower.  The mitigation
   here is letting the rbt know how big the cache can grown and
   pre-allocate the hashtable to be big enough to actually never need to
   rehash.  This will consume more memory at the start, but since the
   size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
   will only consume maximum of 32GB of memory for hashtable in the
   worst case (and max-cache-size would need to be set to more than
   4TB).  Calling the dns_db_adjusthashsize() will also cap the maximum
   size of the hashtable to the pre-computed number of bits, so it won't
   try to consume more gigabytes of memory than available for the
   database.

   FIXME: What is the average size of the rbt node that gets hashed?  I
   chose the pagesize (4k) as initial value to precompute the size of
   the hashtable, but the value is based on feeling and not any real
   data.

For future work, there are more places where we use result of the hash
value modulo some small number and that would benefit from Fibonacci
Hashing to get better distribution.

Notes:
a. A doubly linked list should be used here to speedup the removal of
   the entries from the hashtable.
b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/

(cherry picked from commit e24bc324b4)
2020-07-30 11:57:24 +02:00
Diego Fronza
1a101f223c Add test for RPZ wildcard passthru ignored fix 2020-07-27 17:17:02 -03:00
Mark Andrews
b0942c2442 Check walking the hip rendezvous servers.
Also fixes extraneous white space at end of record when
there are no rendezvous servers.

(cherry picked from commit 78db46d746)
2020-07-24 15:24:49 +10:00
Michal Nowak
9509af7008 Check tests for core files regardless of test status
Failed test should be checked for core files et al. and have
backtrace generated.
2020-07-20 13:09:06 +02:00
Michal Nowak
ace988990a Rationalize backtrace logging
GDB backtrace generated via "thread apply all bt full" is too long for
standard output, lets save them to .txt file among other log files.
2020-07-20 12:48:29 +02:00
Michal Nowak
c2bbe11349 Fold stop_servers_failed() to stop_servers() 2020-07-20 12:48:11 +02:00
Mark Andrews
90154d203b Add regression test for [GL !3735]
Check that resign interval is actually in days rather than hours
by checking that RRSIGs are all within the allowed day range.

(cherry picked from commit 11ecf7901b)
2020-07-14 12:11:42 +10:00
Mark Andrews
7e62d76b6b Don't verify the zone when setting expire to "now+1s" as it can fail
as too much wall clock time may have elapsed.

Also capture signzone output for forensic analysis

(cherry picked from commit a0e8a11cc6)
2020-07-13 12:42:46 +10:00
Ondřej Surý
b9b1366bf0 Add prereq.sh script to the shutdown system test
The shutdown test requires python, pytest and dnspython.
2020-07-03 08:54:01 +02:00