Commit Graph

10269 Commits

Author SHA1 Message Date
Matthijs Mekking
4c337a8e72 Add missing VERIFY export
This makes the 'dnssec-verify' tool visible to the test environment.
2021-06-30 17:28:48 +02:00
Matthijs Mekking
71d5932a14 Slightly improved dnssec tools fatal message
Return the offending key state identifier.
2021-06-30 17:28:48 +02:00
Matthijs Mekking
40331a20c4 Add helpful function 'dns_zone_getdnsseckeys'
This code gathers DNSSEC keys from key files and from the DNSKEY RRset.
It is used for the 'rndc dnssec -status' command, but will also be
needed for "checkds". Turn it into a function.
2021-06-30 17:28:48 +02:00
Matthijs Mekking
2872d6a12e Add "parental-source[-v6]" config option
Similar to "notify-source" and "transfer-source", add options to
set the source address when querying parental agents for DS records.
2021-06-30 17:28:48 +02:00
Matthijs Mekking
6f92d4b9a5 Parse "parental-agents" configuration
Parse the new "parental-agents" configuration and store it in the zone
structure.
2021-06-30 17:28:48 +02:00
Matthijs Mekking
6040c71478 Make "primaries" config parsing generic
Make the code to parse "primaries" configuration more generic so
it can be reused for "parental-agents".
2021-06-30 17:28:48 +02:00
Matthijs Mekking
8327cb7839 Remove stray "setup zone" in kasp system setup 2021-06-30 17:28:48 +02:00
Matthijs Mekking
56262db9cd Add checkds system test
Add a Pytest based system test for the 'checkds' feature. There is
one nameserver (ns9, because it should be started the latest) that
has configured several zones with dnssec-policy. The zones are set
in such a state that they are waiting for DS publication or DS
withdrawal.

Then several other name servers act as parent servers that either have
the DS for these published, or not. Also one server in the mix is
to test a badly configured parental-agent.

There are tests for DS publication, DS publication error handling,
DS withdrawal and DS withdrawal error handling.

The tests ensures that the zone is DNSSEC valid, and that the
DSPublish/DSRemoved key metadata is set (or not in case of the error
handling).

It does not test if the rollover continues, this is already tested in
the kasp system test (that uses 'rndc -dnssec checkds' to set the
DSPublish/DSRemoved key metadata).
2021-06-30 17:28:48 +02:00
Matthijs Mekking
1e763e582b Check parental-agents config
Add checks for "parental-agents" configuration, checking for the option
being at wrong type of zone (only allowed for primaries and
secondaries), duplicate definitions, duplicate references, and
undefined parental clauses (the name referenced in the zone clause
does not have a matching "parental-agent" clause).
2021-06-30 17:28:48 +02:00
Matthijs Mekking
0311705d4b Add parental-agents configuration
Introduce a way to configure parental agents that can be used to
query DS records to be used in automatic key rollovers.
2021-06-30 17:28:47 +02:00
Matthijs Mekking
39a961112f Change primaries objects to remote-servers
Change the primaries configuration objects to the more generic
remote-servers, that we can reuse for other purposes (such as
parental-agents).
2021-06-30 17:21:11 +02:00
Petr Špaček
9290d9752d fix tcp-send-buffer, udp-receive-buffer, udp-send-buffer limits 2021-06-28 11:16:00 +02:00
Matthijs Mekking
10055d44e3 Fix setnsec3param hang on shutdown
When performing the 'setnsec3param' task, zones that are not loaded will have
their task rescheduled. We should do this only if the zone load is still
pending, this prevents zones that failed to load get stuck in a busy wait and
causing a hang on shutdown.
2021-06-28 10:35:34 +02:00
Matthijs Mekking
3631a23c7f Add configuration that causes setnsec3param hang
Add a zone to the configuration file that uses NSEC3 with dnssec-policy
and fails to load. This will cause setnsec3param to go into a busy wait
and will cause a hang on shutdown.
2021-06-28 10:34:19 +02:00
Ondřej Surý
e59a359929 Move the include Makefile.tests to the bottom of Makefile.am(s)
The Makefile.tests was modifying global AM_CFLAGS and LDADD and could
accidentally pull /usr/include to be listed before the internal
libraries, which is known to cause problems if the headers from the
previous version of BIND 9 has been installed on the build machine.
2021-06-24 15:33:52 +02:00
Matthijs Mekking
75ec7d1d9f Fix checkconf dnssec-policy inheritance bug
Similar to #2778, the check for 'dnssec-policy' failed to account for
it being inheritable.
2021-06-24 09:31:59 +02:00
Evan Hunt
d02210607d add test for server failover on REFUSED
- add an 'nsupdate -C' option to override resolv.conf file for nsupdate
- set resolv.conf to use two test servers, the first one of which will
  return REFUSED for a query for 'example'.
2021-06-23 09:00:29 -07:00
Evan Hunt
2100331307 nsupdate: try next server on REFUSED
when nsupdate sends an SOA query to a resolver, if it fails
with REFUSED, nsupdate will now try the next server rather than
aborting the update completely.
2021-06-23 09:00:29 -07:00
Matthijs Mekking
9bd6c96b78 Add more test cases for #2778
Add three more test cases that detect a configuration error if the
key-directory is inherited but has the same value for a zone in a
different view with a deviating DNSSEC policy.
2021-06-23 17:28:06 +02:00
Matthijs Mekking
05e73a24f0 Bump wait time in servestale test with 1 second
This check intermittently failed:

I:serve-stale:check not in cache longttl.example times out...
I:serve-stale:failed

This corresponds to this query in the test:

$DIG -p ${PORT} +tries=1 +timeout=3  @10.53.0.3 longttl.example TXT

Looking at the dig output for a failed test, the query actually got a
response from the authoritative server (in one specific example the
query time was 2991 msec, close to 3 seconds).

After doing the query for the test, we enable the authoritative
server after a sleep of three seconds. If we bump this sleep to 4
seconds, the race will be more in favor of the query timing out,
making it unlikely that this test will fail intermittently.

Bump the subsequent wait_for_log checks also with one second.
2021-06-23 13:09:59 +00:00
Ondřej Surý
0d35b3f1a9 Don't set locale globally, just use it when needed
Previously, we would set the locale on a global level and that could
possibly lead to different behaviour in underlying functions.  In this
commit, we change to code to use the system locale only when calling the
libidn2 functions and reset the locale back to "POSIX" when exiting the
libidn2 code.
2021-06-23 11:12:00 +02:00
Artem Boldariev
ef9f09252c System tests to check named behaviour for unexpected opcodes
This commit adds a set of tests to verify that BIND will not crash
when some opcodes are sent over DoT or DoH, leading to marking network
handle in question as sequential.
2021-06-22 17:21:44 +03:00
Michał Kępień
86698ded32 Hardcode "max-cache-size" for the "_bind" view
The built-in "_bind" view does not allow recursion and therefore does
not need a large cache database.  However, as "max-cache-size" is not
explicitly set for that view in the default configuration, it inherits
that setting from global options.  Set "max-cache-size" for the built-in
"_bind" view to a fixed value (2 MB, i.e. the smallest allowed value) to
prevent needlessly preallocating memory for its cache RBT hash table.
2021-06-22 15:28:31 +02:00
Michał Kępień
86541b39d3 Use minimal-sized caches for non-recursive views
Currently the implicit default for the "max-cache-size" option is "90%".
As this option is inherited by all configured views, using multiple
views can lead to memory exhaustion over time due to overcommitment.
The "max-cache-size 90%;" default also causes cache RBT hash tables to
be preallocated for every configured view, which does not really make
sense for views which do not allow recursion.

To limit this problem's potential for causing operational issues, use a
minimal-sized cache for views which do not allow recursion and do not
have "max-cache-size" explicitly set (either in global configuration or
in view configuration).

For configurations which include multiple views allowing recursion,
adjusting "max-cache-size" appropriately is still left to the operator.
2021-06-22 15:28:31 +02:00
Matthijs Mekking
acd83881ff Add test case for in-view with dnssec-policy
Add a test case for a zone that uses 'in-view' and 'dnssec-policy'.
BIND should not deadlock.
2021-06-21 16:03:35 +02:00
Mark Andrews
d1e283ede1 Checking of key-directory and dnssec-policy was broken
the checks failed to account for key-directory being inheritable.
2021-06-18 16:46:02 +10:00
Mark Andrews
c65dc2f7dc Check wild card expansions by code point 2021-06-18 15:51:36 +10:00
Michał Kępień
ac4c58e8ce Increase timeout in the rndc deadlock test
The timeout originally picked for "rndc status" invocations (2 seconds)
in the test attempting to reproduce a deadlock caused by running
multiple "rndc addzone", "rndc modzone", and "rndc delzone" commands
concurrently causes intermittent failures of the "addzone" system test
in GitLab CI.  Increase the timeout to 10 seconds to make such failures
less probable.  Adjust code comments accordingly.
2021-06-17 12:39:32 +02:00
Mark Andrews
47ca495108 make it clear algorithm field is a domain name 2021-06-16 05:26:00 +00:00
Artem Boldariev
b84fa122ce Make BIND refuse to serve XFRs over DoH
We cannot use DoH for zone transfers.  According to RFC8484 a DoH
request contains exactly one DNS message (see Section 6: Definition of
the "application/dns-message" Media Type,
https://datatracker.ietf.org/doc/html/rfc8484#section-6).  This makes
DoH unsuitable for zone transfers as often (and usually!) these need
more than one DNS message, especially for larger zones.

As zone transfers over DoH are not (yet) standardised, nor discussed
in RFC8484, the best thing we can do is to return "not implemented."

Technically DoH can be used to transfer small zones which fit in one
message, but that is not enough for the generic case.

Also, this commit makes the server-side DoH code ensure that no
multiple responses could be attempted to be sent over one HTTP/2
stream. In HTTP/2 one stream is mapped to one request/response
transaction. Now the write callback will be called with failure error
code in such a case.
2021-06-14 11:37:36 +03:00
Artem Boldariev
5b507c1136 Fix BIND to serve large HTTP responses
This commit makes NM code to report HTTP as a stream protocol. This
makes it possible to handle large responses properly. Like:

dig +https @127.0.0.1 A cmts1-dhcp.longlines.com
2021-06-14 11:37:17 +03:00
Michał Kępień
26ec4b9a89 Add AUTHORITY tests for CNAME-sourced delegations
Add a set of system tests which check the contents of the AUTHORITY
section for signed, insecure delegation responses constructed from CNAME
records and wildcards, both for zones using NSEC and NSEC3.
2021-06-10 10:13:23 +02:00
Ondřej Surý
440fb3d225 Completely remove BIND 9 Windows support
The Windows support has been completely removed from the source tree
and BIND 9 now no longer supports native compilation on Windows.

We might consider reviewing mingw-w64 port if contributed by external
party, but no development efforts will be put into making BIND 9 compile
and run on Windows again.
2021-06-09 14:35:14 +02:00
Matthijs Mekking
08a9e7add1 Add test for NSEC3PARAM not changed after restart
Add a test case where 'named' is restarted and ensure that an already
signed zone does not change its NSEC3 parameters.

The test case first tests the current zone and saves the used salt
value. Then after restart it checks if the salt (and other parameters)
are the same as before the restart.

This test case changes 'set_nsec3param'. This will now reset the salt
value, and when checking for NSEC3PARAM we will store the salt and
use it when testing the NXDOMAIN response. This does mean that for
every test case we now have to call 'set_nsec3param' explicitly (and
can not omit it because it is the same as the previous zone).

Finally, slightly changed some echo output to make debugging friendlier.
2021-06-09 09:14:09 +02:00
Mark Andrews
af95cb8ccc Address test race condition in serve-stale
the dig.out.test# files could still be being written when the
content greps where being made.
2021-06-03 18:20:14 +10:00
Mark Andrews
02726cb66e Add timeout to url get requests
to prevent the system test taking forever on failures.
2021-06-02 22:18:21 +00:00
Mark Andrews
cbdea694e8 Check DNAME resolution via itself 2021-06-02 14:20:35 +02:00
Mark Andrews
5547003a3d Add a system test checking a malformed IXFR
Make sure an incoming IXFR containing an SOA record which is not placed
at the apex of the transferred zone does not result in a broken version
of the zone being served by named and/or a subsequent crash.
2021-06-02 13:15:25 +02:00
Ondřej Surý
8a5c62de83 Refactor zone dumping code to use netmgr async threadpools
Previously, dumping the zones to the files were quantized, so it doesn't
slow down network IO processing.  With the introduction of network
manager asynchronous threadpools, we can move the IO intensive work to
use that API and we don't have to quantize the work anymore as it the
file IO won't block anything except other zone dumping processes.
2021-05-31 14:52:05 +02:00
Evan Hunt
8c047feb3a add a system test for the prefetch bug
Ensure that if prefetch is triggered as a result of a query
restart, it won't have the TRYSTALE_ONTIMEOUT flag set.
2021-05-30 00:04:01 -07:00
Matthijs Mekking
c64589bf46 Test with stale timeout cache miss, then fetch completes
Add a test case where a client request is received and the stale
timeout occurs, but it is not served stale data because there is no entry
in the cache, then is served an authoritative answer once the background
fetch completes. This ensures that a stale timeout only affects a
subsequent response if the client was answered.
2021-05-27 10:35:48 -07:00
Evan Hunt
453e905d7e add a test of DNS64 processing with a stale negative response
- send a query for an AAAA which will be resolved as a mapped A
- disable authoritative responses
- wait for the negative AAAA response to become stale
- send another query, wait for the stale answer
- re-enable authorative responses so that a real answer arrives
- currently, this triggers an assertion in query.c
2021-05-27 10:33:31 -07:00
Diego Fronza
b19cd2d83b Handling NoNameservers exception
In the shutdown system test multiple queries are sent to a resolver
instance, in the meantime we terminate the same resolver process for
which the queries were sent to, either via rndc stop or a SIGTERM
signal, that means the resolver may not be able to answer all those
queries, since it has initiated the shutdown process.

The dnspython library raises a dns.resolver.NoNameservers exception when
a resolver object fails to receive an answer from the specified list
of nameservers (resolver.nameservers list), we need to handle this
exception as this is something that may happen since we asked the
resolver to terminate, as a result it may not answer clients even if
an answer is available, as the operation will be canceled.
2021-05-27 12:37:49 +10:00
Mark Andrews
715a2c7fc1 Add missing initialisations
configuring with --enable-mutex-atomics flagged these incorrectly
initialised variables on systems where pthread_mutex_init doesn't
just zero out the structure.
2021-05-26 08:15:08 +00:00
Ondřej Surý
50270de8a0 Refactor the interface handling in the netmgr
The isc_nmiface_t type was holding just a single isc_sockaddr_t,
so we got rid of the datatype and use plain isc_sockaddr_t in place
where isc_nmiface_t was used before.  This means less type-casting and
shorter path to access isc_sockaddr_t members.

At the same time, instead of keeping the reference to the isc_sockaddr_t
that was passed to us when we start listening, we will keep a local
copy. This prevents the data race on destruction of the ns_interface_t
objects where pending nmsockets could reference the sockaddr of already
destroyed ns_interface_t object.
2021-05-26 09:43:12 +02:00
Mark Andrews
68d203ff1c Check that IXFR delta size is correct 2021-05-25 22:27:54 +10:00
Ondřej Surý
28b65d8256 Reduce the number of clientmgr objects created
Previously, as a way of reducing the contention between threads a
clientmgr object would be created for each interface/IP address.

We tasks being more strictly bound to netmgr workers, this is no longer
needed and we can just create clientmgr object per worker queue (ncpus).

Each clientmgr object than would have a single task and single memory
context.
2021-05-24 20:44:54 +02:00
Evan Hunt
3ed35b3035 extend rndc timeout to 60 seconds
the idle timeout for rndc connections was set to 10 seconds, but this
caused intermittent system failures of the 'rndc' system test on slow
platforms, since 'rndc reconfig' could time out before reconfiguration
was complete.

this commit restores the original timeout value of 60 seconds, which was
changed inadvertently after rndc was updated to use the network manager.

even with this change, however, the test can still time out under
TSAN because loading the huge zone can take a very long time (upwards
of two minutes). so the test is modified here to generate a smaller zone
file when running under TSAN.
2021-05-22 01:11:31 -07:00
Evan Hunt
b0aadaac8e rename dns_name_copynf() to dns_name_copy()
dns_name_copy() is now the standard name-copying function.
2021-05-22 00:37:27 -07:00
Evan Hunt
b1fe1b8ae3 remove the remaining uses of dns_name_copy()
dns_name_copy() has been replaced nearly everywhere with
dns_name_copynf().  this commit changes the last two uses of
the original function.  afterward, we can remove the old
dns_name_copy() implementation, and replace it with _copynf().
2021-05-22 00:22:32 -07:00