Commit Graph

4717 Commits

Author SHA1 Message Date
Matthijs Mekking
70fbbedc24 Adjust serve-stale test
The number of queries to use in the burst can be reduced, as we have
a very low fetch limit of 1.

The dig command in 'wait_for_fetchlimits()' should time out sooner as
we expect a SERVFAIL to be returned promptly.

Enabling serve-stale can be done before hitting fetch-limits. This
reduces the chance that the resolver queries time out and fetch count
is reset. The chance of that happening is already slim because
'resolver-query-timeout' is 10 seconds, but better to first let the
data become stale rather than doing that while attempting to resolve.

(cherry picked from commit 00f575e7ef)
2021-02-08 16:10:12 +01:00
Matthijs Mekking
2e28e5587e Add test for serve-stale /w fetch-limits
Add a test case when fetch-limits are reached and we have stale data
in cache.

This test starts with a positive answer for 'data.example/TXT' in
cache.

1. Reload named.conf to set fetch limits.
2. Disable responses from the authoritative server.
3. Now send a batch of queries to the resolver, until hitting the
   fetch limits. We can detect this by looking at the response RCODE,
   at some point we will see SERVFAIL responses.
4. At that point we will turn on serve-stale.
5. Clients should see stale answers now.
6. An incoming query should not set the stale-refresh-time window,
   so a following query should still get a stale answer because of a
   resolver failure (and not because it was in the stale-refresh-time
   window).

(cherry picked from commit 11b74fc176)
2021-02-08 16:07:43 +01:00
Mark Andrews
a900d79ea8 Cleanup redundant isc_rwlock_init() result checks
(cherry picked from commit 3b11bacbb7)
2021-02-08 15:13:49 +11:00
Mark Andrews
7976a7264a Check that A record is accepted with _spf label present
(cherry picked from commit a3b2b86e7f)
2021-02-03 16:26:32 +01:00
Matthijs Mekking
76f7f598b3 Add kasp test todo for [#2375]
This bugfix has been manually verified but is missing a unit test.
Created GL #2471 to track this.

(cherry picked from commit 189f5a3f28)
2021-02-03 15:48:29 +01:00
Matthijs Mekking
314accf7ef Update legacy-keys kasp test
The 'legacy-keys.kasp' test checks that a zone with key files but not
yet state files is signed correctly. This test is expanded to cover
the case where old key files still exist in the key directory. This
covers bug #2406 where keys with the "Delete" timing metadata are
picked up by the keymgr as active keys.

Fix the 'legacy-keys.kasp' test, by creating the right key files
(for zone 'legacy-keys.kasp', not 'legacy,kasp').

Use a unique policy for this zone, using shorter lifetimes.

Create two more keys for the zone, and use 'dnssec-settime' to set
the timing metadata in the past, long enough ago so that the keys
should not be considered by the keymgr.

Update the 'key_unused()' test function, and consider keys with
their "Delete" timing metadata in the past as unused.

Extend the test to ensure that the keys to be used are not the old
predecessor keys (with their "Delete" timing metadata in the past).

Update the test so that the checks performed are consistent with the
newly configured policy.

(cherry picked from commit d4b2b7072d)
2021-02-03 08:41:00 +01:00
Michal Nowak
7aa33bf5e4 Add README.md file to rsabigexponent system test
This README.md describes why is bigkey needed.

(cherry picked from commit a247f24dfa)
2021-01-29 15:54:07 +01:00
Matthijs Mekking
dba01187e3 Rewrap comments to 80 char width serve-stale test
(cherry picked from commit d8c6655d7d)
2021-01-29 10:43:51 +01:00
Diego Fronza
bea63000db Add system tests for stale-answer-client-timeout
This commit add 4 tests for the new option:
	1. Test default configuration of stale-answer-client-timeout, a
	   value of 1.8 seconds, with stale-refresh-time disabled.

	2. Test disabling of stale-answer-client-timeout.

	3. Test stale-answer-client-timeout with a value of zero, in this
	   case we take advantage of a log entry which shows that a stale
	   answer was promptly used before an attempt to refresh the RRset
	   is made. We also check, by activating a disabled authoritative
	   server, that the RRset was successfully refreshed after that.

	4. Test stale-answer-client-timeout 0 with stale-refresh-time 4, in
	   this test we want to ensure a couple things:

	   - If we have a stale RRSet entry in cache, a request must be
		 promptly answered with this data, while BIND must also attempt
		 to refresh the RRSet in background.

	   - If the attempt to refresh the RRSet times out, the RRSet must
		 have its stale-refresh-time window activated.

	   - If a new request for the same RRSet arrives, it must be
		 promptly answered with stale data due to stale-refresh-time
		 being active for this RRSet, in this case no attempt to refresh
		 the RRSet is made.

	   - Enable authoritative server, ensure that the RRSet was not
		 refreshed, to honor stale-refresh-time.

	   - Wait for stale-refresh-window time pass, send another request
		 for the same RRSet, this time we expect the answer to be the
		 stale entry in cache being hit due to
		 stale-answer-client-timeout 0.

	    - Send another request, this time we expect the answer to be an
		  active RRSet, since it must have been refreshed during the
		  previous request.

(cherry picked from commit 35fd039d03)
2021-01-29 10:39:20 +01:00
Diego Fronza
93ca968644 Adjusted serve-stale test
After the addition of stale-answer-client-timeout a test was broken due
to the following behavior expected by the test.

1. Prime cache data.example txt.
2. Disable authoritative server.
3. Send a query for data.example txt.
4. Recursive server will timeout and answer from cache with stale RRset.
5. Recursive server will activate stale-refresh-time due to the previous
   failure in attempting to refresh the RRset.
6. Send a query for data.example txt.
7. Expect stale answer from cache due to stale-refresh-time
window being active, even if authoritative server is up.

Problem is that in step 4, due to the new option
stale-answer-client-timeout, recursive server will answer with stale
data before the actual fetch completes.

Since the original fetch is still running in background, if we re-enable
the authoritative server during that time, the RRset will actually be
successfully refreshed, and stale-refresh-window will not be activated.

The next queries will fail because they expect the TTL of the RRset to
match the one in the stale cache, not the one just refreshed.

To solve this, we explicitly disable stale-answer-client-timeout for
this test, as it's not the feature we are interested in testing here
anyways.

(cherry picked from commit a12bf4b61b)
2021-01-29 10:38:47 +01:00
Mark Andrews
abb2eb9d97 Add a named acl example
(cherry picked from commit 3dee62cfa5)
2021-01-28 13:43:47 +11:00
Mark Andrews
a30675998b Check that 'nsupdate -y' works for all HMAC algorithms
(cherry picked from commit 4b01ba44ea)
2021-01-28 12:57:03 +11:00
Mark Andrews
52b73db20b Check 'rndc retransfer' of primary error message
(cherry picked from commit 8f36b8567a)
2021-01-28 09:44:26 +11:00
Evan Hunt
62202b0e6d prevent ixfr/ns1 being removed 2021-01-26 12:38:32 +01:00
Evan Hunt
9529d1ed0d add a system test for AXFR fallback when max-ixfr-ratio is exceeded
also cleaned up the ixfr system test:

- use retry_quiet when applicable
- use scripts to generate test zones
- improve consistency
2021-01-26 12:38:32 +01:00
Evan Hunt
57aadd6cea add syntax and setter/getter functions to configure max-ixfr-ratio 2021-01-26 12:38:32 +01:00
Evan Hunt
0a1e1ead94 check whether taskset works before running cpu test
the taskset command used for the cpu system test seems
to be failing under vmware, causing a test failure. we
can try the taskset command and skip the test if it doesn't
work.

(cherry picked from commit a8a49bb783)
2021-01-20 15:44:31 -08:00
Matthijs Mekking
f77ec3cf58 Update serve-stale system test with new defaults
(cherry picked from commit 3be65246f8)
2021-01-15 10:38:45 +01:00
Evan Hunt
7b2880d191 further tidying of primary/secondary terminology in system tests
this changes most visble uses of master/slave terminology in tests.sh
and most uses of 'type master' or 'type slave' in named.conf files.
files in the checkconf test were not updated in order to confirm that
the old syntax still works. rpzrecurse was also left mostly unchanged
to avoid interference with DNSRPS.

(cherry picked from commit e43b3c1fa1)
2021-01-12 15:21:14 +01:00
Evan Hunt
1a32a4d001 prevent "primaries" lists from having duplicate names
it is now an error to have two primaries lists with the same
name. this is true regardless of whether the "primaries" or
"masters" keywords were used to define them.

(cherry picked from commit f619708bbf)
2021-01-12 15:21:14 +01:00
Evan Hunt
746aa2581c add "primary-only" as a synonym for "master-only"
update the "notify" option to use RFC 8499 terminology as well.

(cherry picked from commit 424a3cf3cc)
2021-01-12 15:21:14 +01:00
Evan Hunt
04b9cdb53c add "primaries" as a synonym for "masters" in named.conf
as "type primary" is preferred over "type master" now, it makes
sense to make "primaries" available as a synonym too.

added a correctness check to ensure "primaries" and "masters"
cannot both be used in the same zone.

(cherry picked from commit 16e14353b1)
2021-01-12 15:21:14 +01:00
Matthijs Mekking
e4f4977c1e Fix a quirky mkeys test failure
The mkeys system test started to fail after introducing support for
zones transitioning to unsigned without going bogus. This is because
there was actually a bug in the code: if you reconfigure a zone and
remove the "auto-dnssec" option, the zone is actually still DNSSEC
maintained. This is because in zoneconf.c there is no call
to 'dns_zone_setkeyopt()' if the configuration option is not used
(cfg_map_get(zoptions, "auto-dnssec", &obj) will return an error).

The mkeys system test implicitly relied on this bug: initially the
root zone is being DNSSEC maintained, then at some point it needs to
reset the root zone in order to prepare for some tests with bad
signatures. Because it needs to inject a bad signature, 'auto-dnssec'
is removed from the configuration.

The test pass but for the wrong reasons:

I:mkeys:reset the root server
I:mkeys:reinitialize trust anchors
I:mkeys:check positive validation (18)

The 'check positive validation' test works because the zone is still
DNSSEC maintained: The DNSSEC records in the signed root zone file on
disk are being ignored.

After fixing the bug/introducing graceful transition to insecure,
the root zone is no longer DNSSEC maintained after the reconfig.

The zone now explicitly needs to be reloaded because otherwise the
'check positive validation' test works against an old version of the
zone (the one with all the revoked keys), and the test will obviously
fail.

(cherry picked from commit 2fc42b598b)
2020-12-23 11:57:03 +01:00
Matthijs Mekking
cf0439cd5f Treat dnssec-policy "none" as a builtin zone
Configure "none" as a builtin policy. Change the 'cfg_kasp_fromconfig'
api so that the 'name' will determine what policy needs to be
configured.

When transitioning a zone from secure to insecure, there will be
cases when a zone with no DNSSEC policy (dnssec-policy none) should
be using KASP. When there are key state files available, this is an
indication that the zone once was DNSSEC signed but is reconfigured
to become insecure.

If we would not run the keymgr, named would abruptly remove the
DNSSEC records from the zone, making the zone bogus. Therefore,
change the code such that a zone will use kasp if there is a valid
dnssec-policy configured, or if there are state files available.

(cherry picked from commit cf420b2af0)
2020-12-23 11:56:33 +01:00
Matthijs Mekking
63b72ad5e9 Small adjustments to kasp rndc_checkds function
Slightly better test output, and only call 'load keys' if the
'rndc checkds' call succeeded.

(cherry picked from commit 756674f6d1)
2020-12-23 11:56:16 +01:00
Matthijs Mekking
c3d2843915 Add tests for going from secure to insecure
Add two test zones that will be reconfigured to go insecure, by
setting the 'dnssec-policy' option to 'none'.

One zone was using inline-signing (implicitly through dnssec-policy),
the other is a dynamic zone.

Two tweaks to the kasp system test are required: we need to set
when to except the CDS/CDS Delete Records, and we need to know
when we are dealing with a dynamic zone (because the logs to look for
are slightly different, inline-signing prints "(signed)" after the
zone name, dynamic zones do not).

(cherry picked from commit fa2e4e66b0)
2020-12-23 11:56:07 +01:00
Matthijs Mekking
ba75744331 Add test for cpu affinity
Add a test to check BIND 9 honors CPU affinity mask. This requires
some changes to the start script, to construct the named command.

(cherry picked from commit f1a097964c)
2020-12-23 09:25:48 +11:00
Ondřej Surý
7fc62f829d Add libssl libraries to Windows build
This commit extends the perl Configure script to also check for libssl
in addition to libcrypto and change the vcxproj source files to link
with both libcrypto and libssl.
2020-12-09 10:46:16 +01:00
Ondřej Surý
a35a666a7c Reformat sources using clang-format-11
(cherry picked from commit 7ba18870dc)
2020-12-08 19:34:05 +01:00
Mark Andrews
2898f530cd Check that missing cookies are handled
(cherry picked from commit bd9155590e)
2020-11-27 08:44:00 +11:00
Michal Nowak
f48b221ffa Write traceback file to the same directory as core file
The traceback files could overwrite each other on systems which do not
use different core dump file names for different processes.  Prevent
that by writing the traceback file to the same directory as the core
dump file.

These changes still do not prevent the operating system from overwriting
a core dump file if the same binary crashes multiple times in the same
directory and core dump files are named identically for different
processes.

(cherry picked from commit 6428fc26af)
2020-11-26 18:29:41 +01:00
Mark Andrews
a50ce1bd43 Unify whitespace in bin/tests/system/run.sh
Replace tabs with spaces to make whitespace consistent across the entire
bin/tests/system/run.sh script.

(cherry picked from commit 0f0a006c7e)
2020-11-26 18:29:33 +01:00
Matthijs Mekking
52d3bf5f31 Change nsec3param salt config to saltlen
Upon request from Mark, change the configuration of salt to salt
length.

Introduce a new function 'dns_zone_checknsec3aram' that can be used
upon reconfiguration to check if the existing NSEC3 parameters are
in sync with the configuration. If a salt is used that matches the
configured salt length, don't change the NSEC3 parameters.

(cherry picked from commit 6f97bb6b1f)
2020-11-26 14:15:04 +00:00
Matthijs Mekking
d35dab3db8 Add check for NSEC3 and key algorithms
NSEC3 is not backwards compatible with key algorithms that existed
before the RFC 5155 specification was published.

(cherry picked from commit 00c5dabea3)
2020-11-26 14:15:03 +00:00
Matthijs Mekking
00d7cc5144 Disable one nsec3 test due to GL #2216
This known bug makes the test fail. There is no trivial fix so disable
test case for now.

(cherry picked from commit f10790b02d)
2020-11-26 14:15:03 +00:00
Matthijs Mekking
cf79e6ccc1 Add some NSEC3 optout tests
Make sure that just changing the optout value recreates the chain.

(cherry picked from commit a5b45bdd03)
2020-11-26 14:15:03 +00:00
Matthijs Mekking
2a1793a2be Check nsec3param configuration values
Check 'nsec3param' configuration for the number of iterations.  The
maximum number of iterations that are allowed are based on the key
size (see https://tools.ietf.org/html/rfc5155#section-10.3).

Check 'nsec3param' configuration for correct salt. If the string is
not "-" or hex-based, this is a bad salt.

(cherry picked from commit 7039c5f805)
2020-11-26 14:15:03 +00:00
Matthijs Mekking
008e84e965 Support for NSEC3 in dnssec-policy
Implement support for NSEC3 in dnssec-policy.  Store the configuration
in kasp objects. When configuring a zone, call 'dns_zone_setnsec3param'
to queue an nsec3param event. This will ensure that any previous
chains will be removed and a chain according to the dnssec-policy is
created.

Add tests for dnssec-policy zones that uses the new 'nsec3param'
option, as well as changing to new values, changing to NSEC, and
changing from NSEC.

(cherry picked from commit 114af58ee2)
2020-11-26 14:15:02 +00:00
Matthijs Mekking
5dfd3b2d7b Add kasp nsec3param configuration
Add configuration and documentation on how to enable NSEC3 when
using dnssec-policy for signing your zones.

(cherry picked from commit f7ca96c805)
2020-11-26 14:15:02 +00:00
Matthijs Mekking
259db79579 Fix syntax in echo_i messages
It's either "record returns" or "records return".

(cherry picked from commit 53188daf5b)
2020-11-20 10:43:21 +11:00
Mark Andrews
b3d259107f Fix DNAME when QTYPE is CNAME or ANY
The synthesised CNAME is not supposed to be followed when the
QTYPE is CNAME or ANY as the lookup is satisfied by the CNAME
record.

(cherry picked from commit e980affba0)
2020-11-19 10:52:29 +11:00
Diego Fronza
f321d95464 Adjusted test to match new rndc serve-stale status output 2020-11-11 16:06:36 -03:00
Matthijs Mekking
4d52ddbd15 Add two more system tests for stale-refresh-time
Add one test that checks the behavior when serve-stale is enabled
via configuration (as opposed to enabled via rndc).

Add one test that checks the behavior when stale-refresh-time is
disabled (set to 0).
2020-11-11 16:06:16 -03:00
Matthijs Mekking
276c912953 Change serve-stale test stale-answer-ttl
Using a 'stale-answer-ttl' the same value as the authoritative ttl
value makes it hard to differentiate between a response from the
stale cache and a response from the authoritative server.

Change the stale-answer-ttl from 2 to 4, so that it differs from the
authoritative ttl.
2020-11-11 16:06:07 -03:00
Diego Fronza
8383621ba8 Wait for multiple parallel dig commands to fully finish
The strategy of running many dig commands in parallel and
waiting for the respective output files to be non empty was
resulting in random test failures, hard to reproduce, where
it was possible that the subsequent reading of the files could
have been failing due to the file's content not being fully flushed.

Instead of checking if output files are non empty, we now wait
for the dig processes to finish.
2020-11-11 16:05:30 -03:00
Diego Fronza
0123a0f44f Added system test for stale-refresh-time
This test works as follow:
- Query for data.example rrset.
- Sleep until its TTL expires (2 secs).
- Disable authoritative server.
- Query for data.example again.
- Since server is down, answer come from stale cache, which has
  a configured stale-answer-ttl of 3 seconds.
- Enable authoritative server.
- Query for data.example again
- Since last query before activating authoritative server failed, and
  since 'stale-refresh-time' seconds hasn't elapsed yet, answer should
  come from stale cache and not from the authoritative server.
2020-11-11 16:01:59 -03:00
Diego Fronza
e8c3d538d5 Adjusted ancient rrset system test
Before the stale-refresh-time feature, the system test for ancient rrset
was somewhat based on the average time the previous tests and queries
were taking, thus not very precise.

After the addition of stale-refresh-time the system test for ancient
rrset started to fail since the queries for stale records (low
max-stale-ttl) were not taking the time to do a full resolution
anymore, since the answers now were coming from the cache (because the
rrset were stale and within stale-refresh-time window after the
previous resolution failure).

To handle this, the correct time to wait before rrset become ancient is
calculated from max-stale-ttl configuration plus the TTL set in the
rrset used in the tests (ans2/ans.pl).

Then before sending queries for ancient rrset, we check if we need to
sleep enough to ensure those rrset will be marked as ancient.
2020-11-11 16:01:51 -03:00
Diego Fronza
24ec021e50 Warn if 'stale-refresh-time' < 30 (default)
RFC 8767 recommends that attempts to refresh to be done no more
frequently than every 30 seconds.

Added check into named-checkconf, which will warn if values below the
default are found in configuration.

BIND will also log the warning during loading of configuration in the
same fashion.
2020-11-11 16:00:22 -03:00
Diego Fronza
8cc5abff23 Add stale-refresh-time option
Before this update, BIND would attempt to do a full recursive resolution
process for each query received if the requested rrset had its ttl
expired. If the resolution fails for any reason, only then BIND would
check for stale rrset in cache (if 'stale-cache-enable' and
'stale-answer-enable' is on).

The problem with this approach is that if an authoritative server is
unreachable or is failing to respond, it is very unlikely that the
problem will be fixed in the next seconds.

A better approach to improve performance in those cases, is to mark the
moment in which a resolution failed, and if new queries arrive for that
same rrset, try to respond directly from the stale cache, and do that
for a window of time configured via 'stale-refresh-time'.

Only when this interval expires we then try to do a normal refresh of
the rrset.

The logic behind this commit is as following:

- In query.c / query_gotanswer(), if the test of 'result' variable falls
  to the default case, an error is assumed to have happened, and a call
  to 'query_usestale()' is made to check if serving of stale rrset is
  enabled in configuration.

- If serving of stale answers is enabled, a flag will be turned on in
  the query context to look for stale records:
  query.c:6839
  qctx->client->query.dboptions |= DNS_DBFIND_STALEOK;

- A call to query_lookup() will be made again, inside it a call to
  'dns_db_findext()' is made, which in turn will invoke rbdb.c /
  cache_find().

- In rbtdb.c / cache_find() the important bits of this change is the
  call to 'check_stale_header()', which is a function that yields true
  if we should skip the stale entry, or false if we should consider it.

- In check_stale_header() we now check if the DNS_DBFIND_STALEOK option
  is set, if that is the case we know that this new search for stale
  records was made due to a failure in a normal resolution, so we keep
  track of the time in which the failured occured in rbtdb.c:4559:
  header->last_refresh_fail_ts = search->now;

- In check_stale_header(), if DNS_DBFIND_STALEOK is not set, then we
  know this is a normal lookup, if the record is stale and the query
  time is between last failure time + stale-refresh-time window, then
  we return false so cache_find() knows it can consider this stale
  rrset entry to return as a response.

The last additions are two new methods to the database interface:
- setservestale_refresh
- getservestale_refresh

Those were added so rbtdb can be aware of the value set in configuration
option, since in that level we have no access to the view object.
2020-11-11 15:59:56 -03:00
Mark Andrews
23d2d95d28 Check that DNSTAP captures forwarded UPDATE responses
(cherry picked from commit 2b7128fede)
2020-11-10 17:59:04 +11:00