Commit Graph

9957 Commits

Author SHA1 Message Date
Mark Andrews
b3d259107f Fix DNAME when QTYPE is CNAME or ANY
The synthesised CNAME is not supposed to be followed when the
QTYPE is CNAME or ANY as the lookup is satisfied by the CNAME
record.

(cherry picked from commit e980affba0)
2020-11-19 10:52:29 +11:00
Diego Fronza
10860b09be Update ARM and other documents 2020-11-12 10:13:04 +01:00
Diego Fronza
f321d95464 Adjusted test to match new rndc serve-stale status output 2020-11-11 16:06:36 -03:00
Diego Fronza
4905c2e24a Output 'stale-refresh-time' value on rndc serve-stale status 2020-11-11 16:06:30 -03:00
Diego Fronza
73c199dec7 Check 'stale-refresh-time' when sharing cache between views
This commit ensures that, along with previous restrictions, a cache is
shareable between views only if their 'stale-refresh-time' value are
equal.
2020-11-11 16:06:23 -03:00
Matthijs Mekking
4d52ddbd15 Add two more system tests for stale-refresh-time
Add one test that checks the behavior when serve-stale is enabled
via configuration (as opposed to enabled via rndc).

Add one test that checks the behavior when stale-refresh-time is
disabled (set to 0).
2020-11-11 16:06:16 -03:00
Matthijs Mekking
276c912953 Change serve-stale test stale-answer-ttl
Using a 'stale-answer-ttl' the same value as the authoritative ttl
value makes it hard to differentiate between a response from the
stale cache and a response from the authoritative server.

Change the stale-answer-ttl from 2 to 4, so that it differs from the
authoritative ttl.
2020-11-11 16:06:07 -03:00
Diego Fronza
8383621ba8 Wait for multiple parallel dig commands to fully finish
The strategy of running many dig commands in parallel and
waiting for the respective output files to be non empty was
resulting in random test failures, hard to reproduce, where
it was possible that the subsequent reading of the files could
have been failing due to the file's content not being fully flushed.

Instead of checking if output files are non empty, we now wait
for the dig processes to finish.
2020-11-11 16:05:30 -03:00
Diego Fronza
0123a0f44f Added system test for stale-refresh-time
This test works as follow:
- Query for data.example rrset.
- Sleep until its TTL expires (2 secs).
- Disable authoritative server.
- Query for data.example again.
- Since server is down, answer come from stale cache, which has
  a configured stale-answer-ttl of 3 seconds.
- Enable authoritative server.
- Query for data.example again
- Since last query before activating authoritative server failed, and
  since 'stale-refresh-time' seconds hasn't elapsed yet, answer should
  come from stale cache and not from the authoritative server.
2020-11-11 16:01:59 -03:00
Diego Fronza
e8c3d538d5 Adjusted ancient rrset system test
Before the stale-refresh-time feature, the system test for ancient rrset
was somewhat based on the average time the previous tests and queries
were taking, thus not very precise.

After the addition of stale-refresh-time the system test for ancient
rrset started to fail since the queries for stale records (low
max-stale-ttl) were not taking the time to do a full resolution
anymore, since the answers now were coming from the cache (because the
rrset were stale and within stale-refresh-time window after the
previous resolution failure).

To handle this, the correct time to wait before rrset become ancient is
calculated from max-stale-ttl configuration plus the TTL set in the
rrset used in the tests (ans2/ans.pl).

Then before sending queries for ancient rrset, we check if we need to
sleep enough to ensure those rrset will be marked as ancient.
2020-11-11 16:01:51 -03:00
Diego Fronza
24ec021e50 Warn if 'stale-refresh-time' < 30 (default)
RFC 8767 recommends that attempts to refresh to be done no more
frequently than every 30 seconds.

Added check into named-checkconf, which will warn if values below the
default are found in configuration.

BIND will also log the warning during loading of configuration in the
same fashion.
2020-11-11 16:00:22 -03:00
Diego Fronza
8cc5abff23 Add stale-refresh-time option
Before this update, BIND would attempt to do a full recursive resolution
process for each query received if the requested rrset had its ttl
expired. If the resolution fails for any reason, only then BIND would
check for stale rrset in cache (if 'stale-cache-enable' and
'stale-answer-enable' is on).

The problem with this approach is that if an authoritative server is
unreachable or is failing to respond, it is very unlikely that the
problem will be fixed in the next seconds.

A better approach to improve performance in those cases, is to mark the
moment in which a resolution failed, and if new queries arrive for that
same rrset, try to respond directly from the stale cache, and do that
for a window of time configured via 'stale-refresh-time'.

Only when this interval expires we then try to do a normal refresh of
the rrset.

The logic behind this commit is as following:

- In query.c / query_gotanswer(), if the test of 'result' variable falls
  to the default case, an error is assumed to have happened, and a call
  to 'query_usestale()' is made to check if serving of stale rrset is
  enabled in configuration.

- If serving of stale answers is enabled, a flag will be turned on in
  the query context to look for stale records:
  query.c:6839
  qctx->client->query.dboptions |= DNS_DBFIND_STALEOK;

- A call to query_lookup() will be made again, inside it a call to
  'dns_db_findext()' is made, which in turn will invoke rbdb.c /
  cache_find().

- In rbtdb.c / cache_find() the important bits of this change is the
  call to 'check_stale_header()', which is a function that yields true
  if we should skip the stale entry, or false if we should consider it.

- In check_stale_header() we now check if the DNS_DBFIND_STALEOK option
  is set, if that is the case we know that this new search for stale
  records was made due to a failure in a normal resolution, so we keep
  track of the time in which the failured occured in rbtdb.c:4559:
  header->last_refresh_fail_ts = search->now;

- In check_stale_header(), if DNS_DBFIND_STALEOK is not set, then we
  know this is a normal lookup, if the record is stale and the query
  time is between last failure time + stale-refresh-time window, then
  we return false so cache_find() knows it can consider this stale
  rrset entry to return as a response.

The last additions are two new methods to the database interface:
- setservestale_refresh
- getservestale_refresh

Those were added so rbtdb can be aware of the value set in configuration
option, since in that level we have no access to the view object.
2020-11-11 15:59:56 -03:00
Mark Andrews
23d2d95d28 Check that DNSTAP captures forwarded UPDATE responses
(cherry picked from commit 2b7128fede)
2020-11-10 17:59:04 +11:00
Michał Kępień
64ca2e0061 Wait for the "fast-expire" zone to be transferred
In order for a "fast-expire/IN: response-policy zone expired" message to
be logged in ns3/named.run, the "fast-expire" zone must first be
transferred in by that server.  However, with unfavorable timing, ns3
may be stopped before it manages to fetch the "fast-expire" zone from
ns5 and after the latter has been reconfigured to no longer serve that
zone.  In such a case, the "rpz" system test will report a false
positive for the relevant check.  Prevent that from happening by
ensuring ns3 manages to transfer the "fast-expire" zone before getting
shut down.

(cherry picked from commit 39191052ad)
2020-11-05 07:55:36 +01:00
Matthijs Mekking
67b9e80b1e kasp test: Use DEFAULT_ALGORITHM in tests.sh
Some setup scripts uses DEFAULT_ALGORITHM in their dnssec-policy
and/or initial signing. The tests still used the literal values
13, ECDSAP256SHA256, and 256. Replace those occurrences where
appropriate.

(cherry picked from commit 518dd0bb17)
2020-11-04 14:28:19 +01:00
Matthijs Mekking
a0a4f7e318 Add a test for RFC 8901 signer model 2
The new 'dnssec-policy' was already compatible with multi-signer
model 2, now we also have a test for it.

(cherry picked from commit 7e0ec9f624)
2020-11-04 14:28:10 +01:00
Mark Andrews
939e735e2c Check that a zone in the process of being signed resolves
ans10 simulates a local anycast server which has both signed and
unsigned instances of a zone.  'A' queries get answered from the
signed instance.  Everything else gets answered from the unsigned
instance.  The resulting answer should be insecure.

(cherry picked from commit d7840f4b93)
2020-10-30 09:19:12 +11:00
Evan Hunt
bc9a1b0b2d fix a typo in rpz test
"tcp-only" was not being tested correctly in the RPZ system test
because the option to the "digcmd" function that causes queries to
be sent via TCP was misspelled in one case, and was being interpreted
as a query name.

the "ckresult" function has also been changed to be case sensitive
for consistency with "digcmd".

(cherry picked from commit 78af071c11)
2020-10-28 22:38:55 -07:00
Michal Nowak
175f03f5db Replace a seq invocation with a shell loop
seq is not portable.  Use a while loop instead to make the "dnssec"
system test script POSIX-compatible.

(cherry picked from commit c0c4c024c6)
2020-10-27 12:26:03 +01:00
Michal Nowak
3e937a8c7c Get rid of bashisms in string comparisons
The double equal sign ('==') is a Bash-specific string comparison
operator.  Ensure the single equal sign ('=') is used in all POSIX shell
scripts in the system test suite in order to retain their portability.

(cherry picked from commit 481dfb9671)
2020-10-27 12:26:03 +01:00
Michal Nowak
659feff963 Fix system test backtrace generation on OpenBSD
On Linux core dump contains absolute path to crashed binary

    Core was generated by `/home/newman/isc/ws/bind9/bin/named/.libs/lt-named -D glue-ns1 -X named.lock -m'.

However, on OpenBSD there's only a basename

    Core was generated by `named'.

This commit adds support for the latter, retains the former.

(cherry picked from commit f0b13873a3)
2020-10-26 15:01:52 +01:00
Michal Nowak
47862fc559 Ensure use of "echo_i" where possible
In many instances 'echo "I:' construct was used where echo_i function
should have been.
2020-10-22 12:15:15 +02:00
Diego Fronza
be98c78802 Adjusted additional system test (NS, non-root zone)
After the updates from this branch, BIND now sends glue records for
NS queries even when configured with minimal-responses yes.
2020-10-21 12:12:57 -03:00
Diego Fronza
69e6bea835 Added test for the proposed fix
This test is very simple, two nameserver instances are created:
    - ns4: master, with 'minimal-responses yes', authoritative
        for example. zone
    - ns5: slave, stub zone

The first thing verified is the transfer of zone data from master
to slave, which should be saved in ns5/example.db.

After that, a query is issued to ns5 asking for target.example.
TXT, a record present in the master database with the "test" string
as content.

If that query works, it means stub zone successfully request
nameserver addresses from master, ns4.example. A/AAAA

The presence of both A/AAAA records for ns4 is also verified in the
stub zone local file, ns5/example.db.
2020-10-21 12:12:36 -03:00
Matthijs Mekking
5c0b5b64e5 Don't increment network error stats on UV_EOF
When networking statistics was added to the netmgr (in commit
5234a8e00a), two lines were added that
increment the 'STATID_RECVFAIL' statistic: One if 'uv_read_start'
fails and one at the end of the 'read_cb'.  The latter happens
if 'nread < 0'.

According to the libuv documentation, I/O read callbacks (such as for
files and sockets) are passed a parameter 'nread'. If 'nread' is less
than 0, there was an error and 'UV_EOF' is the end of file error, which
you may want to handle differently.

In other words, we should not treat EOF as a RECVFAIL error.

(cherry picked from commit 6c5ff94218)
2020-10-20 14:05:09 +00:00
Diego Fronza
64ae91c62a Fix dnstap system test on FreeBSD
This commit ensures that dnstap output files captured
by fstrm_capture are properly flushed before any attempt
on reading them with dnstap-read is done.

By reading fstrm-capture source code it was noticed that
signal SIGHUP is used to flush the capture file.
2020-10-20 10:22:50 -03:00
Mark Andrews
7147e4f93a Drop the expected minimum number of buckets to 4.
The previous value of 5 produced too many false errors.

(cherry picked from commit 0abb49034e)
2020-10-15 12:32:11 +11:00
Mark Andrews
07017d0a8e Try to improve rrl timing
Add a +burst option to mdig so that we have a second to setup the
mdig calls then they run at the start of the next second.

RRL uses 'queries in a second' as a approximation to
'queries per second'. Getting the bursts of traffic to all happen in
the same second should prevent false negatives in the system test.

We now have a second to setup the traffic in.  Then the traffic should
be sent at the start of the next second.  If that still fails we
should move to +burst=<now+2> (further extend mdig) instead of the
implicit <now+1> as the trigger second.

(cherry picked from commit 92cdc7b6c7)
2020-10-15 11:41:20 +11:00
Matthijs Mekking
5d3c4baad0 The kasp system test requires Python
Only run the "kasp" system test if the path to the Python interpreter is
set.
2020-10-07 14:14:14 +02:00
Havard Eidnes
cf19c9d3ba Avoid a non-standard bashism: use of "==" in "test".
(cherry picked from commit 7c3f62082bb0c6776ff560f0aef09ad2dfdf77ea)
2020-10-07 13:29:55 +00:00
Ondřej Surý
4d2390c0b9 Adjust legacy tests for default 1232 EDNS Buffer Size
* legacy test was just expecting default server EDNS buffer size to be 4096,
  the test needed the adjustment to reset the buffer sizes back to 4096.

(cherry picked from commit 354a2e102d5b8b0a73c9bcea14a4af7091ed6e31)
2020-10-06 09:35:21 +02:00
Ondřej Surý
b2ebbaf4a0 Adjust digdelv tests for default 1232 EDNS Buffer Size
* digdelv test was just expecting default server EDNS buffer size to be
  4096, the test needed only slight adjustment

(cherry picked from commit f1556f8c41)
2020-10-06 09:35:20 +02:00
Ondřej Surý
58a518adca Change the default ENDS buffer size to 1232 for DNS Flag Day 2020
The DNS Flag Day 2020 aims to remove the IP fragmentation problem from
the UDP DNS communication.  In this commit, we implement the minimal
required changes by changing the defaults for `edns-udp-size`,
`max-udp-size` and `nocookie-udp-size` to `1232` (the value picked by
DNS Flag Day 2020).

(cherry picked from commit bb990030d3)
2020-10-06 09:35:20 +02:00
Mark Andrews
2614cc7610 run.sh failed to exit with a error code when it should
* if a core was detected 'status' was not updated.
* if a tsan or asan error was detected 'status' was not updated.
2020-10-06 06:03:59 +00:00
Mark Andrews
387e2e0c06 run.sh failed to report when system test failed. 2020-10-06 06:03:59 +00:00
Mark Andrews
a1714cf4da incorrect markup in rndc.rst lead to bad layout 2020-10-06 11:09:05 +11:00
Matthijs Mekking
a87fb09eb4 Use default algorithm in kasp test if possible
These tests don't require a specific algorithm so they should use
the DEFAULT_ALGORITHM from 'conf.sh.common'.

(cherry picked from commit 78c09f5622)
2020-10-05 11:20:35 +02:00
Matthijs Mekking
63652ca58f Use explicit result codes for 'rndc dnssec' cmd
It is better to add new result codes than to overload existing codes.

(cherry picked from commit 70d1ec432f)
2020-10-05 11:20:35 +02:00
Matthijs Mekking
6bbb2a8581 Various rndc dnssec -checkds fixes
While working on 'rndc dnssec -rollover' I noticed the following
(small) issues:

- The key files where updated with hints set to "-when" and that
  should always be "now.
- The kasp system test did not properly update the test number when
  calling 'rndc dnssec -checkds' (and ensuring that works).
- There was a missing ']' in the rndc.c help output.

(cherry picked from commit edc53fc416)
2020-10-05 11:20:35 +02:00
Matthijs Mekking
5bbecc5116 Test rndc rollover inactive key
When users (accidentally) try to roll an inactive key, throw an error.

(cherry picked from commit fcd34abb9e)
2020-10-05 11:20:35 +02:00
Matthijs Mekking
4d0dc466b5 Add rndc dnssec -rollover command
This command is similar in arguments as -checkds so refactor the
'named_server_dnssec' function accordingly.  The only difference
are that:

- It does not take a "publish" or "withdrawn" argument.
- It requires the key id to be set (add a check to make sure).

Add tests that will trigger rollover immediately and one that
schedules a test in the future.

(cherry picked from commit e826facadb)
2020-10-05 11:20:35 +02:00
Matthijs Mekking
1b69a49c6e Fix a timing issue in kasp system test
Sometimes, not all keys have been created in time before 'check_keys'
is called. Run a 'retry_quiet' on checking the number of keys before
continuing checking the key data.

(cherry picked from commit af3b014976)
2020-10-02 10:19:07 +02:00
Matthijs Mekking
0e07dbe263 Test migration to dnssec-policy with views
This test case is unrelated to the fix for #2171 but was added to
reproduce the problem.

(cherry picked from commit 621093fe69)
2020-10-02 10:18:52 +02:00
Matthijs Mekking
d31297c9f8 Minor fix in kasp system test
The 'wait_for_nsec' does not need to add TSIG because it calls
'dig_with_opts' and that already checks for TSIG.

(cherry picked from commit 43c6806779)
2020-10-02 10:18:44 +02:00
Matthijs Mekking
91a686c031 Add kasp tests for Ed25519 and Ed448
Use the testcrypto script to see if these algorithms are supported by
openssl. If so, add the specific configuration to the named.conf file
and touch a file to indicate support. If the file exists, the
corresponding setup and tests are performed.

(cherry picked from commit 7be1835795)
2020-10-02 10:18:17 +02:00
Michał Kępień
502d79ae4f Add tests for "order none" RRset ordering rules
Make sure "order none" RRset ordering rules are tested in the
"rrsetorder" system test just like all other rule types are.  As the
check for the case of no "rrset-order" rule matching a given RRset also
tests "order none" (rather than "order random", as the test code may
suggest at first glance), replace the test code for that case so that it
matches other "order none" tests.

(cherry picked from commit abdd4c89fc)
2020-10-02 08:51:29 +02:00
Evan Hunt
a73e807a46 add more logging to the shutdown system test
the test server running in shutdown/resolver was not logging
any debug info, which made it difficult to diagnose test failures.

(cherry picked from commit cc7ceace7d)
2020-10-01 18:09:35 +02:00
Evan Hunt
ba2e9dfb99 change from isc_nmhandle_ref/unref to isc_nmhandle attach/detach
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.

A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:

        - reqhandle:    held while waiting for a request callback (query,
                        notify, update)
        - sendhandle:   held while waiting for a send callback
        - fetchhandle:  held while waiting for a recursive fetch to
                        complete
        - updatehandle: held while waiting for an update-forwarding
                        task to complete

(cherry picked from commit 57b4dde974)
2020-10-01 18:09:35 +02:00
Witold Kręcicki
0202b289c2 assorted small netmgr-related changes
- rename isc_nmsocket_t->tcphandle to statichandle
- cancelread functions now take handles instead of sockets
- add a 'client' flag in socket objects, currently unused, to
  indicate whether it is to be used as a client or server socket

(cherry picked from commit 7eb4564895)
2020-10-01 16:44:43 +02:00
Evan Hunt
1263201732 don't use exclusive mode for rndc commands that don't need it
"showzone" and "tsig-list" both used exclusive mode unnecessarily;
changing this will simplify future refactoring a bit.

(cherry picked from commit 002c328437)
2020-10-01 16:44:43 +02:00