The synthesised CNAME is not supposed to be followed when the
QTYPE is CNAME or ANY as the lookup is satisfied by the CNAME
record.
(cherry picked from commit e980affba0)
Add one test that checks the behavior when serve-stale is enabled
via configuration (as opposed to enabled via rndc).
Add one test that checks the behavior when stale-refresh-time is
disabled (set to 0).
Using a 'stale-answer-ttl' the same value as the authoritative ttl
value makes it hard to differentiate between a response from the
stale cache and a response from the authoritative server.
Change the stale-answer-ttl from 2 to 4, so that it differs from the
authoritative ttl.
The strategy of running many dig commands in parallel and
waiting for the respective output files to be non empty was
resulting in random test failures, hard to reproduce, where
it was possible that the subsequent reading of the files could
have been failing due to the file's content not being fully flushed.
Instead of checking if output files are non empty, we now wait
for the dig processes to finish.
This test works as follow:
- Query for data.example rrset.
- Sleep until its TTL expires (2 secs).
- Disable authoritative server.
- Query for data.example again.
- Since server is down, answer come from stale cache, which has
a configured stale-answer-ttl of 3 seconds.
- Enable authoritative server.
- Query for data.example again
- Since last query before activating authoritative server failed, and
since 'stale-refresh-time' seconds hasn't elapsed yet, answer should
come from stale cache and not from the authoritative server.
Before the stale-refresh-time feature, the system test for ancient rrset
was somewhat based on the average time the previous tests and queries
were taking, thus not very precise.
After the addition of stale-refresh-time the system test for ancient
rrset started to fail since the queries for stale records (low
max-stale-ttl) were not taking the time to do a full resolution
anymore, since the answers now were coming from the cache (because the
rrset were stale and within stale-refresh-time window after the
previous resolution failure).
To handle this, the correct time to wait before rrset become ancient is
calculated from max-stale-ttl configuration plus the TTL set in the
rrset used in the tests (ans2/ans.pl).
Then before sending queries for ancient rrset, we check if we need to
sleep enough to ensure those rrset will be marked as ancient.
RFC 8767 recommends that attempts to refresh to be done no more
frequently than every 30 seconds.
Added check into named-checkconf, which will warn if values below the
default are found in configuration.
BIND will also log the warning during loading of configuration in the
same fashion.
Before this update, BIND would attempt to do a full recursive resolution
process for each query received if the requested rrset had its ttl
expired. If the resolution fails for any reason, only then BIND would
check for stale rrset in cache (if 'stale-cache-enable' and
'stale-answer-enable' is on).
The problem with this approach is that if an authoritative server is
unreachable or is failing to respond, it is very unlikely that the
problem will be fixed in the next seconds.
A better approach to improve performance in those cases, is to mark the
moment in which a resolution failed, and if new queries arrive for that
same rrset, try to respond directly from the stale cache, and do that
for a window of time configured via 'stale-refresh-time'.
Only when this interval expires we then try to do a normal refresh of
the rrset.
The logic behind this commit is as following:
- In query.c / query_gotanswer(), if the test of 'result' variable falls
to the default case, an error is assumed to have happened, and a call
to 'query_usestale()' is made to check if serving of stale rrset is
enabled in configuration.
- If serving of stale answers is enabled, a flag will be turned on in
the query context to look for stale records:
query.c:6839
qctx->client->query.dboptions |= DNS_DBFIND_STALEOK;
- A call to query_lookup() will be made again, inside it a call to
'dns_db_findext()' is made, which in turn will invoke rbdb.c /
cache_find().
- In rbtdb.c / cache_find() the important bits of this change is the
call to 'check_stale_header()', which is a function that yields true
if we should skip the stale entry, or false if we should consider it.
- In check_stale_header() we now check if the DNS_DBFIND_STALEOK option
is set, if that is the case we know that this new search for stale
records was made due to a failure in a normal resolution, so we keep
track of the time in which the failured occured in rbtdb.c:4559:
header->last_refresh_fail_ts = search->now;
- In check_stale_header(), if DNS_DBFIND_STALEOK is not set, then we
know this is a normal lookup, if the record is stale and the query
time is between last failure time + stale-refresh-time window, then
we return false so cache_find() knows it can consider this stale
rrset entry to return as a response.
The last additions are two new methods to the database interface:
- setservestale_refresh
- getservestale_refresh
Those were added so rbtdb can be aware of the value set in configuration
option, since in that level we have no access to the view object.
In order for a "fast-expire/IN: response-policy zone expired" message to
be logged in ns3/named.run, the "fast-expire" zone must first be
transferred in by that server. However, with unfavorable timing, ns3
may be stopped before it manages to fetch the "fast-expire" zone from
ns5 and after the latter has been reconfigured to no longer serve that
zone. In such a case, the "rpz" system test will report a false
positive for the relevant check. Prevent that from happening by
ensuring ns3 manages to transfer the "fast-expire" zone before getting
shut down.
(cherry picked from commit 39191052ad)
Some setup scripts uses DEFAULT_ALGORITHM in their dnssec-policy
and/or initial signing. The tests still used the literal values
13, ECDSAP256SHA256, and 256. Replace those occurrences where
appropriate.
(cherry picked from commit 518dd0bb17)
ans10 simulates a local anycast server which has both signed and
unsigned instances of a zone. 'A' queries get answered from the
signed instance. Everything else gets answered from the unsigned
instance. The resulting answer should be insecure.
(cherry picked from commit d7840f4b93)
"tcp-only" was not being tested correctly in the RPZ system test
because the option to the "digcmd" function that causes queries to
be sent via TCP was misspelled in one case, and was being interpreted
as a query name.
the "ckresult" function has also been changed to be case sensitive
for consistency with "digcmd".
(cherry picked from commit 78af071c11)
The double equal sign ('==') is a Bash-specific string comparison
operator. Ensure the single equal sign ('=') is used in all POSIX shell
scripts in the system test suite in order to retain their portability.
(cherry picked from commit 481dfb9671)
On Linux core dump contains absolute path to crashed binary
Core was generated by `/home/newman/isc/ws/bind9/bin/named/.libs/lt-named -D glue-ns1 -X named.lock -m'.
However, on OpenBSD there's only a basename
Core was generated by `named'.
This commit adds support for the latter, retains the former.
(cherry picked from commit f0b13873a3)
This test is very simple, two nameserver instances are created:
- ns4: master, with 'minimal-responses yes', authoritative
for example. zone
- ns5: slave, stub zone
The first thing verified is the transfer of zone data from master
to slave, which should be saved in ns5/example.db.
After that, a query is issued to ns5 asking for target.example.
TXT, a record present in the master database with the "test" string
as content.
If that query works, it means stub zone successfully request
nameserver addresses from master, ns4.example. A/AAAA
The presence of both A/AAAA records for ns4 is also verified in the
stub zone local file, ns5/example.db.
When networking statistics was added to the netmgr (in commit
5234a8e00a), two lines were added that
increment the 'STATID_RECVFAIL' statistic: One if 'uv_read_start'
fails and one at the end of the 'read_cb'. The latter happens
if 'nread < 0'.
According to the libuv documentation, I/O read callbacks (such as for
files and sockets) are passed a parameter 'nread'. If 'nread' is less
than 0, there was an error and 'UV_EOF' is the end of file error, which
you may want to handle differently.
In other words, we should not treat EOF as a RECVFAIL error.
(cherry picked from commit 6c5ff94218)
This commit ensures that dnstap output files captured
by fstrm_capture are properly flushed before any attempt
on reading them with dnstap-read is done.
By reading fstrm-capture source code it was noticed that
signal SIGHUP is used to flush the capture file.
Add a +burst option to mdig so that we have a second to setup the
mdig calls then they run at the start of the next second.
RRL uses 'queries in a second' as a approximation to
'queries per second'. Getting the bursts of traffic to all happen in
the same second should prevent false negatives in the system test.
We now have a second to setup the traffic in. Then the traffic should
be sent at the start of the next second. If that still fails we
should move to +burst=<now+2> (further extend mdig) instead of the
implicit <now+1> as the trigger second.
(cherry picked from commit 92cdc7b6c7)
* legacy test was just expecting default server EDNS buffer size to be 4096,
the test needed the adjustment to reset the buffer sizes back to 4096.
(cherry picked from commit 354a2e102d5b8b0a73c9bcea14a4af7091ed6e31)
* digdelv test was just expecting default server EDNS buffer size to be
4096, the test needed only slight adjustment
(cherry picked from commit f1556f8c41)
The DNS Flag Day 2020 aims to remove the IP fragmentation problem from
the UDP DNS communication. In this commit, we implement the minimal
required changes by changing the defaults for `edns-udp-size`,
`max-udp-size` and `nocookie-udp-size` to `1232` (the value picked by
DNS Flag Day 2020).
(cherry picked from commit bb990030d3)
While working on 'rndc dnssec -rollover' I noticed the following
(small) issues:
- The key files where updated with hints set to "-when" and that
should always be "now.
- The kasp system test did not properly update the test number when
calling 'rndc dnssec -checkds' (and ensuring that works).
- There was a missing ']' in the rndc.c help output.
(cherry picked from commit edc53fc416)
This command is similar in arguments as -checkds so refactor the
'named_server_dnssec' function accordingly. The only difference
are that:
- It does not take a "publish" or "withdrawn" argument.
- It requires the key id to be set (add a check to make sure).
Add tests that will trigger rollover immediately and one that
schedules a test in the future.
(cherry picked from commit e826facadb)
Sometimes, not all keys have been created in time before 'check_keys'
is called. Run a 'retry_quiet' on checking the number of keys before
continuing checking the key data.
(cherry picked from commit af3b014976)
The 'wait_for_nsec' does not need to add TSIG because it calls
'dig_with_opts' and that already checks for TSIG.
(cherry picked from commit 43c6806779)
Use the testcrypto script to see if these algorithms are supported by
openssl. If so, add the specific configuration to the named.conf file
and touch a file to indicate support. If the file exists, the
corresponding setup and tests are performed.
(cherry picked from commit 7be1835795)
Make sure "order none" RRset ordering rules are tested in the
"rrsetorder" system test just like all other rule types are. As the
check for the case of no "rrset-order" rule matching a given RRset also
tests "order none" (rather than "order random", as the test code may
suggest at first glance), replace the test code for that case so that it
matches other "order none" tests.
(cherry picked from commit abdd4c89fc)
the test server running in shutdown/resolver was not logging
any debug info, which made it difficult to diagnose test failures.
(cherry picked from commit cc7ceace7d)
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.
A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:
- reqhandle: held while waiting for a request callback (query,
notify, update)
- sendhandle: held while waiting for a send callback
- fetchhandle: held while waiting for a recursive fetch to
complete
- updatehandle: held while waiting for an update-forwarding
task to complete
(cherry picked from commit 57b4dde974)
- rename isc_nmsocket_t->tcphandle to statichandle
- cancelread functions now take handles instead of sockets
- add a 'client' flag in socket objects, currently unused, to
indicate whether it is to be used as a client or server socket
(cherry picked from commit 7eb4564895)
"showzone" and "tsig-list" both used exclusive mode unnecessarily;
changing this will simplify future refactoring a bit.
(cherry picked from commit 002c328437)