The synthesised CNAME is not supposed to be followed when the
QTYPE is CNAME or ANY as the lookup is satisfied by the CNAME
record.
(cherry picked from commit e980affba0)
Add one test that checks the behavior when serve-stale is enabled
via configuration (as opposed to enabled via rndc).
Add one test that checks the behavior when stale-refresh-time is
disabled (set to 0).
Using a 'stale-answer-ttl' the same value as the authoritative ttl
value makes it hard to differentiate between a response from the
stale cache and a response from the authoritative server.
Change the stale-answer-ttl from 2 to 4, so that it differs from the
authoritative ttl.
The strategy of running many dig commands in parallel and
waiting for the respective output files to be non empty was
resulting in random test failures, hard to reproduce, where
it was possible that the subsequent reading of the files could
have been failing due to the file's content not being fully flushed.
Instead of checking if output files are non empty, we now wait
for the dig processes to finish.
This test works as follow:
- Query for data.example rrset.
- Sleep until its TTL expires (2 secs).
- Disable authoritative server.
- Query for data.example again.
- Since server is down, answer come from stale cache, which has
a configured stale-answer-ttl of 3 seconds.
- Enable authoritative server.
- Query for data.example again
- Since last query before activating authoritative server failed, and
since 'stale-refresh-time' seconds hasn't elapsed yet, answer should
come from stale cache and not from the authoritative server.
Before the stale-refresh-time feature, the system test for ancient rrset
was somewhat based on the average time the previous tests and queries
were taking, thus not very precise.
After the addition of stale-refresh-time the system test for ancient
rrset started to fail since the queries for stale records (low
max-stale-ttl) were not taking the time to do a full resolution
anymore, since the answers now were coming from the cache (because the
rrset were stale and within stale-refresh-time window after the
previous resolution failure).
To handle this, the correct time to wait before rrset become ancient is
calculated from max-stale-ttl configuration plus the TTL set in the
rrset used in the tests (ans2/ans.pl).
Then before sending queries for ancient rrset, we check if we need to
sleep enough to ensure those rrset will be marked as ancient.
RFC 8767 recommends that attempts to refresh to be done no more
frequently than every 30 seconds.
Added check into named-checkconf, which will warn if values below the
default are found in configuration.
BIND will also log the warning during loading of configuration in the
same fashion.
Before this update, BIND would attempt to do a full recursive resolution
process for each query received if the requested rrset had its ttl
expired. If the resolution fails for any reason, only then BIND would
check for stale rrset in cache (if 'stale-cache-enable' and
'stale-answer-enable' is on).
The problem with this approach is that if an authoritative server is
unreachable or is failing to respond, it is very unlikely that the
problem will be fixed in the next seconds.
A better approach to improve performance in those cases, is to mark the
moment in which a resolution failed, and if new queries arrive for that
same rrset, try to respond directly from the stale cache, and do that
for a window of time configured via 'stale-refresh-time'.
Only when this interval expires we then try to do a normal refresh of
the rrset.
The logic behind this commit is as following:
- In query.c / query_gotanswer(), if the test of 'result' variable falls
to the default case, an error is assumed to have happened, and a call
to 'query_usestale()' is made to check if serving of stale rrset is
enabled in configuration.
- If serving of stale answers is enabled, a flag will be turned on in
the query context to look for stale records:
query.c:6839
qctx->client->query.dboptions |= DNS_DBFIND_STALEOK;
- A call to query_lookup() will be made again, inside it a call to
'dns_db_findext()' is made, which in turn will invoke rbdb.c /
cache_find().
- In rbtdb.c / cache_find() the important bits of this change is the
call to 'check_stale_header()', which is a function that yields true
if we should skip the stale entry, or false if we should consider it.
- In check_stale_header() we now check if the DNS_DBFIND_STALEOK option
is set, if that is the case we know that this new search for stale
records was made due to a failure in a normal resolution, so we keep
track of the time in which the failured occured in rbtdb.c:4559:
header->last_refresh_fail_ts = search->now;
- In check_stale_header(), if DNS_DBFIND_STALEOK is not set, then we
know this is a normal lookup, if the record is stale and the query
time is between last failure time + stale-refresh-time window, then
we return false so cache_find() knows it can consider this stale
rrset entry to return as a response.
The last additions are two new methods to the database interface:
- setservestale_refresh
- getservestale_refresh
Those were added so rbtdb can be aware of the value set in configuration
option, since in that level we have no access to the view object.
ns_client_sendraw() is currently only used to relay UPDATE
responses back to the client. dns_dt_send() is called with
this assumption.
(cherry picked from commit b09727a765)
As of libuv 1.36.0, CMake is the only supported build method for libuv
on Windows. Account for that fact by adjusting the relevant paths and
DLL file names used in the win32utils/Configure script. Update
Windows-specific documentation accordingly.
(cherry picked from commit 64a091d587)
Our GitLab Runner Custom executor scripts now use the "image" key for
determining the Windows Docker image to use for a given CI job. Update
.gitlab-ci.yml to reflect that change.
(cherry picked from commit 004ca913f2)
In order for a "fast-expire/IN: response-policy zone expired" message to
be logged in ns3/named.run, the "fast-expire" zone must first be
transferred in by that server. However, with unfavorable timing, ns3
may be stopped before it manages to fetch the "fast-expire" zone from
ns5 and after the latter has been reconfigured to no longer serve that
zone. In such a case, the "rpz" system test will report a false
positive for the relevant check. Prevent that from happening by
ensuring ns3 manages to transfer the "fast-expire" zone before getting
shut down.
(cherry picked from commit 39191052ad)
Some setup scripts uses DEFAULT_ALGORITHM in their dnssec-policy
and/or initial signing. The tests still used the literal values
13, ECDSAP256SHA256, and 256. Replace those occurrences where
appropriate.
(cherry picked from commit 518dd0bb17)
Using AC_RUN_IFELSE() in configure.ac breaks cross-compilation:
configure: error: cannot run test program while cross compiling
Commit 978c7b2e89 caused AC_RUN_IFELSE()
to be used instead of AC_LINK_IFELSE() because the latter had seemingly
been causing the check for --wrap support in the linker to not work as
expected. However, it later turned out that the problem lied elsewhere:
a minus sign ('-') was missing from the LDFLAGS variable used in the
relevant check [1].
Revert to using AC_LINK_IFELSE() for checking whether the linker
supports the --wrap option in order to make cross-compilation possible
again.
[1] see commit cfa4ea64bc
The following compiler warning is emitted for the BACKTRACE_X86STACK
part of lib/isc/backtrace.c:
backtrace.c: In function ‘getrbp’:
backtrace.c:142:1: warning: no return statement in function returning non-void [-Wreturn-type]
While getrbp() stores the value of the RBP register in the RAX register
and thus does attempt to return a value, this is not enough for an
optimizing compiler to always produce the expected result. With -O2,
the following machine code may be generated in isc_backtrace_gettrace():
0x00007ffff7b0ff7a <+10>: mov %rbp,%rax
0x00007ffff7b0ff7d <+13>: mov $0x17,%eax
0x00007ffff7b0ff82 <+18>: retq
The above is equivalent to:
sp = (void **)getrbp();
return (ISC_R_NOTFOUND);
and results in the backtrace never getting printed.
Fix by using an intermediate variable. With this change in place, the
machine code generated with -O2 becomes something like:
0x00007ffff7af5638 <+24>: mov $0x17,%eax
0x00007ffff7af563d <+29>: mov %rbp,%rdx
0x00007ffff7af5640 <+32>: test %rdx,%rdx
0x00007ffff7af5643 <+35>: je 0x7ffff7af56bd <isc_backtrace_gettrace+157>
...
0x00007ffff7af56bd <+157>: retq
(Note that this method of grabbing a stack trace is finicky anyway
because in order for RBP to be relied upon, -fno-omit-stack-frame must
be present among CFLAGS.)