DiG implements different logic in the `recv_done()` callback function
when processing a failure:
1. For a timed-out query it applies the "retries" logic first, then,
when it fails, fail-overs to the next server.
2. For an EOF (end-of-file, or unexpected disconnect) error it tries to
make a single retry attempt (even if the user has requested more
retries), then, when it fails, fail-overs to the next server.
3. For other types of failures, DiG does not apply the "retries" logic,
and tries to fail-over to the next servers (again, even if the user
has requested to make retries).
Simplify the logic and apply the same logic (1) of first retries, and
then fail-over, for different types of failures in `recv_done()`.
(cherry picked from commit abfd0d363f)
When the `send_done()` callback function gets called with a failure
result code, DiG erroneously cancels the lookup.
Stop canceling the lookup and give DiG a chance to retry the failed
query, or fail-over to another server, using the logic implemented in
the `recv_done()` callback function.
(cherry picked from commit c2329dd110)
When the `udp_ready()` callback function gets called with a failure
result code, DiG erroneously cancels the lookup.
Copy the logic behind `tcp_connected()` callback function into
`udp_ready()` so that DiG will now retry the failed query (if retries
are enabled) and then, if it fails again, it will fail-over to the next
server in the list, which synchronizes the behavior between TCP and UDP
modes.
Also, `udp_ready()` was calling `lookup_detach()` without calling
`lookup_attach()` first, but the issue was masked behind the fact
that `clear_current_lookup()` wasn't being called when needed, and
`lookup_detach()` was compensating for that. This also has been fixed.
(cherry picked from commit 3f31085525)
Luckily we don't rely on SphinxDirective functionality which does not
exist in 1.6.7. Replace it with docutils Directive.
transform_content() callback was added only in Sphinx 3.0.0.
Detect if it was not called and call it manually.
The transform_content() function requires access to inner "contentnode"
which is created inside run(). This workaround relies on the order of
node as it was in the pre-3.0.0 versions, but it should not matter as
new versions will not trigger the workaround.
(cherry picked from commit 8796ad7fe8)
Ancient versions of docutils cannot cope with bare text inside a table
cell. Wrap text in a paragraph to work around that.
(cherry picked from commit af5bbb433a)
Since !6413 we discourage opt-out, so we should not be advertising it in
the examples. Even worse, it was just thrown into the command line
without even mentioning its meaning in the surrounding text.
Related: !6413
(cherry picked from commit beae857288)
Running a respdiff test for every merge request would be useful for
catching protocol-breaking changes before they are applied to the source
code. However, the existing respdiff-based tests take a while to
complete (about half an hour with our current CI infrastructure), which
does not make them a good fit for this purpose. Add a new GitLab CI
job, "respdiff-short", which uses a smaller query set that gets
processed within a couple of minutes on our current CI infrastructure.
Rename the existing respdiff-based jobs to make distinguishing them
easier.
(cherry picked from commit 31ee43a314)
Ensure the common parts of all jobs using respdiff are available in the
form of a reusable YAML anchor, to reduce code duplication and to
simplify adding more respdiff-based jobs to GitLab CI.
(cherry picked from commit ca20a189f7)
The "respdiff" GitLab CI job compares DNS responses produced by the
current version of named with those produced by a reference version.
The latter is built from source in each "respdiff" job, despite the fact
that the reference version changes very rarely. Use a pre-built named
executable as the reference version instead, assuming it is available in
the OS image used for "respdiff" tests.
(cherry picked from commit ab90a4705a)
The BUFSIZ value varies between platforms, it could be 8K on Linux and
512 bytes on mingw. Make sure the buffers are always big enough for the
output data to prevent truncation of the output by appropriately
enlarging or sizing the buffers.
(cherry picked from commit b19d932262)
Remove "external" from the list of legal values for the -M command-line
option as it has not been allowed since the internal memory allocator
was removed by commit 55ace5d3aa.
Make the style of the relevant paragraph more in line with the next one
and split its contents up into an unordered list of options for improved
readability.
(cherry picked from commit f0c31ceb3b)
When a thread calls dns_dispatch_connect() on an unconnected TCP socket
it sets `tcpstate` from `DNS_DISPATCHSTATE_NONE` to `_CONNECTING`.
Previously, it then INSISTed that there were no pending connections
before calling isc_nm_tcpdnsconnect().
If a second thread called dns_dispatch_connect() during that window
of time, it could add a pending connection to the list, and trigger
an assertion failure.
This commit removes the INSIST since the condition is actually
harmless.
(cherry picked from commit 25ddec8a0a)
The statistics system test makes a query to foo.info to check for the
pending connections because the ans4 doesn't respond to the query.
This might or might not (depending on exact timing) increment the failed
TCP connection counter when the query is retried over TCP because ans4
doesn't listen on the TCP.
Wait for the 'connection refused' in the ns3 log file to be able to
count the exactly 1 failed TCP connection.
(cherry picked from commit 0227d82dc8)
The STATID_CONNECT and STATID_CONNECTFAIL statistics were used
incorrectly. The STATID_CONNECT was incremented twice (once in
the *_connect_direct() and once in the callback) and STATID_CONNECTFAIL
would not be incremented at all if the failure happened in the callback.
Closes: #3452
(cherry picked from commit 59e1703b50)
On FreeBSD (and perhaps other *BSD) systems, the TCP connect() call (via
uv_tcp_connect()) can fail with transient UV_EADDRINUSE error. The UDP
code already handles this by trying three times (is a charm) before
giving up. Add a code for the TCP, TCPDNS and TLSDNS layers to also try
three times before giving up by calling uv_tcp_connect() from the
callback two more time on UV_EADDRINUSE error.
Additionally, stop the timer only if we succeed or on hard error via
isc__nm_failed_connect_cb().
(cherry picked from commit b21f507c0a)
free_namelist could be passed names with associated rdatasets
when handling errors. These need to be disassociated before
calling dns_message_puttemprdataset.
(cherry picked from commit 745d5edc3a)
Some zones where not being logged when just DNSSEC keys where being
generated in system test setup phase. Add logging for these zones.
(cherry picked from commit 04627997eb)
The tests/libtest directory is missing from the .dir-locals.el, so the
emacs flycheck would not work for the unit tests. Add it to the
configuration.
(cherry picked from commit 80fbd849d5)
Commit 7b2ea97e46 introduced a logic bug
in resume_dslookup(): that function now only conditionally checks
whether DS chasing can still make progress. Specifically, that check is
only performed when the previous resume_dslookup() call invokes
dns_resolver_createfetch() with the 'nameservers' argument set to
something else than NULL, which may not always be the case. Failing to
perform that check may trigger assertion failures as a result of
dns_resolver_createfetch() attempting to resolve an invalid name.
Example scenario that leads to such outcome:
1. A validating resolver is configured to forward all queries to
another resolver. The latter returns broken DS responses that
trigger DS chasing.
2. rctx_chaseds() calls dns_resolver_createfetch() with the
'nameservers' argument set to NULL.
3. The fetch fails, so resume_dslookup() is called. Due to
fevent->result being set to e.g. DNS_R_SERVFAIL, the default branch
is taken in the switch statement.
4. Since 'nameservers' was set to NULL for the fetch which caused the
resume_dslookup() callback to be invoked
(fctx->nsfetch->private->nameservers), resume_dslookup() chops off
one label off fctx->nsname and calls dns_resolver_createfetch()
again, for a name containing one label less than before.
5. Steps 3-4 are repeated (i.e. all attempts to find the name servers
authoritative for the DS RRset being chased fail) until fctx->nsname
becomes stripped down the the root name.
6. Since resume_dslookup() does not check whether DS chasing can still
make progress, it strips off a label off the root name and continues
its attempts at finding the name servers authoritative for the DS
RRset being chased, passing an invalid name to
dns_resolver_createfetch().
Fix by ensuring resume_dslookup() always checks whether DS chasing can
still make progress when a name server fetch fails. Update code
comments to ensure the purpose of the relevant dns_name_equal() check is
clear.
(cherry picked from commit 1a79aeab44)
There should be 2 keys with the same key id after the numerically
lower one is revoked (serial space arithmetic). The DS points
at the non-revoked key so validation should still succeed.
(cherry picked from commit 513cb24b55)
messages indicating the reason for a fallback to AXFR (i.e, because
the requested serial number is not present in the journal, or because
the size of the IXFR response would exceeed "max-ixfr-ratio") are now
logged at level info instead of debug(4).
(cherry picked from commit df1d81cf96)
When dnssec-policy is used, and the zone is not dynamic, BIND will
assume that the zone is inline-signed. But the function responsible
for this did not inherit the dnssec-policy option from the view or
options level, and thus never enabled inline-signing, while the zone
should have been.
This is fixed by this commit.
(cherry picked from commit 576b21b168)
When dnssec-policy is used, and the zone is not dynamic, BIND will
assume that the zone is inline-signed. Add test cases to verify this.
(cherry picked from commit efa8a4e88d)
Fix a comment, ensuring the right parameters are used (zone is
parameter $3, not $2) and add view and policy parameters to the comment.
Fix the view tests and test the correct view (example3 instead of
example2).
Fix placement of "n=$((n+1)" for two test cases.
(cherry picked from commit ff65f07779)
Before this change the TLS code would ignore the accept callback result,
and would not try to gracefully close the connection. This had not been
noticed, as it is not really required for DoH. Now the code tries to
shut down the TLS connection gracefully when accepting it is not
successful.
(cherry picked from commit ffcb54211e)