The dig commands appear to be failing unexpectedly on some platforms
when rate limiting kicks in and the response is dropped. Correct
behaviour should be for dig to retry the query. Set +qr and capture
stdout and stderr of each of the dig commands involved.
(cherry picked from commit 614cf5a030)
3034 next = ISC_LIST_NEXT(query, link);
3035 } else {
3036 next = NULL;
3037 }
CID 352554 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking connectquery suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
3038 if (connectquery != NULL) {
3039 query_detach(&connectquery);
3040 }
(cherry picked from commit 30f3d51368)
In '_check_apex_dnskey' we check for each key (KEY1 to KEY4) if they
are present in the DNSKEY RRset if they should be.
However, we only grep the dig output for the first seven fields (owner,
ttl, class, type, flags, protocol, algorithm). This can be the same
for different keys.
For example, KEY1 may be KSK predecessor and KEY2 a KSK successor,
both DNSKEY records for these keys are the same up to the public key
field. This can cause test failures if KEY1 needs to be present, but
KEY2 not, because when grepping for KEY2 we will falsely detect the
key to be present (because the grep matches KEY1).
Fix the function by grepping looking for the first seven fields in the
corresponding key file and retrieve the public key part. Grep for this
in the dig output.
(cherry picked from commit 3e1d09ac66)
It might be useful to display built-in configuration with all its
values. It should make it easier to test what default values has changed
in a new release.
Related: #1326
(cherry picked from commit cf722d18b3)
- var_decl: Declaring variable "tbuf" without initializer
- assign: Assigning: "target.base" = "tbuf", which points to
uninitialized data
- assign: Assigning: "r.base" = "target.base", which points to
uninitialized data
I expect it would correctly initialize length always. Add simple
initialization to silent coverity.
(cherry picked from commit 59132bd3ec)
Parser ensures new-zones-directory has qstring parameter before it can
reach this place. dir == NULL then should never happen on any
configuration. Replace silent check with insist.
(cherry picked from commit 0a7d04367a)
The DNS catalog zones draft version 5 document requires that catalog
zones consumers must reset the member zone's internal zone state when
its unique label changes (either within the same catalog zone or
during change of ownership performed using the "coo" property).
BIND already behaves like that, and, in fact, doesn't support keeping
the zone state during change of ownership even if the unique label
has been kept the same, because BIND always removes the member zone
and adds it back during unique label renaming or change of ownership.
Document the described behavior and add a log message to inform when
unique label renaming occurs.
Add a system test case with unique label renaming.
(cherry picked from commit 2f2e02ff0c)
There is already a check for the missing version property case
(catalog-bad1.example), and this new test should result in the same
outcome, but differs in a way that there exists a version record in the
zone, but it is of a wrong type (A instead of the expected TXT).
(cherry picked from commit 5bfe655835)
According to DNS catalog zones draft version 5 document, the CLASS field
of every RR in a catalog zone MUST be IN.
Add a new check in the catz system test to verify that a non-IN class
catalog zone (in this case CH) fails to load.
BIND does not support having a non-IN class RR in an IN class zone, or
non-IN class zone in an IN class view, so to verify that BIND respects
the mentioned restriction we must try to add a non-IN class catalog
zone and check that it didn't succeed.
The `named` configuration files had to be restructured to put all the
zones inside views, which also resulted in some corresponding changes
in the tests.sh script.
(cherry picked from commit 247ae534a0)
When parsing the configuration file, log a warning message in
configure_view() function when encountering a `catalog-zones`
option in a view with non-IN rdata class.
(cherry picked from commit dfd5a01eba)
The DNS catalog zones draft version 5 document describes various
situations when a catalog zones must be considered as "broken" and
not be processed.
Implement those checks in catz.c and add corresponding system tests.
(cherry picked from commit a8228d5f19)
There was a query_detach() call missing in dig, which could lead to
dig hanging on TLS context creation errors. This commit fixes.
The error was introduced because the Strict TLS implementation was
initially made over an older version of the code, where this extra
query_detach() call was not needed.
This commit extends the 'doth' system test with a set of Strict/Mutual
TLS related checks.
This commit also makes each doth NS instance use its own TLS
certificate that includes FQDN, IPv4, and IPv6 addresses, issued using
a common Certificate Authority, instead of ad-hoc certs.
Extend servers initialisation timeout to 60 seconds to improve the
tests stability in the CI as certain configurations could fail to
initialise on time under load.
This commit adds support for Strict/Mutual TLS into BIND. It does so
by implementing the backing code for 'hostname' and 'ca-file' options
of the 'tls' statement. The commit also updates the documentation
accordingly.
This commit adds support for Strict/Mutual TLS to dig.
The new command-line options and their behaviour are modelled after
kdig (+tls-ca, +tls-hostname, +tls-certfile, +tls-keyfile) for
compatibility reasons. That is, using +tls-* is sufficient to enable
DoT in dig, implying +tls-ca
If there is no other DNS transport specified via command-line,
specifying any of +tls-* options makes dig use DoT. In this case, its
behaviour is the same as if +tls-ca is specified: that is, the remote
peer's certificate is verified using the platform-specific
intermediate CA certificates store. This behaviour is introduced for
compatibility with kdig.
Add DNS extended errors 3 (Stale Answer) and 19 (Stale NXDOMAIN Answer)
to responses. Add extra text with the reason why the stale answer was
returned.
To test, we need to change the configuration such that for the first
set of tests the stale-refresh-time window does not interfer with the
expected extended errors.
(cherry picked from commit c66b9abc0b)
Man pages for dig/mdig/delv used `.. option:: +[no]bla` to describe two
options at once, and very old Sphinx does not support that [] in option
names.
Solution is to split negative and positive options into `+bla, +nobla`
form. In the end it improves readability because it transforms hard to
read strings with double brackets from
`+[no]subnet=addr[/prefix-length]` to
`+subnet=addr[/prefix-length], +nosubnet`.
As a side-effect it also allows easier linking to dig/mdig/delv options
using their name directly instead of always overriding the link target
to `+[no]bla` form.
Transformation was done using regex:
s/:: +\[no\]\(.*\)/:: +\1, +no\1
... and manual review around occurences matching regex
+no.*=
Fixes: #3301
(cherry picked from commit 0342dddce7)
Sphinx "standard domain" provides directive types ".. program::" and
".. option::" to create link anchor for a program name + option combination.
These can be referenced using :ref:`program option` syntax.
The problem is that Sphinx 1.8.5 (e.g. in Ubuntu 18.04) generates
conflicting link targets if a page contains two option directives
starting with the same word, e.g.:
.. program:: dnssec-settime
.. option:: -P date
.. option:: -P ds date
The reason is that option directive consumes only first word as "option
name" (-P) and all the rest is considered "option argument" (date, ds
date). Newer versions of Sphinx (e.g. 4.5.0) handle this by creating
numbered link anchors, but older versions warn and BIND build system
turns the warning into a hard error.
To handle that we use method recommended by Sphinx maintainer:
https://github.com/sphinx-doc/sphinx/issues/10218#issuecomment-1059925508
As a bonus it provides more accurate link anchors for sub-options.
Alternatives considered:
- Replacing standard domain definition of .. option - causes more
problems, see BIND issue #3294.
- Removing hyperlinks for options - that would be a step back.
Fixes: #3295
(cherry picked from commit bbb24264bb)
PyLint 2.13.7 reports the following error:
bin/tests/system/doth/conftest.py:34:28: E0601: Using variable 'stderr' before assignment (used-before-assignment)
The reason the current code has not caused problems before is that
invoking gnutls-cli with just the --logfile=/dev/null argument causes it
to always return with a non-zero exit code, either due to the option not
being supported or due to the hostname argument not being provided. In
other words, the 'except' branch has always been taken. PyLint is
obviously right on a syntactical level, though.
Instead of relying on a less than obvious code flow (where the 'except'
branch is always taken), rework the flagged code by employing
subprocess.run(..., check=False) instead of subprocess.check_output(),
making exception handling redundant.
While this issue was investigated, it was also noticed that
subprocess.check_output() was incorrectly used as a context manager:
Popen objects are context managers, but subprocess.check_output() and
subprocess.run() are not. Fix by dropping the relevant 'with'
statement.
(cherry picked from commit 3f5318f094)
Commit f64cd23e7b added a Python-based
name server (bin/tests/system/digdelv/ans8/ans.py) to the "digdelv"
system test, but did not update bin/tests/system/Makefile.am to ensure
Python is present in the test environment before the "digdelv" system
test is run. Update bin/tests/system/Makefile.am to enforce that
requirement.
(cherry picked from commit aaa0223752)
Fix another occurrence of the mistake of passing a regex to
wait_for_log by using the new wait_for_log_re instead.
(cherry picked from commit f4c2909353)
There were two problems in the notify system test when it waited for
log messages to appear: the shellcheck refactoring introduced a call
to `wait_for_log` with a regex, but `wait_for_log` only supports fixed
strings, so it always ran for the full 45 second timeout; and the new
test to ensure that notify messages time out failed to reset the
nextpart pointer, so if the notify messages timed out before the test
ran, it would fail to see them.
This change adds a `wait_for_log_re` helper that matches a regex, and
uses it where appropriate in the notify system test, which stops the
test from waiting longer than necessary; and it resets the nextpart
pointer so that the notify timeout test works reliably.
Closes#3275
(cherry picked from commit 4a30733ae5)
Prime the cache with a negative cache DS entry then make a query for
name beneath that entry. This will cause the DS entry to be retieved
as part of the validation process. Each RRset in the ncache entry
will be validated and the trust level for each will be updated.
(cherry picked from commit d2d9910da2)
dig previously set an exit code of 9 when a TCP connection failed
or when a UDP connection timed out, but when the server address is
localhost it's possible for a UDP query to fail with ISC_R_CONNREFUSED.
that code path didn't update the exit code, causing dig to exit with
status 0. we now set the exit code to 9 in this failure case.
(cherry picked from commit 4eee6460ff)
Catalog zones change of ownership is special mechanism to facilitate
controlled migration of a member zone from one catalog to another.
It is implemented using catalog zones property named "coo" and is
documented in DNS catalog zones draft version 5 document.
Implement the feature using a new hash table in the catalog zone
structure, which holds the added "coo" properties for the catalog zone
(containing the target catalog zone's name), and the key for the hash
table being the member zone's name for which the "coo" property is being
created.
Change some log messages to have consistent zone name quoting types.
Update the ARM with change of ownership documentation and usage
examples.
Add tests which check newly the added features.
(cherry picked from commit bb837db4ee)
According to DNS catalog zones draft version 5 document, catalog
zone custom properties must be placed under the "ext" label.
Make necessary changes to support the new custom properties syntax in
catalog zones with version "2" of the schema.
Change the default catalog zones schema version from "1" to "2" in
ARM to prepare for the new features and changes which come starting
from this commit in order to support the latest DNS catalog zones draft
document.
Make some restructuring in ARM and rename the term catalog zone "option"
to "custom property" to better reflect the terms used in the draft.
Change the version of 'catalog1.zone.' catalog zone in the "catz" system
test to "2", and leave the version of 'catalog2.zone.' catalog zone at
version "1" to test both versions.
Add tests to check that the new syntax works only with the new schema
version, and that the old syntax works only with the legacy schema
version catalog zones.
(cherry picked from commit cedfebc64a)
when a query was canceled while still in the process of connecting,
tcp_connected() and udp_ready() didn't detach the query object.
(cherry picked from commit 6bf8535542)
In `+nssearch` mode `dig` starts the next query of the followup lookup
using `start_udp()` or `start_tcp()` calls without waiting for the
previous query to complete.
In UDP mode that happens in the `send_done()` callback of the previous
query, but in TCP mode that happens in the `start_tcp()` call of the
previous query (recursion) which doesn't work because `start_tcp()`
attaches the `lookup->current_query` to the query it is starting, so a
recursive call will result in an assertion failure.
Make the TCP mode to start the next query in `send_done()`, just like in
the UDP mode. During that time the `lookup->current_query` is already
detached by the `tcp_connected()`/`udp_ready()` callbacks.
(cherry picked from commit b944bf4120)
Add a test case for a dynamically added CDS DELETE record and make
sure it is not removed when signing the zone. This happens because
BIND maintains CDS and CDNSKEY publishing and it will only allow
CDS DELETE records if the zone is transitioning to insecure. This is
a state that can be identified when using KASP through 'dnssec-policy',
but not when using 'auto-dnssec'.
(cherry picked from commit f08277f9fb)
Commit 3b3495a631 added a Python-based
name server (bin/tests/system/forward/ans11/ans.py) to the "forward"
system test, but did not update bin/tests/system/Makefile.am to ensure
Python is present in the test environment before the "forward" system
test is run. Update bin/tests/system/Makefile.am to enforce that
requirement.
(cherry picked from commit 806f457147)
Implement TCP support in the `ans11` Python-based DNS server.
Implement a control command channel in `ans11` to support an optional
silent mode of operation, which, when enabled, will ignore incoming
queries.
In the added check, make the `ans11` the NS server of
"a.root-servers.nil." for `ns3`, so it uses `ans11` (in silent mode)
for the regular (non-forwarded) name resolutions.
This will trigger the "hung fetch" scenario, which was causing `named`
to crash.
(cherry picked from commit 848094d6f7)
- Check that an NS in an authority section returned from a forwarder
which is above the name in a configured "forward first" or "forward
only" zone (i.e., net/NS in a response from a forwarder configured for
local.net) is not cached.
- Test that a DNAME for a parent domain will not be cached when sent
in a response from a forwarder configured to answer for a child.
- Check that glue is rejected if its name falls below that of zone
configured locally.
- Check that an extra out-of-bailiwick data in the answer section is
not cached (this was already working correctly, but was not explicitly
tested before).
(cherry picked from commit bf3fffff67)
Add a test case to check for lingering TCP sockets stuck in the
CLOSE_WAIT state. This can happen if a client sends some garbage after
its first query.
The system test runs the reproducer script and then sends another TCP
query to the resolver. The resolver is configured to allow one TCP
client only. If BIND has its TCP socket stuck in CLOSE_WAIT, it does
not have the resources available to answer the second query.
Note: A better test would be to check if the named daemon does not
have a TCP socket stuck in CLOSE_WAIT for example with netstat. When
running this test locally you can examine named with netstat manually.
But since netstat is platform specific it is not a good candidate to do
this as a system test.
If you, if you could return, don't let it burn.
Do you have to let it linger?
- Cranberries
(cherry picked from commit b9ebde705b)
This allows Gitlab to show nice summary for individual tests/test
directories and to expose the results in Gitlab API for consumption
elsewhere.
A catch: As of Gitlab 14.7.7, the detailed results are stored
only in artifacts and thus expire. All consumers (including API) need
to be "fast enough" to get the data before they disappear.
This also forces us to always store the artifacts intead of storing them
only on failure.
(cherry picked from commit d26d4f289f)
There are a couple of problems with dns_request_createvia(): a UDP
retry count of zero means unlimited retries (it should mean no
retries), and the overall request timeout is not enforced. The
combination of these bugs means that requests can be retried forever.
This change alters calls to dns_request_createvia() to avoid the
infinite retry bug by providing an explicit retry count. Previously,
the calls specified infinite retries and relied on the limit implied
by the overall request timeout and the UDP timeout (which did not work
because the overall timeout is not enforced). The `udpretries`
argument is also changed to be the number of retries; previously, zero
was interpreted as infinity because of an underflow to UINT_MAX, which
appeared to be a mistake. And `mdig` is updated to match the change in
retry accounting.
The bug could be triggered by zone maintenance queries, including
NOTIFY messages, DS parental checks, refresh SOA queries and stub zone
nameserver lookups. It could also occur with `nsupdate -r 0`.
(But `mdig` had its own code to avoid the bug.)
(cherry picked from commit 71ce8b0a51)
After some back and forth, it was decidede to match the configuration
option with unbound ("so-reuseport"), PowerDNS ("reuseport") and/or
nginx ("reuseport").
(cherry picked from commit 7e71c4d0cc)
The error code path handling the `ISC_R_CANCELED` code lacks a
`clear_current_lookup()` call, without which dig hangs indefinitely
when handling the error.
Add the missing call to account for all references of the lookup so
it can be destroyed.
(cherry picked from commit 2771a5b64d)
In `send_udp()` and `launch_next_query()` functions, when calling
`dighost_printmessage()` to print detailed information about the
sent query, dig always prints the data of the first query in the
lookup's queries list.
The first query in the list can be already finished, having its handles
freed, and accessing this information results in assertion failure.
Print the current query's information instead.
(cherry picked from commit f831e758d1)