We might have some leftover data in bio buffer that we received
before pausing, we need to do tls_do_bio to make sure we call
the recv() callback on it.
- add isc_nm_tlsdnsconnect() function
- add "+[no]tls" option to dig to enable TLS mode
- override the default port number in dig from 53 to 853 when using TLS
Add an optional SSL_CTX argument to isc_nm_listentcpdns - if not NULL,
use isc_nm_listentls instead of isc_nm_listentcp to listen on a TLS
socket for DoT.
Add server-side TLS support to netmgr - that includes moving some of the isc_nm_
functions from tcp.c to a wrapper in netmgr.c calling a proper tcp or tls
function, and a new isc_nm_listentls function.
if more than 10 seconds pass while we wait for netmgr
events to finish running on shutdown, something is almost
certainly wrong and we should assert and crash.
this function sets up a UDP socket, connected to a specified peer
address, then immediately calls a callback with a handle so that
the caller can begin sending.
Resolve "[CVE-2020-8623] A flaw in native PKCS#11 code can lead to a remotely triggerable assertion failure in pk11.c"
See merge request isc-projects/bind9!4037
It was discovered, that some systems might set EPROTO instead of EACCESS
on recvmsg() call causing spurious syslog messages from the socket
code. This commit returns soft handling of EPROTO errno code to the
socket code. [GL #1928]
When calculating the new hashtable bitsize, there was an off-by-one
error that would allow the new bitsize to be larger than maximum allowed
causing assertion failure in the rehash() function.
Printing test-suite.log on system test failure does not work for system
test run from tarball because the "after_script" step does not honour
directory change from the "before_script" step and fails with:
Running after script...
$ cat bin/tests/system/test-suite.log
cat: bin/tests/system/test-suite.log: No such file or directory
The rbtdb version glue_table has been refactored similarly to rbt.c hash
table, so it does use 32-bit hash function return values and apply
Fibonacci Hashing to lookup the index to the hash table instead of
modulo. For more details, see the lib/dns/rbt.c commit log.
The non-minimized corpus from https://github.com/CZ-NIC/dns-fuzzing was
used as input to afl-cmin, then every case were processed by afl-tmin
and then afl-cmin was used to further minimize the corpus again.
Previously, the bin/system/wire_test.c was optionally used as a fuzzer,
this commit extracts the parts relevant to the fuzzing into a
specialized fuzzer that can be used in oss-fuzz project.
The fuzzer parses the input as UDP DNS message, then prints parsed DNS
message, then renders the DNS message and then prints the rendered DNS
message. No part of the code should cause a assertion failure.
Shifting (signed) integer left could trigger undefined behaviour when
the shifted value would overflow into the sign bit (e.g. 2048).
The issue was found when using AFL++ and UBSAN:
message.c:2274:33: runtime error: left shift of 2048 by 20 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior message.c:2274:33 in
sockaddr.c:147:49: error: pointer targets in passing argument 2 of ‘isc__buffer_putmem’ differ in signedness
rdata.c:1780:30: error: pointer targets in passing argument 2 of ‘isc__buffer_putmem’ differ in signedness
When updating source files, we might forget to update pre-generated
files (generated by sphinx-build and cfg_gen) and then the extra changes
would get included in the random merge request.
This commit updates the tarball-create job to enable the maintainer
mode, then clean all maintainer files (`make maintainer-clean`) rebuild
all the file from scratch and compare the result which must be a clean
git directory.
This commit updates and simplifies the checks for the readline support
in nslookup and nsupdate:
* Change the autoconf checks to pkg-config only, all supported
libraries have accompanying .pc files now.
* Add editline support in addition to libedit and GNU readline
* Add isc/readline.h shim header that defines dummy readline()
function when no readline library is available
* Disallow compression pointers in names as we are not
reading from a packet and as a result length checks fail.
* Increase totext buffer size as fuzzer ran out of space on
big bitmaps.
* NUL terminate totext to make fault diagnosis easier.
* Add debugging messages to make fault diagnosie easier.
base32_decode_char() added a extra zero octet to the output
if the fifth character was a pad character. The length
of octets to copy to the output was set to 3 instead of 2.
Prevent intermittent false positives on slow platforms by subtracting
the number of seconds which passed between key creation and invoking
'rndc dnssec -checkds'.
This particularly fails for the step3.csk-roll2.autosign zone because
the closest next key event is when the zone signatures become
omnipresent. Running 'rndc dnssec -checkds' some time later means
that the next key event is in fact closer than the calculated time
and thus we need to adjust the expected time by the time already
passed.
The --enable-fuzzing option now allows third choice "ossfuzz" that just
adds $LIB_FUZZING_ENGINE to FUZZ_LDFLAGS to make the fuzzer builds
compatible with OSS-Fuzz project that has some special quirks (the
main() routine is provided in the static library the project provides).
Previously, we have disallowed static linking (for good reasons).
However, there are legitimate reasons where static linking might be
useful, and one of the reasons is the OSS-Fuzz project that doesn't have
the libraries used for build, so static linking is the sane option here.
The static linking is still disallowed in the "production" builds, but
it's not possible to disable shared and enable static libraries when
used together with --enable-developer.
There was a copy&paste error in fuzz/isc_lex_getmastertoken.c where we
didn't really test the function we wanted to test. Update the test to
have the input data to always include expected 'tokentype' in the first
byte, `eol` argument in the second byte and the rest of the input is the
data to parse.
Previously .txt files with full backtrace may be identified as a
crashed test:
I:Core dumps were found for the following system tests:
I: core.19948-backtrace.txt
I: shutdown
Now .txt files are removed from the list.
Change 'run.sh.in' to match the core matching pattern in
'testsummary.sh'.
Resolve "readline/rltypedefs.h:35:22: error: this function declaration is not a prototype on NetBSD 9"
Closes#2045
See merge request isc-projects/bind9!3926
Hold a weak reference to the view so that it can't go away while
nta is performing its lookups. Cancel nta timers once all external
references to the view have gone to prevent them triggering new work.
The hash table rework MRs (!3865, !3871) increased the default RBT hash
table size from 64 to 65,536 entries (for 64-bit architectures, that is
512 bytes before vs. 524,288 bytes after). This works fine for RBTs
used for cache databases, but since three separate RBT databases are
created for every zone loaded (RRs, NSEC, NSEC3), memory usage would
skyrocket when BIND 9 is used as an authoritative DNS server with many
zones.
The default RBT hash table size before the rework was 64 entries, this
commit reduces it to 16 entries because our educated guess is that most
zones are just couple of entries (SOA, NS, A, AAAA, MX) and rehashing
small hash tables is actually cheap. The rework we did in the previous
MRs tries to avoid growing the hash tables for big-to-huge caches where
growing the hash table comes at a price because the whole cache needs to
be locked.
(cherry picked from commit 1e043a011b)
The hash table rework MRs (!3865, !3871) increased the default RBT hash
table size from 64 to 65,536 entries (for 64-bit architectures, that is
512 bytes before vs. 524,288 bytes after). This works fine for RBTs
used for cache databases, but since three separate RBT databases are
created for every zone loaded (RRs, NSEC, NSEC3), memory usage would
skyrocket when BIND 9 is used as an authoritative DNS server with many
zones.
The default RBT hash table size before the rework was 64 entries, this
commit reduces it to 16 entries because our educated guess is that most
zones are just couple of entries (SOA, NS, A, AAAA, MX) and rehashing
small hash tables is actually cheap. The rework we did in the previous
MRs tries to avoid growing the hash tables for big-to-huge caches where
growing the hash table comes at a price because the whole cache needs to
be locked.
Running "make recheck" after the test suite fails hides intermittent
system test failures in GitLab CI. This makes it hard to identify which
branches are affected by a particular test failure mode and causes CI
results to be overly optimistic. Prevent "make recheck" from being run
when "make check" fails to ensure GitLab CI results properly reflect the
stability of the "main" branch.
Make sure the 'checkds' command correctly sets the right key timing
metadata and also make sure that it rejects setting the key timing
metadata if there are multiple keys with the KSK role and no key
identifier is provided.
With 'checkds' replacing 'parent-registration-delay', the kasp
test needs the expected times to be adjusted. Also the system test
needs to call 'rndc dnssec -checkds' to progress the rollovers.
Since we pretend that the KSK is active as soon as the DS is
submitted (and parent registration delay is no longer applicable)
we can simplify the 'csk_rollover_predecessor_keytimes' function
to take only one "addtime" parameter.
This commit also slightly changes the 'check_dnssecstatus' function,
passing the zone as a parameter.
Don't strip off the final character when printing times in key files.
With the introduction of 'rndc dnssec -status' we introduced
'isc_stdtime_tostring()'. This changed in behavior such that it was no
longer needed to strip of the final '\n' of the string format
datetime. However, in 'printtime()' it still stripped the final
character.
Add a new 'rndc' command 'dnssec -checkds' that allows the user to
signal named that a new DS record has been seen published in the
parent, or that an existing DS record has been withdrawn from the
parent.
Upon the 'checkds' request, 'named' will write out the new state for
the key, updating the 'DSPublish' or 'DSRemoved' timing metadata.
This replaces the "parent-registration-delay" configuration option,
this was unreliable because it was purely time based (if the user
did not actually submit the new DS to the parent for example, this
could result in an invalid DNSSEC state).
Because we cannot rely on the parent registration delay for state
transition, we need to replace it with a different guard. Instead,
if a key wants its DS state to be moved to RUMOURED, the "DSPublish"
time must be set and must not be in the future. If a key wants its
DS state to be moved to UNRETENTIVE, the "DSRemoved" time must be set
and must not be in the future.
By default, with '-checkds' you set the time that the DS has been
published or withdrawn to now, but you can set a different time with
'-when'. If there is only one KSK for the zone, that key has its
DS state moved to RUMOURED. If there are multiple keys for the zone,
specify the right key with '-key'.
When pk11_numbits() is passed a user provided input that contains all
zeroes (via crafted DNS message), it would crash with assertion
failure. Fix that by properly handling such input.
QNAME minimization is normally disabled when forwarding. if, in the
course of processing a fetch, we switch back to normal recursion at
some point, we can't safely start minimizing because we may have
been left in an inconsistent state.
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.
This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.
The only arm64 runner we have at our disposal is suffering from
intermittent connectivity issues which make it unusable for extended
periods of time. Remove arm64 jobs from GitLab CI until we manage to
set up an arm64 runner with more reliable connectivity.
The named configuration files used in the "geoip2" system test cause a
rather large number of views (6-8) to be set up in each tested named
instance. Each view has its own cache.
Commit e24bc324b4 caused the RBT hash
table to be pre-allocated to a size derived from "max-cache-size", so
that it never needs to be rehashed. The size of that hash table is not
expected to be significant enough to cause memory use issues in typical
conditions even for large "max-cache-size" settings.
However, these two factors combined can cause memory exhaustion issues
in GitLab CI, where we run multiple "instances" of the test suite in
parallel on the same runner, each test suite executes multiple system
tests concurrently, and each system test may potentially start multiple
named instances at the same time. In practice, this problem currently
only seems to be affecting the "geoip2" system test, which is failing
intermittently due to named instances used by that test getting killed
by oom-killer.
Prevent the "geoip2" system test from failing intermittently by setting
"max-cache-size" in named configuration files used in that test to a low
value in order to keep memory usage at bay even with a large number of
views configured.
Resolve "BIND ARM incorrectly documents the processing of forwarders (still has the pre 9.3.0 explanation)"
Closes#2030
See merge request isc-projects/bind9!3881
Now that the log message has been printed set the result code to
DNS_R_FORMERR. We don't do this via dns_result_torcode() as we
don't want upstream errors to produce FORMERR if that processing
end with DNS_R_BADTSIG.
When a received RRSet has TTL 0, they would be preserved for
serve-stale (default `max-stale-cache` is 12 hours) rather than expiring
them quickly from the cache database.
This commit makes sure the RRSet didn't have TTL 0 before marking the
entry in the database as "stale".
The current serve-stale implementation in BIND 9 stores all received
records in the cache for a max-stale-ttl interval (default 12 hours).
This allows DNS operators to turn the serve-stale answers in an event of
large authoritative DNS outage. The caching of the stale answers needs
to be enabled before the outage happens or the feature would be
otherwise useless.
The negative consequence of the default setting is the inevitable
cache-bloat that happens for every and each DNS operator running named.
In this MR, a new configuration option `stale-cache-enable` is
introduced that allows the operators to selectively enable or disable
the serve-stale feature of BIND 9 based on their decision.
The newly introduced option has been disabled by default,
e.g. serve-stale is disabled in the default configuration and has to be
enabled if required.
In this commit, the simple fuzzing tests for the isc_lex_gettoken() and
isc_lex_getmastertoken() functions have been added.
As part of this commit, the initialization has been moved from fuzz.h
constructor/destructor to LLVMFuzzerInitialize() in each fuzz test. The
main.c of no-fuzzing and AFL modes have been modified to run the
LLVMFuzzerInitialize() at the start of the main() function mimicking
the libfuzzer mode of operation.
The fuzzing tests were temporarily disabled when the build system has been
converted to automake. This commit restores the functionality to run the
fuzzing tests as part of the `make check`. When the afl or libfuzzer
is enabled via ./configure, it uses a custom LOG_DRIVER (fuzz/<fuzzer.sh>).
Currently only libfuzzer.sh has been implemented that runs each fuzz
test for 5 seconds each.
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.
Created isc_refcount_decrement_expect macro to test conditionally
the return value to ensure it is in expected range. Converted
unchecked isc_refcount_decrement to use isc_refcount_decrement_expect.
Converted INSIST(isc_refcount_decrement()...) to isc_refcount_decrement_expect.
When silencing the Coverity warning in remove_old_tsversions(), the code
was refactored to reduce the indentation levels and break down the long
code into individual functions. This improve fix for [GL #1989].
':!.gitlab-ci.yml' is a pathspec pattern used to limit paths in the "git
grep" command to all but the .gitlab-ci.yml file which includes the
checked word itself. This requires Git 2.13.
It seems that config.guess gets always created in source root, so for
that sake of out-of-tree system test, we should expect the file there
instead of where configure was run.
The $SYSTEMTESTTOP shell variable if often set to .. in various shell
scripts inside bin/tests/system/, but most of the time it is only
used one line later, while sourcing conf.sh. This hardly improves
code readability.
$SYSTEMTESTTOP is also used for the purpose of referencing
scripts/files living in bin/tests/system/, but given that the
variable is always set to a short, relative path, we can drop it and
replace all of its occurrences with the relative path without adversely
affecting code readability.
When named acting as a resolver connects to an authoritative server over
TCP, it sets the idle timeout for that connection to 20 seconds. This
fixed timeout was picked back when the default processing timeout for
each client query was hardcoded to 30 seconds. Commit
000a8970f8 made this processing timeout
configurable through "resolver-query-timeout" and decreased its default
value to 10 seconds, but the idle TCP timeout was not adjusted to
reflect that change. As a result, with the current defaults in effect,
a single hung TCP connection will consistently cause the resolution
process for a given query to time out.
Set the idle timeout for connected TCP sockets to half of the client
query processing timeout configured for a resolver. This allows named
to handle hung TCP connections more robustly and prevents the timeout
mismatch issue from resurfacing in the future if the default is ever
changed again.
when building without ISC_BUFFER_USEINLINE (which is the default on
Windows) an assertion failure could occur when setting up a new
isc_httpd_t object for the statistics channel.
Whenever an exact match is found by dns_rbt_findnode(),
the highest level node in the chain will not be put into
chain->levels[] array, but instead the chain->end
pointer will be adjusted to point to that node.
Suppose we have the following entries in a rpz zone:
example.com CNAME rpz-passthru.
*.example.com CNAME rpz-passthru.
A query for www.example.com would result in the
following chain object returned by dns_rbt_findnode():
chain->level_count = 2
chain->level_matches = 2
chain->levels[0] = .
chain->levels[1] = example.com
chain->levels[2] = NULL
chain->end = www
Since exact matches only care for testing rpz set bits,
we need to test for rpz wild bits through iterating the nodechain, and
that includes testing the rpz wild bits in the highest level node found.
In the case of an exact match, chain->levels[chain->level_matches]
will be NULL, to address that we must use chain->end as the start point,
then iterate over the remaining levels in the chain.
Creation of EVP_MD_CTX and EVP_PKEY is quite expensive, so until we fix the code
to reuse the OpenSSL contexts and keys we'll use our own implementation of
siphash instead of trying to integrate with OpenSSL.
There were several problems with rbt hashtable implementation:
1. Our internal hashing function returns uint64_t value, but it was
silently truncated to unsigned int in dns_name_hash() and
dns_name_fullhash() functions. As the SipHash 2-4 higher bits are
more random, we need to use the upper half of the return value.
2. The hashtable implementation in rbt.c was using modulo to pick the
slot number for the hash table. This has several problems because
modulo is: a) slow, b) oblivious to patterns in the input data. This
could lead to very uneven distribution of the hashed data in the
hashtable. Combined with the single-linked lists we use, it could
really hog-down the lookup and removal of the nodes from the rbt
tree[a]. The Fibonacci Hashing is much better fit for the hashtable
function here. For longer description, read "Fibonacci Hashing: The
Optimization that the World Forgot"[b] or just look at the Linux
kernel. Also this will make Diego very happy :).
3. The hashtable would rehash every time the number of nodes in the rbt
tree would exceed 3 * (hashtable size). The overcommit will make the
uneven distribution in the hashtable even worse, but the main problem
lies in the rehashing - every time the database grows beyond the
limit, each subsequent rehashing will be much slower. The mitigation
here is letting the rbt know how big the cache can grown and
pre-allocate the hashtable to be big enough to actually never need to
rehash. This will consume more memory at the start, but since the
size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
will only consume maximum of 32GB of memory for hashtable in the
worst case (and max-cache-size would need to be set to more than
4TB). Calling the dns_db_adjusthashsize() will also cap the maximum
size of the hashtable to the pre-computed number of bits, so it won't
try to consume more gigabytes of memory than available for the
database.
FIXME: What is the average size of the rbt node that gets hashed? I
chose the pagesize (4k) as initial value to precompute the size of
the hashtable, but the value is based on feeling and not any real
data.
For future work, there are more places where we use result of the hash
value modulo some small number and that would benefit from Fibonacci
Hashing to get better distribution.
Notes:
a. A doubly linked list should be used here to speedup the removal of
the entries from the hashtable.
b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/
Make sure bin/tests/system/run.sh returns a non-zero exit code if any of
the following happens:
- the test being run produces a core dump,
- assertion failures are found in the test's logs,
- ThreadSanitizer reports are found after the test completes,
- the servers started by the test fail to shut down cleanly.
This change is necessary to always fail a test in such cases (before the
migration to Automake, test failures were determined based on the
presence of "R:<test-name>:FAIL" lines in the test suite output and thus
it was not necessary for bin/tests/system/run.sh to return a non-zero
exit code).
Add an item to the release checklist to make sure confidential issues
assigned to the relevant milestone are made public after the BIND
versions addressing them are released.
Fix Coverity CHECKED_RETURN reports for dst_key_getbool(). In most
cases we do not really care about its return value, but it is prudent
to check it.
In one case, where a dst_key_getbool() error should be treated
identically as success, cast the return value to void and add a relevant
comment.
Our GitLab Runner Custom executor scripts now use the "image" key
instead of the job name for determining the QCOW2 image to use for a
given CI job. Update .gitlab-ci.yml to reflect that change.
Since October 2019 I have had complaints from `dnssec-cds` reporting
that the signatures on some of my test zones had expired. These were
zones signed by BIND 9.15 or 9.17, with a DNSKEY TTL of 24h and
`sig-validity-interval 10 8`.
This is the same setup we have used for our production zones since
2015, which is intended to re-sign the zones every 2 days, keeping
at least 8 days signature validity. The SOA expire interval is 7
days, so even in the presence of zone transfer problems, no-one
should ever see expired signatures. (These timers are a bit too
tight to be completely correct, because I should have increased
the expiry timers when I increased the DNSKEY TTLs from 1h to 24h.
But that should only matter when zone transfers are broken, which
was not the case for the error reports that led to this patch.)
For example, this morning my test zone contained:
dev.dns.cam.ac.uk. 86400 IN RRSIG DNSKEY 13 5 86400 (
20200701221418 20200621213022 ...)
But one of my resolvers had cached:
dev.dns.cam.ac.uk. 21424 IN RRSIG DNSKEY 13 5 86400 (
20200622063022 20200612061136 ...)
This TTL was captured at 20200622105807 so the resolver cached the
RRset 64976 seconds previously (18h02m56s), at 20200621165511
only about 12h before expiry.
The other symptom of this error was incorrect `resign` times in
the output from `rndc zonestatus`.
For example, I have configured a test zone
zone fast.dotat.at {
file "../u/z/fast.dotat.at";
type primary;
auto-dnssec maintain;
sig-validity-interval 500 499;
};
The zone is reset to a minimal zone containing only SOA and NS
records, and when `named` starts it loads and signs the zone. After
that, `rndc zonestatus` reports:
next resign node: fast.dotat.at/NS
next resign time: Fri, 28 May 2021 12:48:47 GMT
The resign time should be within the next 24h, but instead it is
near the signature expiry time, which the RRSIG(NS) says is
20210618074847. (Note 499 hours is a bit more than 20 days.)
May/June 2021 is less than 500 days from now because expiry time
jitter is applied to the NS records.
Using this test I bisected this bug to 09990672d which contained a
mistake leading to the resigning interval always being calculated in
hours, when days are expected.
This bug only occurs for configurations that use the two-argument form
of `sig-validity-interval`.
Resolve "gssapictx.c:681:10: error: implicit declaration of function 'gsskrb5_register_acceptor_identity' on illumos"
Closes#1995
See merge request isc-projects/bind9!3830
When we're shutting the system down via "rndc stop" or "rndc halt",
or reconfiguring the control channel, there are potential shutdown
races between the server task and network manager. These are adressed by:
- purging any pending command tasks when shutting down the control channel
- adding an extra handle reference before the command handler to
ensure the handle can't be deleted out from under us before calling
command_respond()
- using an isc_task to execute all rndc functions makes it relatively
simple for them to acquire task exclusive mode when needed
- control_recvmessage() has been separated into two functions,
control_recvmessage() and control_respond(). the respond function
can be called immediately from control_recvmessage() when processing
a nonce, or it can be called after returning from the task event
that ran the rndc command function.
- updated libisccc to use netmgr events
- updated rndc to use isc_nm_tcpconnect() to establish connections
- updated control channel to use isc_nm_listentcp()
open issues:
- the control channel timeout was previously 60 seconds, but it is now
overridden by the TCP idle timeout setting, which defaults to 30
seconds. we should add a function that sets the timeout value for
a specific listener socket, instead of always using the global value
set in the netmgr. (for the moment, since 30 seconds is a reasonable
timeout for the control channel, I'm not prioritizing this.)
- the netmgr currently has no support for UNIX-domain sockets; until
this is addressed, it will not be possible to configure rndc to use
them. we will need to either fix this or document the change in
behavior.
The basic scenario for the problem was that in the process of
resolving a query, if any rrset was eligible for prefetching, then it
would trigger a call to query_prefetch(), this call would run in
parallel to the normal query processing.
The problem arises due to the fact that both query_prefetch(), and,
in the original thread, a call to ns_query_recurse(), try to attach
to the recursionquota, but recursing client stats counter is only
incremented if ns_query_recurse() attachs to it first.
Conversely, if fetch_callback() is called before prefetch_done(),
it would not only detach from recursionquota, but also decrement
the stats counter, if query_prefetch() attached to te quota first
that would result in a decrement not matched by an increment, as
expected.
To solve this issue an atomic bool was added, it is set once in
ns_query_recurse(), allowing fetch_callback() to check for it
and decrement stats accordingly.
For a more compreensive explanation check the thread comment below:
https://gitlab.isc.org/isc-projects/bind9/-/issues/1719#note_145857
If too many versions of log / dnstap files to be saved where requests
the memory after to_keep could be overwritten. Force the number of
versions to be saved to a save level. Additionally the memmove length
was incorrect.
When "rndc reconfig" is run, named first configures a fresh set of views
and then tears down the old views. Consider what happens for a single
view with LMDB enabled; "envA" is the pointer to the LMDB environment
used by the original/old version of the view, "envB" is the pointer to
the same LMDB environment used by the new version of that view:
1. mdb_env_open(envA) is called when the view is first created.
2. "rndc reconfig" is called.
3. mdb_env_open(envB) is called for the new instance of the view.
4. mdb_env_close(envA) is called for the old instance of the view.
This seems to have worked so far. However, an upstream change [1] in
LMDB which will be part of its 0.9.26 release prevents the above
sequence of calls from working as intended because the locktable mutexes
will now get destroyed by the mdb_env_close() call in step 4 above,
causing any subsequent mdb_txn_begin() calls to fail (because all of the
above steps are happening within a single named process).
Preventing the above scenario from happening would require either
redesigning the way we use LMDB in BIND, which is not something we can
easily backport, or redesigning the way BIND carries out its
reconfiguration process, which would be an even more severe change.
To work around the problem, set MDB_NOLOCK when calling mdb_env_open()
to stop LMDB from controlling concurrent access to the database and do
the necessary locking in named instead. Reuse the view->new_zone_lock
mutex for this purpose to prevent the need for modifying struct dns_view
(which would necessitate library API version bumps). Drop use of
MDB_NOTLS as it is made redundant by MDB_NOLOCK: MDB_NOTLS only affects
where LMDB reader locktable slots are stored while MDB_NOLOCK prevents
the reader locktable from being used altogether.
[1] 2fd44e3251
There are still some pregenerated files left in the git
repository (cleaned up during `make maintainer-clean`) and we currently
don't notice if any of those needs to be updated in the git repository
because we ignore changes in the repository done during the build.
This commit adds a safeguard that fails the build job if the contents of
the git repository gets modified during the build.
There were some missing bits in the other rst files and Makefile.am(s)
that didn't reflect the rename of the main document. Also add
ddns-confgen.8 manpage.
The ThreadSanitizer found a data race when updating the stale header.
Instead of trying to acquire the write lock and failing occasionally
which would skew the statistics, the dns_rdatasetheader_t.attributes
field has been promoted to use stdatomics. Updating the attributes in
the mark_header_ancient() and mark_header_stale() now uses the cmpxchg
to update the attributes forfeiting the need to hold the write lock on
the tree. Please note that mark_header_ancient() still needs to hold
the lock because .dirty is being updated in the same go.
The stdatomic shims for non-C11 compilers (Windows, old gcc, ...) and
mutexatomic implemented only and minimal subset of the atomic types.
This commit adds 16-bit operations for Windows and all atomic types as
defined in standard.
BUFSIZ (512 bytes on Windows) may not be enough to fit the status of a
DNSSEC policy and three DNSSEC keys.
Set the size of the relevant buffer to a hardcoded value of 4096 bytes,
which should be enough for most scenarios.
While the creation and publication times of the various keys
in this policy are nearly at the same time there is a chance that
one key is created a second later than the other.
The `set_keytimes_algorithm_policy` mistakenly set the keytimes
for KEY3 based of the "published" time from KEY2.
this changes most visble uses of master/slave terminology in tests.sh
and most uses of 'type master' or 'type slave' in named.conf files.
files in the checkconf test were not updated in order to confirm that
the old syntax still works. rpzrecurse was also left mostly unchanged
to avoid interference with DNSRPS.
it is now an error to have two primaries lists with the same
name. this is true regardless of whether the "primaries" or
"masters" keywords were used to define them.
as "type primary" is preferred over "type master" now, it makes
sense to make "primaries" available as a synonym too.
added a correctness check to ensure "primaries" and "masters"
cannot both be used in the same zone.
We erroneously tried to destroy a socket after issuing
isc__nm_tcp{,dns}_close. Under some (race) circumstances we could get
nm_socket_cleanup to be called twice for the same socket, causing an
access to a dead memory.
There's a possibility of race in isc__nm_tcpconnect if the asynchronous
connect operation finishes with all the callbacks before we exit the
isc__nm_tcpconnect itself we might access an already freed memory.
Fix it by creating an additional reference to the socket freed at the
end of isc__nm_tcpconnect.
When we're coming back from recursion fetch_callback does not accept
DNS_R_NXDOMAIN as an rcode - query_gotanswer calls query_nxdomain in
which an assertion fails on qctx->is_zone. Yet, under some
circumstances, qname minimization will return an DNS_R_NXDOMAIN - when
root zone mirror is not yet loaded. The fix changes the DNS_R_NXDOMAIN
answer to DNS_R_SERVFAIL.
This test ensures that named will correctly shutdown
when receiving multiple control connections after processing
of either "rncd stop" or "kill -SIGTERM" commands.
Before the fix, named was crashing due to a race condition happening
between two threads, one running shutdown logic in named/server.c
and other handling control logic in controlconf.c.
This test tries to reproduce the above scenario by issuing multiple
queries to a target named instance, issuing either rndc stop or kill
-SIGTERM command to the same named instance, then starting multiple rndc
status connections to ensure it is not crashing anymore.
Due to lack of synchronization, whenever named was being requested to
stop using rndc, controlconf.c module could be trying to access an already
released pointer through named_g_server->interfacemgr in a separate
thread.
The race could only be triggered if named was being shutdown and more
rndc connections were ocurring at the same time.
This fix correctly checks if the server is shutting down before opening
a new rndc connection.
the blackhole ACL was accidentally disabled with respect to client
queries during the netmgr conversion.
in order to make this work for TCP, it was necessary to add a return
code to the accept callback functions passed to isc_nm_listentcp() and
isc_nm_listentcpdns().
Implement the 'rndc dnssec -status' command that will output
some information about the key states, such as which policy is
used for the zone, what keys are in use, and when rollover is
scheduled.
Add loose testing in the kasp system test, the actual times are
already tested via key file inspection.
Add the code and documentation required to provide DNSSEC signing
status through rndc. This does not yet show any useful information,
just provide the command that will output some dummy string.
I'd like to use the same functionality (pretty print the datetime
of keytime metadata) in the 'rndc dnssec -status' command. So it is
better that this logic is done in a separate function.
Since the stdtime.c code have differernt files for unix and win32,
I think the "#ifdef WIN32" define can be dropped.
The rndc.conf main header was missing the header markup and that was
breaking the TOC for all manpages in the ARM because sphinx-build
incorrectly remembered the markup for subheader to be ~~~~ instead of
----.
The "krb5-devel" package on openSUSE Tumbleweed installs the
"krb5-config" binary into a custom prefix, which prevents BIND's
"configure" script from autodetecting it. Fix by specifying the path to
the "krb5-config" binary using --with-gssapi.
Since lib/dns/include/dns/view.h unconditionally defines dnstap-related
fields in struct dns_view (and includes <dns/dnstap.h>), care must be
taken to ensure that any source file which includes <dns/view.h> gets
built with a set of CFLAGS which allows <dns/dnstap.h> to be properly
processed (particularly its <fstrm.h> and <protobuf-c/protobuf-c.h>
conditional dependencies which are only included for dnstap-enabled
builds). Ensure that by making LIBDNS_CFLAGS include DNSTAP_CFLAGS when
building with dnstap support.
The same reasoning applies for LMDB_CFLAGS.
The AX_LIB_LMDB() macro attempts to test the potential LMDB installation
path provided to it by temporarily updating CFLAGS and LIBS, calling
AC_SEARCH_LIBS(), and then restoring CFLAGS and LIBS to their original
values. However, including certain statements (e.g. "break") in the
arguments provided to the AX_LIB_LMDB() macro may cause an early exit
from it, in which case CFLAGS and LIBS will be left polluted. Fix by
resetting CFLAGS and LIBS to their original values before executing the
commands provided as AX_LIB_LMDB() arguments.
The wait until zones are signed after rndc reconfig is broken
because the zones are already signed before the reconfig. Fix
by having a different way to ensure the signing of the zone is
complete. This does require a call to the "wait_for_done_signing"
function after each "check_keys" call after the ns6 reconfig.
The "wait_for_done_signing" looks for a (newly added) debug log
message that named will output if it is done signing with a certain
key.
isc__nm_tcpdns_send() was not asynchronous and accessed socket
internal fields in an unsafe manner, which could lead to a race
condition and subsequent crash. Fix it by moving tcpdns processing
to a proper netmgr thread.
We need to mark the socket as inactive early (and synchronously)
in the stoplistening process; otherwise we might destroy the
callback argument before we actually stop listening, and call
the callback on bad memory.
Resolve "BIND stops DNSKEY lookup in get_dst_key() when a key with unsupported algorithm is found first"
Closes#1689
See merge request isc-projects/bind9!3736
Add a note why we don't have a test case for the issue.
It is tricky to write a good test case for this if our tools are
not allowed to create signatures for unsupported algorithms.
Assign and then check node for NULL to address another thread
changing radix->head in the meantime.
Move 'node != NULL' check into while loop test to silence cppcheck
false positive.
Fix pointer != NULL style.
enough buffer space. Also change named_os_uname() prototype so
that it is now returning (const char *) rather than (char *). If
uname() is not supported on a UNIX build prepopulate unamebuf[]
with "unknown architecture".
The LT_INIT() call in configure.ac is effectively a no-op because it is
preceded by a call to AC_PROG_LIBTOOL(), which is the previous name of
LT_INIT() used in older libtool versions. Replace AC_PROG_LIBTOOL()
with AC_PATH_PROG() to look for libtool in PATH without initializing it,
which is the originally intended behavior.
Without this change, --enable-static is used by default, which causes a
plain ./configure invocation to fail because static linking is now
disallowed. Drop --disable-static from the ./configure invocations used
in GitLab CI to test this scenario continuously.
Linking BIND 9 programs and libraries statically disables several
important features:
* dlopen() - relied on by dynamic loading of modules, dlz, and dyndb,
* RELRO (read-only relocations) and ASLR (address space layout
randomization) - security features which are important for any
program interacting with the network and/or user input.
Disable and disallow linking BIND 9 binaries statically, thus enforcing
dlopen() support and allowing use of RELRO and ASLR by default.
The `rndc` argument was always overridden by the static configuration,
because the logic for handling the number of dnstap files to retain
was both backwards and a bit redundant.
When maintainer mode is enabled (./configure --enable-maintainer-mode)
it enables rebuild of documentation source files that require extra
tools to be installed or compiled. For a convenience, those files are
already committed into the repository and their rebuild is not required
to build BIND 9 from sources.
Similarly, the manpage sources will get rebuild only when in maintainer
mode because they require sphinx-build to be available locally and that
might not be always the case.
The files in doc/misc requires all the BIND 9 libraries to be built
before the documentation can be built. One of the extra automake
features is maintainer mode that allows to conditionally build and clean
files that require special tools. Make use of the automake maintainer
mode to not rebuild the files in doc/misc under normal circumstances.
if tests that take a particularly long time to complete
(serve-stale, dnssec, rpzrecurse) are run first, a parallel
run of the system tests can finish 1-2 minutes faster.
The doc/misc/options is used to generate a file describing all
configuration options. Currently, the file contents could differ
based on ./configure option which is kind of suboptimal.
We already removed the "// not configured" from the options.active, and
this time we remove generation of the string altogether.
these keywords were added to the parser as synonyms for "master"
and "slave" but were never hooked in to the configuration of named,
so they were ignored. this has been fixed and the option is now
checked for correctness.
The isc_nm_cancelread() function cancels reading on a connected
socket and calls its read callback function with a 'result'
parameter of ISC_R_CANCELED.
when isc_nm_destroy() is called, there's a loop that waits for
other references to be detached, pausing and unpausing the netmgr
to ensure that all the workers' events are run, followed by a
1-second sleep. this caused a delay on shutdown which will be
noticeable when netmgr is used in tools other than named itself,
so the delay has now been reduced to a hundredth of a second.
the isc_nm_tcpconnect() function establishes a client connection via
TCP. once the connection is esablished, a callback function will be
called with a newly created network manager handle.
A TCPDNS socket creates a handle for each complete DNS message.
Previously, when all the handles were disconnected, the socket
would be closed, but the wrapped TCP socket might still have
more to read.
Now, when a connection is established, the TCPDNS socket creates
a reference to itself by attaching itself to sock->self. This
reference isn't cleared until the connection is closed via
EOF, timeout, or server shutdown. This allows the socket to remain
open even when there are no active handles for it.
- isc__nmhandle_get() now attaches to the sock in the nmhandle object.
the caller is responsible for dereferencing the original socket
pointer when necessary.
- tcpdns listener sockets attach sock->outer to the outer tcp listener
socket. tcpdns connected sockets attach sock->outerhandle to the handle
for the tcp connected socket.
- only listener sockets need to be attached/detached directly. connected
sockets should only be accessed and reference-counted via their
associated handles.
there is no need for a caller to reference-count socket objects.
they need tto be able tto close listener sockets (i.e., those
returned by isc_nm_listen{udp,tcp,tcpdns}), and an isc_nmsocket_close()
function has been added for that. other sockets are only accessed via
handles.
Since the reference BIND version for the ABI check job which is run for
the main branch is now 9.17.2, autoreconf needs to be run before
./configure as the latter is no longer present in the Git repository.
RBTDB node can now appear on the deadnodes lists following the changes
to decrement_reference in 176b23b6cd to
defer checking of node->down when the tree write lock is not held. The
node should be unlinked instead.
NS_CLIENT_TCP_BUFFER_SIZE was 2 byte too large following the
move to netmgr add associated changes to lib/ns/client.c and
as a result an INSIST could be trigger if the DNS message being
constructed had a checkpoint stage that fell in those two extra
bytes. Adjusted NS_CLIENT_TCP_BUFFER_SIZE and cleaned up
client_allocsendbuf now that the previously reserved 2 bytes
are no longer used.
The ThreadSanitizer uses system synchronization primitives to check for
data race. The netmgr handle->references was missing acquire memory
barrier before resetting and reusing the memory occupied by isc_nmhandle_t.
- clone keynode->dsset rather than return a pointer so that thread
use is independent of each other.
- hold a reference to the dsset (keynode) so it can't be deleted
while in use.
- create a new keynode when removing DS records so that dangling
pointers to the deleted records will not occur.
- use a rwlock when accessing the rdatalist to prevent instabilities
when DS records are added.
There's a possibility of a race in TCP accepting code:
T1 accepts a connection C1
T2 accepts a connection C2
T1 tries to accept a connection C3, but we hit a quota,
isc_quota_cb_init() sets quota_accept_cb for the socket,
we return from accept_connection
T2 drops C2, but we race in quota_release with accepting C3 so
we don't see quota->waiting is > 0, we don't launch the callback
T1 accepts a connection C4, we are able to get the quota we clear
the quota_accept_cb from sock->quotacb
T1 drops C1, tries to call the callback which is zeroed, sigsegv.
Due to the changes introduced by the Automake migration, system tests
requiring Python (chain, pipelined, qmin, tcp), dynamic loading of
shared objects (dlzexternal, dyndb, filter-aaaa), or LMDB (nzd2nzf)
currently do not work on Windows. Temporarily disable them on that
platform by moving them from the PARALLEL_COMMON list to the
PARALLEL_UNIX list until the situation is rectified.
Without SYSTEMTESTTOP=.. lines in tests.sh scripts, SYSTEMTESTTOP is
being set to an absolute path. On Windows, this means that an absolute
Cygwin path gets passed as a command line argument to native Windows
binaries, which cannot work and causes system tests to break. Fix by
passing SYSTEMTESTTOP through cygpath on Windows, which causes that
variable to be set to an absolute "mixed mode" path (Windows path with
forward slashes).
With "make dist" producing usable source tarballs and documentation
building working again, restore the script which allows a release
tarball to be built by a GitLab CI job, only making minimal adjustments
required due to the changes in the documentation building process and
due to dropping the "version" file.
As the "configure" script is no longer stored in the Git repository, run
"autoreconf -fi" at the beginning of the respdiff job in GitLab CI in
order to enable that job to work properly.
For the time being, "make all" needs to be run before "make dist" can
succeed as parts of the documentation are generated by programs compiled
during the regular build process.
As only one source tarball is published for each BIND release, make sure
the tarball creation job in GitLab CI only contains one tarball in the
desired format among its artifacts.
Drop the TARBALL_COMPRESSOR .gitlab-ci.yml variable as it is no longer
used in the source tarball creation process.
The "srcid" file present in each BIND source tarball contains a
shortened hash of the Git commit corresponding to a given BIND release.
This allows a Git reference to be included in an archive that otherwise
lacks any Git information.
Before the move to Automake, if an "srcid" file was present in the root
source directory at the time ./configure was run, its contents were used
as the value of a compile-time constant which was then baked into BIND
binaries; otherwise, "git rev-parse" was used to determine the value of
that constant.
With Automake, a similar approach was attempted that required the
"srcid" file to be present at autoreconf time in order for it to be
used. However, note that this means that even if that file is present
in a source tarball created using "make dist", its contents are not
going to influence the value of the aforementioned compile-time constant
because autoreconf hardcodes the output of "git rev-parse" into the
configure script at autoreconf time.
To make things more clear, always use "git rev-parse" for determining
the value of the PACKAGE_SRCID compile-time constant when running
autoreconf. This causes "srcid" to be an empty string in source
tarballs built from other source tarballs, but that is not deemed to be
much of an issue as "make dist" is expected to be run from Git
repository clones. Remove stderr redirections to /dev/null to ensure
errors caused e.g. by running "make dist" from outside a Git repository
clone are not hidden. Trim the Git commit hash to 7 characters for
consistency between Unix and Windows systems.
Despite the above, ensure the "srcid" file is present in source tarballs
created using "make dist" as that file is used by the build process on
Windows.
Resolve "Correct the BIND ARM to say that the default session-key for use with 'update-policy local;' is generated at startup"
Closes#1842
See merge request isc-projects/bind9!3664
We were passing client address to dns_resolver_createfetch as a pointer
and it was saved as a pointer. The client (with its address) could be
gone before the fetch is finished, and in a very odd scenario
log_formerr would call isc_sockaddr_format() which first checks if the
address family is valid (and at this point it still is), then the
sockaddr is cleared, and then isc_netaddr_fromsockaddr is called which
fails an assertion as the address family is now invalid.
Make various adjustments necessary to enable "make dist" to build a BIND
source tarball whose contents are complete enough to build binaries, run
unit & system tests, and generate documentation on Unix systems.
Known outstanding issues:
- "make distcheck" does not work yet.
- Tests do not work for out-of-tree source-tarball-based builds.
- Source tarballs are not complete enough for building on Windows.
All of the above will be addressed in due course.
Merge lib/isc/unix/ifiter_getifaddrs.c into lib/isc/unix/interfaceiter.c
and lib/isc/xoshiro128starstar.c into lib/isc/random.c. This avoids the
need for extra Automake directives required to process the "helper" *.c
files properly and makes the code more localized.
Turn the static check_bad_bits() function used by both Unix and Windows
systems into a "private" function and extract the "private" parts of
lib/isc/fsaccess.c to lib/isc/fsaccess_common_p.h. Instead of including
lib/isc/fsaccess.c from lib/isc/{unix,win32}/fsaccess.c, make the former
an independent C source file.
Rename lib/isc/fsaccess.c to lib/isc/fsaccess_common.c to prevent build
issues on Windows caused by multiple source files (lib/isc/fsaccess.c,
lib/isc/win32/fsaccess.c) being compiled into the same object file.
These changes improve consistency with the way "private" functions and
macros are treated elsewhere in the source tree.
There was a case where an primary server sent a response
on the wrong TCP connection and failure to check the question
section resulted in a truncated zone being served.
DS records only belong at delegation points and if present
at the zone apex are invariably the result of administrative
errors. Additionally they can't be queried for with modern
resolvers as the parent servers will be queried.
When ./run.sh <test> is invoked, it acts as a wrapper around
`env - TESTS="<test>" make -e check` to preserve the ability to build
files defined only in the `check` target. Unfortunately, cleaning the
full environment had a side-effect of some tests failing due to missing
binaries and libraries. We now preserve the two most important
variables - PATH and LD_LIBRARY_PATH.
To indicate the SoftHSM version used in each CI job while avoiding the
need to add another token to job names, replace "pkcs11" with
"softhsm2.4" and "fedora31:amd64" with "softhsm2.6".
Various SoftHSM versions differ in algorithm support. Since Fedora
tends to have the latest SoftHSM version available in its stock package
repositories, enable PKCS#11 support in Fedora jobs to test multiple
SoftHSM versions in GitLab CI.
Move BIND binaries which are neither daemons nor administrative programs
to $bindir. This results in only the following binaries being left in
$sbindir:
- ddns-confgen
- named
- rndc
- rndc-confgen
- tsig-confgen
It might be possible some pending task would run when kserver is already
cleaned up. Postpone gsstsig structures cleanup after task and timer
managers are destroyed. No pending threads are possible after it.
Make action in maybeshutdown only if doshutdown was not already called.
Might be called from getinput event.
The release notes were previously built as a separate document
(including the PDF version). It was agreed that this doesn't make much
sense, so the release notes are now included only as an appendix to the
BIND 9 ARM.
This includes reorganization of the lists of RFCs supported by BIND 9.
I included all the RFCs and notes from the list identified by Vicky in
any DNS-related RFCs written by current ISC engineers, on the assumption
that BIND would comply with them.
As a leftover from old TCP accept code isc_uv_import passed TCP_SERVER
flag when importing a socket on Windows.
Since now we're importing/exporting accepted connections it needs to
pass TCP_CONNECTION flag.
The Danger script inspects differences between the current version of a
given merge request's target branch and the merge request branch. If
the latter falls behind the former, the Danger script will wrongly warn
about missing GitLab/RT identifiers because it incorrectly treats the
"+++" diff marker as an indication of the merge request adding new lines
to a file. Tweak the relevant conditional expression to prevent such
invalid warnings from being raised.
As GitLab Runner Docker executor caches Git repositories between jobs,
prevent the Danger script from attempting to update local refs to ensure
"git fetch" returns with an exit code of 0. Use the FETCH_HEAD ref for
determining the differences between the merge request branch and its
target branch.
Commits adding CHANGES entries and/or release notes do not need a commit
log message. Do not warn about a missing commit log message for such
commits to make the warning more meaningful.
The SO_INCOMING_CPU is available since Linux 3.19 for getting the value,
but only since Linux 4.4 for setting the value (see below for a full
description). BIND 9 should not fail when setting the option on the
socket fails, as this is only an optimization and not hard requirement
to run BIND 9.
SO_INCOMING_CPU (gettable since Linux 3.19, settable since Linux 4.4)
Sets or gets the CPU affinity of a socket. Expects an integer flag.
int cpu = 1;
setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, sizeof(cpu));
Because all of the packets for a single stream (i.e., all
packets for the same 4-tuple) arrive on the single RX queue that
is associated with a particular CPU, the typical use case is to
employ one listening process per RX queue, with the incoming
flow being handled by a listener on the same CPU that is
handling the RX queue. This provides optimal NUMA behavior and
keeps CPU caches hot.
Originally, the default value for max-stale-ttl was 1 week, which could
and in some scenarios lead to cache exhaustion on a busy resolvers.
Picking the default value will always be juggling between value that's
useful (e.g. keeping the already cached records after they have already
expired and the upstream name servers are down) and not bloating the
cache too much (e.g. keeping everything for a very long time). The new
default reflects what we think is a reasonable to time to react on both
sides (upstream authoritative and downstream recursive).
When creating the successor, the current active key (predecessor)
should change its goal state to HIDDEN.
Also add two useful debug logs in the keymgr_key_rollover function.
Catch a case where if the prepublication time of the successor key
is later than the retire time of the predecessor. If that is the
case we should prepublish as soon as possible, a.k.a. now.
The `dns_keymgr_run()` function became quite long, put the logic
that looks if a new key needs to be created (start a key rollover)
in a separate function.
The logic in `keymgr_key_has_successor(key, keyring)` is flawed, it
returns true if there is any key in the keyring that has a successor,
while what we really want here is to make sure that the given key
has a successor in the given keyring.
Rather than relying on `keymgr_key_exists_with_state`, walk the
list of keys in the keyring and check if the key is a successor of
the given predecessor key.
This improves keytime testing on CSK rollover. It now
tests for specific times, and also tests for SyncPublish and
Removed keytimes.
Since an "active key" for ZSK and KSK means something
different, this makes it tricky to decide when a CSK is
active. An "active key" intuitively means the key is signing
so we say a CSK is active when it is creating zone signatures.
This change means a lot of timings for the CSK rollover tests
need to be adjusted.
The keymgr code needs a slight change on calculating the
prepublication time: For a KSK we need to include the parent
registration delay, but for CSK we look at the zone signing
property and stick with the ZSK prepublication calculation.
Registration delay is not part of the Iret retire interval, thus
removed from the calculation when setting the Delete time metadata.
Include the registration delay in prepublication time, because
we need to prepublish the key sooner than just the Ipub
publication interval.
This commit adds testing keytiming metadata. In order to facilitate
this, the kasp system test undergoes a few changes:
1. When finding a key file, rather than only saving the key ID,
also save the base filename and creation date with `key_save`.
These can be used later to set expected key times.
2. Add a test function `set_addkeytime` that takes a key, which
keytiming to update, a datetime in keytiming format, and a number
(seconds) to add, and sets the new time in the given keytime
parameter of the given key. This is used to set the expected key
times.
3. Split `check_keys` in `check_keys` and `check_keytimes`. First we
need to find the keyfile before we can check the keytimes.
We need to retrieve the creation date (and sometimes other
keytimes) to determine the other expected key times.
4. Add helper functions to set the expected key times per policy.
This avoids lots of duplication.
Check for keytimes for the first test cases (all that do not cover
rollovers).
After removing dnssec-settime calls that set key rollover
relationship, we can adjust the counts in test output filenames.
Also fix a couple of more wrong counts in output filenames.
Using dnssec-setttime after dnssec-keygen in the kasp system test
can lead to off by one second failures, so reduce the usage of
dnssec-settime in the setup scripts. This commit deals with
setting the key rollover relationship (predecessor/successor).
In the kasp system test, we are going to set the keytimes on
dnssec-keygen so we can test them against the key creation time.
This prevents off by one second in the test, something that can
happen if you set those times with dnssec-settime after
dnssec-keygen.
Also fix some test output filenames.
While kasp relies on key states to determine when a key needs to
be published or be used for signing, the keytimes are used by
operators to get some expectation of key publication and usage.
Update the code such that these keytimes are set appropriately.
That means:
- Print "PublishCDS" and "DeleteCDS" times in the state files.
- The keymgr sets the "Removed" and "PublishCDS" times and derives
those from the dnssec-policy.
- Tweak setting of the "Retired" time, when retiring keys, only
update the time to now when the retire time is not yet set, or is
in the future.
This also fixes a bug in "keymgr_transition_time" where we may wait
too long before zone signatrues become omnipresent or hidden. Not
only can we skip waiting the sign delay Dsgn if there is no
predecessor, we can also skip it if there is no successor.
Finally, this commit moves setting the lifetime, reducing two calls
to one.
For testing purposes mainly, we want to allow set keytimings on
generated keys, such that we don't have to "keygen/settime" which
can result in one second off times.
Certain rules of the BIND development process are not codified anywhere
and/or are used inconsistently. In an attempt to improve this
situation, add a GitLab CI job which uses Danger Python to add comments
to merge requests when certain expectations are not met. Two categories
of feedback are used, only one of which - fail() - causes the GitLab CI
job to fail. Exclude dangerfile.py from Python QA checks as the way the
contents of that file are evaluated triggers a lot of Flake8 and PyLint
warnings.
in addition to being more efficient, this prevents a possible crash by
looking up the node name before the tree sructure can be changed when
cleaning up dead nodes in addrdataset().
The code in util/bindkeys.pl was overly complicated and it could not be
reused on Windows because redirecting stdin and stdout at the same time
from perl is overly complicated.
Now the util/bindkeys.pl accepts the input file as the first and only
argument and prints the header file to stdout. This allows the same
utility to be used from automake and win32/Configure script.
when built with "configure --enable-singletrace", named will produce
detailed query logging at the highest debug level for any query with
query ID zero.
this enables monitoring of the progress of a single query by specifying
the QID using "dig +qid=0". the "client" logging category should be set
to a low severity level to suppress logging of other queries. (the
chance of another query using QID=0 at the same time is only 1 in 2^16.)
"--enable-singletrace" turns on "--enable-querytrace" as well, so if the
logging severity is not lowered, all other queries will be logged
verbosely as well. compiling with either of these options will impair
query performance; they should only be turned on when testing or
troubleshooting.
Replace an existing comment with a more verbose explanation of when the
"hint" variable is set in resquery_send() and how its value affects the
advertised UDP buffer size in outgoing queries.
If "edns-udp-size" is set in a "server" block matching the queried
server, it is accounted for in the process of determining the advertised
UDP buffer size, but its value may still be overridden before the query
is sent. This behavior contradicts the ARM which claims that when set,
the server-specific "edns-udp-size" value is used for all EDNS queries
sent to a given server.
Furthermore, calling dns_peer_getudpsize() with the "udpsize" variable
as an argument makes the code hard to follow as that call may either
update the value of "udpsize" or leave it untouched.
Ensure the code matches the documentation by moving the
dns_peer_getudpsize() call below all other blocks of code potentially
affecting the advertised UDP buffer size, which is where it was located
when server-specific "edns-udp-size" support was first implemented [1].
Improve code readability by calling dns_peer_getudpsize() with a helper
variable instead of "udpsize".
[1] see commit 1c153afce5
When the DNS_FETCHOPT_EDNS512 flag was first introduced [1], it enforced
advertising a 512-byte UDP buffer size in an outgoing query. Ever since
EDNS processing code got updated [2], that flag has still been set upon
detection of certain query timeout patterns, but it has no longer been
affecting the calculations of the advertised UDP buffer size in outgoing
queries. Restore original semantic meaning of DNS_FETCHOPT_EDNS512 by
ensuring the advertised UDP buffer size is set to 512 bytes when that
flag is set. Update existing comments and add new ones to improve code
readability.
[1] see commit 08c9026166
[2] see commit 8e15d5eb3a
The following message:
success resolving '<name>' (in '<domain>'?) after reducing the advertised EDNS UDP packet size to 512 octets
can currently be logged even if the EDNS UDP buffer size advertised in
queries sent to a given server had already been set to 512 octets before
the fetch context was created (e.g. due to the server responding
intermittently). In other words, this log message may be misleading as
lowering the advertised EDNS UDP buffer size may not be the actual cause
of <name> being successfully resolved. Remove the log message in
question to prevent confusion.
As this log message is the only existing user of the "reason" field in
struct fetchctx, remove that field as well, along with all the code
related to it.
This adds a unit test driver for BIND with Automake. It runs the unit
test program provided as its sole command line argument and then looks
for a core dump generated by that test program. If one is found, the
driver prints the backtrace into the test log.
Some operating systems (e.g. CentOS, OpenBSD) install the main pytest
script as "py.test-3". Add that name to the list of names passed to
AC_PATH_PROGS() in order for pytest to be properly detected on a broader
range of operating systems.
Use str.format() instead of f-strings in Python system tests to enable
them to work on Python 3 versions older than 3.6 as the latter is not
available on some operating systems used in GitLab CI that are still
actively supported (CentOS 6, Debian 9, Ubuntu 16.04).
As documentation building utilities are now all included in operating
system images used in GitLab CI, do not install them in each "docs" CI
job any more.
As Python QA tools, BIND system test prerequisites, and documentation
building utilities are now all included in operating system images used
in GitLab CI, do not use pip for installing them in each CI job any
more.
- First merge release branches to maintenance branches, then push
tags. If tags are pushed first and a given set of releases contains
security fixes, the push will be rejected by a server-side Git hook.
- Update ABI check job name.
- Add an item for updating QA tools used in GitLab CI after each
public release.
In process_fd we lock sock->lock and then internal_accept locks mgr->lock,
in isc_sockmgr_render* functions we lock mgr->lock and then lock sock->lock,
that can cause a deadlock when accessing stats. Unlock sock->lock early in
all the internal_{send,recv,connect,accept} functions instead of late
in process_fd.
Add a system test that counts how many address fetches are made
for different numbers of NS records and checks that the number
are successfully limited.
If there are more that 5 NS record for a zone only perform a
maximum of 4 address lookups for all the name servers. This
limits the amount of remote lookup performed for server
addresses at each level for a given query.
as the update triggers by the rndc command to clear the signing records
may not have completed by the time the subsequent rndc command to test
that the records have been removed is commenced. Loop several times to
prevent false negative.
cppcheck 2.0 reports false positives about uninitialized variables in a
lot of places throughout BIND source code, e.g.:
bin/dnssec/dnssec-cds.c:283:6: error: Uninitialized variable: length [uninitvar]
if (isc_buffer_availablelength(&buf) <= len) {
^
Apparently cppcheck 2.0 has issues with processing (&var)->field syntax,
which is what the macros from lib/isc/include/isc/buffer.h are evaluated
to. This issue was reported upstream [1] and will hopefully be
addressed in a future cppcheck release.
In the meantime, to avoid modifying BIND source code in multiple places
just because of a static checker false positive, work around the issue
by adding intermediate variables to buffer macro definitions using a sed
invocation in the cppcheck job script.
[1] https://sourceforge.net/p/cppcheck/discussion/general/thread/122153e3c1/
Add whitespace to the regular expression used for extracting the GCC
version from "gcc --version" output so that it works properly with
multi-digit major version numbers.
Commit ec72d1100d broke the cppcheck job
in GitLab CI: when cppcheck fails, the script is immediately
interrupted, preventing cppcheck-htmlreport from being run. To ensure
the HTML report is generated when cppcheck fails, revert to invoking
cppcheck-htmlreport in the "after_script" part of the job.
the dnstap test was pausing for 20 seconds to search for a string in
named.run, which only appears if named is built with --enable-developer or
--enable-querytrace.
wire_test is not only used by the dnstap system test, but also in
fuzz testing. it doesn't need to be installed, but it's useful to have it
built when BIND is. this commit moves it back from bin/tests/system to
bin/tests, as a noinst_PROGRAM so that it's built by "make all" but
not installed.
Although in util/api-checker.sh we create textual reports, we don't
preserve them in job artifacts, but we should.
We don't want to keep all HTML pages present in the project root, but
just those produced by ABI checker.
Instead of using bind() and passing the listening socket to the children
threads using uv_export/uv_import use one thread that does the accepting,
and then passes the connected socket using uv_export/uv_import to a random
worker. The previous solution had thundering herd problems (all workers
waking up on one connection and trying to accept()), this one avoids this
and is simpler.
The tcp clients quota is simplified with isc_quota_attach_cb - a callback
is issued when the quota is available.
The `statschannel/ns2/` was missing `manykeys.db.in`, but the test
succeeded even when `setup.sh` (or `clean.sh`) failed to execute. This
commit makes run.sh to run in stricter mode and fail the test
immediately when `clean.sh` or `setup.sh` fails.
The ARM and the manpages have been converted into Sphinx documentation
format.
Sphinx uses reStructuredText as its markup language, and many of its
strengths come from the power and straightforwardness of
reStructuredText and its parsing and translating suite, the Docutils.
There were two errors:
1. get_random() function was returning random number with leading zeros
that could lead the shell to interpret the number as octal value
instead of decimal. The surrounding whitespace was also causing
problems.
2. The calculation of the port was off, it was adding the whole range
and not just the min port to the base.
The introduction of netmgr doubled the number of threads from which
dnstap data may be logged: previously, it could only happen from within
taskmgr worker threads; with netmgr, it can happen both from taskmgr
worker threads and from network threads. Since the argument passed to
fstrm_iothr_options_set_num_input_queues() was not updated to reflect
this change, some calls to fstrm_iothr_get_input_queue() can now return
NULL, effectively preventing some dnstap data from being logged.
Whether this bug is triggered or not depends on thread scheduling order
and packet distribution between network threads, but will almost
certainly be triggered on any recursive resolver sooner or later. Fix
by requesting the correct number of dnstap input queues to be allocated.
Ensure the release checklist reflects our current release process:
- add an additional deadline for introducing code changes ("code
freeze"); only test and documentation tweaks can be applied to
pending releases after this deadline passes,
- notify Support and Marketing about an impending release earlier in
the process so that they have time to schedule a release note review
before the tagging deadline,
- examine current test results on all platforms in advance, to prevent
diagnosing and addressing test failures in the last minute before
the tagging deadline,
- check Perflab results earlier in the process to leave some room for
addressing any potential problems before code freeze,
- ensure empty release notes for the next set of releases are prepared
after public release.
nzf_append is conditionally compiled and this is intended to
catch error introduced by changes to the called functions on all
systems before the changes are run through the CI.
Test zones with various escape sequences and filesystem seperator
characters.
* escaped double quote (\")
* escaped escape (\\)
* escaped decimal byte value (\032)
* slash seperator (/)
Per Current Mechanisms 2.3.5, the curve name is DER-encoded in the
EC_PARAMS attribute, and the public key value is DER-encoded in the
EC_POINT attribute.
The system tests currently uses patchwork of shell scripts which doesn't
offer proper error handling.
This commit introduced option to write new tests in pytest framework
that also allows easier manipulation of DNS traffic (using dnspython),
native XML and JSON manipulation and proper error reporting.
named_os_openfile was being called with switch_user set to true
unconditionally leading to log messages about being unable to
switch user identity from named when regenerating the key.
When running on Linux and system capabilities are available, named will
drop the extra capabilities before loading the configuration. This led
to spurious warnings from `seteuid()` because named already dropped
CAP_SETUID and CAP_GETUID capabilities.
The fix removes setting the effective uid/gid when capabilities are
available, and adds a check that we are running under the user we were
requested to run.
Add recursive "test" and "unit" rules, which execute "make check"
in specific directories - "make test" runs the system tests, and
"make unit" runs the unit tests.
The current script used ephemeral port range which clashed with the
ports used by the tools (dig, ...), and the range always started with
the first port and there was 100 ports allocated for each system test.
In this commit, the first port has been randomized, the get_ports.sh
script outputs the variables (the output has to be eval'ed from run.sh)
and there's less waste in the port range.
There are several improvements over the default/previous behaviour of
the test log driver and log compiler:
* The system-test-driver.sh was dropped (it was used incorrectly)
* The run.sh script is now both log compiler and cli script to run
individual tests
* The custom-test-driver was added as extended version of the automake
test-driver with capability to tee the test output to stdout when
`--verbose yes` is passed to it (you can use LOG_DRIVER_FLAGS to
add the option by default)
* Makefile.am has been extended to honor V=1 for the system tests
test-driver (e.g. V=1 adds `--verbose yes` to AM_LOG_DRIVER_FLAGS)
fstrm_capture is not an essential utility, but its corresponding
Makefile token needs to substituted even if it is not found in PATH or
else the "dnstap" system test will consistently fail.
The bin/tests/wire_test helper program is currently not included in any
Makefile.am file. Move its source code to bin/tests/system and build it
along other helper tools when dnstap support is requested as the
"dnstap" system test needs this tool in order to pass.
Make yaml.load_all() use yaml.SafeLoader to address a warning currently
emitted when bin/tests/system/dnstap/ydump.py is run:
ydump.py:28: YAMLLoadWarning: calling yaml.load_all() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
for l in yaml.load_all(f.stdout):
The libirs contained own re-implementations of the getaddrinfo,
getnameinfo and gai_strerror + irs_context and irs_dnsconf API that was
unused anywhere in the BIND 9.
Keep just the irs_resonf API that is being extensively used to parse
/etc/resolv.conf by several of BIND 9 tools.
The 'ephemeral' database implementation was used to provide a
lightweight database implemenation that doesn't cache results, and the
only place where it was really use is "samples" because delv is
overriding this to use "rbtdb" instead. Otherwise it was completely
unused.
* The 'ephemeral' cache DB (ecdb) implementation. An ecdb just provides
* temporary storage for ongoing name resolution with the common DB interfaces.
* It actually doesn't cache anything. The implementation expects any stored
* data is released within a short period, and does not care about the
* scalability in terms of the number of nodes.
The three libdns tests directly include ../dst_internal.h which
in turn directly include openssl headers, thus there was a missing
path and build failure on systems where OpenSSL is not in the default
include path.
Right before the release API version (LIBINTERFACE, LIBREVISION, LIBAGE)
for older and newer libraries tends to be the same. Given that, commit
hash can't be the determining factor here, Unix time of the commit
should suit us better and is placed after the API version. The commit
hash is preserved as it's useful to see it in the actual report.
'-nosymtbl' versions of libraries are not produced in Automake builds.
this addresses a race that could occur during shutdown or when
reconfiguring to remove RPZ zones.
this change should ensure that the rpzs structure and the incremental
updates don't interfere with each other: rpzs->zones entries cannot
be set to NULL while an update quantum is running, and the
task should be destroyed and its queue purged so that no subsequent
quanta will run.
The rewrite of BIND 9 build system is a large work and cannot be reasonable
split into separate merge requests. Addition of the automake has a positive
effect on the readability and maintainability of the build system as it is more
declarative, it allows conditional and we are able to drop all of the custom
make code that BIND 9 developed over the years to overcome the deficiencies of
autoconf + custom Makefile.in files.
This squashed commit contains following changes:
- conversion (or rather fresh rewrite) of all Makefile.in files to Makefile.am
by using automake
- the libtool is now properly integrated with automake (the way we used it
was rather hackish as the only official way how to use libtool is via
automake
- the dynamic module loading was rewritten from a custom patchwork to libtool's
libltdl (which includes the patchwork to support module loading on different
systems internally)
- conversion of the unit test executor from kyua to automake parallel driver
- conversion of the system test executor from custom make/shell to automake
parallel driver
- The GSSAPI has been refactored, the custom SPNEGO on the basis that
all major KRB5/GSSAPI (mit-krb5, heimdal and Windows) implementations
support SPNEGO mechanism.
- The various defunct tests from bin/tests have been removed:
bin/tests/optional and bin/tests/pkcs11
- The text files generated from the MD files have been removed, the
MarkDown has been designed to be readable by both humans and computers
- The xsl header is now generated by a simple sed command instead of
perl helper
- The <irs/platform.h> header has been removed
- cleanups of configure.ac script to make it more simpler, addition of multiple
macros (there's still work to be done though)
- the tarball can now be prepared with `make dist`
- the system tests are partially able to run in oot build
Here's a list of unfinished work that needs to be completed in subsequent merge
requests:
- `make distcheck` doesn't yet work (because of system tests oot run is not yet
finished)
- documentation is not yet built, there's a different merge request with docbook
to sphinx-build rst conversion that needs to be rebased and adapted on top of
the automake
- msvc build is non functional yet and we need to decide whether we will just
cross-compile bind9 using mingw-w64 or fix the msvc build
- contributed dlz modules are not included neither in the autoconf nor automake
With the introduction of dnssec-policy, the aforementioned tools were
either rendered obsolete, or they will be replaced with dnssec-policy
based tools. Remove the tools and the requirement to have Python
installed. Python 3 is still being used for tests, so keep the autoconf
test, but make it much simpler.
Previously, the code would do:
REQUIRE(alg == CURVE1 || alg == CURVE2);
[...]
if (alg == CURVE1) { /* code for CURVE1 */ }
else { /* code for CURVE2 */ }
This approach is less extensible and also more prone to errors in case
the initial REQUIRE() is forgotten. The code has been refactored to
use:
REQUIRE(alg == CURVE1 || alg == CURVE2);
[...]
switch (alg) {
case CURVE1: /* code for CURVE1 */; break;
case CURVE2: /* code for CURVE2 */; break;
default: INSIST(0);
}
The pk11/constants.h header contained static CK_BYTE arrays and
we had to use #defines to pull only those we need. This commit
changes the constants to only define byte arrays with the content
and either use them directly or define the CK_BYTE arrays locally
where used.
Coverity showed that the return value of `dst_key_gettime` was
unchecked in INITIALIZE_STATE. If DST_TIME_CREATED was not set we
would set the state to be initialized to a weird last changed time.
This would normally not happen because DST_TIME_CREATED is always
set. However, we would rather set the time to now (as the comment
also indicates) not match the creation time.
The comment on INITIALIZE_STATE also needs updating as we no
longer always initialize to HIDDEN.
Revert the change from ad03c22e97 as
further testing has shown that with hyper-threading disabled, named with
ISC rwlocks outperforms named with pthread rwlocks in cold cache testing
scenarios. Since building named with pthread rwlocks might still be a
better choice for some workloads, keep the compile-time option which
enables that.
Add two tests that checks that dynamic zones
can be updated and will be signed appropriately.
One zone covers an update with freeze/thaw, the
other covers an update through nsupdate.
When dnssec-policy was introduced, it implicitly set inline-signing.
But DNSSEC maintenance required either inline-signing to be enabled,
or a dynamic zone. In other words, not in all cases you want to
DNSSEC maintain your zone with inline-signing.
Change the behavior and determine whether inline-signing is
required: if the zone is dynamic, don't use inline-signing,
otherwise implicitly set it.
You can also explicitly set inline-signing to yes with dnssec-policy,
the restriction that both inline-signing and dnssec-policy cannot
be set at the same time is now lifted.
However, 'inline-signing no;' on a non-dynamic zone with a
dnssec-policy is not possible.
The yamlget.py file was changed in !3311 as part of making the
python code pylint and flake8 compliant. This omitted setting
'item' to 'item[key]' which caused the digdelv yaml tests to fail.
Also, the pretty printing is not really necessary, so remove
the "if key not in item; print error" logic.
Change 5332 renamed "dnssec-keys" configuration statement to the
more descriptive "trust-anchors". Not all occurrences in the
documentation had been updated.
All our MSVS Project files share the same intermediate directory. We
know that this doesn't cause any problems, so we can just disable the
detection in the project files.
Example of the warning:
warning MSB8028: The intermediate directory (.\Release\) contains files shared from another project (dnssectool.vcxproj). This can lead to incorrect clean and rebuild behavior.
There was a missing indirection for the pluginlist_cb_t *callback in the
declaration of the cfg_pluginlist_foreach function. Reported by MSVC as:
lib\isccfg\parser.c(4057): warning C4028: formal parameter 4 different from declaration
Due to a way the stdatomic.h shim is implemented on Windows, the MSVC
always things that the outside type is the largest - atomic_(u)int_fast64_t.
This can lead to false positives as this one:
lib\dns\adb.c(3678): warning C4477: 'fprintf' : format string '%u' requires an argument of type 'unsigned int', but variadic argument 2 has type 'unsigned __int64'
We workaround the issue by loading the value in a scoped local variable
with correct type first.
MSVC documentation states: "This warning can be caused when a pointer to
a const or volatile item is assigned to a pointer not declared as
pointing to const or volatile."
Unfortunately, this happens when we dynamically allocate and deallocate
block of atomic variables using isc_mem_get and isc_mem_put.
Couple of examples:
lib\isc\hp.c(134): warning C4090: 'function': different 'volatile' qualifiers [C:\builds\isc-projects\bind9\lib\isc\win32\libisc.vcxproj]
lib\isc\hp.c(144): warning C4090: 'function': different 'volatile' qualifiers [C:\builds\isc-projects\bind9\lib\isc\win32\libisc.vcxproj]
lib\isc\stats.c(55): warning C4090: 'function': different 'volatile' qualifiers [C:\builds\isc-projects\bind9\lib\isc\win32\libisc.vcxproj]
lib\isc\stats.c(87): warning C4090: 'function': different 'volatile' qualifiers [C:\builds\isc-projects\bind9\lib\isc\win32\libisc.vcxproj]
The InterlockedOr8() and InterlockedAnd8() first argument was cast
to (atomic_int_fast8_t) instead of (atomic_int_fast8_t *), this was
reported by MSVC as:
warning C4024: '_InterlockedOr8': different types for formal and actual parameter 1
warning C4024: '_InterlockedAnd8': different types for formal and actual parameter 1
Our vcxproj files set the WarningLevel to Level3, which is too verbose
for a code that needs to be portable. That basically leads to ignoring
all the errors that MSVC produces. This commits downgrades the
WarningLevel to Level1 and enables treating warnings as errors for
Release builds. For the Debug builds the WarningLevel got upgraded to
Level4, and treating warnings as errors is explicitly disabled.
We should eventually make the code clean of all MSVC warnings, but it's
a long way to go for Level4, so it's more reasonable to start at Level1.
For reference[1], these are the warning levels as described by MSVC
documentation:
* /W0 suppresses all warnings. It's equivalent to /w.
* /W1 displays level 1 (severe) warnings. /W1 is the default setting
in the command-line compiler.
* /W2 displays level 1 and level 2 (significant) warnings.
* /W3 displays level 1, level 2, and level 3 (production quality)
warnings. /W3 is the default setting in the IDE.
* /W4 displays level 1, level 2, and level 3 warnings, and all level 4
(informational) warnings that aren't off by default. We recommend
that you use this option to provide lint-like warnings. For a new
project, it may be best to use /W4 in all compilations. This option
helps ensure the fewest possible hard-to-find code defects.
* /Wall displays all warnings displayed by /W4 and all other warnings
that /W4 doesn't include — for example, warnings that are off by
default.
* /WX treats all compiler warnings as errors. For a new project, it
may be best to use /WX in all compilations; resolving all warnings
ensures the fewest possible hard-to-find code defects.
1. https://docs.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level?view=vs-2019
Windows BIND releases produced by GitLab CI are built from Git
repositories, not from release tarballs, which means the "srcid" file is
not present in the top source directory when MSBuild is invoked. This
causes the Git commit hash for such builds to be set to "unset_id".
Enable win32utils/Configure to try determining the commit hash for a
build by invoking Git on the build host if the "srcid" file is not
present (which is what its Unix counterpart does).
Our python code didn't adhere to any coding standard. In this commit, we add
flame8 (https://pypi.org/project/flake8/), and pylint (https://www.pylint.org/).
There's couple of exceptions:
- ans.py scripts are not checked, nor fixed as part of this MR
- pylint's missing-*-docstring and duplicate-code checks have
been disabled via .pylintrc
Both exceptions should be removed in due time.
The assembly code generated by MSVC for at least some signed comparisons
involving atomic variables incorrectly uses unsigned conditional jumps
instead of signed ones. In particular, the checks in isc_log_wouldlog()
are affected in a way which breaks logging on Windows and thus also all
system tests involving a named instance. Work around the issue by
assigning the values returned by atomic_load_acquire() calls in
isc_log_wouldlog() to local variables before performing comparisons.
Resolve "DNS rebinding protection is ineffective when BIND is configured as a forwarding DNS server"
Closes#1574
See merge request isc-projects/bind9!3342
This test asserts that option "deny-answer-aliases" works correctly
when forwarding requests.
As a matter of example, the behavior expected for a forwarder BIND
instance, having an option such as deny-answer-aliases { "domain"; }
is that when forwarding a request for *.anything-but-domain, it is
expected that it will return SERVFAIL if any answer received has a CNAME
for "*.domain".
(cherry picked from commit 9bdb960a16a69997b08746e698b6b02c8dc6c795)
BIND wasn't honoring option "deny-answer-aliases" when configured to
forward queries.
Before the fix it was possible for nameservers listed in "forwarders"
option to return CNAME answers pointing to unrelated domains of the
original query, which could be used as a vector for rebinding attacks.
The fix ensures that BIND apply filters even if configured as a forwarder
instance.
(cherry picked from commit af6a4de3d5ad6c1967173facf366e6c86b3ffc28)
Increate the DNSKEY TTL of the migrate.kasp zone for the following
reason: The key states are initialized depending on the timing
metadata. If a key is present long enough in the zone it will be
initialized to OMNIPRESENT. Long enough here is the time when it
was published (when the setup script was run) plus DNSKEY TTL.
Otherwise it is set to RUMOURED, or to HIDDEN if no timing metadata
is set or the time is still in the future.
Since the TTL is "only" 5 minutes, the DNSKEY state may be
initialized to OMNIPRESENT if the test is slow, but we expect it
to be in RUMOURED state. If we increase the TTL to a couple of
hours it is very unlikely that it will be initialized to something
else than RUMOURED.
This fixes another intermittent failure in the kasp system test.
It does not happen often, except for in the Windows platform tests
where it takes a long time to run the tests.
In the "kasp" system test, there is an "rndc reconfig" call which
triggers a new rekey event. check_next_key_event() verifies the time
remaining from the moment "rndc reconfig" is called until the next key
event. However, the next key event time is calculated from the key
times provided during key creation (i.e. during test setup). Given
this, if "rndc reconfig" is called a significant amount of time after
the test is started, some check_next_key_event() checks will fail.
Fix by calculating the time passed since the start of the test and
when 'rndc reconfig' happens. Substract this time from the
calculated next key event.
This only needs to be done after an "rndc reconfig" on zones where
the keymgr needs to wait for a period of time (for example for keys
to become OMNIPRESENT, or HIDDEN). This is on step 2 and step 5 of
the algorithm rollover. In step 2 there is a waiting period before
the DNSKEY is OMNIPRESENT, in step 5 there is a waiting period
before the DNSKEY is HIDDEN.
In step 1 new keys are created, in step 3 and 4 key states just
entered OMNIPRESENT, and in step 6 we no longer care because the
key lifetime is unlimited and we default to checking once per hour.
Regardless of our indifference about the next key event after step 6,
change some of the key timings in the setup script to better
reflect reality: DNSKEY is in HIDDEN after step 5, DS times have
changed when the new DS became active.
In case of normal fetch, the .recursionquota is attached and
ns_statscounter_recursclients is incremented when the fetch is created. Then
the .recursionquota is detached and the counter decremented in the
fetch_callback().
In case of prefetch or rpzfetch, the quota is attached, but the counter is not
incremented. When we reach the soft-quota, the function returns early but don't
detach from the quota, and it gets destroyed during the ns_client_endrequest(),
so no memory was leaked.
But because the ns_statscounter_recursclients is only incremented during the
normal fetch the counter would be incorrectly decremented on two occassions:
1) When we reached the softquota, because the quota was not properly detached
2) When the prefetch or rpzfetch was cancelled mid-flight and the callback
function was never called.
Rather than group key ids together, group key id with its
corresponding counters. This should make growing / shrinking easier
than having keyids then counters.
Add a statschannel test case for DNSSEC sign metrics that has more
keys than there are allocated stats counters for. This will produce
gibberish, but at least it should not crash.
The first attempt to add DNSSEC sign statistics was naive: for each
zone we allocated 64K counters, twice. In reality each zone has at
most four keys, so the new approach only has room for four keys per
zone. If after a rollover more keys have signed the zone, existing
keys are rotated out.
The DNSSEC sign statistics has three counters per key, so twelve
counters per zone. First counter is actually a key id, so it is
clear what key contributed to the metrics. The second counter
tracks the number of generated signatures, and the third tracks
how many of those are refreshes.
This means that in the zone structure we no longer need two separate
references to DNSSEC sign metrics: both the resign and refresh stats
are kept in a single dns_stats structure.
Incrementing dnssecsignstats:
Whenever a dnssecsignstat is incremented, we look up the key id
to see if we already are counting metrics for this key. If so,
we update the corresponding operation counter (resign or
refresh).
If the key is new, store the value in a new counter and increment
corresponding counter.
If all slots are full, we rotate the keys and overwrite the last
slot with the new key.
Dumping dnssecsignstats:
Dumping dnssecsignstats is no longer a simple wrapper around
isc_stats_dump, but uses the same principle. The difference is that
rather than dumping the index (key tag) and counter, we have to look
up the corresponding counter.
Resolve "Changing from auto-dnssec maintain to dnssec-policy x immediately deletes existing keys"
Closes#1706
See merge request isc-projects/bind9!3322
Add a test to ensure migration from 'auto-dnssec maintain;' to
dnssec-policy works even if the algorithm is changed. The existing
keys should not be removed immediately, but their goal should be
changed to become hidden, and the new keys with the different
algorithm should be introduced immediately.
If we initialize goals on all keys, superfluous keys that match
the policy all desire to be active. For example, there are six
keys available for a policy that needs just two, we only want to
set the goal state to OMNIPRESENT on two keys, not six.
Migrating from 'auto-dnssec maintain;' to dnssec-policy did not
work properly, mainly because the legacy keys were initialized
badly. Earlier commit deals with migration where existing keys
match the policy. This commit deals with migration where existing
keys do not match the policy. In that case, named must not
immediately delete the existing keys, but gracefully roll to the
dnssec-policy.
However, named did remove the existing keys immediately. This is
because the legacy key states were initialized badly. Because
those keys had their states initialized to HIDDEN or RUMOURED, the
keymgr decides that they can be removed (because only when the key
has its states in OMNIPRESENT it can be used safely).
The original thought to initialize key states to HIDDEN (and
RUMOURED to deal with existing keys) was to ensure that those keys
will go through the required propagation time before the keymgr
decides they can be used safely. However, those keys are already
in the zone for a long time and making the key states represent
otherwise is dangerous: keys may be pulled out of the zone while
in fact they are required to establish the chain of trust.
Fix initializing key states for existing keys by looking more closely
at the time metadata. Add TTL and propagation delays to the time
metadata and see if the DNSSEC records have been propagated.
Initialize the state to OMNIPRESENT if so, otherwise initialize to
RUMOURED. If the time metadata is in the future, or does not exist,
keep initializing the state to HIDDEN.
The added test makes sure that new keys matching the policy are
introduced, but existing keys are kept in the zone until the new
keys have been propagated.
A few kasp system test tweaks to improve test failure debugging and
deal with tests related to migration to dnssec-policy.
1. When clearing a key, set lifetime to "none". If "none", skip
expect no lifetime set in the state file. Legacy keys that
are migrated but don't match the dnssec-policy will not have a
lifetime.
2. The kasp system test prints which key id and file it is checking.
Log explicitly if we are checking the id or a file.
3. Add quotes around "ID" when setting the key id, for consistency.
4. Fix a typo (non -> none).
5. Print which key ids are found, this way it is easier to see what
KEY[1-4] failed to match one of the key files.
Migrating from 'auto-dnssec maintain;' to dnssec-policy did not
work properly, mainly because the legacy keys were initialized
badly. Several adjustments in the keymgr are required to get it right:
- Set published time on keys when we calculate prepublication time.
This is not strictly necessary, but it is weird to have an active
key without the published time set.
- Initalize key states also before matching keys. Determine the
target state by looking at existing time metadata: If the time
data is set and is in the past, it is a hint that the key and
its corresponding records have been published in the zone already,
and the state is initialized to RUMOURED. Otherwise, initialize it
as HIDDEN. This fixes migration to dnssec-policy from existing
keys.
- Initialize key goal on keys that match key policy to OMNIPRESENT.
These may be existing legacy keys that are being migrated.
- A key that has its goal to OMNIPRESENT *or* an active key can
match a kasp key. The code was changed with CHANGE 5354 that
was a bugfix to prevent creating new KSK keys for zones in the
initial stage of signing. However, this caused problems for
restarts when rollovers are in progress, because an outroducing
key can still be an active key.
The test for this introduces a new KEY property 'legacy'. This is
used to skip tests related to .state files.
The rwlock introduced to protect the .logconfig member of isc_log_t
structure caused a significant performance drop because of the rwlock
contention. It was also found, that the debug_level member of said
structure was not protected from concurrent read/writes.
The .dynamic and .highest_level members of isc_logconfig_t structure
were actually just cached values pulled from the assigned channels.
We introduced an even higher cache level for .dynamic and .highest_level
members directly into the isc_log_t structure, so we don't have to
access the .logconfig member in the isc_log_wouldlog() function.
After an RPZ zone is updated via zone transfer, the RPZ summary
database is updated, inserting the newly added names in the policy
zone and deleting the newly removed ones. The first part of this
was quantized so it would not run too long and starve other tasks
during large updates, but the second part was not quantized, so
that an update in which a large number of records were deleted
could cause named to become briefly unresponsive.
We could have a race between handle closing and processing async
callback. Deactivate the handle before issuing the callback - we
have the socket referenced anyway so it's not a problem.
We introduce a isc_quota_attach_cb function - if ISC_R_QUOTA is returned
at the time the function is called, then a callback will be called when
there's quota available (with quota already attached). The callbacks are
organized as a LIFO queue in the quota structure.
It's needed for TCP client quota - with old networking code we had one
single place where tcp clients quota was processed so we could resume
accepting when the we had spare slots, but it's gone with netmgr - now
we need to notify the listener/accepter that there's quota available so
that it can resume accepting.
Remove unused isc_quota_force() function.
The isc_quote_reserve and isc_quota_release were used only internally
from the quota.c and the tests. We should not expose API we are not
using.
ORACLE MySQL 8.0 has dropped the my_bool type, so we need to reinstate
it back when compiling with that version or higher. MariaDB is still
keeping the my_bool type. The numbering between the two (MariaDB 5.x
jumped to MariaDB 10.x) doesn't make the life of the developer easy.
Most build/test job names already contain a "clang", "gcc", or "msvc"
prefix which indicates the compiler used for a given job. Apply that
naming convention to all build/test job names.
Multiple YAML keys have identical values for both TSAN unit test job
definitions. Extract these common keys to a YAML anchor and use it in
TSAN unit test job definitions to reduce code duplication.
Definitions of jobs running unit tests under TSAN contain an
"after_script" YAML key. Since the "unit_test_job" anchor is included
in those job definitions before "after_script" is defined, the
job-specific value of that key overrides the one defined in the included
anchor. This prevents "kyua report-html" from being run for TSAN unit
test jobs. Moving the invocation of "kyua report-html" to the "script"
key in the "unit_test_job" anchor is not acceptable as it would cause
the exit code of that command to determine the result of all unit test
jobs and we need that to be the exit code of "make unit". Instead, add
"kyua report-html" invocations to the "after_script" key of TSAN unit
test job definitions to address the problem without affecting other job
definitions.
Multiple YAML keys have identical values for both TSAN system test job
definitions. Extract these common keys to a YAML anchor and use it in
TSAN system test job definitions to reduce code duplication.
Both "system_test_job" and "unit_test_job" YAML anchors contain a
"before_script" key. TSAN job definitions first specify their own value
of the "before_script" key and then include the aforementioned YAML
anchors, which results in the value of the "before_script" key being
overridden with the value specified by the included anchor. Given this,
remove "before_script" definitions specific to TSAN jobs as they serve
no practical purpose.
All assignments for the TSAN_OPTIONS variable are identical across the
entire .gitlab-ci.yml file. Define a global TSAN_OPTIONS_COMMON
variable and use it in job definitions to reduce code duplication.
The custom builds (oot, asan, tsan) were mostly built using Debian sid
amd64 image. The problem was that this image broke too easily, because
it's Debian "unstable" after all.
This commit introduces "base_image" that should be most stable with
extra bits on top (clang, coccinelle, cppcheck, ...). Currently, that
would be Debian buster amd64.
Other changes introduced by this commit:
* Change the default clang version to 10
* Run both ASAN and TSAN with both gcc and clang compilers
* Remove Clang Debian stretch i386 job
These are mostly false positives, the clang-analyzer FAQ[1] specifies
why and how to fix it:
> The reason the analyzer often thinks that a pointer can be null is
> because the preceding code checked compared it against null. So if you
> are absolutely sure that it cannot be null, remove the preceding check
> and, preferably, add an assertion as well.
The 4 warnings reported are:
dnssec-cds.c:781:4: warning: Access to field 'base' results in a dereference of a null pointer (loaded from variable 'buf')
isc_buffer_availableregion(buf, &r);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:996:36: note: expanded from macro 'isc_buffer_availableregion'
^
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:821:16: note: expanded from macro 'ISC__BUFFER_AVAILABLEREGION'
(_r)->base = isc_buffer_used(_b); \
^~~~~~~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:152:29: note: expanded from macro 'isc_buffer_used'
((void *)((unsigned char *)(b)->base + (b)->used)) /*d*/
^~~~~~~~~
1 warning generated.
--
byname_test.c:308:34: warning: Access to field 'fwdtable' results in a dereference of a null pointer (loaded from variable 'view')
RUNTIME_CHECK(dns_fwdtable_add(view->fwdtable, dns_rootname,
^~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/util.h:318:52: note: expanded from macro 'RUNTIME_CHECK'
^~~~
/builds/isc-projects/bind9/lib/isc/include/isc/error.h:50:21: note: expanded from macro 'ISC_ERROR_RUNTIMECHECK'
((void)(ISC_LIKELY(cond) || \
^~~~
/builds/isc-projects/bind9/lib/isc/include/isc/likely.h:23:43: note: expanded from macro 'ISC_LIKELY'
^
1 warning generated.
--
./rndc.c:255:6: warning: Dereference of null pointer (loaded from variable 'host')
if (*host == '/') {
^~~~~
1 warning generated.
--
./main.c:1254:9: warning: Access to field 'sctx' results in a dereference of a null pointer (loaded from variable 'named_g_server')
sctx = named_g_server->sctx;
^~~~~~~~~~~~~~~~~~~~
1 warning generated.
References:
1. https://clang-analyzer.llvm.org/faq.html#null_pointer
The 3 warnings reported are:
os.c:872:7: warning: Although the value stored to 'ptr' is used in the enclosing expression, the value is never actually read from 'ptr'
if ((ptr = strtok_r(command, " \t", &last)) == NULL) {
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
--
rpz.c:1117:10: warning: Although the value stored to 'zbits' is used in the enclosing expression, the value is never actually read from 'zbits'
return (zbits &= x);
^ ~
1 warning generated.
--
openssleddsa_link.c:532:10: warning: Although the value stored to 'err' is used in the enclosing expression, the value is never actually read from 'err'
while ((err = ERR_get_error()) != 0) {
^ ~~~~~~~~~~~~~~~
1 warning generated.
tcpdns used transport-specific functions to operate on the outer socket.
Use generic ones instead, and select the proper call in netmgr.c.
Make the missing functions (e.g. isc_nm_read) generic and add type-specific
calls (isc__nm_tcp_read). This is the preparation for netmgr TLS layer.
There are several reason why remove Debian 8 from the CI:
* Debian 8 ("jessie") has been superseded by Debian 9 ("stretch").
* Regular security support updates have been discontinued as of
June 17th, 2018.
* Jessie LTS is supported from 17th June 2018 to June 30, 2020.
In other words, it's no longer officially supported by Debian security
team, but by the volunteer/paid contributor composed LTS team. And the
release will be discontinued in three months from now. We can use the
freed CI resources to bring new platforms or just to make the jobs run a
bit faster.
The reference BIND version used in the ABI check CI job is not
determined automatically - it needs to be updated after each BIND
release. Reflect that fact in the release checklist to make sure the
ABI check CI job is always comparing current code with the latest BIND
release on a given branch.
The files configure.ac and version are already up to date.
Updated CHANGES with 9.17.0 release line.
Fixed CHANGES by adding GitLab reference to entry 5357 and fix
grammar mistakes.
Add missing /util/check-make-install.in to .gitattributes.
The lib/*/api are already updated to match the new ranges.
I listed two new features under BIND 9.17 features that to me
seemed noteworthy.
The release notes look good to me.
When unit test fails, core file is created. Kyua's 'debug' command can
run GDB on it and provide backtrace. Unfortunately Kyua is picky about
location of these core files we opt to use custom Kyua fork and copy
core files from Kyua working directory to source tree and make it
available in GitLab.
The tkey test was not adapted to dynamic ports, so we had to run it in
sequence. This commit adds support for dynamic ports, and also makes
all the scripts shellcheck clean.
The eddsa test was not adapted to dynamic ports, so we had to run it in
sequence. This commit adds support for dynamic ports, and also makes
all the scripts shellcheck clean.
The ecdsa test was not adapted to dynamic ports, so we had to run it in
sequence. This commit adds support for dynamic ports, and also makes
all the scripts shellcheck clean.
The environment variable MAKE has been replaced with MAKE_COMMAND,
because overriding MAKE variable also changed the definition of the MAKE
inside the Makefiles, and we want only a single wrapper around the whole
build process.
Previously, setting `MAKE` to `bear make` meant that `bear make` would
be run at every nested make invocation, which messed up the upcoming
automake transition as compile_commands.json would be generated in every
subdirectory instead of just having one central file at the top of the
build tree.
All jobs now use solely the newer needs configuration to declare
dependencies between jobs:
needs:
- job: <foo>
artifacts: true
instead of combination of dependencies and needs which is deprecated.
This change completely unbundles the stages (alas the stages still needs
to stay because the job graph has to stay acyclic between the stages).
In isc_log_woudlog() the .logconfig member of isc_log_t structure was
accessed unlocked on the merit that there could be just a race when
.logconfig would be NULL, so the message would not be logged. This
turned not to be true, as there's also data race deeper. The accessed
isc_logconfig_t object could be in the middle of destruction, so the
pointer would be still non-NULL, but the structure members could point
to a chunk of memory no longer belonging to the object. Since we are
only accessing integer types (the log level), this would never lead to
a crash, it leads to memory access to memory area no longer belonging to
the object and this a) wrong, b) raises a red flag in thread-safety tools.
The isc_mem API now crashes on memory allocation failure, and this is
the next commit in series to cleanup the code that could fail before,
but cannot fail now, e.g. isc_result_t return type has been changed to
void for the isc_log API functions that could only return ISC_R_SUCCESS.
On Windows, C11 localtime_r() and gmtime_r() functions are not
available. While localtime() and gmtime() functions are already thread
safe because they use Thread Local Storage, it's quite ugly to #ifdef
around every localtime_r() and gmtime_r() usage to make the usage also
thread-safe on POSIX platforms.
The commit adds wrappers around Windows localtime_s() and gmtime_s()
functions.
NOTE: The implementation of localtime_s and gmtime_s in Microsoft CRT
are incompatible with the C standard since it has reversed parameter
order and errno_t return type.
The <isc/md.h> header directly included <openssl/evp.h> header which
enforced all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto. By hiding the specific
implementation into the private namespace, we no longer enforce this.
In the long run, this might also allow us to switch cryptographic
library implementation without affecting the downstream users.
While making the isc_md_type_t type opaque, the API using the data type
was changed to use the pointer to isc_md_type_t instead of using the
type directly.
This new option was added to fill a gap in RPZ configuration
options.
It was possible to instruct BIND wheter NSIP rewritting rules would
apply or not, as long as the required data was already in cache or not,
respectively, by means of the option nsip-wait-recurse.
A value of yes (default) could incur a little processing cost, since
BIND would need to recurse to find NS addresses in case they were not in
the cache.
This behavior could be changed by setting nsip-wait-recurse value to no,
in which case BIND would promptly return some error code if the NS IP addresses
data were not in cache, then BIND would start a recursive query
in background, so future similar requests would have the required data
(NS IPs) in cache, allowing BIND to apply NSIP rules accordingly.
A similar feature wasn't available for NSDNAME triggers, so this commit
adds the option nsdname-wait-recurse to fill this gap, as it was
expected by couple BIND users.
To get rid of the currently used FreeBSD-specific executor, move FreeBSD
CI jobs to libvirt-based executors. Make the necessary tag and variable
adjustments.
Waiting for the reply message will ensure that all messages being
looked for exist in the logs at the time of checking. When the
test was only waiting for the send message there was a race between
grep and the ns1 instance of named logging that it had seen the
request.
Save 'i' to 'locknum' and use that rather than using
'header->node->locknum' when performing the deferred
unlock as 'header->node->locknum' can theoretically be
different to 'i'.
The <isc/md.h> header directly included <openssl/hmac.h> header which
enforced all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto. By hiding the specific
implementation into the private namespace, we no longer enforce this.
In the long run, this might also allow us to switch cryptographic
library implementation without affecting the downstream users.
The two "functions" that isc/safe.h declared before were actually simple
defines to matching OpenSSL functions. The downside of the approach was
enforcing all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto. By hiding the specific
implementation into the private namespace changing the defines into
simple functions, we no longer enforce this. In the long run, this
might also allow us to switch cryptographic library implementation
without affecting the downstream users.
The previous commit removed the code related to the internal symbol
table. On platforms where available, we can now use backtrace_symbols()
to print more verbose symbols table to the output.
As there's now general availability of backtrace() and
backtrace_symbols() functions (see below), the commit also removes the
usage of glibc internals and the custom stack tracing.
* backtrace(), backtrace_symbols(), and backtrace_symbols_fd() are
provided in glibc since version 2.1.
* backtrace(), backtrace_symbols(), and backtrace_symbols_fd() first
appeared in Mac OS X 10.5.
* The backtrace() library of functions first appeared in NetBSD 7.0 and
FreeBSD 10.0.
The kasp system test is timing critical. The test passes on all
Linux based machines, but fails frequently on Windows. The test
takes a lot more time on Windows and at the final checks fail
because the expected next key event is too far off. For example:
I:kasp:check next key event for zone step2.algorithm-roll.kasp (570)
I:kasp:error: bad next key event time 20909 for zone \
step2.algorithm-roll.kasp (expect 21600)
I:kasp:failed
This is because the kasp system test calculates the time when the
next key event should occur based on the policy. This assumes that
named is able to do key management within a minute. But starting,
named, doing key management for other zones, and reconfiguring takes
much more time on Windows and thus the next key event on Windows is
much shorter than anticipated.
That this happens is a good thing because this means that the
correct next key event is used, but is not so nice for testing, as
it is hard to determine how much time named needed before finishing
the current key event.
Disable the kasp test on Windows now because it is blocking the
release. We know the cause of these test failures, and it is clear
that this is a fault in the test, not the code. Therefore we feel
comfortable disabling the test right now and work on a fix while
unblocking the release.
A data race was happening while BIND was starting due to
isc_log_wouldlog function accessing lctx->logconfig without a lock.
To prevent that without incurring much costs, that variable was made
atomic.
ABI checker tools generate HTML and TXT API compatibility reports of
BIND libraries. Comparison is being done between two bind source trees
which hold built BIND.
In the CI one version is the reference version defined by
BIND_BASELINE_VERSION variable, the latter one is the HEAD of branch
under test.
When configuring the same dnssec-policy for two zones with the same
name but in different views, there is a race condition for who will
run the keymgr first. If running sequential only one set of keys will
be created, if running parallel two set of keys will be created.
Lock the kasp when running looking for keys and running the key
manager. This way, for the same zone in different views only one
keyset will be created.
The dnssec-policy does not implement sharing keys between different
zones.
OpenBSD virtual machines seem to affected particularly badly by other
activity happening on the host. This causes trouble around release
time: when multiple tags are pushed to the repository, a large number of
jobs is started concurrently on all CI runners. In extreme cases, this
causes the system test suite to run for about an hour (!) on OpenBSD
VMs, with multiple tests failing. We investigated the test artifacts
for all such cases in the past and the outcome was always the same: test
failures were caused by extremely slow I/O on the guest. We tried
various tricks to work around this problem, but nothing helped.
Given the above, stop running OpenBSD system test jobs for pending BIND
releases to prevent the results of these jobs from affecting the
assessment of a given release's readiness for publication. This change
does not affect OpenBSD build jobs. OpenBSD system test jobs will still
be run for scheduled and web-requested pipelines, to make sure we catch
any severe issues with test code on that platform sooner or later.
Some comments started with a lowercased letter. Capitalized them to
be more consistent with the rest of the comments.
Add some newlines between `set_*` calls and check calls, also to be
more consistent with the other test cases.
There is a failure mode which gets triggered on heavily loaded
systems. A key change is scheduled in 5 seconds to make ZSK2 inactive
and ZSK3 active, but `named` takes more than 5 seconds to progress
from `rndc loadkeys` to the query check. At this time the SOA RRset
is already signed by the new ZSK which is not expected to be active
at that point yet.
Split up the checks to test the case where RRsets are signed
correctly with the offline KSK (maintained the signature) and
the active ZSK. First run, RRsets should be signed with the still
active ZSK2, second run RRsets should be signed with the new active
ZSK3.
This commit fixes isc_glob function on windows environments.
The file_list_t * object pointed to by pglob->reserved was missing
ISC_LIST_INIT intialization macro.
We may be checking the algorithm steps too fast: the reconfig
command may still be in progress. Make sure the zones are signed
and loaded by digging the NSEC records for these zones.
Algorithm rollover waited too long before introducing zone
signatures. It waited to make sure all signatures were resigned,
but when introducing a new algorithm, all signatures are resigned
immediately. Only add the sign delay if there is a predecessor key.
Algorithm rollover was stuck on submitting DS because keymgr thought
it would move to an invalid state. It did not match the current
key because it checked it against the current key in the next state.
Fixed by when checking the current key, check it against the desired
state, not the existing state.
Add a test case for algorithm rollover. This is triggered by
changing the dnssec-policy. A new nameserver ns6 is introduced
for tests related to dnssec-policy changes.
This requires a slight change in check_next_key_event to only
check the last occurrence. Also, change the debug log message in
lib/dns/zone.c to deal with checks when no next scheduled key event
exists (and default to loadkeys interval 3600).
Algorithm rollover will require four keys so introduce KEY4.
Also it requires to look at key files for multiple algorithms so
change getting key ids to be algorithm rollover agnostic (adjusting
count checks). The algorithm will be verified in check_key so
relaxing 'get_keyids' is fine.
Replace '${_alg_num}' with '$(key_get KEY[1-4] ALG_NUM)' in checks
to deal with multiple algorithms.
HAVE_UV_IMPORT and other config.h macros must not be set unconditionally
because no existing libuv release exposes uv_import() and/or uv_export()
yet. Windows builds not passing an explicit path to libuv to
win32utils/Configure are currently broken because of this, so comment
out the offending lines and describe when the aforementioned config.h
macros should be set.
Send AXFR instead of requested IXFR if the size of the incremental transfer is too large to sensibly IXFR
Closes#1375 and #1515
See merge request isc-projects/bind9!3113
- change name of 'bytes' to 'xfrsize' in dns_db_getsize() parameter list
and related variables; this is a more accurate representation of what
the function is doing
- change the size calculations in dns_db_getsize() to more accurately
represent the space needed for a *XFR message or journal file to contain
the data in the database. previously we returned the sizes of all
rdataslabs, including header overhead and offset tables, which
resulted in the database size being reported as much larger than the
equivalent *XFR or journal.
- map files caused a particular problem here: the fullname can't be
determined from the node while a file is being deserialized, because
the uppernode pointers aren't set yet. so we store "full name length"
in the dns_rbtnode structure while serializing, and clear it after
deserialization is complete.
the call initailizing a journal iterator can now optionally return
to the caller the size in bytes of an IXFR message (not including
DNS header overhead, signatures etc) containing the differences from
the beginning to the ending serial number.
this is calculated by scanning the journal transaction headers to
calculate the transfer size. since journal file records contain a length
field that is not included in IXFR messages, we subtract out the length
of those fields from the overall transaction length.
this necessitated adding an "RR count" field to the journal transaction
header, so we know how many length fields to subract. NOTE: this will
make existing journal files stop working!
Fixes a race between ns_client_killoldestquery and ns_client_endrequest -
killoldestquery takes a client from `recursing` list while endrequest
destroys client object, then killoldestquery works on a destroyed client
object. Prevent it by holding reclist lock while cancelling query.
- Define the SLOT environment variable before starting the test. This
variable defaults to 0 and that does not work with SoftHSM 2.
- The system test expects the PIN environment variable to be set to
"1234" while bin/tests/prepare-softhsm2.sh sets it to "0000".
Update bin/tests/prepare-softhsm2.sh so that it sets the PIN to
"1234".
- Move contents of bin/tests/system/pkcs11/prereq.sh to
bin/tests/system/pkcs11/setup.sh as the former was creating a file
called "supported" that was getting removed by the latter before
bin/tests/system/pkcs11/tests.sh could access it.
- Fix typo in "have_ecx".
- no longer exclude these entries when dumping the NTA table
- indicate "validate-except" entries with the keyword "permanent" in
place of an expiry date
- add a test for this feature, and update other tests to account for
the presence of extra lines in some rndc outputs
- incidentally removed the unused function dns_ntatable_dump()
- CHANGES, release note
There was a very slim chance of a race between isc_socket_detach and
process_fd: isc_socket_detach decrements references to 0, and before it
calls destroy gets preempted. Second thread calls process_fd, increments
socket references temporarily to 1, and then gets preempted, first thread
then hits assertion in destroy() as the reference counter is now 1 and
not 0.
With RRSIG records no longer being signed with the full
sig-validity-interval we need to ensure the zone->resigntime
as it may need to be set to a earlier time.
Previously badcache used one single mutex for everything, which
was causing performance issues. Use one global rwlock for the whole
hashtable and per-bucket mutexes.
When --with-zlib is passed to ./configure (or when the latter
autodetects zlib's presence), libisc uses certain zlib functions and
thus libisc's users should be linked against zlib in that case. Adjust
Makefile variables appropriately to prevent shared build failures caused
by underlinking.
mem.c:add_trace_entry() -> isc_hash_function() -> isc_siphash24()
129 for (; in != end; in += 8) {
6. byte_swapping: Performing a byte swapping operation on
in implies that it came from an external source, and is
therefore tainted.
130 uint64_t m = U8TO64_LE(in);
sending each group of queries simultaneously, and then checking the
output after the last one finishes, reduces the runtime of the
serve-stale test by about six minutes.
"yes" and "no" are permissible synonyms for "on" and "off", which
use exactly the same code paths. making sure they work isn't a good
use of 80 seconds of test time.
Fix a potential assertion failure on shutdown in ns__client_endrequest.
Scenario:
1. We are shutting down, interface->clientmgr is gone.
2. We receive a packet, it gets through ns__client_request
3. mgr == NULL, return
4. isc_nmhandle_detach calls ns_client_reset_cb
5. ns_client_reset_cb calls ns_client_endrequest
6. INSIST(client->state == NS_CLIENTSTATE_WORKING ||
client->state == NS_CLIENTSTATE_RECURSING) is not met
- we haven't started processing this packet so
client->state == NS_CLIENTSTATE_READY.
As a solution - don't do anything in ns_client_reset_cb if the client
is still in READY state.
- the configuration summary reported zlib compression was not
supported even when it was.
- when bind.keys.h was regenerated it violated clang-format style.
The change introduced by commit be159f5565
was not fully complete. Adjust ./configure summary so that it reflects
the new way the --with-tuning switch works, fixing the Autoconf variable
used for determining the value of that switch. Fix win32utils/Configure
so that it behaves the same way as its Unix counterpart.
* ctx needs to be destroyed before it is regenerated.
* emit the name of the signature to be replaced.
* cleanup memory before asserting so post longjump doesn't detect a
memory leak.
* comment code.
If a filename (the last argument) is not provided for named-checkzone or
named-compilezone, or if it is a single dash "-" character,
zone data will be read from stdin.
Example of invocation:
cat /etc/zone_name.db | named-compilezone -f text -F raw \
-o zone_name.raw zone_name
BSD sed does not recognize \s as a whitespace matching token. Make the
sed script in doc/arm/Makefile.in which ensures GitLab identifiers are
not split across lines portable by replacing \s with [[:space:]].
Artifacts generated by the docs:sid:amd64 job need to be retained longer
than for other jobs as they are used for building bind.isc.org contents.
If these artifacts are removed too quickly, pipelines in the pages/bind
GitLab project start failing, preventing content updates from being
published. Increase lifetime of the relevant job artifacts to prevent
this from happening.
We were using our own versions of isc_uv_{export,import} functions
for multithreaded TCP listeners. Upcoming libuv version will
contain proper uv_{export,import} functions - use them if they're
available.
Upcoming version of libuv will suport uv_recvmmsg and uv_sendmmsg. To
use uv_recvmmsg we need to provide a larger buffer and be able to
properly free it.
isc_task_pause/unpause were inherently thread-unsafe - a task
could be paused only once by one thread, if the task was running
while we paused it it led to races. Fix it by making sure that
the task will pause if requested to, and by using a 'pause reference
counter' to count task pause requests - a task will be unpaused
iff all threads unpause it.
Don't remove from queue when pausing task - we lock the queue lock
(expensive), while it's unlikely that the task will be running -
and we'll remove it anyway in dispatcher
this corrects some style glitches such as:
```
long_function_call(arg, arg2, arg3, arg4, arg5, "str"
"ing");
```
...by adjusting the penalties for breaking strings and call
parameter lists.
While testing BIND 9 on arm64 8+ core machine, it was discovered that
the weak variants in fact does spuriously fail, we haven't observed that
on other architectures.
This commit replaces all non-loop usage of atomic_compare_exchange_weak
with atomic_compare_exchange_strong.
This commit simplifies a bit the lock management within dns_resolver_prime()
and prime_done() functions by means of turning resolver's attribute
"priming" into an atomic_bool and by creating only one dependent object on the
lock "primelock", namely the "primefetch" attribute.
By having the attribute "priming" as an atomic type, it save us from having to
use a lock just to test if priming is on or off for the given resolver context
object, within "dns_resolver_prime" function.
The "primelock" lock is still necessary, since dns_resolver_prime() function
internally calls dns_resolver_createfetch(), and whenever this function
succeeds it registers an event in the task manager which could be called by
another thread, namely the "prime_done" function, and this function is
responsible for disposing the "primefetch" attribute in the resolver object,
also for resetting "priming" attribute to false.
It is important that the invariant "priming == false AND primefetch == NULL"
remains constant, so that any thread calling "dns_resolver_prime" knows for sure
that if the "priming" attribute is false, "primefetch" attribute should also be
NULL, so a new fetch context could be created to fulfill this purpose, and
assigned to "primefetch" attribute under the lock protection.
To honor the explanation above, dns_resolver_prime is implemented as follow:
1. Atomically checks the attribute "priming" for the given resolver context.
2. If "priming" is false, assumes that "primefetch" is NULL (this is
ensured by the "prime_done" implementation), acquire "primelock"
lock and create a new fetch context, update "primefetch" pointer to
point to the newly allocated fetch context.
3. If "priming" is true, assumes that the job is already in progress,
no locks are acquired, nothing else to do.
To keep the previous invariant consistent, "prime_done" is implemented as follow:
1. Acquire "primefetch" lock.
2. Keep a reference to the current "primefetch" object;
3. Reset "primefetch" attribute to NULL.
4. Release "primefetch" lock.
5. Atomically update "priming" attribute to false.
6. Destroy the "primefetch" object by using the temporary reference.
This ensures that if "priming" is false, "primefetch" was already reset to NULL.
It doesn't make any difference in having the "priming" attribute not protected
by a lock, since the visible state of this variable would depend on the calling
order of the functions "dns_resolver_prime" and "prime_done".
As an example, suppose that instead of using an atomic for the "priming" attribute
we employed a lock to protect it.
Now suppose that "prime_done" function is called by Thread A, it is then preempted
before acquiring the lock, thus not reseting "priming" to false.
In parallel to that suppose that a Thread B is scheduled and that it calls
"dns_resolver_prime()", it then acquires the lock and check that "priming" is true,
thus it will consider that this resolver object is already priming and it won't do
any more job.
Conversely if the lock order was acquired in the other direction, Thread B would check
that "priming" is false (since prime_done acquired the lock first and set "priming" to false)
and it would initiate a priming fetch for this resolver.
An atomic variable wouldn't change this behavior, since it would behave exactly the
same, depending on the function call order, with the exception that it would avoid
having to use a lock.
There should be no side effects resulting from this change, since the previous
implementation employed use of the more general resolver's "lock" mutex, which
is used in far more contexts, but in the specifics of the "dns_resolver_prime"
and "prime_done" it was only used to protect "primefetch" and "priming" attributes,
which are not used in any of the other critical sections protected by the same lock,
thus having zero dependency on those variables.
- add util/cformat.sh, which runs clang-format on all C files with
the default .clang-format, and on all header files with a slightly
modified version.
- use correct bracing after multi-line control statements
- stop aligning variable declarations to avoid problems with pointer
alignment, but retain aligned declarations in header files so that
struct definitions look cleaner.
- static function prototypes in C files can skip the line break after
the return type, but function prototypes in header files still have
the line break.
- don't break-before-brace in function definitions. ISC style calls
for braces on the same line when function parameters fit on a single
line, and a line break if they don't, but clang-format doesn't yet
support that distinction. one-line function definitions are about
four times more common than multi-line, so let's use the option that
deviates less.
Both clang-tidy and uncrustify chokes on statement like this:
for (...)
if (...)
break;
This commit uses a very simple semantic patch (below) to add braces around such
statements.
Semantic patch used:
@@
statement S;
expression E;
@@
while (...)
- if (E) S
+ { if (E) { S } }
@@
statement S;
expression E;
@@
for (...;...;...)
- if (E) S
+ { if (E) { S } }
@@
statement S;
expression E;
@@
if (...)
- if (E) S
+ { if (E) { S } }
Submissions to Coverity Scan should be limited to those originated from
release branches and only from a specific schedule which holds
COVERITY_SCAN_PROJECT_NAME and COVERITY_SCAN_TOKEN variables.
This job requires two CI variables to be set:
- COVERITY_SCAN_PROJECT_NAME: project name, which is associated with
the BIND branch for which this job is executed, e.g. "bind-master",
- COVERITY_SCAN_TOKEN: project token.
There was a hard limit set on number of uvreq and nmhandles
that can be allocated by a pool, but we don't handle a situation
where we can't get an uvreq. Don't limit the number at all,
let the OS deal with it.
The memory ordering in the rwlock was all wrong, I am copying excerpts
from the https://en.cppreference.com/w/c/atomic/memory_order#Relaxed_ordering
for the convenience of the reader:
Relaxed ordering
Atomic operations tagged memory_order_relaxed are not synchronization
operations; they do not impose an order among concurrent memory
accesses. They only guarantee atomicity and modification order
consistency.
Release-Acquire ordering
If an atomic store in thread A is tagged memory_order_release and an
atomic load in thread B from the same variable is tagged
memory_order_acquire, all memory writes (non-atomic and relaxed atomic)
that happened-before the atomic store from the point of view of thread
A, become visible side-effects in thread B. That is, once the atomic
load is completed, thread B is guaranteed to see everything thread A
wrote to memory.
The synchronization is established only between the threads releasing
and acquiring the same atomic variable. Other threads can see different
order of memory accesses than either or both of the synchronized
threads.
Which basically means that we had no or weak synchronization between
threads using the same variables in the rwlock structure. There should
not be a significant performance drop because the critical sections were
already protected by:
while(1) {
if (relaxed_atomic_operation) {
break;
}
LOCK(lock);
if (!relaxed_atomic_operation) {
WAIT(sem, lock);
}
UNLOCK(lock)l
}
I would add one more thing to "Don't do your own crypto, folks.":
- Also don't do your own locking, folks.
The code for specifying OpenSSL PKCS#11 engine as part of the label
(e.g. -l "pkcs11:token=..." instead of -E pkcs11 -l "token=...")
was non-functional. This commit just cleans the related code.
Also disable the semantic patch as the code needs tweaks here and there because
some destroy functions might not destroy the object and return early if the
object is still in use.
Our destroy functions usually look like this:
void
foo_destroy(foo_t **foop) {
foo_t foo = *foop;
...destroy the contents of foo...
*foop = NULL;
}
nulling the pointer should be done as soon as possible which is
not always the case. This commit adds simple semantic patch that
changes the example function to:
void
foo_destroy(foo_t **foop) {
foo_t foo = *foop;
*foop = NULL;
...destroy the contents of foo...
}
On OpenBSD, the bin/tests/system/pipelined/ans5/ans.py script does not
shut down when it is sent the SIGTERM signal. What seems to be
happening is that starting the UDP listening thread somehow makes the
accept() calls in the script's main thread uninterruptible and thus the
SIGTERM signal sent to the main thread does not get processed until a
TCP connection is established with the script's TCP socket. Work around
the issue by setting a timeout for operations performed on the script's
TCP socket, so that each accept() call in the main thread's infinite
loop returns after at most 1 second, allowing termination signals sent
to the script to be processed.
The key-directory keyword actually does nothing right now but may
be useful in the future if we want to differentiate between key
directories or HSM keys, or if we want to speficy different
directories for different keys or policies. Make it optional for
the time being.
The keyword 'unlimited' can be used instead of PT0S which means the
same but is more comprehensible for users.
Also fix some redundant "none" parameters in the kasp test.
Creation of EVP_MD_CTX and EVP_PKEY is quite expensive, until
we fix the code to reuse the context and key we'll use our own
implementation of siphash.
Add checks to the kasp system test to verify CDNSKEY publication.
This test is not entirely complete, because when there is a CDNSKEY
available but there should not be one for KEY N, it is hard to tell
whether the existing CDNSKEY actually belongs to KEY N or another
key.
The check works if we expect a CDNSKEY although we cannot guarantee
that the CDNSKEY is correct: The test verifies existence, not
correctness of the record.
When you do a restart or reconfig of named, or rndc loadkeys, this
triggers the key manager to run. The key manager will check if new
keys need to be created. If there is an active key, and key rollover
is scheduled far enough away, no new key needs to be created.
However, there was a bug that when you just start to sign your zone,
it takes a while before the KSK becomes an active key. An active KSK
has its DS submitted or published, but before the key manager allows
that, the DNSKEY needs to be omnipresent. If you restart named
or rndc loadkeys in quick succession when you just started to sign
your zone, new keys will be created because the KSK is not yet
considered active.
Fix is to check for introducing as well as active keys. These keys
all have in common that their goal is to become omnipresent.
In system tests on Windows tool's local port can sometimes clash with
'named'. On Unix the system is poked for the minimal local port,
otherwise is set to 32768 as a sane minimum. For Windows we don't
poke but set a hardcoded limit; this change aligns the limit with
Unix and changes it to 32768.
10067 cleanup:
CID 1452683 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking dispatch suggests that it
may be null, but it has already been dereferenced on all
paths leading to the check.
10068 if (dispatch != NULL)
10069 isc_mem_put(server->mctx, dispatch, sizeof(*dispatch));
1549 cleanup:
1550 if (dctx->dbiter != NULL)
1551 dns_dbiterator_destroy(&dctx->dbiter);
1552 if (dctx->db != NULL)
1553 dns_db_detach(&dctx->db);
CID 1452686 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking dctx suggests that it may
be null, but it has already been dereferenced on all paths
leading to the check.
1554 if (dctx != NULL)
1555 isc_mem_put(mctx, dctx, sizeof(*dctx));
707 complete_allnds:
CID 1452689 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking dir_list suggests that it
may be null, but it has already been dereferenced on all
paths leading to the check.
708 if (dir_list != NULL) {
709 /* clean up entries from list. */
389 else
CID 1452695 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking lcfg suggests that it may
be null, but it has already been dereferenced on all paths
leading to the check.
390 if (lcfg != NULL)
391 isc_logconfig_destroy(&lcfg);
122 cleanup:
CID 1452696 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking s suggests that it may be
null, but it has already been dereferenced on all paths
leading to the check.
123 if (s != NULL)
124 isc_mem_free(mctx, s);
255 flag_fail:
256 /* get rid of what was build of the query list */
CID 1452697 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking tql suggests that it may
be null, but it has already been dereferenced on all paths
leading to the check.
257 if (tql != NULL)
258 destroy_querylist(mctx, &tql);
6412 cleanup:
6413 dns_rdataset_disassociate(&neg);
6414 dns_rdataset_disassociate(&negsig);
CID 1452700 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking closest suggests that it
may be null, but it has already been dereferenced on all
paths leading to the check.
6415 if (closest != NULL)
6416 free_noqname(mctx, &closest);
336 cleanup_mem:
337 /* cleanup memory */
338
339 /* free tmpPath memory */
CID 1452701 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking tmpPath suggests that it
may be null, but it has already been dereferenced on all
paths leading to the check.
340 if (tmpPath != NULL && result != ISC_R_SUCCESS)
341 isc_mem_free(named_g_mctx, tmpPath);
342
343 /* free tmpPath memory */
344 return (result);
13429 cleanup:
13430 cancel_refresh(zone);
CID 1452702 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking stub suggests that it may
be null, but it has already been dereferenced on all paths
leading to the check.
13431 if (stub != NULL) {
13432 stub->magic = 0;
6367cleanup:
6368 dns_rdataset_disassociate(&neg);
6369 dns_rdataset_disassociate(&negsig);
CID 1452704 (#1 of 1): Dereference before null check
(REVERSE_INULL) check_after_deref: Null-checking noqname
suggests that it may be null, but it has already been
dereferenced on all paths leading to the check.
6370 if (noqname != NULL)
6371 free_noqname(mctx, &noqname);
11030 cleanup:
CID 1452705 (#1 of 1): Dereference before null check
(REVERSE_INULL) check_after_deref: Null-checking dctx
suggests that it may be null, but it has already been
dereferenced on all paths leading to the check.
11031 if (dctx != NULL)
11032 dumpcontext_destroy(dctx);
11033 return (result);
1401 }
CID 1453455 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking event suggests that it may be null,
but it has already been dereferenced on all paths leading to the check.
1402 if (event != NULL)
1403 isc_event_free(ISC_EVENT_PTR(&event));
13836 if (zone != NULL)
13837 dns_zone_detach(&zone);
null: At condition dz != NULL, the value of dz must be NULL.
dead_error_condition: The condition dz != NULL cannot be true.
13838 if (dz != NULL) {
CID 1453456 (#1 of 1): Logically dead code (DEADCODE)
dead_error_begin: Execution cannot reach this statement:
dns_zone_detach(&dz->zone);.
13839 dns_zone_detach(&dz->zone);
13840 isc_mem_put(named_g_mctx, dz, sizeof(*dz));
13841 }
128 return (ISC_R_SUCCESS);
129
CID 1456146 (#1 of 1): Structurally dead code (UNREACHABLE)
unreachable: This code cannot be reached: {
if (dst->labels[i] != N....
130 do {
402 ctx->serve_stale_ttl = 0;
notnull: At condition indentctx, the value of indentctx
cannot be NULL. dead_error_condition: The condition indentctx
must be true.
CID 1456147 (#1 of 1): Logically dead code (DEADCODE)
dead_error_line: Execution cannot reach the expression
default_indent inside this statement: ctx->indent = (indentctx
? ....
403 ctx->indent = indentctx ? *indentctx : default_indent;
1636 cleanup:
CID 1458130 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking buffer suggests that it may be
null, but it has already been dereferenced on all paths leading to
the check.
1637 if (buffer != NULL)
1638 isc_buffer_free(&buffer);
Found by LGTM.com (see below for description), and while it should not
happen as EDNS OPT RDLEN is uint16_t, the fix is easy. A little bit
of cleanup is included too.
> In a loop condition, comparison of a value of a narrow type with a value
> of a wide type may result in unexpected behavior if the wider value is
> sufficiently large (or small). This is because the narrower value may
> overflow. This can lead to an infinite loop.
Increase the short lived record TTL and negative SOA TTL to make
this test less vulnerable to timing issues. The drawback is that we
also have to sleep longer in this test.
Add queries and checks for CAA RRtype in the serve-stale test.
Ensure that the "Others" rrtype stat counter is incremented and
decremented properly if the RRset becomes stale/ancient.
The low max-stale-ttl config option needs to be increased in order
to match the timing when things expire (aka become ancient).
This commit simplifies the cachedb rrset statistics in two ways:
- Introduce new rdtypecounter arithmetics, allowing bitwise
operations.
- Remove the special DLV statistic counter.
New rdtypecounter arithmetics
-----------------------------
"The rdtypecounter arithmetics is a brain twister". Replace the
enum counters with some defines. A rdtypecounter is now 8 bits for
RRtypes and 3 bits for flags:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | S |NX| RRType |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
If the 8 bits for RRtype are all zero, this is an Other RRtype.
Bit 7 is the NXRRSET (NX) flag and indicates whether this is a
positive (0) or a negative (1) RRset.
Then bit 5 and 6 mostly tell you if this counter is for an active,
stale, or ancient RRtype:
S = 0x00 means Active
S = 0x01 means Stale
S = 0x10 means Ancient
Since a counter cannot be stale and ancient at the same time, we
treat S = 0x11 as a special case to deal with NXDOMAIN counters.
S = 0x11 indicates an NXDOMAIN counter and in this case the RRtype
field signals the expiry of this cached item:
RRType = 0 means Active
RRType = 1 means Stale
RRType = 2 means Ancient
This also removes counting the DLV RRtype separately. Since we have
deprecated the lookaside validation it makes no sense to keep this
special statistic counter.
Since OpenBSD 6.6 is the current OpenBSD release, replace OpenBSD 6.5
GitLab CI jobs with their up-to-date counterparts.
As CI jobs for OpenBSD 6.6 will be run by a generalized libvirt executor
rather than an OpenBSD-specific one, make the necessary tag and variable
adjustments as well.
- Add quotes before and after zone name when generating "addzone"
input so avoid "unexpected token" errors.
- Use a hex digest for zone filenames when the zone or view name
contains a slash.
- Test with a domain name containing a slash.
- Incidentally added 'catzhash.py' to contrib/scripts to generate
hash labels for catalog zones, as it was needed to write the test.
The isc_buffer_allocate() function now cannot fail with ISC_R_MEMORY.
This commit removes all the checks on the return code using the semantic
patch from previous commit, as isc_buffer_allocate() now returns void.
The isc_mempool_create() function now cannot fail with ISC_R_MEMORY.
This commit removes all the checks on the return code using the semantic
patch from previous commit, as isc_mempool_create() now returns void.
Each system test can be marked as failed not only due to some tested
component(s) not behaving as expected, but also because of core dumps,
assertion failures, and/or ThreadSanitizer reports being found among its
artifacts. Make the system test summary list the tests which exhibit
such atypical symptoms to more clearly present the nature of problems
found.
The `rndc signing -clear` command cleans up the private-type records
that keep track of zone signing activity, but before this change it
did not tell the secondary servers that the zone has changed.
Resolve "bind 9.14.8 and 9.14.9 aborts when queried for non-existing domain in chaos class"
Closes#1569 and #1540
See merge request isc-projects/bind9!2843
Added test to ensure that NXDOMAIN is returned when BIND is queried for a
non existing domain in CH class (if a view of CHAOS class is configured)
and that it also doesn't crash anymore in those cases.
Function dns_view_findzonecut in view.c wasn't correctly handling
classes other than IN (chaos, hesiod, etc) whenever the name being
looked up wasn't in cache or in any of the configured zone views' database.
That resulted in a NULL fname being used in resolver.c:4900, which
in turn was triggering abort.
When both 'broken' and 'failed' test cases appear in unit test output
...
===> Broken tests
lib/isc/tests/socket_test:main -> broken: Test case timed out [300.022s]
===> Failed tests
lib/isc/tests/time_test:main -> failed: 2 of 6 tests failed [0.006s]
===> Summary
...
spurious '===>' string gets matched, that results in the following
error:
Usage error for command debug: '===>' is not a test case identifier (missing ':'?).
Following change makes sure the string is omitted.
I checked on FreeBSD and OpenBSD that the AWK construct is supported.
If we created a key, mark its SyncPublish time as 'now' and started
bind the key might not be published if the SyncPublish time is in
the same second as the time the zone is loaded. This is mostly
for dnssec system test, as this kind of scenario is very unlikely
in a real world environment.
To reproduce the race - create a task, send two events to it, first one
must take some time. Then, from the outside, pause(), unpause() and detach()
the task.
When the long-running event is processed by the task it is in
task_state_running state. When we called pause() the state changed to
task_state_paused, on unpause we checked that there are events in the task
queue, changed the state to task_state_ready and enqueued the task on the
workers readyq. We then detach the task.
The dispatch() is done with processing the event, it processes the second
event in the queue, and then shuts down the task and frees it (as it's not
referenced anymore). Dispatcher then takes the, already freed, task from
the queue where it was wrongly put, causing an use-after free and,
subsequently, either an assertion failure or a segmentation fault.
The probability of this happening is very slim, yet it might happen under a
very high load, more probably on a recursive resolver than on an
authoritative.
The fix introduces a new 'task_state_pausing' state - to which tasks
are moved if they're being paused while still running. They are moved
to task_state_paused state when dispatcher is done with them, and
if we unpause a task in paused state it's moved back to task_state_running
and not requeued.
This is a bug I encountered when trying to schedule an algorithm
rollover. My plan, for a zone whose maximum TTL is 48h, was to sign
with the new algorithm and schedule a change of CDS records for more
than 48 hours in the future, roughly like this:
$ dnssec-keygen -a 13 -fk -Psync now+50h $zone
$ dnssec-keygen -a 13 $zone
$ dnssec-settime -Dsync now+50h $zone_ksk_old
However the algorithm 13 CDS was published immediately, which could
have made the zone bogus.
To reveal the bug using the `smartsign` test, this change just adds a
KSK with all its times in the future, so it should not affect the
existing checks at all. But the final check (that there are no CDS or
CDSNSKEY records after -Dsync) fails with the old `syncpublish()`
logic, because the future key's sync records appear early. With the
new `syncpublish()` logic the future key does not affect the test, as
expected, and it now passes.
When two threads unreferenced handles coming from one socket while
the socket was being destructed we could get a use-after-free:
Having handle H1 coming from socket S1, H2 coming from socket S2,
S0 being a parent socket to S1 and S2:
Thread A Thread B
Unref handle H1 Unref handle H2
Remove H1 from S1 active handles Remove H2 from S2 active handles
nmsocket_maybe_destroy(S1) nmsocket_maybe_destroy(S2)
nmsocket_maybe_destroy(S0) nmsocket_maybe_destroy(S0)
LOCK(S0->lock)
Go through all children, figure
out that we have no more active
handles:
sum of S0->children[i]->ah == 0
UNLOCK(S0->lock)
destroy(S0)
LOCK(S0->lock)
- but S0 is already gone
We weren't consistent about who should unreference the handle in
case of network error. Make it consistent so that it's always the
client code responsibility to unreference the handle - either
in the callback or right away if send function failed and the callback
will never be called.
If taskmgr is shutting down ns_client_setup will fail to create
a task for the newly created client, we weren't cleaning up already
created/attached things (memory context, server, clientmgr).
In tcp and udp stoplistening code we accessed libuv structures
from a different thread, which caused a shutdown crash when named
was under load. Also added additional DbC checks making sure we're
in a proper thread when accessing uv_ functions.
We had a race in which n UDP socket could have been already closing
by libuv but we still sent data to it. Mark socket as not-active
when stopping listening and verify that socket is not active when
trying to send data to it.
if validator_start() is called with validator->event->message set to
NULL, we can't use message->rcode to decide which negative proofs are
needed, so we use the rdataset attributes instead to determine whether
the rdataset was cached as NXDOMAIN or NODATA.
We pass interface as an opaque argument to tcpdns listening socket.
If we stop listening on an interface but still have in-flight connections
the opaque 'interface' is not properly reference counted, and we might
hit a dead memory. We put just a single source of truth in a listening
socket and make the child sockets use that instead of copying the
value from listening socket. We clean the callback when we stop listening.
- isc__netievent_storage_t was to small to contain
isc__netievent__socket_streaminfo_t on Windows
- handle isc_uv_export and isc_uv_import errors properly
- rewrite isc_uv_export and isc_uv_import on Windows
hp implementation requires an object for each thread accessing
a hazard pointer. previous implementation had a hardcoded
HP_MAX_THREAD value of 128, which failed on machines with lots of
CPU cores (named uses 3n threads). We make isc__hp_max_threads
configurable at startup, with the value set to 4*named_g_cpus.
It's also important for this value not to be too big as we do
linear searches on a list.
it now removes matching trust anchors from from the dslist while leaving
the other trust anchors in place.
also cleaned up the API to remove functions that were never being used.
NOTE: the keytable test is still failing because dns_keytable_deletekey()
is looking for exact matches in keynodes containing dst_key objects,
which no keynode has anymore.
the internal keytable structure has not yet been changed, but
insertion of DS anchors is the only method now available.
NOTE: the keytable unit test is currently failing because of tests
that expect individual keynode objects to contain single DST key
objects.
as initial-key and static-key trust anchors will now be stored as a
DS rrset, code referencing keynodes storing DNSKEY trust anchors will
no longer be reached.
this function is used by dns_view_untrust() to handle revoked keys, so
it will still be needed after the keytable/validator refactoring is
complete, even though the keytable will be storing DS trust anchors
instead of keys. to simplify the way it's called, it now takes a DNSKEY
rdata struct instead of a DST key.
The isc_refcount API that provides reference counting lost DbC checks for
overflows and underflows in the isc_refcount_{increment,decrement} functions.
The commit restores the overflow check in the isc_refcount_increment and
underflows check in the isc_refcount_decrement by checking for the previous
value to not be on the boundary.
This commits removes superfluous checks when using the isc_refcount API.
Examples of superfluous checks:
1. The isc_refcount_decrement function ensures there was not underflow,
so this check is superfluous:
INSIST(isc_refcount_decrement(&r) > 0);
2 .The isc_refcount_destroy() includes check whether the counter
is zero, therefore this is superfluous:
INSIST(isc_refcount_decrement(&r) == 1 && isc_refcount_destroy(&r));
If a connection was closed early (right after accept()) an assertion
that assumed that the connection was still alive could be triggered
in accept_connection. Handle those errors properly and not with
assertions, free all the resources afterwards.
- the socket stat counters have been moved from socket.h to stats.h.
- isc_nm_t now attaches to the same stats counter group as
isc_socketmgr_t, so that both managers can increment the same
set of statistics
- isc__nmsocket_init() now takes an interface as a paramter so that
the address family can be determined when initializing the socket.
- based on the address family and socket type, a group of statistics
counters will be associated with the socket - for example, UDP4Active
with IPv4 UDP sockets and TCP6Active with IPv6 TCP sockets. note
that no counters are currently associated with TCPDNS sockets; those
stats will be handled by the underlying TCP socket.
- the counters are not actually used by netmgr sockets yet; counter
increment and decrement calls will be added in a later commit.
If pktinfo were supported then we could listen on :: for ipv6 and get
the information about the destination address from pktinfo structure passed
in recvmsg but this method is not portable and libuv doesn't support it - so
we need to listen on all interfaces.
We should verify that this doesn't impact performance (we already do it for
ipv4) and either remove all the ipv6pktinfo detection code or think of fixing
libuv.
- use UV_{TC,UD}P_IPV6ONLY for IPv6 sockets, keeping the pre-netmgr
behaviour.
- add a new listening_error bool flag which is set if the child
listener fails to start listening. This fixes a bug where named would
hang if, e.g., we failed to bind to a TCP socket.
For multithreaded TCP listening we need to pass a bound socket to all
listening threads. Instead of using uv_pipe handle passing method which
is quite complex (lots of callbacks, each of them with its own error
handling) we now use isc_uv_export() to export the socket, pass it as a
member of the isc__netievent_tcpchildlisten_t structure, and then
isc_uv_import() it in the child thread, simplifying the process
significantly.
These functions can be used to pass a uv handle between threads in a
safe manner. The other option is to use uv_pipe and pass the uv_handle
via IPC, which is way more complex. uv_export() and uv_import() functions
existed in libuv at some point but were removed later. This code is
based on the original removed code.
The Windows version of the code uses two functions internal to libuv;
a patch for libuv is attached for exporting these functions.
Ensure BIND is continuously tested on Tumbleweed, a pure rolling release
version of openSUSE. This will allow BIND incompatibilities with latest
upstream versions of its dependencies to be caught more quickly.
5339. [bug] With some libmaxminddb versions, named could erroneously
match an IP address not belonging to any subnet defined
in a given GeoIP2 database to one of the existing
entries in that database. [GL #1552]
Only comparing the value of the integer passed as the last argument to
MMDB_lookup_sockaddr() against MMDB_SUCCESS is not enough to ensure that
an MMDB lookup was successful - the 'found_entry' field of the
MMDB_lookup_result_s structure returned by that function also needs to
be true or else the remaining contents of that structure should be
ignored as the lookup failed. Extend the relevant logical condition in
get_entry_for() to ensure the latter does not return incorrect MMDB
entries for IP addresses which do not belong to any subnet defined in a
given GeoIP2 database.
Before this change, there was a missing blank line between the
negative trust anchors for one view, and the heading line for the next
view. This is because dns_ntatable_totext() omits the last newline.
There is an example of the incorrect output below; the fixed output
has a blank line before "Start view auth".
secure roots as of 21-Oct-2019 12:03:23.500:
Start view rec
Secure roots:
./RSASHA256/20326 ; managed
Negative trust anchors:
example.com: expiry 21-Oct-2019 13:03:15.000
Start view auth
Secure roots:
./RSASHA256/20326 ; managed
Negative trust anchors:
example.com: expiry 21-Oct-2019 13:03:07.000
Some unit tests need various managers to be created before they are run.
The interface manager spawned during libns tests listens on a fixed port
number, which causes intermittent issues when multiple tests using an
interface manager are run concurrently. Make the interface manager
listen on a randomized port number to greatly reduce the risk of
multiple unit tests using the same port concurrently.
"rndc signing -serial <value>" could take longer than a second to
complete. Loop waiting for update to succeed.
For tests where "rndc signing -serial <value>" is supposed to not
succeed, repeatedly test that we don't get the new serial, then
test that we have the old value. This should prevent false negatives.
GitLab issue and merge request numbers placed in release notes (in the
form of "#1234" for issues and "!5678" for merge requests) should not be
split across two lines. Extend the shell pipeline generating
doc/arm/notes.txt with a sed invocation which prevents such splitting.
The initial tcp statistics test was not testing tcp-highwater counter,
but only initial number of current TCP clients, so this missing test was
added to ensure initial tcp-highwater value is correct.
After the network manager rewrite, tcp-higwater stats was only being
updated when a valid DNS query was received over tcp.
It turns out tcp-quota is updated right after a tcp connection is
accepted, before any data is read, so in the event that some client
connect but don't send a valid query, it wouldn't be taken into
account to update tcp-highwater stats, that is wrong.
This commit fix tcp-highwater to update its stats whenever a tcp connection
is established, independent of what happens after (timeout/invalid
request, etc).
During BIND startup it scans for network interfaces available, in this
process it ensures that for every interface it will bind and listen to,
at least one socket will be always available accepting connections on
that interface, this way avoiding some DOS attacks that could exploit
tcp quota on some interface and make others unavailable.
In the previous network implementation this initial "reserved" tcp-quota
used by BIND was already been added to the tcp-highwater stats, but with
the new network code it was necesary to add this workaround to ensure
tcp-highwater stats reflect the tcp-quota used by BIND after startup.
- Add a GitLab merge request number to the "trust-anchors" release
note and slightly rephrase its second half.
- Replace tabs with spaces in doc/arm/notes-9.15.7.xml to retain
consistency with other XML files containing release notes.
- Move the "Security Fixes" section for BIND 9.15.6 higher up, for
consistency with release notes for other versions.
- Add a missing release note for TCP high-water. That feature was not
yet merged when the initial version of !2524 was prepared and its
release note was missed when that merge request was later rebased.
- Rephrase the release note for CVE-2019-6477 so that it uses the same
text as its corresponding notes in all other releases.
- Unify whitespace in doc/arm/notes-9.15.6.xml.
Add a GitLab CI job (which is run only if all other jobs in a pipeline
succeed) that builds a BIND release tarball, i.e. fetches the source
tarball from the tarball building job, creates Windows zips, puts
certain parts of BIND documentation into the appropriate places, and
packs it all up into a single tarball whose contents can be subsequently
signed and published.
Add a system test job for binaries created by Visual Studio in the
"Debug" build configuration to GitLab CI so that they can be tested
along their "Release" counterparts when necessary.
Add a Visual Studio build job using the "Debug" build configuration to
GitLab CI without enabling it for every pipeline as it takes about twice
as long to complete as its "Release" counterpart.
Add a set of jobs to GitLab CI that create a BIND source tarball and
then build and test its contents. Run those extra jobs only when a tag
is pushed to the Git repository as they are only meant to be sanity
checks of BIND source tarball contents.
The util/prepare-softhsm2.sh script is useful for initializing a working
SoftHSM environment which can be used by unit tests and system tests.
However, since it is a test-specific script, it does not really belong
in the util/ subdirectory which is mostly pruned during the BIND source
tarball creation process. Move the prepare-softhsm2.sh script to
bin/tests/ so that its location is more appropriate for its purpose and
also so that it does not get removed during the BIND source tarball
creation process, allowing it to be used for setting up test
environments for tarball-based builds.
Convert the logic (currently present in the form of "rm -rf" calls in
util/kit.sh) for removing files and directories which are tracked by Git
but redundant in release tarballs into a set of .gitattributes rules
which allow the same effect to be achieved using "git archive".
Resolve "ThreadSanitizer: data race /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1027 in nmhandle_free"
Closes#1473
See merge request isc-projects/bind9!2739
The LC_ALL=C assignments in the "idna" system test, which were only
meant to affect a certain subset of checks, in fact persist throughout
all the subsequent checks in that system test. That affects the test's
behavior and is misleading.
When the "VARIABLE=value command ..." syntax is used in a shell script,
in order for the variable assignment to only apply to "command", the
latter must be an external binary; otherwise, the VARIABLE=value
assignment persists for all subsequent commands in a script:
$ cat foo.sh
#!/bin/sh
foo() {
/bin/sh bar.sh
}
BAR="baz0"
BAR="baz1" /bin/sh bar.sh
echo "foo: BAR=${BAR}"
BAR="baz2" foo
echo "foo: BAR=${BAR}"
$ cat bar.sh
#!/bin/sh
echo "bar: BAR=${BAR}"
$ /bin/sh foo.sh
bar: BAR=baz1
foo: BAR=baz0
bar: BAR=baz2
foo: BAR=baz2
$
Fix by saving the value of LC_ALL before the relevant set of checks in
the "idna" system test, restoring it afterwards, and dropping the
"LC_ALL=C command ..." syntax.
- make tcp listening IPC pipe name saner
- put the pipe in /tmp on unices
- add pid to the pipe name to avoid conflicts between processes
- fsync directory in which the pipe resides to make sure that the
child threads will see it and be able to open it
even when worker is paused (e.g. interface reconfiguration). This is
needed to prevent deadlocks when reconfiguring interfaces - as network
manager is paused then, but we still need to stop/start listening.
- Proper handling of TCP listen errors in netmgr - bind to the socket first,
then return the error code.
When listening for TCP connections we create a socket, bind it
and then pass it over IPC to all threads - which then listen on
in and accept connections. This sounds broken, but it's the
official way of dealing with multithreaded TCP listeners in libuv,
and works on all platforms supported by libuv.
In decrement_reference only test node->down if the tree lock
is held. As node->down is not always tested in
decrement_reference we need to test that it is non NULL in
cleanup_dead_nodes prior to removing the node from the rbt
tree. Additionally it is not always possible to aquire the
node lock and reactivate a node when adding parent nodes.
Reactivate such nodes in cleanup_dead_nodes if required.
Before, the zero system test could get stuck almost infinitely, because
the first test sends > 300 queries with 5 seconds timeout on each in
each pass. If named crashed early, it would took the test more than 4
hours to properly timeout.
This commit introduces a "watchdog" on the dig commands running in the
background and failing the test on timeout, failing any test if any dig
command fails to return successfully, and making the tests.sh script
shellcheck clean.
The kasp system test has a call to sed to retrieve the key identifier
without leading zeros. The sed call could not handle key id 0.
Update the kasp test to also correctly deal with this case.
The autosign test has a test case where a DNSSEC maintaiend zone
has a set of DNSSEC keys without any timing metadata set. It
tests if named picks up the key for publication and signing if a
delayed dnssec-settime/loadkeys event has occured.
The test failed intermittently despite the fact it sleeps for 5
seconds but the triggered key reconfigure action should happen after
3 seconds.
However, the test output showed that the test query came in before
the key reconfigure action was complete (see excerpts below).
The loadkeys command is received:
15:38:36 received control channel command 'loadkeys delay.example.'
The reconfiguring zone keys action is triggered after 3 seconds:
15:38:39 zone delay.example/IN: reconfiguring zone keys
15:38:39 DNSKEY delay.example/NSEC3RSASHA1/7484 (ZSK) is now published
15:38:39 DNSKEY delay.example/NSEC3RSASHA1/7455 (KSK) is now published
15:38:39 writing to journal
Two seconds later the test query comes in:
15:38:41 client @0x7f1b8c0562b0 10.53.0.1#44177: query
15:38:41 client @0x7f1b8c0562b0 10.53.0.1#44177: endrequest
And 6 more seconds later the reconfigure keys action is complete:
15:38:47 zone delay.example/IN: next key event: 05-Dec-2019 15:48:39
This commit fixes the test by checking the "next key event" log has
been seen before executing the test query, making sure that the
reconfigure keys action has been complete.
This commit however does not fix, nor explain why it took such a long
time (8 seconds) to reconfigure the keys.
Clarify in the ARM that TTL-style options can also now take ISO
8601 durations.
Mention the built-in dnssec policies "default" and "none". Mention
that "none" is the default.
Add a file documenting the default dnssec-policy configuration options.
Fix dnssec-policy syntax in ARM (dnssec-policy.grammar.xml).
The first step in all existing setup.sh scripts is to call clean.sh. To
reduce code duplication and ensure all system tests added in the future
behave consistently with existing ones, invoke clean.sh from run.sh
before calling setup.sh.
At the end of each system test suite run, the system test framework
collects all existing test.output files from system test subdirectories
and produces bin/tests/system/systests.output from those files.
However, it does not check whether a test.output file was found for
every executed test. Thus, if the test.output file is accidentally
deleted by the system test itself (e.g. due to an overly broad file
removal wildcard present in clean.sh), its output will not be included
in bin/tests/system/systests.output. Since the result of each system
test suite run is determined by bin/tests/system/testsummary.sh, which
only operates on the contents of bin/tests/system/systests.output, this
can lead to test failures being ignored. Fix by ensuring the number of
test results found in bin/tests/system/systests.output is equal to the
number of tests run and triggering a system test suite failure in case
of a discrepancy between these two values.
Since the role of the bin/tests/system/clean.sh script has now been
reduced to calling a given system test's clean.sh script, remove the
former altogether and replace its only use with a direct invocation of
the latter.
Since files containing system test output are no longer stored in test
subdirectories, bin/tests/system/clean.sh no longer needs to take care
of removing the test.output file for a given test as testsummary.sh
already takes care of that and even if a test suite terminates
abnormally and another one is started, tee invoked without the -a
command line switch overwrites the destination file if it exists, so
leftover test.output.* files from previous test suite runs are not a
concern. Remove the -r command line switch and the code associated with
it from the relevant scripts.
Some clean.sh scripts contain overly broad file deletion wildcards which
cause the test.output file (used by the system test framework for
collecting output) in a given system test's directory to be erroneously
removed immediately after the test is started (due to setup.sh scripts
calling clean.sh at the beginning). This prevents the test's output
from being placed in bin/tests/system/systests.output at the end of a
test suite run and thus can lead to test failures being ignored. Fix by
storing each test's output in a test.output.<test-name> file in
bin/tests/system/, which prevents clean.sh scripts from removing it (as
they should only ever affect files contained in a given system test's
directory).
When a function returns void, it can be used as an argument to return in
function returning also void, e.g.:
void in(void) {
return;
}
void out(void) {
return (in());
}
while this is legal, it should be rewritten as:
void out(void) {
in();
return;
}
The semantic patch just find the occurrences, and they need to be fixed
by hand.
Since the introduction of durations, all ttlval configuration types
are replaced with durations. Duration is an ISO 8601 duration, a
TTL-style value, or a number. These two references were missed and
are now also replaced.
This commit makes some minor changes to the trust anchor code:
1. Replace the undescriptive n1, n2 and n3 identifiers with slightly
better rdata1, rdata2, and rdata3.
2. Fix an occurrence where in the error log message a static number
32 was printed, rather than the rdata3 length.
3. Add a default case to the switch statement checking DS digest
algorithms to catch unknown algorithms.
Previously, the fetchlimit tested the recursive-clients soft limit
that's defined as 90% of the hard limit (the actual configured value).
This worked previously because the reaping of the oldest recursive
client was put on the same event queue as the current TCP client, thus
the cleaning has happened before the new TCP client established a new
connection.
With the change in BIND 9.14 that added a multiple event queues the
cleaning of the oldests clients is no longer synchronous and could
happen stochastically making the soft limit testing fail often. The
situation became even worse with the new networking manager, thus we
change the system test to fail only if the hard limit bound is not
honored.
Changing the accounting of the already reaped TCP clients so the soft
limit testing is possible again is out of the scope for this change.
These two tests were failing basically because in order for prefetching to
happen, the TTL for a given DNS record must be greater than or equal to
the prefetch config value + 9.
The previous TTL for both records was 10, while prefetch value in
configuration was 3, thus making only records with TTL >= 12 elligible
for prefetching.
TTL value for both records was adjusted to the value 13, and prefetch
value was set to 4 (inc by 1), so records with TTL (4 + 9) >= 13 are
elligible for prefetching.
Adjusting prefetch value to 4 gives the test 1 second more to avoid time
problems when sharing resources on a heavy loaded PC.
Also prefetch value in settings is now read by the script and used
by it to corrrectly calculate the amount of time needed to delay before
sending a request to trigger prefetch, adding a bit of flexibility to
fine tune the test in the future.
The previous test had two problems:
1. It wasn't written specifically for testing what it was supposed to:
prefetch disabled.
2. It could fail in some circunstances if the computer's load is too
high, due to sleeps not taking parallel tests and cpu load into account.
The new test is testing prefetch disabled as follows:
1. It asks for a txt record for a given domain and takes note of the
record's TTL (which is 10).
2. It sleeps for (TTL - 5) = 5 seconds, having a window of 5 seconds to
issue new queries before the record expires from cache.
3. Three(3) queries are executed in a row, with a interval of 1 second
between them, and for each query we verify that the TTL in response is
less than the previous one, thus ensuring that prefetch is disabled (if
it were enabled this record would have been refreshed already and TTL
would be >= the first TTL).
Having a window of 5 seconds to perform 3 queries with a interval of 1
second between them gives the test a reasonable amount of time
to not suffer from a machine with heavy load.
For BIND 9.16+, TLS aware compiler is required, and using
ISC_THREAD_LOCAL is preferred way of using Thread Local Storage. The
isc_thread_key API is no longer used anywhere and hence was removed from
BIND 9.
Previously, the irs_context API used isc_thread_key API for TLS, which is
fairly complicated and requires initialization of memory contexts, etc.
This part of code was refactored to use a ISC_THREAD_LOCAL pointer which
greatly simplifies the whole code related to storing TLS variables.
Previously, the dns_geoip API used isc_thread_key API for TLS, which is
fairly complicated and requires initialization of memory contexts, etc.
This part of code was refactored to use a ISC_THREAD_LOCAL pointer which
greatly simplifies the whole code related to storing TLS variables, and
creating the local memory context was moved to named and stored in the
named_g_geoip global context.
Previously, the dns_dt API used isc_thread_key API for TLS, which is
fairly complicated and requires initialization of memory contexts, etc.
This part of code was refactored to use a ISC_THREAD_LOCAL pointer which
greatly simplifies the whole code related to storing TLS variables.
Previously, the dns_name API used isc_thread_key API for TLS, which is
fairly complicated and requires initialization of memory contexts, etc.
This part of code was refactored to use a ISC_THREAD_LOCAL pointer which
greatly simplifies the whole code related to storing TLS variables.
The new ISC_THREAD_LOCAL macro unifies usage of platform dependent
Thread Local Storage definition thread_local vs __thread vs
__declspec(thread) to a single macro.
The commit also unifies the required level of support for TLS as for
some parts of the code it was mandatory and for some parts of the code
it wasn't.
FCTX_ATTR_SHUTTINGDOWN needs to be set and tested while holding the node
lock but the rest of the attributes don't as they are task locked. Making
fctx->attributes atomic allows both behaviours without races.
Disabling ASAN memory leak detection for a build job is pointless
because ASAN is only used in test jobs. (Also, memory leak detection
should not be disabled globally - explicit suppressions should be used
in case of issues with external code.)
xmlInitThreads() and xmlCleanupThreads() are called from within
named_statschannels_configure() and named_statschannels_shutdown(),
respectively. Both of these functions are executed by worker threads,
not the main named thread. This causes ASAN to report memory leaks like
the following one upon shutdown (as long as named is asked to produce
any XML output over its configured statistics channels during its
lifetime):
Direct leak of 968 byte(s) in 1 object(s) allocated from:
#0 0x7f677c249cd8 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cc:153
#1 0x7f677bc1838f in xmlGetGlobalState (/usr/lib/libxml2.so.2+0xa838f)
The data mentioned in the above report is a libxml2 state structure
stored as thread-specific data. Such chunks of memory are automatically
released (by a destructor passed to pthread_key_create() by libxml2)
whenever a thread that allocated a given chunk exits. However, if
xmlCleanupThreads() is called by a given thread before it exits, the
destructor will not be invoked (due to xmlCleanupThreads() calling
pthread_key_delete()) and ASAN will report a memory leak. Thus,
xmlInitThreads() and xmlCleanupThreads() must not be called from worker
threads. Since xmlInitThreads() must be called on Windows in order for
libxml2 to work at all, move xmlInitThreads() and xmlCleanupThreads()
calls to the main named thread (which does not produce any XML output
itself) in order to prevent the memory leak from being reported by ASAN.
Loaded GeoIP2 databases are only released when named is shut down, but
not during server reconfiguration. This causes memory to be leaked
every time "rndc reconfig" or "rndc reload" is used, as long as any
GeoIP2 database is in use. Fix by releasing any loaded GeoIP2 databases
before reloading them. Do not call dns_geoip_shutdown() until server
shutdown as that function releases the memory context used for caching
GeoIP2 lookup results.
This prevents races on fctx->client whenever a new fetch joins a existing
fetch (by calling fctx_join) as it is now invariant for the active life of
fctx.
Some semantic patches are meant to be run just once, as they work on
functions with changed prototypes. We keep them for reference, but
disabled them from the CI to save time.
The saved_command_line buffer in bin/named/main.c is 8192 bytes long.
The size of libisc's internal logging buffer (defined by the value of
the LOG_BUFFER_SIZE constant in lib/isc/log.c) is also 8192 bytes.
Since the buffer containing the ellipsis is passed as the last argument
to isc_log_write() and the buffer containing the potentially trimmed
named command line (saved_command_line) is passed as the second argument
in the same isc_log_write() call, it may happen that saved_command_line
will exhaust all available space in libisc's internal logging buffer, in
which case the ellipsis will be elided from the output.
Make saved_command_line 4096 bytes long as that value is arguably also
large enough for any reasonable use case and at the same time it ensures
ellipsis will always be printed for excessively long named command
lines.
The "runtime" system test currently fails on Windows because it waits
for named to log a message indicating successful startup ("running"),
but that never happens since named on Windows fails to open the
configuration file as its path includes control characters.
Instead of putting control characters in directory names, put them in
the value of the -D command line switch passed to named, which is used
for identifying an instance of named in a process listing and whose
value is completely ignored by named, but still logged.
While a similar check using special characters appears to be working
fine on Windows for the time being, modify it in the same way to avoid
potential future problems on other platforms and make the test cleaner.
This mostly comprises of:
* using $(...) instead of `...`
* changing the directories in subshell and not ignoring `cd` return code
* handling every error gracefully instead of ignoring the return code
The indentation for dumping the master zone was driven by two
global variables dns_master_indent and dns_master_indentstr. In
threaded mode, this becomes prone to data access races, so this commit
converts the global variables into a local per-context tuple that
consist of count and string.
When loading the configuration fails, there might be already other tasks
running and calling OpenSSL library functions. The OpenSSL on_exit
handler is called when exiting the main process and there's a timing
race between the on_exit function that destroys OpenSSL allocated
resources (threads, locks, ...) and other tasks accessing the very same
resources leading to a crash in the system threading library. Therefore,
the fatal() function needs to request exlusive access to the task
manager to finish the already running tasks and exit only when no other
tasks are running.
The oldsigs test was checking only for the validity of the A
a.oldsigs.example. resource record and associated DNSSEC signature while
the zone might not have been fully signed yet leading to validation
failures because of bogus signatures on the validation path.
This commit changes the test to test that all old signatures in the
oldsigs.example. zone were replaced and the zone is fully resigned
before running the main check.
- restore support for tcp-initial-timeout, tcp-idle-timeout,
tcp-keepalive-timeout and tcp-advertised-timeout configuration
options, which were ineffective previously.
- add timeout support for TCP and TCPDNS connections to protect against
slowloris style attacks. currently, all timeouts are hard-coded.
- rework and simplify the TCPDNS state machine.
We were not reseting the keynode value when iterating over DNSKEYs in
RRSET, so we weren't checking all DNSKEYs against all trust anchors. This
commit fixes the issue by resetting keynode with every loop.
Make the "tcp" system test fail if the Python tool used for establishing
TCP connections (ans6) logs a result different than "OK" after
processing a command sent to it (as that means the tool was unable to
successfully perform the requested action), with the exception of
cleanup errors at the end of the test which can be safely ignored. Note
that the tool not returning any result at all in 10 seconds is still a
fatal error in all cases.
The dns_adb_beginudpfetch() is called only for UDP queries, but
the dns_adb_endudpfetch() is called for all queries, including
TCP. This messages the quota counting in adb.c.
- TSAN can't handle more than 64 locks in one thread, lock ADB bucket-by-bucket
in TSAN mode. This means that the dump won't be consistent but it's good
enough for testing
- Use proper order when unlocking adb->namelocks and adb->entrylocks when
dumping ADB.
isc_mem_traceflag_test messes with stdout/stderr, which can cause
problems with subsequent tests (no output, libuv problems). Moving that
test case to the end ensures there are no side effects.
close the uv_handle for the worker async channel, and call
uv_loop_close() on shutdown to ensure that the event loop's
internal resources are properly freed.
when the TCPDNS_CLIENTS_PER_CONN limit has been exceeded for a TCP
DNS connection, switch to sequential mode to ensure that memory cannot
be exhausted by too many simultaneous queries.
-Wl,-z,interpose is not supported.
-Wl,rpath=<path> is not supported use -Wl,rpath,<path> instead.
Use @SO@ for loadable extension.
Use -L <path> -l libwrap instead of libwrap.sa.
this adds functions in conf.sh.common to create DS-style trust anchor
files. those functions are then used to create nearly all of the trust
anchors in the system tests.
there are a few exceptions:
- some tests in dnssec and mkeys rely on detection of unsupported
algorithms, which only works with key-style trust anchors, so those
are used for those tests in particular.
- the mirror test had a problem with the use of a CSK without a
SEP bit, which still needs addressing
in the future, some of these tests should be changed back to using
traditional trust anchors, so that both types will be exercised going
forward.
use empty placeholder KEYDATA records for all trust anchors, not just
DS-style trust anchors.
this revealed a pre-existing bug: keyfetch_done() skips keys without
the SEP bit when populating the managed-keys zone. consequently, if a
zone only has a single ZSK which is configured as trust anchor and no
KSKs, then no KEYDATA record is ever written to the managed-keys zone
when keys are refreshed.
that was how the root server in the dnssec system test was configured.
however, previously, the KEYDATA was created when the key was
initialized; this prevented us from noticing the bug until now.
configuring a ZSK as an RFC 5011 trust anchor is not forbidden by the
spec, but it is highly unusual and not well defined. so for the time
being, I have modified the system test to generate both a KSK and ZSK
for the root zone, enabling the test to pass.
we should consider adding code to detect this condition and allow keys
without the SEP bit to be used as trust anchors if no key with the SEP
bit is available, or at minimum, log a warning.
note: this also needs further refactoring.
- when initializing RFC 5011 for a name, we populate the managed-keys
zone with KEYDATA records derived from the initial-key trust anchors.
however, with initial-ds trust anchors, there is no key. but the
managed-keys zone still must have a KEYDATA record for the name,
otherwise zone_refreshkeys() won't refresh that key. so, for
initial-ds trust anchors, we now add an empty KEYDATA record and set
the key refresh timer so that the real keys will be looked up as soon
as possible.
- when a key refresh query is done, we verify it against the
trust anchor; this is done in two ways, one with the DS RRset
set up during configuration if present, or with the keys linked
from each keynode in the list if not. because there are two different
verification methods, the loop structure is overly complex and should
be simplified.
- the keyfetch_done() and sync_keyzone() functions are both too long
and should be broken into smaller functions.
note: this is a frankensteinian kluge which needs further refactoring.
the keytable started as an RBT where the node->data points to a list of
dns_keynode structures, each of which points to a single dst_key.
later it was modified so that the list could instead point to a single
"null" keynode structure, which does not reference a key; this means
a trust anchor has been configured but the RFC 5011 refresh failed.
in this branch it is further updated to allow the first keynode in
the list to point to an rdatalist of DS-style trust anchors. these will
be used by the validator to populate 'val->dsset' when validating a zone
key.
a DS style trust anchor can be updated as a result of RFC 5011
processing to contain DST keys instead; this results in the DS list
being freed. the reverse is not possible; attempting to add a DS-style
trust anchor if a key-style trust anchor is already in place results
in an error.
later, this should be refactored to use rdatalists for both DS-style
and key-style trust anchors, but we're keeping the existing code for
old-style trust anchors for now.
- val->keynode and val->seensig were set but never used.
- val->nearest, val->soaset, val->soaname, val->nsecset and val->nsec3set
were never used at all.
- pull out the code that checks whether a key was signed by a trust
anchor into a separate function, anchor_signed().
- pull out the code that looks up a DS while validating a zone key
into a separate function, get_dsset().
- check in create_validator() whether the sigrdataset is bound, so that
we can always pass in &val->fsigrdataset during an insecurity proof;
this will allow a reduction of code duplication.
- also simplified some calls: don't pass siginfo where val->siginfo
is sufficient, don't INSIST where returning false is sufficient.
- also added header comments to several local functions.
- the netmgr was not correctly being specified when creating the task
manager, and was cleaned up in the wrong order when shutting down.
- on freebsd, timer_test appears to be prone to failure if the
netmgr is set up and torn down before and after ever test case, but
less so if it's only set up once at the beginning and once at the
end.
This commit renames isctest {mctx,lctx} to test_{mctx,lctx} and cleans
up their usage in the individual unit tests. This allows embedding
library .c files directly into the unit tests.
Commit 09ac224c5c made dnssec-keygen
depend on libisccfg but the Visual Studio solution file was not updated
to reflect that change. Make sure the dnssec-keygen Visual Studio
project depends on the libisccfg project to prevent compilation issues
during parallel builds.
Intertwining release notes from different BIND releases in a single XML
file has caused confusion in the past due to different (and often
arbitrary) approaches to keeping/removing release notes from older
releases on different BIND branches. Divide doc/arm/notes.xml into
per-version sections to simplify determining the set of changes
introduced by a given release and to make adding/reviewing release notes
less error-prone.
With the netmgr in use, named may start answering queries before zones
are loaded. This can cause transient failures in system tests after
servers are restarted or reconfigured. This commit adds retry loops
and sleep statements where needed to address this problem.
Also incidentally silenced a clang warning.
- ns__client_request() is now called by netmgr with an isc_nmhandle_t
parameter. The handle can then be permanently associated with an
ns_client object.
- The task manager is paused so that isc_task events that may be
triggred during client processing will not fire until after the netmgr is
finished with it. Before any asynchronous event, the client MUST
call isc_nmhandle_ref(client->handle), to prevent the client from
being reset and reused while waiting for an event to process. When
the asynchronous event is complete, isc_nmhandle_unref(client->handle)
must be called to ensure the handle can be reused later.
- reference counting of client objects is now handled in the nmhandle
object. when the handle references drop to zero, the client's "reset"
callback is used to free temporary resources and reiniialize it,
whereupon the handle (and associated client) is placed in the
"inactive handles" queue. when the sysstem is shutdown and the
handles are cleaned up, the client's "put" callback is called to free
all remaining resources.
- because client allocation is no longer handled in the same way,
the '-T clienttest' option has now been removed and is no longer
used by any system tests.
- the unit tests require wrapping the isc_nmhandle_unref() function;
when LD_WRAP is supported, that is used. otherwise we link a
libwrap.so interposer library and use that.
This allows a task to be temporary disabled so that objects won't be
processed simultaneously by libuv events and isc_task events. When a
task is paused, currently running events may complete, but no further
event will added to the run queue will be executed until the task is
unpaused.
When a task manager is created, we can now specify an `isc_nm`
object to associate with it; thereafter when the task manager is
placed into exclusive mode, the network manager will be paused.
This is a replacement for the existing isc_socket and isc_socketmgr
implementation. It uses libuv for asynchronous network communication;
"networker" objects will be distributed across worker threads reading
incoming packets and sending them for processing.
UDP listener sockets automatically create an array of "child" sockets
so each worker can listen separately.
TCP sockets are shared amongst worker threads.
A TCPDNS socket is a wrapper around a TCP socket, which handles the
the two-byte length field at the beginning of DNS messages over TCP.
(Other wrapper socket types can be implemented in the future to handle
DNS over TLS, DNS over HTTPS, etc.)
The double-locked queue implementation is still currently in use
in ns_client, but will be replaced by a fetch-and-add array queue.
This commit moves it from queue.h to list.h so that queue.h can be
used for the new data structure, and clean up dependencies between
list.h and types.h. Later, when the ISC_QUEUE is no longer is use,
it will be removed completely.
Ensure any unexpected failure in the "tcp" system test causes it to be
immediately interrupted with an error to make the aforementioned test
more reliable. Since the exit code for "expr 0 + 0" is 1, the status
variable needs to be updated using arithmetic expansion.
assert_int_equal() calls in bin/tests/system/tcp/tests.sh pass the found
value as the first argument and the expected value as the second
argument, while the function interprets its arguments the other way
round. Fix argument handling in assert_int_equal() to make sure the
error messages printed by that function are correct.
In the TCP high-water checks, "rndc stats" is run after ans6 reports
that it opened the requested number of TCP connections. However, we
fail to account for the fact that ns5 might not yet have called accept()
for these connections, in which case the counts output by "rndc stats"
will be off. To prevent intermittent "tcp" system test failures, allow
the relevant connection count checks to be retried (just once, after one
second, as that should be enough for any system to accept() a dozen TCP
connections under any circumstances).
the current method used for testing distribution of signatures
is failure-prone. we need to replace it with something both
effective and portable, but in the meantime we're commenting
out the jitter test.
The original requirement for the check to pass was <-10;10> interval and
the first test was failing by 1 second. As the minimum interval for
checking is 7200 seconds, the commit relaxes the requirement to <-60;60>
interval, which is still sane, but not that draconic.
The get_keyids() function can return multiple keyids, when the
return value was not quoted, only the first keyid would be checked
with check_key() function. This MR fixes both the error that came
with quoting the "$id" with value "12345 54321", and the code now
checks all returned keyids.
'dnssec-policy' can now also be set on the options and view level and
a zone that does not set 'dnssec-policy' explicitly will inherit it
from the view or options level.
This requires a new keyword to be introduced: 'none'. If set to
'none' the zone will not be DNSSEC maintained, in other words it will
stay unsigned. You can use this to break the inheritance. Of course
you can also break the inheritance by referring to a different
policy.
The keywords 'default' and 'none' are not allowed when configuring
your own dnssec-policy statement.
Add appropriate tests for checking the configuration (checkconf)
and add tests to the kasp system test to verify the inheritance
works.
Edit the kasp system test such that it can deal with unsigned zones
and views (so setting a TSIG on the query).
The kasp system tests are updated with 'check_cds' calls that will
verify that the correct CDS and CDNSKEY records are published during
a rollover and that they are signed with the correct KSK.
This requires a change in 'dnssec.c' to check the kasp key states
whether the CDS/CDNSKEY of a key should be published or not. If no
kasp state exist, fall back to key timings.
The 'sign_apex()' function has special processing for signing the
DNSKEY RRset such that it will always be signed with the active
KSK. Since CDS and CDNSKEY are also signed with the KSK, it
should have the same special processing. The special processing is
moved into a new function 'tickle_apex_rrset()' and is applied to
all three RR types (DNSKEY, CDS, CDNSKEY).
In addition, when kasp is involved, update the DNSKEY TTL accordingly
to what is in the policy.
Test two CSK rollover scenarios, one where the DS is swapped before the zone
signatures are all replaced, and one where the signatures are replaced sooner
than the DS is swapped.
Update dns_dnssec_keyactive to differentiate between the roles ZSK
and KSK. A key is active if it is signing but that differs per role.
A ZSK is signing if its ZRRSIG state is in RUMOURED or OMNIPRESENT,
a KSK is signing if its KRRSIG state is in RUMOURED or OMNIPRESENT.
This means that a key can be actively signing for one role but not
the other. Add checks in inline signing (zone.c and update.c) to
cover the case where a CSK is active in its KSK role but not the ZSK
role.
Add more tests for kasp:
- Add tests for different algorithms.
- Add a test to ensure that an edit in an unsigned zone is
picked up and properly signed.
- Add two tests that ensures that a zone gets signed when it is
configured as so-called 'inline-signing'. In other words, a
secondary zone that is configured with a 'dnssec-policy'. A zone
that is transferred over AXFR or IXFR will get signed.
- Add a test to ensure signatures are reused if they are still
fresh enough.
- Adds two more tests to verify that expired and unfresh signatures
will be regenerated.
- Add tests for various cases with keys already available in the
key-directory.
A significant refactor of the kasp system test in an attempt to
make the test script somewhat brief. When writing a test case,
you can/should use the functions 'zone_properties',
'key_properties', and 'key_timings' to set the expected values
when checking a key with 'check_key'. All these four functions
can be used to set environment variables that come in handy when
testing output.
Update the signing code in lib/dns/zone.c and lib/dns/update.c to
use kasp logic if a dnssec-policy is enabled.
This means zones with dnssec-policy should no longer follow
'update-check-ksk' and 'dnssec-dnskey-kskonly' logic, instead the
KASP keys configured dictate which RRset gets signed with what key.
Also use the next rekey event from the key manager rather than
setting it to one hour.
Mark the zone dynamic, as otherwise a zone with dnssec-policy is
not eligble for automatic DNSSEC maintenance.
Update dns_dnssec_get_hints and dns_dnssec_keyactive to use dst_key
functions and thus if dnssec-policy/KASP is used the key states are
being considered.
Add a new variable to 'struct dns_dnsseckey' to signal whether this
key is a zone-signing key (it is no longer true that ksk == !zsk).
Also introduce a hint for revoke.
Update 'dns_dnssec_findzonekeys' and 'dns_dnssec_findmatchingkeys'
to also read the key state file, if available.
Remove 'allzsk' from 'dns_dnssec_updatekeys' as this was only a
hint for logging.
Also make get_hints() (now dns_dnssec_get_hints()) public so that
we can use it in the key manager.
If a zone has a dnssec-policy set, use signature validity,
dnskey signature validity, and signature refresh from
dnssec-policy.
Zones configured with 'dnssec-policy' will allow 'named' to create
DNSSEC keys (similar to dnssec-keymgr) if not available.
Add a key manager to named. If a 'dnssec-policy' is set, 'named'
will run a key manager on the matching keys. This will do a couple
of things:
1. Create keys when needed (in case of rollover for example)
according to the set policy.
2. Retire keys that are in excess of the policy.
3. Maintain key states according to "Flexible and Robust Key
Rollover" [1]. After key manager ran, key files will be saved to
disk.
[1] https://matthijsmekking.nl/static/pdf/satin2012-Schaeffer.pdf
KEY GENERATION
Create keys according to DNSSEC policy. Zones configured with
'dnssec-policy' will allow 'named' to create DNSSEC keys (similar
to dnssec-keymgr) if not available.
KEY ROLLOVER
Rather than determining the desired state from timing metadata,
add a key state goal. Any keys that are created or picked from the
key ring and selected to be a successor has its key state goal set
to OMNIPRESENT (this key wants to be signing!). At the same time,
a key that is being retired has its key state goal set to HIDDEN.
The keymgr state machine with the three rules will make sure no
introduction or withdrawal of DNSSEC records happens too soon.
KEY TIMINGS
All timings are based on RFC 7583.
The keymgr will return when the next action is happening so
that the zone can set the proper rekey event. Prior to this change
the rekey event will run every hour by default (configurable),
but with kasp we can determine exactly when we need to run again.
The prepublication time is derived from policy.
Add a couple of dst_key functions for determining hints that
consider key states if they are available.
- dst_key_is_unused:
A key has no timing metadata set other than Created.
- dst_key_is_published:
A key has publish timing metadata <= now, DNSKEY state in
RUMOURED or OMNIPRESENT.
- dst_key_is_active:
A key has active timing metadata <= now, RRSIG state in
RUMOURED or OMNIPRESENT.
- dst_key_is_signing:
KSK is_signing and is_active means different things than
for a ZSK. A ZSK is active means it is also signing, but
a KSK always signs its DNSKEY RRset but is considered
active if its DS is present (rumoured or omnipresent).
- dst_key_is_revoked:
A key has revoke timing metadata <= now.
- dst_key_is_removed:
A key has delete timing metadata <= now, DNSKEY state in
UNRETENTIVE or HIDDEN.
When doing rollover in a timely manner we need to have access to the
relevant kasp configured durations.
Most of these are simple get functions, but 'dns_kasp_signdelay'
will calculate the maximum time that is needed with this policy to
resign the complete zone (taking into account the refresh interval
and signature validity).
Introduce parent-propagation-delay, parent-registration-delay,
parent-ds-ttl, zone-max-ttl, zone-propagation-delay.
When signing a zone with dnssec-policy, we don't mind DNSSEC records.
This is useful for testing purposes, and perhaps it is better to
signal this behavior with a different configuration option.
Introduce a new option '-s' for dnssec-settime that when manipulating
timing metadata, it also updates the key state file.
For testing purposes, add options to dnssec-settime to set key
states and when they last changed.
The dst code adds ways to write and read the new key states and
timing metadata. It updates the parsing code for private key files
to not parse the newly introduced metadata (these are for state
files only).
Introduce key goal (the state the key wants to be in).
When reading a key from file, you can set the DST_TYPE_STATE option
to also read the key state.
This expects the Algorithm and Length fields go above the metadata,
so update the write functionality to do so accordingly.
Introduce new DST metadata types for KSK, ZSK, Lifetime and the
timing metadata used in state files.
This commit adds code for generating keys with dnssec-keygen given
a specific dnssec-policy.
The dnssec-policy can be set with a new option '-k'. The '-l'
option can be used to set a configuration file that contains a
specific dnssec-policy.
Because the dnssec-policy dictates how the keys should look like,
many of the existing dnssec-keygen options cannot be used together
with '-k'.
If the dnssec-policy lists multiple keys, dnssec-keygen has now the
possibility to generate multiple keys at one run.
Add two tests for creating keys with '-k': One with the default
policy, one with multiple keys from the configuration.
Write functions to access various elements of the kasp structure,
and the kasp keys. This in preparation of code in dnssec-keygen,
dnssec-settime, named...
Add a number of metadata variables (lifetime, ksk and zsk role).
For the roles we add a new type of metadata (booleans).
Add a function to write the state of the key to a separate file.
Only write out known metadata to private file. With the
introduction of the numeric metadata "Lifetime", adjust the write
private key file functionality to only write out metadata it knows
about.
Code and documentation were not in line:
- Remove -z option from code
- Remove -k option from docbook
- Add -d option to docbook
- Add -T option to docbook
This stores the dnssec-policy configuration and adds methods to
create, destroy, and attach/detach, as well as find a policy with
the same name in a list.
Also, add structures and functions for creating and destroying
kasp keys.
This commit introduces the initial `dnssec-policy` configuration
statement. It has an initial set of options to deal with signature
and key maintenance.
Add some checks to ensure that dnssec-policy is configured at the
right locations, and that policies referenced to in zone statements
actually exist.
Add some checks that when a user adds the new `dnssec-policy`
configuration, it will no longer contain existing DNSSEC
configuration options. Specifically: `inline-signing`,
`auto-dnssec`, `dnssec-dnskey-kskonly`, `dnssec-secure-to-insecure`,
`update-check-ksk`, `dnssec-update-mode`, `dnskey-sig-validity`,
and `sig-validity-interval`.
Test a good kasp configuration, and some bad configurations.
The ttlval configuration types are replaced by duration configuration
types. The duration is an ISO 8601 duration that is going to be used
for DNSSEC key timings such as key lifetimes, signature resign
intervals and refresh periods, etc. But it is also still allowed to
use the BIND ttlval ways of configuring intervals (number plus
optional unit).
A duration is stored as an array of 7 different time parts.
A duration can either be expressed in weeks, or in a combination of
the other datetime indicators.
Add several unit tests to ensure the correct value is parsed given
different string values.
This commit does not change anything significant, it just makes
the file more readable in preparation for upcoming changes related
to the `dnssec-policy` configuration option.
glibc 2.30 deprecated the <sys/sysctl.h> header [1]. However, that
header is still used on other Unix-like systems, so only prevent it from
being used on Linux, in order to prevent compiler warnings from being
triggered.
[1] https://sourceware.org/ml/libc-alpha/2019-08/msg00029.html
Add a shell function which is used in the "tcp" system test, but has
been accidentally omitted from !2425. Make sure the function does not
change the value of "ret" itself, so that the caller can decide what to
do with the function's return value.
When doing regular signing expiry time is jittered to make sure
that the re-signing times are not clumped together. This expands
this behaviour to expiry times of dynamically added records.
When incrementally re-signing a zone use the full jitter range if
the server appears to have been offline for greater than 5 minutes
otherwise use a small jitter range of 3600 seconds. This will stop
the signatures becoming more clustered if the server has been off
line for a significant period of time (> 5 minutes).
This variable will report the maximum number of simultaneous tcp clients
that BIND has served while running.
It can be verified by running rndc status, then inspect "tcp high-water:
count", or by generating statistics file, rndc stats, then inspect the
line with "TCP connection high-water" text.
The tcp-highwater variable is atomically updated based on an existing
tcp-quota system handled in ns/client.c.
Related scan-build report:
dnstap_test.c:169:2: warning: Value stored to 'result' is never read
result = dns_test_makeview("test", &view);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dnstap_test.c:193:2: warning: Value stored to 'result' is never read
result = dns_compress_init(&cctx, -1, dt_mctx);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.
The named_g_defaultdnstap was never used as the dnstap requires
explicit configuration of the output file.
Related scan-build report:
./server.c:3476:14: warning: Value stored to 'dpath' during its initialization is never read
const char *dpath = named_g_defaultdnstap;
^~~~~ ~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
And add a note to the man page that `rndc validation` flushes the
cache when the validation state is changed. (It is necessary to flush
the cache when turning on validation, to avoid continuing to use
cryptographically invalid data. It is probably wise to flush the cache
when turning off validation to recover from lameness problems.)
The implementation of `rndc validation status` iterates over all the
views to print their validation status. It takes care to print newlines
in between, but it also used put a nul byte at the end of the first view
which truncated the output.
After this change, the nul byte is added at the end so that it prints
the validation status in all views. The `_bind` view is skipped
because its validation status is irrelevant.
Portion of the digdelv test are skipped on IPv6 due to extra quotes
around $TESTSOCK6: "I:digdelv:IPv6 unavailable; skipping".
Researched by @michal.
Regressed with 351efd8812.
If a TCP connection fails while attempting to send a query to a server,
the fetch context will be restarted without marking the target server as
a bad one. If this happens for a server which:
- was already marked with the DNS_FETCHOPT_EDNS512 flag,
- responds to EDNS queries with the UDP payload size set to 512 bytes,
- does not send response packets larger than 512 bytes,
and the response for the query being sent is larger than 512 byes, then
named will pointlessly alternate between sending UDP queries with EDNS
UDP payload size set to 512 bytes (which are responded to with truncated
answers) and TCP connections until the fetch context retry limit is
reached. Prevent such query loops by marking the server as bad for a
given fetch context if the advertised EDNS UDP payload size for that
server gets reduced to 512 bytes and it is impossible to reach it using
TCP.
I was truncating zone files for experimental purposes when I found
that `named-compilezone | head` got stuck. The full command line that
exhibited the problem was:
dig axfr dotat.at |
named-compilezone -o /dev/stdout dotat.at /dev/stdin |
head
This requires a large enough zone to exhibit the problem, more than
about 70000 bytes of plain text output from named-compilezone.
I was running the command on Debian Stretch amd64.
This was puzzling since it looked like something was suppressing the
SIGPIPE. I used `strace` to examine what was happening at the hang.
The program was just calling write() a lot to print the zone file, and
the last write() hanged until I sent it a SIGINT.
During some discussion with friends, Ian Jackson guessed that opening
/dev/stdout O_RDRW might be the problem, and after some tests we found
that this does in fact suppress SIGPIPE.
Since `named-compilezone` only needs to write to its output file, the
fix is to omit the stdio "+" update flag.
It was found that NSEC Aggressive Caching has a significant performance impact
on BIND 9 when used as recursor. This commit disables the synth-from-dnssec
configuration option by default to provide immediate remedy for people running
BIND 9.12+. The NSEC Aggressive Cache will be enabled again after a proper fix
will be prepared.
Make the release checklist match the current release process better by
adding missing steps, rearranging existing ones, reassigning
responsibilities, and dividing the list into sections (by due date).
cppcheck 1.89 emits a false positive for lib/dns/spnego_asn1.c:
lib/dns/spnego_asn1.c:698:9: error: Uninitialized variable: data [uninitvar]
memset(data, 0, sizeof(*data));
^
lib/dns/spnego.c:1707:47: note: Calling function 'decode_NegTokenResp', 3rd argument '&resp' value is <Uninit>
ret = decode_NegTokenResp(buf + taglen, len, &resp, NULL);
^
lib/dns/spnego_asn1.c:698:9: note: Uninitialized variable: data
memset(data, 0, sizeof(*data));
^
This message started appearing with cppcheck 1.89 [1], but it will be
gone in the next release [2], so just suppress it for the time being.
[1] af214e8212
[2] 2595b82634
cppcheck 1.89 enabled certain value flow analysis mechanisms [1] which
trigger null pointer dereference false positives in lib/dns/rpz.c:
lib/dns/rpz.c:582:7: warning: Possible null pointer dereference: tgt_ip [nullPointer]
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
lib/dns/rpz.c:1419:44: note: Calling function 'adj_trigger_cnt', 4th argument 'NULL' value is 0
adj_trigger_cnt(rpzs, rpz_num, rpz_type, NULL, 0, true);
^
lib/dns/rpz.c:582:7: note: Null pointer dereference
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
lib/dns/rpz.c:596:7: warning: Possible null pointer dereference: tgt_ip [nullPointer]
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
lib/dns/rpz.c:1419:44: note: Calling function 'adj_trigger_cnt', 4th argument 'NULL' value is 0
adj_trigger_cnt(rpzs, rpz_num, rpz_type, NULL, 0, true);
^
lib/dns/rpz.c:596:7: note: Null pointer dereference
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
lib/dns/rpz.c:610:7: warning: Possible null pointer dereference: tgt_ip [nullPointer]
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
lib/dns/rpz.c:1419:44: note: Calling function 'adj_trigger_cnt', 4th argument 'NULL' value is 0
adj_trigger_cnt(rpzs, rpz_num, rpz_type, NULL, 0, true);
^
lib/dns/rpz.c:610:7: note: Null pointer dereference
if (KEY_IS_IPV4(tgt_prefix, tgt_ip)) {
^
It seems that cppcheck no longer treats at least some REQUIRE()
assertion failures as fatal, so add extra assertion macro definitions to
lib/isc/include/isc/util.h that are only used when the CPPCHECK
preprocessor macro is defined; these definitions make cppcheck 1.89
behave as expected.
There is an important requirement for these custom definitions to work:
cppcheck must properly treat abort() as a function which does not
return. In order for that to happen, the __GNUC__ macro must be set to
a high enough number (because system include directories are used and
system headers compile attributes away if __GNUC__ is not high enough).
__GNUC__ is thus set to the major version number of the GCC compiler
used, which is what that latter does itself during compilation.
[1] aaeec462e6
Commit afa81ee4e4 omitted some spots in
the source tree which are still referencing the removed --with-cc-alg
"configure" option. Make sure the latter is removed completely.
When a GitLab CI runner is not under load, a single OpenBSD system test
job completes in about 12 minutes, which is considered decent. However,
such jobs are usually multiplexed with other system test jobs on the
same host, which causes each of them to take even 40 minutes to
complete. Taking retries into account, this is completely unacceptable
for everyday use, so only start OpenBSD system test jobs for pipelines
created through GitLab's web interface and for pipelines created for Git
tags.
Since the Windows build job does not use the files created as a result
of running "autoreconf -fi" in the "autoreconf:sid:amd64" job, set its
dependencies to an empty list.
Since it is currently not possible to use "needs: []" for jobs which do
not belong to the first stage of a pipeline, set the "needs" key for the
Windows build job to the "autoreconf:sid:amd64" job so that all build
jobs are started at the same time (without this change, the Windows
build job does not start until all jobs in the "precheck" stage are
finished).
As a side note, these changes also attempt to eliminate intermittent,
bogus GitLab error messages ("There has been a missing dependency
failure").
The intended purpose of the "autoreconf:sid:amd64" GitLab CI job is to
run "autoreconf -fi" and then pass the updated files on to subsequent
non-Windows build jobs. However, the artifacts currently created by
that job only include files which are not tracked by Git. Since we
currently do track e.g. "configure" with Git, the aforementioned job is
essentially a no-op. Fix by manually specifying the files generated by
the "autoreconf:sid:amd64" job that should be passed on to subsequent
build jobs.
Ensure BIND can be tested on OpenBSD in GitLab CI to more quickly catch
build and test errors on that operating system.
Some notes:
- While GCC is packaged for OpenBSD, only old versions (4.2.1, 4.9.4)
are readily available and none of them is the default system
compiler, so we are only doing Clang builds in GitLab CI.
- Unit tests are currently not run on OpenBSD because it ships with an
old version of kyua which does not handle skipped tests properly.
These jobs will be added when we move away from using kyua in the
future as the test code itself works fine.
- All OpenBSD jobs are run inside QEMU virtual machines, using GitLab
Runner Custom executor.
Consider the following Makefile:
foo:
false
On OpenBSD, the following happens for this Makefile:
- "make foo" returns 1,
- "make -k foo" returns 0,
- "make -k -j6 foo" returns 1.
However, if the .NOTPARALLEL pseudo-target is added to this Makefile,
"make -k -j6 foo" will return 0 as well.
Since bin/tests/Makefile contains the .NOTPARALLEL pseudo-target,
running "make -k -j6 test" from bin/tests/ on OpenBSD prevents any
errors from being reported through that command's exit code.
Work around the issue by running "make -k -j6 test" in the
bin/tests/system/ directory instead as bin/tests/system/Makefile does
not contain the .NOTPARALLEL pseudo-target and thus things work as
expected there.
Resolve "A minor documentation issue & consideration of parsing inconsistencies in IPv4s in address match lists and in a controls/inet statement"
Closes#1143
See merge request isc-projects/bind9!2152
BIND supports the non-standard DNSKEY algorithm mnemonic ECDSA256
everywhere ECDSAP256SHA256 is allowed, and allows algorithm numbers
interchangeably with mnemonics. This is all done in one place by the
dns_secalg_fromtext() function.
DS digest types were less consistent: the rdata parser does not allow
abbreviations like SHA1, but the dnssec-* command line tools do; and
the command line tools do not alow numeric types though that is the
norm in rdata.
The command line tools now use the dns_dsdigest_fromtext() function
instead of rolling their own variant, and dns_dsdigest_fromtext() now
knows about abbreviated digest type mnemonics.
previously, if the option was empty, then it was printed without a
colon, which could not be parsed as YAML. adding a colon in all cases
addresses this problem.
'isc_commandline_index' is a global variable so it can theoretically
change result between if expressions. Save 'argv[isc_commandline_index]'
to local variable 'arg1' and use 'arg1 == NULL' in if expressions
instead of 'argc < isc_commandline_index + 1'. This allows clang
to correctly determine what code is reachable.
From Cppcheck:
Passing NULL after the last typed argument to a variadic function leads to
undefined behaviour. The C99 standard, in section 7.15.1.1, states that if the
type used by va_arg() is not compatible with the type of the actual next
argument (as promoted according to the default argument promotions), the
behavior is undefined. The value of the NULL macro is an implementation-defined
null pointer constant (7.17), which can be any integer constant expression with
the value 0, or such an expression casted to (void*) (6.3.2.3). This includes
values like 0, 0L, or even 0LL.In practice on common architectures, this will
cause real crashes if sizeof(int) != sizeof(void*), and NULL is defined to 0 or
any other null pointer constant that promotes to int. To reproduce you might be
able to use this little code example on 64bit platforms. If the output includes
"ERROR", the sentinel had only 4 out of 8 bytes initialized to zero and was not
detected as the final argument to stop argument processing via
va_arg(). Changing the 0 to (void*)0 or 0L will make the "ERROR" output go away.
void f(char *s, ...) {
va_list ap;
va_start(ap,s);
for (;;) {
char *p = va_arg(ap,char*);
printf("%018p, %s\n", p, (long)p & 255 ? p : "");
if(!p) break;
}
va_end(ap);
}
void g() {
char *s2 = "x";
char *s3 = "ERROR";
// changing 0 to 0L for the 7th argument (which is intended to act as
// sentinel) makes the error go away on x86_64
f("first", s2, s2, s2, s2, s2, 0, s3, (char*)0);
}
void h() {
int i;
volatile unsigned char a[1000];
for (i = 0; i<sizeof(a); i++)
a[i] = -1;
}
int main() {
h();
g();
return 0;
}
This MR changes the default Debian sid build to wrap make with bear
that creates compilation database and use the compilation database
to run Cppcheck on the source files systematically.
The job is currently set to be allowed to fail as it will take some
time to fix all the Cppcheck detected issues.
- compare key data when checking for a trust anchor match.
- allow for the possibility of multiple trust anchors with the same key ID
so we don't overlook possible matches.
The coccinellery repository provides many little semantic patches to fix common
problems in the code. The number of semantic patches in the coccinellery
repository is high and most of the semantic patches apply only for Linux, so it
doesn't make sense to run them on regular basis as the processing takes a lot of
time.
The list of issue found in BIND 9, by no means complete, includes:
- double assignment to a variable
- `continue` at the end of the loop
- double checks for `NULL`
- useless checks for `NULL` (cannot be `NULL`, because of earlier return)
- using `0` instead of `NULL`
- useless extra condition (`if (foo) return; if (!foo) { ...; }`)
- removing & in front of static functions passed as arguments
The dns_name_copy() function followed two different semanitcs that was driven
whether the last argument was or wasn't NULL. This commit splits the function
in two where now third argument to dns_name_copy() can't be NULL and
dns_name_copynf() doesn't have third argument.
This commit was done by hand to add the RUNTIME_CHECK() around stray
dns_name_copy() calls with NULL as third argument. This covers the edge cases
that doesn't make sense to write a semantic patch since the usage pattern was
unique or almost unique.
This second commit uses second semantic patch to replace the calls to
dns_name_copy() with NULL as third argument where the result was stored in a
isc_result_t variable. As the dns_name_copy(..., NULL) cannot fail gracefully
when the third argument is NULL, it was just a bunch of dead code.
Couple of manual tweaks (removing dead labels and unused variables) were
manually applied on top of the semantic patch.
This commit add RUNTIME_CHECK() around all simple dns_name_copy() calls where
the third argument is NULL using the semantic patch from the previous commit.
The dns_name_copy() function cannot fail gracefully when the last argument
(target) is NULL. Add RUNTIME_CHECK()s around such calls.
The first semantic patch adds RUNTIME_CHECK() around any call that ignores the
return value and is very safe to apply.
The second semantic patch attempts to properly add RUNTIME_CHECK() to places
where the return value from `dns_name_copy()` is recorded into `result`
variable. The result of this semantic patch needs to be reviewed by hand.
Both patches misses couple places where the code surrounding the
`dns_name_copy(..., NULL)` usage is more complicated and is better suited to be
fixed by a human being that understands the surrounding code.
The libidn2 library on Ubuntu Bionic is broken and idn2_to_unicode_8zlz() does't
fail when it should. This commit ensures that we don't run the system test for
valid A-label in locale that cannot display with the buggy libidn2 as it would
break the tests.
It is possible dig used ACE encoded name in locale, which does not
support converting it to unicode. Instead of fatal error, fallback to
ACE name on output.
Bring the files describing Windows-specific aspects of building and
installing BIND up to date. Remove the parts which are either outdated
(e.g. 32-bit build instructions), already included elsewhere (e.g. the
list of Windows systems BIND is known to run on), or inconvenient to
keep up to date in the long run (e.g. ARM chapter numbers).
Ensure BIND can be tested on Windows in GitLab to more quickly catch
build and test errors on that operating system.
Some notes:
- While build jobs are triggered for all pipelines, system test jobs
are not - due to the time it takes to run the complete system test
suite on Windows (about 20 minutes), the latter are only run for
pipelines created through GitLab's web interface and for pipelines
created for Git tags.
- Only the "Release" build configuration is currently used. Adding
"Debug" builds is a matter of extending .gitlab-ci.yml, but it was
not done for the time being due to questionable usefulness of
performing such builds in GitLab CI.
- Only a 64-bit build is performed. Adding support for 32-bit builds
is not planned to be implemented.
- Unit tests are still not run on Windows, but adding support for that
is on the roadmap.
- All Windows GitLab CI jobs are run inside Windows Server containers,
using the Custom executor feature of GitLab Runner as Windows Server
2016 is not supported by GitLab Runner's native Docker on Windows
executor and Windows Server 2019 is not yet widely available from
hosting providers.
- The Windows Docker image used by GitLab CI is not stored in the
GitLab Container Registry as it is over 27 GB in size and thus
passing it between GitLab and its runners is impractical.
- There is no vcvarsall.bat variant written in PowerShell and batch
scripts are no longer supported by GitLab Runner Custom executor, so
the environment variables set by vcvarsall.bat are injected back
into the PowerShell environment by processing the output of "set".
- Visual Studio parallel builds are a bit different than "make -jX"
builds as parallelization happens in two tiers: project parallelism
(controlled by the "/maxCpuCount" msbuild.exe switch) and compiler
parallelism (controlled by the "/MP" cl.exe switch). To limit the
total number of compiler processes spawned concurrently to a value
similar to the one used for Unix builds, msbuild.exe is allowed to
build at most 2 projects at once, each of which can spawn up to half
of BUILD_PARALLEL_JOBS worth of compiler processes. Using such
parameters is a fairly arbitrary decision taken to solve the
trade-off between compilation speed and runner load.
- Configuring network addresses in Windows Server containers is
tricky. Adding 10.53.0.1/24 and similar addresses to the vEthernet
interface created by Docker never causes ifconfig.bat to fail, but
in fact only one container can have any given IP address configured
at any given time (the request to add the same address in another
container is silently ignored). Thus, in order to allow multiple
system test jobs to be run in parallel, the addresses used in system
tests are configured on the loopback interfaces. Interestingly
enough, the addresses set on the loopback interfaces... persist
between containers. Fortunately, this is acceptable for the time
being and only requires ifconfig.bat failures to be ignored (as
ifconfig.bat will fail if it attempts to configure an already
existing address on an interface). We also need to wait for a brief
moment after calling ifconfig.bat as the addresses the latter
attempts to configure may not be immediately available after it
returns (and that causes runall.sh to error out). Finally, for some
reason we also need to signal that the DNS servers on each loopback
interface are to be configured using DHCP or else ifconfig.bat will
fail to add the requested addresses.
- Since named.pid files created by named instances used in system
tests contain Windows PIDs instead of Cygwin PIDs and various
versions of Cygwin "kill" react differently when passed Windows PIDs
without the -W switch, all "kill" invocations in GitLab CI need to
use that switch (otherwise they would print error messages which
would cause stop.pl to assume the process being killed died
prematurely). However, to preserve compatibility with older Cygwin
versions used in our other Windows test environments, we alter the
relevant scripts "on the fly" rather than in the Git repository.
- In the containers used for running system tests, Windows Error
Reporting is configured to automatically create crash dumps in
C:\CrashDumps. This directory is examined after the test suite is
run to ensure no crashes went under stop.pl's radar.
The SYSTEMTESTTOP variable is set by bin/tests/system/run.sh. When
system tests are run on Windows, that variable will contain an absolute
Cygwin path. In the case of the "statschannel" system test, using the
unmodified SYSTEMTESTTOP variable in tests.sh causes the RNDCCMD
variable to contain an invocation of a native Windows application with
an absolute Cygwin path passed as a parameter, which prevents rndc from
working in that system test. Until we have a cleaner solution, override
SYSTEMTESTTOP with a relative path to work around the issue and thus fix
the "statschannel" system test on Windows.
Make sure the CYGWIN environment variable is set whenever system tests
are run on Windows to prevent stop.pl from making incorrect assumptions
about the environment it is running in, which triggers e.g. false
reports about named instances crashing on shutdown when system tests are
run on Windows. This issue has not been caught earlier because the
CYGWIN environment variable was incidentally being set on a higher level
in our Windows test environments.
Error reporting for parallel system tests on Windows has been broken all
along: since all parallel.mk targets generated by parallel.sh pipe their
output through "tee", the return code from run.sh is lost and thus
running "make -f parallel.mk check" will not yield a non-zero return
code if some system tests fail. The same applies to runsequential.sh.
Yet, runall.sh on Windows only sets its return code to a non-zero value
if either "make -f parallel.mk check" or runsequential.sh returns a
non-zero return code. Fix by making runall.sh yield a non-zero return
code when testsummary.sh fails, which is the same approach as the one
used in the "test" target in bin/tests/system/Makefile.
Until now, the build process for BIND on Windows involved upgrading the
solution file to the version of Visual Studio used on the build host.
Unfortunately, the executable used for that (devenv.exe) is not part of
Visual Studio Build Tools and thus there is no clean way to make that
executable part of a Windows Server container.
Luckily, the solution upgrade process boils down to just adding XML tags
to Visual Studio project files and modifying certain XML attributes - in
files which we pregenerate anyway using win32utils/Configure. Thus,
extend win32utils/Configure with three new command line parameters that
enable it to mimic what "devenv.exe bind9.sln /upgrade" does. This
makes the devenv.exe build step redundant and thus facilitates building
BIND in Windows Server containers.
Build configuration for the dnssec-cds Visual Studio project is absent
from the solution file template, which means the solution needs to be
upgraded using "devenv bind9.sln /upgrade" in order for the dnssec-cds
project to be built. Add the build configuration for dnssec-cds to the
solution file template so that upgrading the solution is not necessary
for building that project.
named-checkzone does not use libbind9. Update the Visual Studio project
file template for named-checkzone to reflect that, thus preventing
compilation issues during parallel builds.
When commit 8eb88aafee removed liblwres,
it also modified nsupdate to use libirs instead of liblwres, but the
Visual Studio project files were not updated to reflect that change.
Make sure the nsupdate Visual Studio project depends on the libirs
project to prevent compilation issues during parallel builds.
Make stderr fully buffered on Windows to improve named performance when
it is logging to stderr, which happens e.g. in system tests. Note that:
- line buffering (_IOLBF) is unavailable on Windows,
- fflush() is called anyway after each log message gets written to the
default stderr logging channels created by libisc.
BIND system tests are run in a Cygwin environment. Apparently Cygwin
shell sets the SEM_NOGPFAULTERRORBOX bit in its process error mode which
is then inherited by all spawned child processes. This bit prevents the
Windows Error Reporting dialog from being displayed, which I assume is
part of an effort to contain memory handling errors triggered by Cygwin
binaries in the Cygwin environment. Unfortunately, this also prevents
automatic crash dump creation by Windows Error Reporting and Cygwin
itself does not handle memory errors in native Windows processes spawned
from a Cygwin shell.
Fix by clearing the SEM_NOGPFAULTERRORBOX bit inside named if it is
started in a Cygwin environment, thus overriding the Cygwin-set process
error mode in order to enable Windows Error Reporting to handle all
named crashes.
When libxml2 is to be used in a multi-threaded application, the
xmlInitThreads() function must be called before any other libxml2
function. This function does different things on various platforms and
thus one can get away without calling it on Unix systems, but not on
Windows, where it initializes critical section objects used for
synchronizing access to data structures shared between threads. Add the
missing xmlInitThreads() call to prevent crashes on affected systems.
Also add a matching xmlCleanupThreads() call to properly release the
resources set up by xmlInitThreads().
No problems have been observed on the FreeBSD GitLab CI runner during
the burn-in period, when FreeBSD jobs needed to be triggered manually.
Thus, make the FreeBSD jobs run automatically along other GitLab CI
jobs.
`/usr/share/sgml/docbook/xsl-stylesheets` and `/usr/share/dblatex` are
places where docbook-style-xsl and, respectively, dblatex packages on
Red Hat systems put their XSL templates. Unless we hint this place it
has to be added to `./configure` manually (`--with-docbook-xsl=...`):
https://src.fedoraproject.org/rpms/bind/blob/master/f/bind.spec#_691.
On Fedora 30:
Before
```
./configure
...
checking for Docbook-XSL path... auto
checking for html/docbook.xsl... "not found"
checking for xhtml/docbook.xsl... "not found"
checking for manpages/docbook.xsl... "not found"
checking for html/chunk.xsl... "not found"
checking for xhtml/chunk.xsl... "not found"
checking for html/chunktoc.xsl... "not found"
checking for xhtml/chunktoc.xsl... "not found"
checking for html/maketoc.xsl... "not found"
checking for xhtml/maketoc.xsl... "not found"
checking for xsl/docbook.xsl... "not found"
checking for xsl/latex_book_fast.xsl... "not found"
```
After:
```
./configure
...
checking for Docbook-XSL path... auto
checking for html/docbook.xsl... /usr/share/sgml/docbook/xsl-stylesheets/html/docbook.xsl
checking for xhtml/docbook.xsl... /usr/share/sgml/docbook/xsl-stylesheets/xhtml/docbook.xsl
checking for manpages/docbook.xsl... /usr/share/sgml/docbook/xsl-stylesheets/manpages/docbook.xsl
checking for html/chunk.xsl... /usr/share/sgml/docbook/xsl-stylesheets/html/chunk.xsl
checking for xhtml/chunk.xsl... /usr/share/sgml/docbook/xsl-stylesheets/xhtml/chunk.xsl
checking for html/chunktoc.xsl... /usr/share/sgml/docbook/xsl-stylesheets/html/chunktoc.xsl
checking for xhtml/chunktoc.xsl... /usr/share/sgml/docbook/xsl-stylesheets/xhtml/chunktoc.xsl
checking for html/maketoc.xsl... /usr/share/sgml/docbook/xsl-stylesheets/html/maketoc.xsl
checking for xhtml/maketoc.xsl... /usr/share/sgml/docbook/xsl-stylesheets/xhtml/maketoc.xsl
checking for xsl/docbook.xsl... /usr/share/dblatex/xsl/docbook.xsl
checking for xsl/latex_book_fast.xsl... /usr/share/dblatex/xsl/latex_book_fast.xsl
```
* CKR_CRYPTOKI_ALREADY_INITIALIZED: This value can only be returned by
`C_Initialize`. It means that the Cryptoki library has already been
initialized (by a previous call to `C_Initialize` which did not have a
matching `C_Finalize` call).
* CKR_FUNCTION_NOT_SUPPORTED: The requested function is not supported by this
Cryptoki library. Even unsupported functions in the Cryptoki API should have a
“stub” in the library; this stub should simply return the value
CKR_FUNCTION_NOT_SUPPORTED.
* CKR_LIBRARY_LOAD_FAILED: The Cryptoki library could not load a dependent
shared library.
The OASIS pkcs11.h header has a restrictive license. Replace the
pkcs11.h pkcs11f.h and pkcs11t.h headers with pkcs11.h from p11-kit.
For source distribution, the license for the OASIS headers itself
doesn't pose any licensing problem when combined with MPL license, but
it possibly creates problem for downstream distributors of BIND 9.
The isc_refcount_decrement() was either used as:
if (isc_refcount_decrement() == 1) { destroy(); }
or
if (isc_refcount_decrement() != 1) { return; } destroy();
This commits eradicates the last usage of the later, so the code is unified to
use the former.
Fixing typos, typographical glitches. Added backticks around binaries,
modules, and libraries so it's more consistent. Added a paragraph with
ISC Security Policy.
Ensure BIND can be tested on FreeBSD in GitLab to more quickly catch
build and test errors on that operating system. Make the relevant jobs
optional until the CI environment supporting them is deemed stable
enough for continuous use.
FreeBSD jobs are run using the Custom executor feature of GitLab Runner.
Unlike the Docker executor, the Custom executor does not support the
"image" option and thus some way of informing the runner about the OS
version to use for a given job is necessary. Arguably the simplest way
of doing that without a lot of code duplication in .gitlab-ci.yml would
be to use a YAML template with a "variables" block specifying the
desired FreeBSD release to use, but including such a template in a job
definition would cause issues in case other variables also needed to be
set for that job (e.g. CFLAGS or EXTRA_CONFIGURE for build jobs). Thus,
only one FreeBSD YAML template is defined instead and the Custom
executor scripts on FreeBSD runners extract the OS version to use from
the CI job name. This allows .gitlab-ci.yml variables to be defined for
FreeBSD jobs in the same way as for Docker-based jobs.
Currently, the lib/dns/tests/tkey_test unit test is only run when the
linker supports the --wrap option. However, linker support for that
option is only needed for static builds. As a result, the unit test
mentioned before is not being run everywhere it can be run as even for
builds done using --with-libtool, the test is not run unless the linker
supports the --wrap option.
Tweak preprocessor directives in lib/dns/tests/tkey_test.c so that this
test is run:
- for all builds using --with-libtool,
- for static builds done using a linker supporting the --wrap option.
Weak symbols are handled differently by different dynamic linkers. With
glibc, lib/dns/tests/tkey_test works as expected no matter whether
--with-libtool is used or not: __attribute__((weak)) prevents a static
build from failing and it just so happens that the desired symbols are
picked at runtime for dynamic builds. However, with BSD libc, the
libdns functions called from lib/dns/tests/tkey_test.c use the "real"
memory allocation functions from libisc, thus breaking that unit test.
(Note: similar behavior can be reproduced with glibc by setting the
LD_DYNAMIC_WEAK environment variable.)
The simplest way to make lib/dns/tests/tkey_test work reliably is to
drop all uses of __attribute__((weak)) in it - this way, the memory
functions inside lib/dns/tests/tkey_test.c will always be used instead
of the "real" libisc ones for dynamic builds. However, this would not
work with static builds as it would result in multiple strong symbols
with the same name being present in a single binary.
Work around the problem by only compiling in the overriding definitions
of memory functions when building using --with-libtool. For static
builds, keep relying on the --wrap linker option for replacing calls to
the functions we are interested in.
When kyua is called without the --logfile command line option, the log
file is created at a default location which is derived from the HOME
environment variable. On FreeBSD GitLab CI runners, /home is a
read-only directory and thus kyua invocations not using the --logfile
option fail when HOME is set to something beneath /home. Set --logfile
to /dev/null for all kyua invocations whose logs are irrelevant in order
to prevent kyua failures caused by HOME being non-writable.
For newer versions of Xcode, "xcode-select --install" no longer installs
system headers into /usr/include (instead, they are installed in the
Xcode directory tree), so do not mention that path in the macOS section
of README to prevent confusion.
Previously the libisc allocator had ability to run unlocked when threading was
disabled. As the threading is now always on, remove the ISC_MEMFLAG_NOLOCK
memory flag as it serves no purpose.
The isc_mem_createx() function was only used in the tests to eliminate using the
default flags (which as of writing this commit message was ISC_MEMFLAG_INTERNAL
and ISC_MEMFLAG_FILL). This commit removes the isc_mem_createx() function from
the public API.
Previously, the isc_mem_create() and isc_mem_createx() functions took `max_size`
and `target_size` as first two arguments. Those values were never used in the
BIND 9 code. The refactoring removes those arguments and let BIND 9 always use
the default values.
Previously, the isc_mem_create() and isc_mem_createx() functions could have
failed because of failed memory allocation. As this was no longer true and the
functions have always returned ISC_R_SUCCESS, the have been refactored to return
void.
Resolve "BIND | Potential for NULL pointer de-references plus memory leaks (CWE-476) in file 'dlz_mysqldyn_mod.c'"
Closes#1207
See merge request isc-projects/bind9!2299
This commits adds an OpenSSL based isc_siphash24() implementation, which is
preferred when available.
The siphash_test has been modified to test both implementation with a trick that
renames the isc_siphash24() to openssl_ or native_ prefixed name and includes
the ../siphash.c two times (when the OpenSSL implementation is available).
Add check for creating new EVP_PKEY with EVP_PKEY_SIPHASH, but disable SipHash
on OpenSSL 1.1.1 as the hash length initialization is broken before OpenSSL
1.1.1a release.
The native implementation's conversion from the uint8_t buffers to uint64_t now
follows the reference implementation that doesn't require aligned buffers.
isc_event_allocate() calls isc_mem_get() to allocate the event structure. As
isc_mem_get() cannot fail softly (e.g. it never returns NULL), the
isc_event_allocate() cannot return NULL, hence we remove the (ret == NULL)
handling blocks using the semantic patch from the previous commit.
when looking for a possible wildcard match in the RPZ summary database,
use an rbtnodechain to walk up label by label, rather than using the
node's parent pointer.
GitLab 12.2 has introduced Directed Acyclic Graphs in the GitLab CI[1] that
allow jobs to run out-of-order and not wait for the whole previous stage to
complete.
1. https://docs.gitlab.com/ee/ci/directed_acyclic_graph/
When updating the statistics for RRset types, if a header is marked
stale or ancient, the appropriate statistic counters are decremented,
then incremented.
Also fix some out of date comments.
Having the decrement/increment logic in stats makes the code hard
to follow. Remove it here and adjust the unit test. The caller
will be responsible for maintaining the correct increments and
decrements for statistics counters (in the following commit).
The stale RR types are now printed with '#'. This used to be the
prefix for RR types that were marked ancient, but commit
df50751585 changed the meaning. It is
probably better to keep '#' for stale RR types and introduce a new
prefix for reintroducing ancient type stat counters.
In the ARM section about RPZ, add text explicitly stating that ACLs take
precedence over RPZ to prevent users from expecting RPZ actions to be
applied to queries coming from clients which are not permitted access to
the resolver by ACLs.
- this required modification to the code that generates grammar text for
the documentation, because the "dnssec-lookaside" option spanned more
than one line in doc/misc/options, so grepping out only the lines
marked "// obsolete" didn't remove the whole option. this commit adds
an option to cfg_test to print named.conf clauses only if they don't
have the obsolete, ancient, test-only, or not-yet-implemented flags
set.
Add a helper shell function, rndc_dumpdb(), which provides a convenient
way to call "rndc dumpdb" for a given server with optional additional
arguments. Since database dumping is an asynchronous process, the
function waits until the dump is complete before returning, which
prevents false positives in system tests caused by inspecting the dump
before its preparation is finished. The function also renames the dump
file before returning so that it does not get overwritten by subsequent
calls; this retains forensic data in case of an unexpected test failure.
The change fixes the following build failure on sparc T3 and older CPUs:
```
sparc-unknown-linux-gnu-gcc ... -O2 -mcpu=niagara2 ... -c rwlock.c
{standard input}: Assembler messages:
{standard input}:398: Error: Architecture mismatch on "pause ".
{standard input}:398: (Requires v9e|v9v|v9m|m8; requested architecture is v9b.)
make[1]: *** [Makefile:280: rwlock.o] Error 1
```
`pause` insutruction exists only on `-mcpu=niagara4` (`T4`) and upper.
The change adds `pause` configure-time autodetection and uses it if available.
config.h.in got new `HAVE_SPARC_PAUSE` knob. Fallback is a fall-through no-op.
Build-tested on:
- sparc-unknown-linux-gnu-gcc (no `pause`, build succeeds)
- sparc-unknown-linux-gnu-gcc -mcpu=niagara4 (`pause`, build succeeds)
Reported-by: Rolf Eike Beer
Bug: https://bugs.gentoo.org/691708
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
There's a deadlock in BIND 9 code where (dns_view_t){ .lock } and
(dns_resolver_t){ .buckets[i].lock } gets locked in different order. When
view->weakrefs gets converted to a reference counting we can reduce the locking
in dns_view_weakdetach only to cases where it's the last instance of the
dns_view_t object.
(cherry picked from commit a7c9a52c89)
(cherry picked from commit 232140edae)
There's no strong reason to keep `make tags` in our build system. The previous
functionality of `make tags` could be simply retained by aliasing variant of:
etags $(git ls-files '*.c' '*.h')
which would be universal for all C-code projects.
Previously isc_thread_join() would return ISC_R_UNEXPECTED on a failure to
create new thread. All such occurences were caught and wrapped into assert
function at higher level. The function was simplified to assert directly in the
isc_thread_join() function and all caller level assertions were removed.
Previously isc_thread_create() would return ISC_R_UNEXPECTED on a failure to
create new thread. All such occurences were caught and wrapped into assert
function at higher level. The function was simplified to assert directly in the
isc_thread_create() function and all caller level assertions were removed.
Multiple resolvers in the "wildcard" system test are configured with a
single root hint: "ns.root-servers.nil", pointing to 10.53.0.1, which is
inconsistent with authoritative data served by ns1. This may cause
intermittent resolution failures, triggering false positives for the
"wildcard" system test. Prevent this from happening by making ns2, ns3,
and ns5 use root hints corresponding to the contents of ns1/root.db.in.
The isc-config.sh script was introduced before pkg-config as is a purely
historical thing. There are two reason for removal of isc-config.sh scripts:
a) The BIND 9 libraries are now meant to be used only from BIND 9, so there's no
reason to provide convenience script to link with the libraries.
b) Even if that was not the case, we should and would replace the isc-config.sh
with respective pkg-config (.pc) file for every library.
Resolve "Replace the isc_mem_put(mctx, ...)+isc_mem_detach(&mctx) usage with isc_mem_putanddetach(&mctx)"
Closes#1160
See merge request isc-projects/bind9!2195
Using isc_mem_put(mctx, ...) + isc_mem_detach(mctx) required juggling with the
local variables when mctx was part of the freed object. The isc_mem_putanddetach
function can handle this case internally, but it wasn't used everywhere. This
commit apply the semantic patching plus bit of manual work to replace all such
occurrences with proper usage of isc_mem_putanddetach().
With the move of the normal output to stdout, we need a way how to silence the
extra output, so the signed file name can be captured in a simple way. This
commit adds `-q` command line option that will silence all the normal output
that get's printed from both tools.
The lib/dns/zoneverify.c output was hardwired to stderr, which was inconsistent
with lib/dns/dnssec.c. This commit changes zoneverify.c to print the normal run
information to caller supplied function - same model as in the lib/dns/dnssec.c.
Previously, the default output from the libdns library went to stderr by
default. This was inconsistent with the rest of the output. This commit
changes the default logging to go to stdout, with notable exception - when the
output of the signing process goes to stdout, the messages are printed to the
stderr. This is consistent with other functions that output information about
the signing process - e.g. print_stats().
The ns2 named instance in the "staticstub" system test is configured
with a single root hint commonly used in BIND system tests
(a.root-servers.nil with an address of 10.53.0.1), which is inconsistent
with authoritative data served by ns1. This may cause intermittent
resolution failures, triggering false positives for the "staticstub"
system test. Prevent this from happening by making ns1 serve data
corresponding to the contents of bin/tests/system/common/root.hint.
Ensure BIND is continuously tested on Alpine Linux as it is commonly
used as a base for Docker containers and employs a less popular libc
implementation, musl libc.
"PST8PDT" is a legacy time zone name whose use in modern code is
discouraged. It so happens that using this time zone with musl libc
time functions results in different output than for other libc
implementations, which breaks the lib/isc/tests/time_test unit test.
Use the "America/Los_Angeles" time zone instead in order to get
consistent output across all tested libc implementations.
Appending output of a command to the same file as the one that command
is reading from is a dangerous practice. It seems to have accidentally
worked with all the awk implementations we have tested against so far,
but for BusyBox awk, doing this may result in the input/output file
being written to in an infinite loop. Prevent this from happening by
redirect awk output to a temporary file and appending its contents to
the original file in a separate shell pipeline.
The Net::DNS Perl module needs the Digest::HMAC module to support TSIG.
However, since the latter is not a hard requirement for the former, some
packagers do not make Net::DNS depend on Digest::HMAC. If Net::DNS is
installed on a host but Digest::HMAC is not, the "xfer" system test
breaks in a very hard-to-debug way (ans5 returns TSIG RRs with empty
RDATA, which prevents TSIG-signed SOA queries and transfers from
working). Prevent this from happening by making the "xfer" system test
explicitly require Digest::HMAC apart from Net::DNS.
The BusyBox version of sed treats leading '\+' in a regular expression
to be matched as a syntax error ("Repetition not preceded by valid
expression"), which triggers false positives for the "digdelv" system
test. Make the relevant sed invocations work portably across all sed
implementations by removing the leading backslash.
The BusyBox version of awk treats some variables which other awk
implementations consider to be decimal values as octal values. This
intermittently breaks key event interval calculations in the "autosign"
system test, trigger false positives for it. Prevent the problem from
happening by stripping leading zeros from the affected awk variables.
For some libc implementations, BUFSIZ is small enough (e.g. 1024 for
musl libc) to trigger compilation warnings about insufficient size of
certain buffers. Since the relevant buffers are used for printing DNS
names, increase their size to '(n + 1) * DNS_NAME_FORMATSIZE', where 'n'
is the number of DNS names which are printed to a given buffer. This
results in somewhat arbitrary, albeit nicely-aligned and large enough
buffer sizes.
Including <sys/errno.h> instead of <errno.h> raises a compiler warning
when building against musl libc. Always include <errno.h> instead of
<sys/errno.h> to prevent that compilation warning from being triggered
and to achieve consistency in this regard across the entire source tree.
Make sure all unit tests include headers in a similar order:
1. Three headers which must be included before <cmocka.h>.
2. System headers.
3. UNIT_TESTING definition, followed by the <cmocka.h> header.
4. libisc headers.
5. Headers from other BIND libraries.
6. Local headers.
Also make sure header file names are sorted alphabetically within each
block of #include directives.
All unit tests define the UNIT_TESTING macro, which causes <cmocka.h> to
replace malloc(), calloc(), realloc(), and free() with its own functions
tracking memory allocations. In order for this not to break
compilation, the system header declaring the prototypes for these
standard functions must be included before <cmocka.h>.
Normally, these prototypes are only present in <stdlib.h>, so we make
sure it is included before <cmocka.h>. However, musl libc also defines
the prototypes for calloc() and free() in <sched.h>, which is included
by <pthread.h>, which is included e.g. by <isc/mutex.h>. Thus, unit
tests including "dnstest.h" (which includes <isc/mem.h>, which includes
<isc/mutex.h>) after <cmocka.h> will not compile with musl libc as for
these programs, <sched.h> will be included after <cmocka.h>.
Always including <cmocka.h> after all other header files is not a
feasible solution as that causes the mock assertion macros defined in
<isc/util.h> to mangle the contents of <cmocka.h>, thus breaking
compilation. We cannot really use the __noreturn__ or analyzer_noreturn
attributes with cmocka assertion functions because they do return if the
tested condition is true. The problem is that what BIND unit tests do
is incompatible with Clang Static Analyzer's assumptions: since we use
cmocka, our custom assertion handlers are present in a shared library
(i.e. it is the cmocka library that checks the assertion condition, not
a macro in unit test code). Redefining cmocka's assertion macros in
<isc/util.h> is an ugly hack to overcome that problem - unfortunately,
this is the only way we can think of to make Clang Static Analyzer
properly process unit test code. Giving up on Clang Static Analyzer
being able to properly process unit test code is not a satisfactory
solution.
Undefining _GNU_SOURCE for unit test code could work around the problem
(musl libc's <sched.h> only defines the prototypes for calloc() and
free() when _GNU_SOURCE is defined), but doing that could introduce
discrepancies for unit tests including entire *.c files, so it is also
not a good solution.
All in all, including <sched.h> before <cmocka.h> for all affected unit
tests seems to be the most benign way of working around this musl libc
quirk. While quite an ugly solution, it achieves our goals here, which
are to keep the benefit of proper static analysis of unit test code and
to fix compilation against musl libc.
Resolvers in the "filter-aaaa" system test are configured with a single
root hint: "ns.rootservers.net", pointing to 10.53.0.1. However,
querying ns1 for "ns.rootservers.net" results in NXDOMAIN answers.
Since the TTL for the root hint is set to 0, it may happen that a
resolver's ADB will be asked to return any known addresses for
"ns.rootservers.net", but it will only have access to a cached NXDOMAIN
answer for that name and an expired root hint, which will result in a
resolution failure, triggering a false positive for the "filter-aaaa"
system test. Prevent this from happening by making all the root hints
consistent with authoritative data served by ns1.
The HTML view of the statistics channel creates
pages with many long tables. These can be difficult
to navigate.
This commit adds a "show/hide" toggle to each
heading, which makes it easy to compress/expand
the view.
- removed some dead code
- dns_zone_setdbtype is now void as it could no longer return
anything but ISC_R_SUCCESS; calls to it no longer check for a result
- controlkeylist_fromconfig() is also now void
- fixed a whitespace error
The isc_mem_get() cannot fail gracefully now, it either gets memory of
assert()s. The added semantic patch cleans all the blocks checking whether
the return value of isc_mem_get() was NULL.
The coccinelle and util/update_copyright script have different
idea about how the whitespace should look like. Revert the script
to the previous version, so it doesn't mangle the files in place,
and deal with just whitespace changes.
Commit 9da902a201 removed locking around
the fctx_decreference() call inside resume_dslookup(). This allows
fctx_unlink() to be called without the bucket lock being held, which
must never happen. Ensure the bucket lock is held by resume_dslookup()
before it calls fctx_decreference().
Ensure BIND with dnstap support enabled is being continuously tested by
adding --enable-dnstap to the ./configure invocation used for CentOS 7
and Debian sid builds in GitLab CI.
When the unit test is linked with dynamic libraries, the wrapping
doesn't occur, probably because it's different translation unit.
To workaround the issue, we provide thin wrappers with *real* symbol
names that just call the mocked functions.
1. Restore locking in the fctx_decreference() code, because the insides of the
function needs to be protected when fctx->references drops to 0.
2. Restore locking in the dns_resolver_attach() code, because two variables are
accessed at the same time and there's slight chance of data race.
Although the struct dns_resolver.exiting member is protected by stdatomics, we
actually need to wait for whole dns_resolver_shutdown() to finish before
destroying the resolver object. Otherwise, there would be a data race and some
fctx objects might not be destroyed yet at the time we tear down the
dns_resolver object.
This commit changes the BIND cookie algorithms to match
draft-sury-toorop-dnsop-server-cookies-00. Namely, it changes the Client Cookie
algorithm to use SipHash 2-4, adds the new Server Cookie algorithm using SipHash
2-4, and changes the default for the Server Cookie algorithm to be siphash24.
Add siphash24 cookie algorithm, and make it keep legacy aes as
Each individual test opened GeoIP databased but the database handles were never
closed. This commit moves the open/close from the individual unit tests into
the _setup and _teardown methods where they really belong.
Instead of the explicit struct initializer with all member, rely on the fact
that static variables are explicitly initialized to 0 if not explicitly
initialized.
The MSVS C compiler requires every struct to have at least one member.
The dns_geoip_databases_t structure had one set of members for
HAVE_GEOIP and a different set for HAVE_GEOIP2, and none when neither
API is in use.
This commit silences the compiler error by moving the declaration of
dns_geoip_databases_t to types.h as an opaque reference, and commenting
out the contents of geoip.h when neither version of GeoIP is enabled.
Commit b104a9bc50 introduced unconditional
use of the ATOMIC_VAR_INIT() macro in bin/dnssec/dnssec-signzone.c even
though that macro is only defined on Unix platforms. Define it on
Windows systems as well in order to prevent build failures.
The ThreadSanitizer found several possible data races in our rwlock
implementation. This commit changes all the unprotected variables to atomic and
also changes the explicit memory ordering (atomic_<foo>_explicit(..., <order>)
functions to use our convenience macros (atomic_<foo>_<order>).
The SO_BSDCOMPAT socket option is no-op since Linux 2.4, see the manpage:
SO_BSDCOMPAT
Enable BSD bug-to-bug compatibility. This is used by the UDP protocol
module in Linux 2.0 and 2.2. If enabled, ICMP errors received for a UDP
socket will not be passed to the user program. In later kernel
versions, support for this option has been phased out: Linux 2.4
silently ignores it, and Linux 2.6 generates a kernel warning (printk())
if a program uses this option. Linux 2.0 also enabled BSD bug-to-bug
compatibility options (random header changing, skipping of the broadcast
flag) for raw sockets with this option, but that was removed in Linux
2.2.
The 'managed-keys' (and 'trusted-keys') options have been deprecated
by 'dnssec-keys'. Some documentation references to 'managed-keys'
had not yet been marked or noted as such.
When trying to extract the key ID from a key file name, some test code
incorrectly attempts to strip all leading zeros. This breaks tests when
keys with ID 0 are generated. Add a new helper shell function,
keyfile_to_key_id(), which properly handles keys with ID 0 and use it in
test code whenever a key ID needs to be extracted from a key file name.
When printing a packet, dnstap-read checks whether its text form takes
up more than the 2048 bytes allocated for the output buffer by default.
If that is the case, the output buffer is automatically expanded, but
the truncated output is left in the buffer, resulting in malformed data
being printed. Clear the output buffer before expanding it to prevent
this issue from occurring.
Adds a new option to named-checkconf, -i. If set, named-checkconf
will not warn you about deprecated options. This allows people
to use named-checkconf in automated deployment precoesses where an
operator only cares if their conf is valid, even if it is not optimal.
This was added as a request as part of introducing a policy on
removing named.conf options.
- revise mapping of search terms to database types to match the
GeoIP2 schemas.
- open GeoIP2 databases when starting up; close when shutting down.
- clarify the logged error message when an unknown database type
is configured.
- add new geoip ACL subtypes to support searching for continent in
country databases.
- map geoip ACL subtypes to specific MMDB database queries.
- perform MMDB lookups based on subtype, saving state between
queries so repeated lookups for the same address aren't necessary.
- "--with-geoip" is used to enable the legacy GeoIP library.
- "--with-geoip2" is used to enable the new GeoIP2 library
(libmaxminddb), and is on by default if the library is found.
- using both "--with-geoip" and "--with-geoip2" at the same time
is an error.
- an attempt is made to determine the default GeoIP2 database path at
compile time if pkg-config is able to report the module prefix. if
this fails, it will be necessary to set the path in named.conf with
geoip-directory
- Makefiles have been updated, and a stub lib/dns/geoip2.c has been
added for the eventual GeoIP2 search implementation.
When GNU C Compiler is used on Solaris (11), the Thread Local Storage
is completely broken. The behaviour doesn't manifest when GNU ld is
used. Thus, we need to enforce usage of GNU ld when GNU C Compiler is
the compiler of choice.
For more background for this change, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90912
In ISC-Bugs 45340, I wrote:
The Statistics channel offers links to Zones and Traffic.
Both produce valid data, but display as blank pages with
a web browser.
Zones never had XSL (I provided the original
implementation, but punted on the XSL).
Traffic has XSL, but it wasn't updated to reflect the
split between IPv4 and IPv6 data.
I've picked up enough XSL to fix my original omission,
and as penance for my sloth, fixed the Traffic bug as well.
- when processing authoritative queries for ./NS, set 'gluedb' so
that glue will be included in the response, regardless of how
'minimal-responses' has been configured.
if "rndc reload" fails, the result code is supposed to be passed to
zone_postload, but for inline-signing zones, the result can be
overwritten first by a call to the ZONE_TRYLOCK macro. this can lead
to the partially-loaded unsigned zone being synced over to the signed
zone instead of being rejected.
libidn2 2.2.0+ parses Punycode more strictly than older versions and
thus "dig +idnin +noidnout xn--19g" fails with libidn2 2.2.0+ but
succeeds with older versions.
We could preserve the old behavior by using the IDN2_NO_ALABEL_ROUNDTRIP
flag available in libidn2 2.2.0+, but:
- this change in behavior is considered a libidn2 bug fix [1],
- we want to make sure dig behaves as expected, not libidn2,
- implementing that would require additional configure.ac cruft.
Removing the problematic check appears to be the simplest solution as it
does not prevent the relevant block of checks in the "idna" system test
from achieving its purpose, i.e. ensuring dig properly handles invalid
U-labels.
[1] see upstream commit 241e8f486134793cb0f4a5b0e5817a97883401f5
Since commit 0771dd3be8, <isc/mem.h> no
longer includes <isc/xml.h>. On some systems (e.g. FreeBSD), this means
that no header included by lib/dns/dnsrps.c (and no header included by
those headers) contains a definition of free() any more, which triggers
a compiler warning as lib/dns/dnsrps.c calls that function. Add the
missing #include directive to prevent that warning from being triggered.
No function called dns_dnssecsignstats_decrement() actually exists.
Putting it into lib/dns/win32/libdns.def.in breaks at least some Windows
builds. Remove the nonexistent function from that file.
Since the message confirming outgoing transfer completion is logged
asynchronously, it may happen that transfer statistics may not yet be
logged by the time the dig command triggering a given transfer returns.
This causes false positives for the "ixfr" and "xfer" system tests.
Prevent this from happening by checking outgoing transfer statistics up
to 10 times, in 1-second intervals.
The ax_check_openssl m4 macro used OPENSSL_INCLUDES. Rename the
subst variable to OPENSSL_CFLAGS and wrap AX_CHECK_OPENSSL() in
action-if-not-found part of PKG_CHECK_MODULE check for libcrypto.
The json-c have previously leaked into the global namespace leading
to forced -I<include_path> for every compilation unit using isc/xml.h
header. This MR fixes the usage making the caller object opaque.
The libxml2 have previously leaked into the global namespace leading
to forced -I<include_path> for every compilation unit using isc/xml.h
header. This MR fixes the usage making the caller object opaque.
In addition to gather how many times signatures are created per
key in a zone, also count how many of those signature creations are
because of DNSSEC maintenance. These maintenance counters are
incremented if a signature is refreshed (but the RRset did not
changed), when the DNSKEY RRset is changed, and when that leads
to additional RRset / RRSIG updates (for example SOA, NSEC).
This adds tests to the statschannel system test for testing if
the dnskey sign operation counters are incremented correctly.
It tests three cases:
1. A zone maintenance event where all the signatures that are about
to expire are resigned.
2. A dynamic update event where the new RR and other relevant records
(SOA, NSEC) are resigned.
3. Adding a standby key, that means the DNSKEY and SOA RRset are
resigned.
After a failed reload I noticed two problems:
* There was a missing newline in the output of `rndc status` so it
finished "reload/reconfig in progressserver is up and running"
* The "reconfig in progress" note should have said "reconfig failed"
Previously the autoconf script set sysconfdir to /etc and localstatedir to /var
if they were not explicitly set in the ./configure invocation. This MR reverts
the override and make it more in line with default and generally expected
autoconf behavior.
AM_MAINTAINER_MODE macro adds ability to disable rebuilding build file
(Makefile.in, configure, ...) when the source file changes. This is
important in the CI where the timestamps could get skewed and that
triggers the rebuild on every ./configure run.
The differences between two files are very minimal and most of the
code is common. Merge those two files and use #ifdef WIN32 to include
the right bits on Windows.
Using atomic_int_fast64_t variables with atomic functions on x86 does
not cause Visual Studio to report build errors, but such operations
yield useless results. Since the isc_stat_t type is unconditionally
typedef'd to atomic_int_fast64_t, any code performing atomic operations
on isc_stat_t variables is broken in x86 Windows builds. Fix by using
the atomic_int_fast32_t type for isc_stat_t in x86 Windows builds.
BIND 9.11.0 has bumped DNS_CLIENTINFOMETHODS_VERSION and _AGE to
version 2 and 1 in the dlz_minimal.h because a member was addet to the
dnsclientinfo struct. It was found out that the new member is not
used anywhere and there are no accessor functions therefore the change
was reverted.
Later on, it was found out that the revert caused some problems to the
users of BIND 9, and thus this changes takes a different approach by
syncing the values other way around.
The common construct seen in the BIND 9 source is func(isc_mem_t *mctx, ...).
Unfortunately, the dnstest.{h,c} has been using mctx as a global symbol, which
in turn generated a lot of errors when update.c got included in update_test.c.
As a rule of thumb, we should avoid naming global symbols with generic names
(like mctx) and we should prefix them with "namespace" (like dt_mctx).
The CHECK() macro has been defined both in dnstest.h and update.c
files. This has created a conflict between macro definitions when
including both of the files in update_test.c. While the CHECK() macro
is convenient for the tests, it has been really used in just two
files, so the MR moves them into those respective .c files.
lib/dns/tests/update_test was failing on macOS on random occasions. It
turned out this was a linker problem - it preferred isc_stdtime_get()
from libisc instead of the local version in lib/dns/tests/update_test.c.
Fix by including the original .c file in the unit test. This has two
benefits:
a) linking order may no longer cause issues as symbols found in the
same compilation unit are always preferred,
b) it allows writing tests for static functions in lib/dns/update.c.
Pull and use several autoconf archive convenience macros to simplify
configure.ac.
* AX_CHECK_COMPILE_FLAG(FLAG, ...) - check whether given CFLAG works
* AX_CHECK_LINK_FLAG(FLAG, ...) - check whether given LDFLAG works
* AX_CHECK_PREPROC_FLAG(FLAG, ...) - check whether give CPPFLAG works
* AX_SAVE_FLAGS/AX_RESTORE_FLAGS - save and restore *FLAGS
In certain situations (e.g. a named instance crashing upon shutdown in a
system test which involves shutting down a server and restarting it
afterwards), a system test may succeed despite a named crash being
triggered. This must never be the case. Extend run.sh to mark a test
as failed if core dumps or log lines indicating assertion failures are
detected (the latter is only an extra measure aimed at test environments
in which core dumps are not generated; note that some types of crashes,
e.g. segmentation faults, will not be detected using this method alone).
Make the get_named_xfer_stats() helper shell function more precise in
order to prevent it from matching the wrong lines as that may trigger
false positives for the "ixfr" and "xfer" system tests. As an example,
the regular expression responsible for extracting the number of bytes
transmitted throughout an entire zone transfer could also match a line
containing the following string:
transfer of '<zone-name>/IN': sending TCP message of <integer> bytes
However, such a line is not one summarizing a zone transfer.
Also simplify both get_dig_xfer_stats() and get_named_xfer_stats() by
eliminating the need for "echo" statements in them.
If ns1/setup.sh generates a key with ID 0, the "KEYID" token in
ns1/named.conf.in will be replaced with an empty string, causing the
following broken statement to appear in ns1/named.conf:
tkey-dhkey "server" ;
Such a statement triggers false positives for the "tkey" system test due
to ns1 being unable to start with a broken configuration file. Fix by
tweaking the regular expression used for removing leading zeros from the
key ID, so that it removes at most 4 leading zeros.
We increase recursclients when we attach to recursion quota,
decrease when we detach. In some cases, when we hit soft
quota, we might attach to quota without increasing recursclients
gauge. We then decrease the gauge when we detach from quota,
and it causes the statistics to underflow.
Fix makes sure that we increase recursclients always when we
succesfully attach to recursion quota.
Compiling with -O3 triggers the following warnings with GCC 9.1:
task.c: In function ‘isc_taskmgr_create’:
task.c:1384:43: warning: ‘%04u’ directive output may be truncated writing between 4 and 10 bytes into a region of size 6 [-Wformat-truncation=]
1384 | snprintf(name, sizeof(name), "isc-worker%04u", i);
| ^~~~
task.c:1384:32: note: directive argument in the range [0, 4294967294]
1384 | snprintf(name, sizeof(name), "isc-worker%04u", i);
| ^~~~~~~~~~~~~~~~
task.c:1384:3: note: ‘snprintf’ output between 15 and 21 bytes into a destination of size 16
1384 | snprintf(name, sizeof(name), "isc-worker%04u", i);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
private_test.c: In function ‘private_nsec3_totext_test’:
private_test.c:110:9: warning: array subscript 4 is outside array bounds of ‘uint32_t[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds]
110 | while (*sp == '\0' && slen > 0) {
| ^~~
private_test.c:103:11: note: while referencing ‘salt’
103 | uint32_t salt;
| ^~~~
Prevent these warnings from being triggered by increasing the size of
the relevant array (task.c) and reordering conditions (private_test.c).
Compiling with -O3 triggers the following warning with GCC 8.3:
driver.c: In function ‘dlz_findzonedb’:
driver.c:191:29: warning: ‘%u’ directive output may be truncated writing between 1 and 5 bytes into a region of size between 0 and 99 [-Wformat-truncation=]
snprintf(buffer, size, "%s#%u", addr_buf, port);
^~
driver.c:191:25: note: directive argument in the range [0, 65535]
snprintf(buffer, size, "%s#%u", addr_buf, port);
^~~~~~~
driver.c:191:2: note: ‘snprintf’ output between 3 and 106 bytes into a destination of size 100
snprintf(buffer, size, "%s#%u", addr_buf, port);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Increase the size of the relevant array to prevent this warning from
being triggered.
Change the compiler optimization level for Debian sid build jobs from
-O2 to -O3 in order to enable triggering compilation warnings which are
not raised when -O2 is used.
write locking.
Unreachable cache in zonemgr is realized as an static LRU list.
When we 'use' an entry we need to update the last-used time, we
can use atomics to do so without the necessity to upgrading
read-lock to write-lock.
this change silences a warning message and prevents the unwanted
use of smart quotes when using pandoc 2.7.1 to generate human-readable
versions of README and other markdown files.
- change references to trusted-keys to dnssec-keys with static-key
- rebuild doc/misc/options and other generated grammar doc
- add a "see MANAGED-KEYS" note when building named.conf.docbook
- managed-keys is now deprecated as well as trusted-keys, though
it continues to work as a synonym for dnssec-keys
- references to managed-keys have been updated throughout the code.
- tests have been updated to use dnssec-keys format
- also the trusted-keys entries have been removed from the generated
bind.keys.h file and are no longer generated by bindkeys.pl.
- any use of trusted or static keys for the root zone will now
elicit a warning, regardless of what the keys may be
- ditto for any use of a key for dlv.isc.org, static or managed
- trusted-keys is now flagged as deprecated, but still works
- managed-keys can be used to configure permanent trust anchors by
using the "static-key" keyword in place of "initial-key"
- parser now uses an enum for static-key and initial-key keywords
Since 2008, the cleaning-interval timer has been documented as
"effectively obsolete" and disabled in the default configuration with
a comment saying "now meaningless".
This change deletes all the code that implements the cleaning-interval
timer, except for the config parser in whcih it is now explicitly
marked as obsolete.
I have verified (using the deletelru and deletettl cache stats) that
named still cleans the cache after this change.
Move the macOS section of <isc/endian.h> to a lower spot as it is
believed not to be the most popular platform for running BIND. Add a
comment and remove redundant definitions.
Instead of only supporting Linux, try making <isc/endian.h> support
other GNU platforms as well. Since some compilers define __GNUC__ on
BSDs (e.g. Clang on FreeBSD), move the relevant section to the bottom of
the platform-specific part of <isc/endian.h>, so that it only gets
evaluated when more specific platform determination criteria are not
met. Also include <byteswap.h> so that any byte-swapping macros which
may be defined in that file on older platforms are used in the fallback
definitions of the nonstandard hto[bl]e{16,32,64}() and
[bl]e{16,32,64}toh() conversion functions.
While Solaris does not support the nonstandard hto[bl]e{16,32,64}() and
[bl]e{16,32,64}toh() conversion functions, it does have some
byte-swapping macros available in <sys/byteorder.h>. Ensure these
macros are used in the fallback definitions of the aforementioned
nonstandard functions.
Since the hto[bl]e{16,32,64}() and [bl]e{16,32,64}toh() conversion
functions are nonstandard, add fallback definitions of these functions
to <isc/endian.h>, so that their unavailability does not prevent
compilation from succeeding.
Current versions of DragonFly BSD, FreeBSD, NetBSD, and OpenBSD all
support the modern variants of functions converting values between host
and big-endian/little-endian byte order while older ones might not.
Ensure <isc/endian.h> works properly in both cases.
Replace grep calls with awk scripts to more precisely detect presence of
CDS and CDNSKEY records in a signed zone file, in order to prevent rare
false positives for the "smartsign" system test triggered by the strings
"CDS" and/or "CDNSKEY" being accidentally present in the Base64 form of
DNSSEC-related data in the zone file being checked.
There's a small possibility of race between udp dispatcher and
socket code - socket code can still hold internal reference to a
socket while dispatcher calls isc_socket_open, which can cause
an assertion failure. Fix it by relaxing the assertion test, and
instead simply locking the socket in isc_socket_open.
qname minimization, even in relaxed mode, can fail on
some very broken domains. In relaxed mode, instead of
asking for "foo.bar NS" ask for "_.foo.bar A" to either
get a delegation or NXDOMAIN. It will require more queries
than regular mode for proper NXDOMAINs.
Performing server setup checks using "+tries=3 +time=5" is redundant as
a single query is arguably good enough for determining whether a given
named instance was set up properly. Only use multiple queries with a
long timeout for resolution checks in the "legacy" system test, in order
to significantly reduce its run time (on a contemporary machine, from
about 1m45s to 0m40s).
In the "legacy" system test, in order to make server setup checks more
consistent with each other, add further checks for either presence or
absence of the EDNS OPT pseudo-RR in the responses returned by the
tested named instances.
Extract repeated dig and grep calls into two helper shell functions,
resolution_succeeds() and resolution_fails(), in order to reduce code
duplication in the "legacy" system test, emphasize the similarity
between all the resolution checks in that test, and make the conditions
for success and failure uniform for all resolution checks in that test.
When testing named instances which are configured to drop outgoing UDP
responses larger than 512 bytes, querying with DO=1 may be used instead
of querying for large TXT records as the effect achieved will be
identical: an unsigned response for a SOA query will be below 512 bytes
in size while a signed response for the same query will be over 512
bytes in size. Doing this makes all resolution checks in the "legacy"
system test more similar. Add checks for the TC flag being set in UDP
responses which are expected to be truncated to further make sure that
tested named instances behave as expected.
Sending TCP queries to test named instances with TCP support disabled
should cause dig output to contain the phrase "connection refused", not
"connection timed out", as such instances never open the relevant
sockets. Make sure that the "legacy" system test fails if the expected
phrase is not found in any of the relevant files containing dig output.
On some systems (namely Debian buster armhf) the readdir() call fails
with `Value too large for defined data type` unless the
_FILE_OFFSET_BITS=64 is defined. The correct way to fix this is to
get the appropriate compilation parameters from getconf system
interface.
Each network thread holds an array of locks, indexed by a hash
of fd. When we accept a connection we hold a lock in accepting thread.
We then generate the thread number and lock bucket for the new
connection socket - if we hit the same thread and lock bucket as
accepting socket we get a deadlock. Avoid this by checking if we're
in the same thread/lock bucket and not locking in this case.
5235. [cleanup] Refactor lib/isc/app.c to be thread-safe, unused
parts of the API has been removed and the
isc_appctx_t data type has been changed to be
fully opaque. [GL #1023]
This work cleans up the API which includes couple of things:
1. Make the isc_appctx_t type fully opaque
2. Protect all access to the isc_app_t members via stdatomics
3. sigwait() is part of POSIX.1, remove dead non-sigwait code
4. Remove unused code: isc_appctx_set{taskmgr,sockmgr,timermgr}
The header file <isc/atomic.h> now contains convenience macros for
most useful explicit memory ordering for C11 stdatomics, only relaxed
and acquire-release semantics is being used. These macros SHOULD be
used instead of atomic_<func>_explicit functions.
If named is configured to perform DNSSEC validation and also forwards
all queries ("forward only;") to validating resolvers, negative trust
anchors do not work properly because the CD bit is not set in queries
sent to the forwarders. As a result, instead of retrieving bogus DNSSEC
material and making validation decisions based on its configuration,
named is only receiving SERVFAIL responses to queries for bogus data.
Fix by ensuring the CD bit is always set in queries sent to forwarders
if the query name is covered by an NTA.
Previously, only a message about missing Python was printed, which was
misleading to many users. The new message clearly states that Python
AND PLY is required and prints basic instructions how to install PLY
package.
This affects CDS records generated by `named` and `dnssec-signzone`
based on `-P sync` and `-D sync` key timing instructions.
This is for conformance with the DS/CDS algorithm requirements in
https://tools.ietf.org/html/draft-ietf-dnsop-algorithm-update
This affects two cases:
* When writing a `dsset` file for this zone, to be used by its
parent, only write a SHA-256 DS record.
* When reading a `keyset` file for a child, to generate DS records
to include in this zone, generate SHA-256 DS records only.
This change does not affect digests used in CDS records.
This is for conformance with the DS/CDS algorithm requirements in
https://tools.ietf.org/html/draft-ietf-dnsop-algorithm-update
This changes the behaviour so that it explicitly lists DS records that
are present in the parent but do not have keys in the child. Any
inconsistency is reported as an error, which is somewhat stricter than
before.
This is for conformance with the DS/CDS algorithm requirements in
https://tools.ietf.org/html/draft-ietf-dnsop-algorithm-update
This makes the `-12a` options to `dnssec-dsfromkey` work more like
`dnssec-cds`, in that you can specify more than one digest and you
will get multiple records. (Previously you could only get one
non-default digest type at a time.)
The default is now `-2`. You can get the old behaviour with `-12`.
Tests and tools that use `dnssec-dsfromkey` have been updated to use
`-12` where necessary.
This is for conformance with the DS/CDS algorithm requirements in
https://tools.ietf.org/html/draft-ietf-dnsop-algorithm-update
Fuzz input to dns_rdata_fromwire(). Then convert the result
to text, back to wire format, to multiline text, and back to wire
format again, checking for consistency throughout the sequence.
Resolve "Bind returning malformed packet error when sshfp record has fingerprint value less than 4 characters"
Closes#852
See merge request isc-projects/bind9!1445
there is now a common list of tests in conf.sh.common, with the
tests that are either unique to windows or to unix, or which are
enabled or disabled by configure or Configure, being listed in
separate variables in conf.sh.in and conf.sh.win32.
this moves the creation of "parallel.mk" into a separate shell script
instead of bin/tests/system/Makefile. that shell script can now be
executed by runall.sh, allowing us to make use of the cygwin "make"
command, which supports parallel execution.
Windows systems do not allow a trailing period in file names while Unix
systems do. When BIND system tests are run, the $TP environment
variable is set to an empty string on Windows systems and to "." on Unix
systems. This environment variable is then used by system test scripts
for handling this discrepancy properly.
In multiple system test scripts, a variable holding a zone name is set
to a string with a trailing period while the names of the zone's
corresponding dlvset-* and/or dsset-* files are determined using
numerous sed invocations like the following one:
dlvsets="$dlvsets dlvset-`echo $zone |sed -e "s/.$//g"`$TP"
In order to improve code readability, use zone names without trailing
periods and replace sed invocations with variable substitutions.
To retain local consistency, also remove the trailing period from
certain other zone names used in system tests that are not subsequently
processed using sed.
In the "allow-query" system test, ns3 uses a root hints file which
contains a single entry for a.root-servers.nil (10.53.0.1). This name
is not present in the root zone served by ns1, which means querying it
for that name and any type will yield an NXDOMAIN response. When
combined with unfavorable thread scheduling, this can lead to ns3
caching an NXDOMAIN response for the only root server it is aware of and
thus to false positives for the "allow-query" system test caused by ns3
returning unexpected SERVFAIL responses. Fix by modifying the root zone
served by ns1 so that authoritative responses to a.root-servers.nil
queries match the root hints file used by ns3.
in the "refactor tcpquota and pipeline refs" commit, the counting
of active interfaces was tightened in such a way that named could
fail to listen on an interface if there were more interfaces than
tcp-clients. when checking the quota to start accepting on an
interface, if the number of active clients was above zero, then
it was presumed that some other client was able to handle accepting
new connections. this, however, ignored the fact that the current client
could be included in that count, so if the quota was already exceeded
before all the interfaces were listening, some interfaces would never
listen.
we now check whether the current client has been marked active; if so,
then the number of active clients on the interface must be greater
than 1, not 0.
(cherry picked from commit 02365b87ea0b1ea5ea8b17376f6734c811c95e61)
(cherry picked from commit cae79e1bab)
- if the TCP quota has been exceeded but there are no clients listening
for new connections on the interface, we can now force attachment to the
quota using isc_quota_force(), instead of carrying on with the quota not
attached.
- the TCP client quota is now referenced via a reference-counted
'ns_tcpconn' object, one of which is created whenever a client begins
listening for new connections, and attached to by members of that
client's pipeline group. when the last reference to the tcpconn
object is detached, it is freed and the TCP quota slot is released.
- reduce code duplication by adding mark_tcp_active() function
- convert counters to stdatomic
(cherry picked from commit a8dd133d270873b736c1be9bf50ebaa074f5b38f)
(cherry picked from commit 4a8fc979c4)
- ensure that tcpactive is cleaned up correctly when accept() fails.
- set 'client->tcpattached' when the client is attached to the tcpquota.
carry this value on to new clients sharing the same pipeline group.
don't call isc_quota_detach() on the tcpquota unless tcpattached is
set. this way clients that were allowed to accept TCP connections
despite being over quota (and therefore, were never attached to the
quota) will not inadvertently detach from it and mess up the
accounting.
- simplify the code for tcpquota disconnection by using a new function
tcpquota_disconnect().
- before deciding whether to reject a new connection due to quota
exhaustion, check to see whether there are at least two active
clients. previously, this was "at least one", but that could be
insufficient if there was one other client in READING state (waiting
for messages on an open connection) but none in READY (listening
for new connections).
- before deciding whether a TCP client object can to go inactive, we
must ensure there are enough other clients to maintain service
afterward -- both accepting new connections and reading/processing new
queries. A TCP client can't shut down unless at least one
client is accepting new connections and (in the case of pipelined
clients) at least one additional client is waiting to read.
(cherry picked from commit 427a2fb4d17bc04ca3262f58a9dcf5c93fc6d33e)
(cherry picked from commit 0896841272)
Track pipeline groups using a shared reference counter
instead of a linked list.
(cherry picked from commit 31f392db20207a1b05d6286c3c56f76c8d69e574)
(cherry picked from commit 2211120222)
the TCP client quota could still be ineffective under some
circumstances. this change:
- improves quota accounting to ensure that TCP clients are
properly limited, while still guaranteeing that at least one client
is always available to serve TCP connections on each interface.
- uses more descriptive names and removes one (ntcptarget) that
was no longer needed
- adds comments
(cherry picked from commit 9e74969f85329fe26df2fad390468715215e2edd)
(cherry picked from commit d7e84cee0b)
tcp-clients settings could be exceeded in some cases by
creating more and more active TCP clients that are over
the set quota limit, which in the end could lead to a
DoS attack by e.g. exhaustion of file descriptors.
If TCP client we're closing went over the quota (so it's
not attached to a quota) mark it as mortal - so that it
will be destroyed and not set up to listen for new
connections - unless it's the last client for a specific
interface.
(cherry picked from commit eafcff07c25bdbe038ae1e4b6660602a080b9395)
(cherry picked from commit 9e7617cc84)
- Always set is_zonep in query_getdb; previously it was only set if
result was ISC_R_SUCCESS or ISC_R_NOTFOUND.
- Don't reset is_zone for redirect.
- Style cleanup.
(cherry picked from commit a85cc641d7a4c66cbde03cc4e31edc038a24df46)
(cherry picked from commit 486a201149)
Key IDs may accidentally match dig output that is not the key ID (for
example the RRSIG inception or expiration time, the query ID, ...).
Search for key ID + signer name should prevent that, as that is what
only should occur in the RRSIG record, and signer name always follows
the key ID.
Remove sleep calls from test, rely on wait_for_log(). Make
wait_for_log() and dnssec_loadkeys_on() fail the test if the
appropriate log line is not found.
Slightly adjust the echo_i() lines to print only the key ID (not the
key name).
One second may not be enough for an NSEC3 chain change triggered by an
UPDATE message to complete. Wait up to 10 seconds when checking whether
a given NSEC3 chain change is complete in the "nsupdate" system test.
In the "nsupdate" system test, do not sleep before checking results of
changes which are expected to be processed synchronously, i.e. before
nsupdate returns.
- named could return FORMERR if parsing iterative responses
ended with a result code such as DNS_R_OPTERR. instead of
computing a response code based on the result, in this case
we now just force the response to be SERVFAIL.
Make bin/tests/system/ifconfig.bat also configure addresses ending with
9 and 10, so that the script is in sync with its Unix counterpart.
Update comments listing the interfaces created by ifconfig.{bat,sh} so
that they do not include addresses whose last octet is zero (since an
address like 10.53.1.0/24 is not a valid host address and thus the
aforementioned scripts do not even attempt configuring them).
On Windows, the bin/tests/system/dnssec/signer/example.db.signed file
contains carriage return characters at the end of each line. Remove
them before passing the aforementioned file to the awk script extracting
key IDs so that the latter can work properly.
As signals are currently not handled by named on Windows, instances
terminated using signals are not able to perform a clean shutdown, which
involves e.g. removing the lock file. Thus, waiting for a given
instance's lock file to be removed beforing assuming it is shut down
is pointless on Windows, so do not even attempt it.
When a Windows service receives a request to stop, it should not set its
state to SERVICE_STOPPED until it is completely shut down as doing that
allows the operating system to kill that service prematurely, which in
the case of named may e.g. prevent the PID file and/or the lock file
from being cleaned up.
Set service state to SERVICE_STOP_PENDING when named begins its shutdown
and only report the SERVICE_STOPPED state immediately before exiting.
this restores functionality that was removed in commit 03be5a6b4e,
allowing named to search in authoritative zone databases outside the
current zone for additional data, if and only if recursion is allowed
and minimal-responses is disabled.
The option `update-check-ksk` will look if both KSK and ZSK are
available before signing records. It will make sure the keys are
active and available. However, for operational practices keys may
be offline. This commit relaxes the update-check-ksk check and will
mark a key that is offline to be available when adding signature
tasks.
This commit adds a lengthy test where the ZSK is rolled but the
KSK is offline (except for when the DNSKEY RRset is changed). The
specific scenario has the `dnskey-kskonly` configuration option set
meaning the DNSKEY RRset should only be signed with the KSK.
A new zone `updatecheck-kskonly.secure` is added to test against,
that can be dynamically updated, and that can be controlled with rndc
to load the DNSSEC keys.
There are some pre-checks for this test to make sure everything is
fine before the ZSK roll, after the new ZSK is published, and after
the old ZSK is deleted. Note there are actually two ZSK rolls in
quick succession.
When the latest added ZSK becomes active and its predecessor becomes
inactive, the KSK is offline. However, the DNSKEY RRset did not
change and it has a good signature that is valid for long enough.
The expected behavior is that the DNSKEY RRset stays signed with
the KSK only (signature does not need to change). However, the
test will fail because after reconfiguring the keys for the zone,
it wants to add re-sign tasks for the new active keys (in sign_apex).
Because the KSK is offline, named determines that the only other
active key, the latest ZSK, will be used to resign the DNSKEY RRset,
in addition to keeping the RRSIG of the KSK.
The question is: Why do we need to resign the DNSKEY RRset
immediately when a new key becomes active? This is not required,
only once the next resign task is triggered the new active key
should replace signatures that are in need of refreshing.
Add dns_rdata_totext() and dns_rdata_fromtext() to fromwire for
valid inputs to ensure that what we accept in dns_rdata_fromwire()
can be written out and read back in.
In dns_rpz_update_from_db we call setup_update which creates the db
iterator and calls dns_dbiterator_first. This unpauses the iterator and
might cause db->tree_lock to be acquired. We then do isc_task_send(...)
on an event to do quantum_update, which (correctly) after each iteration
calls dns_dbiterator_pause, and re-isc_task_sends itself.
That's an obvious bug, as we're holding a lock over an async task send -
if a task requesting write (e.g. prune_tree) is scheduled on the same
workers queue as update_quantum but before it, it will wait for the
write lock indefinitely, resulting in a deadlock.
To fix it we have to pause dbiterator in setup_update.
Some system tests assume dig's default setings are in effect. While
these defaults may only be silently overridden (because of specific
options set in /etc/resolv.conf) for BIND releases using liblwres for
parsing /etc/resolv.conf (i.e. BIND 9.11 and older), it is arguably
prudent to make sure that tests relying on specific +timeout and +tries
settings specify these explicitly in their dig invocations, in order to
prevent test failures from being triggered by any potential changes to
current defaults.
When parsing message with DNS_MESSAGE_BESTEFFORT (used exclusively in
tools, never in named itself) if we hit an invalid SIG(0) in wrong
place we continue parsing the message, and put the sig0 in msg->sig0.
If we then hit another sig0 in a proper place we see that msg->sig0
is already 'taken' and we don't free name and rdataset, and we don't
set seen_problem. This causes an assertion failure.
This fixes that issue by setting seen_problem if we hit second sig0,
tsig or opt, which causes name and rdataset to be always freed.
Simply looking for the key ID surrounded by spaces in the tested
dnssec-signzone output file is not a precise enough method of checking
for signatures prepared using a given key ID: it can be tripped up by
cross-algorithm key ID collisions and certain low key IDs (e.g. 60, the
TTL specified in bin/tests/system/dnssec/signer/example.db.in), which
triggers false positives for the "dnssec" system test. Make key ID
extraction precise by using an awk script which operates on specific
fields.
The "mirror" system test expects all dig queries (including recursive
ones) to be responded to within 1 second, which turns out to be overly
optimistic in certain cases and leads to false positives being
triggered. Increase dig query timeout used throughout the "mirror"
system test to 2 seconds in order to alleviate the issue.
Currently, ns3 in the "mirror" system test sends trust anchor telemetry
queries every second as it is started with "-T tat=1". Given the number
of trust anchors configured on ns3 (9), TAT-related traffic clutters up
log files, hindering troubleshooting efforts. Increase TAT query
interval to 3 seconds in order to alleviate the issue.
Note that the interval chosen cannot be much higher if intermittent test
failures are to be avoided: TAT queries are only sent after the
configured number of seconds passes since resolver startup. Quick
experiments show that even on contemporary hardware, ns3 should be
running for at least 5 seconds before it is first shut down, so a
3-second TAT query interval seems to be a reasonable, future-proof
compromise. Ensure the relevant check is performed before ns3 is first
shut down to emphasize this trade-off and make it more clear by what
time TAT queries are expected to be sent.
"rndc dumpdb" works asynchronously, i.e. the requested dump may not yet
be fully written to disk by the time "rndc" returns. Prevent false
positives for the "serve-stale" system test by only checking dump
contents after the line indicating that it is complete is written.
This tests both the cases when the DLV trust anchor is of an
unsupported or disabled algorithm, as well as if the DLV zone
contains a key with an unsupported or disabled algorithm.
Some values returned by dstkey_fromconfig() indicate that key loading
should be interrupted, others do not. There are also certain subsequent
checks to be made after parsing a key from configuration and the results
of these checks also affect the key loading process. All of this
complicates the key loading logic.
In order to make the relevant parts of the code easier to follow, reduce
the body of the inner for loop in load_view_keys() to a single call to a
new function, process_key(). Move dstkey_fromconfig() error handling to
process_key() as well and add comments to clearly describe the effects
of various key loading errors.
More specifically: ignore configured trusted and managed keys that
match a disabled algorithm. The behavioral change is that
associated responses no longer SERVFAIL, but return insecure.
bin/tests/system/stop.pl only waits for the PID file to be cleaned up
while named cleans up the lock file after the PID file. Thus, the
aforementioned script may consider a named instance to be fully shut
down when in fact it is not.
Fix by also checking whether the lock file exists when determining a
given instance's shutdown status. This change assumes that if a named
instance uses a lock file, it is called "named.lock".
Also rename clean_pid_file() to pid_file_exists(), so that it is called
more appropriately (it does not clean up the PID file itself, it only
returns the server's identifier if its PID file is not yet cleaned up).
MR !1141 broke the way stop.pl is invoked when start.pl fails:
- start.pl changes the working directory to $testdir/$server before
attempting to start $server,
- commit 27ee629e6b causes the $testdir
variable in stop.pl to be determined using the $SYSTEMTESTTOP
environment variable, which is set to ".." by all tests.sh scripts,
- commit e227815af5 makes start.pl pass
$test (the test's name) rather than $testdir (the path to the test's
directory) to stop.pl when a given server fails to start.
Thus, when a server is restarted from within a tests.sh script and such
a restart fails, stop.pl attempts to look for the server directory in a
nonexistent location ($testdir/$server/../$test, i.e. $testdir/$test,
instead of $testdir/../$test). Fix the issue by changing the working
directory before stop.pl is invoked in the scenario described above.
Change to cmocka broken initialization of TZ environment. This time,
commit 1cf1254051 is not soon enough. Has
to be moved more forward, before any other tests. It library is not full
reinitialized on each test.
Remove another remnant of shared secret HMAC-MD5 support.
Explain that with currently recommended setups DNSKEY records are
inserted automatically, but you can still use $INCLUDE in other cases.
When sending an udp query (resquery_send) we first issue an asynchronous
isc_socket_connect and increment query->connects, then isc_socket_sendto2
and increment query->sends.
If we happen to cancel this query (fctx_cancelquery) we need to cancel
all operations we might have issued on this socket. If we are under very high
load the callback from isc_socket_connect (resquery_udpconnected) might have
not yet been fired. In this case we only cancel the CONNECT event on socket,
and ignore the SEND that's waiting there (as there is an `else if`).
Then we call dns_dispatch_removeresponse which kills the dispatcher socket
and calls isc_socket_close - but if system is under very high load, the send
we issued earlier might still not be complete - which triggers an assertion
because we're trying to close a socket that's still in use.
The fix is to always check if we have incomplete sends on the socket and cancel
them if we do.
On Unix systems, the CYGWIN environment variable is not set at all when
BIND system tests are run. If a named instance crashes on shutdown or
otherwise fails to clean up its pidfile and the CYGWIN environment
variable is not set, stop.pl will print an uninitialized value warning
on standard error. Prevent this by using defined().
ifconfig.sh depends on config.guess for platform guessing. It uses it to
choose between ifconfig or ip tools to configure interfaces. If
system-wide automake script is installed and local was not found, use
platform guess. It should work well on mostly any sane platform. Still
prefers local guess, but passes when if cannot find it.
When a zone is converted from NSEC to NSEC3, the private record at zone
apex indicating that NSEC3 chain creation is in progress may be removed
during a different (later) zone_nsec3chain() call than the one which
adds the NSEC3PARAM record. The "delzsk.example" zone check only waits
for the NSEC3PARAM record to start appearing in dig output while private
records at zone apex directly affect "rndc signing -list" output. This
may trigger false positives for the "autosign" system test as the output
of the "rndc signing -list" command used for checking ZSK deletion
progress may contain extra lines which are not accounted for. Ensure
the private record is removed from zone apex before triggering ZSK
deletion in the aforementioned check.
Also future-proof the ZSK deletion progress check by making it only look
at lines it should care about.
For checks querying a named instance with "dnssec-accept-expired yes;"
set, authoritative responses have a TTL of 300 seconds. Assuming empty
resolver cache, TTLs of RRsets in the ANSWER section of the first
response to a given query will always match their authoritative
counterparts. Also note that for a DNSSEC-validating named resolver,
validated RRsets replace any existing non-validated RRsets with the same
owner name and type, e.g. cached from responses received while resolving
CD=1 queries. Since TTL capping happens before a validated RRset is
inserted into the cache and RRSIG expiry time does not impose an upper
TTL bound when "dnssec-accept-expired yes;" is set and, as pointed out
above, the original TTLs of the relevant RRsets equal 300 seconds, the
RRsets in the ANSWER section of the responses to expiring.example/SOA
and expired.example/SOA queries sent with CD=0 should always be exactly
120 seconds, never a lower value. Make the relevant TTL checks stricter
to reflect that.
Always expecting a TTL of exactly 300 seconds for RRsets found in the
ADDITIONAL section of responses received for CD=1 queries sent during
TTL capping checks is too strict since these responses will contain
records cached from multiple DNS messages received during the resolution
process.
In responses to queries sent with CD=1, ns.expiring.example/A in the
ADDITIONAL section will come from a delegation returned by ns2 while the
ANSWER section will come from an authoritative answer returned by ns3.
If the queries to ns2 and ns3 happen at different Unix timestamps,
RRsets cached from the older response will have a different TTL by the
time they are returned to dig, triggering a false positive.
Allow a safety margin of 60 seconds for checks inspecting the ADDITIONAL
section of responses to queries sent with CD=1 to fix the issue. A
safety margin this large is likely overkill, but it is used nevertheless
for consistency with similar safety margins used in other TTL capping
checks.
Commit c032c54dda inadvertently changed
the DNS message section inspected by one of the TTL capping checks from
ADDITIONAL to ANSWER, introducing a discrepancy between that check's
description and its actual meaning. Revert to inspecting the ADDITIONAL
section in the aforementioned check.
Changes introduced by commit 6b8e4d6e69
were incomplete as not all time-sensitive checks were updated to match
revised "nta-lifetime" and "nta-recheck" values. Prevent rare false
positives by updating all NTA-related checks so that they work reliably
with "nta-lifetime 12s;" and "nta-recheck 9s;". Update comments as well
to prevent confusion.
Resolve "Add return code to allow dlz's allowzonexfr to fall back to to the view's allow-transfer setting."
Closes#803
See merge request isc-projects/bind9!1292
During "dlv" system test setup, the "sed" regex used for mangling the
DNSKEY RRset for the "druz" zone does not include the plus sign ("+"),
which may:
- cause the replacement to happen near the end of DNSKEY RDATA, which
can cause the latter to become an invalid Base64 string,
- prevent the replacement from being performed altogether.
Both cases prevent the "dlv" system test from behaving as intended and
may trigger false positives. Add the missing character to the
aforementioned regex to ensure the replacement is always performed on
bytes 10-25 of DNSKEY RDATA.
Use them in structs for various rdata types where they are missing.
This doesn't change the structs since we are replacing explicit
uint8_t field types with aliases for uint8_t.
Use dns_dsdigest_t in library function arguments.
Improve dnssec-cds with these more specific types.
Alphabetize options and synopsis; remove spurious -z from synopsis;
remove remnants of deprecated -k option; remove mention of long-gone
TSIG support; refer to -T KEY in options that are only relevant to
pre-RFC3755 DNSSEC; remove unnecessary -n ZONE from the example, and
add a -f KSK example.
During server reconfiguration, plugin instances set up for the old views
are unloaded very close to the end of the whole process, after new
plugin instances are set up. As the log message announcing plugin
unloading is emitted at the default "info" level, the user might be
misled into thinking that it is the new plugin instances that are being
unloaded for some reason, particularly because all other messages logged
at the "info" level around the same time inform about setting things up
rather than tearing them down. Since no distinction is currently made
between destroying a view due to reconfiguration and due to a shutdown
in progress, there is no easy way to vary the contents of the log
message depending on circumstances. Since this message is not a
particularly critical one, demote it to debug level to prevent
confusion.
5161. [func] named plugins are now installed into a separate
directory. Supplying a filename (a string without path
separators) in a "plugin" configuration stanza now
causes named to look for that plugin in that directory.
[GL #878]
When the "library" part of a "plugin" configuration stanza does not
contain at least one path separator, treat it as a filename and assume
it is a name of a shared object present in the named plugin installation
directory. Absolute and relative paths can still be used and will be
used verbatim. Get the full path to a plugin before attempting to
check/register it so that all relevant log messages include the same
plugin path (apart from the one logged when the full path cannot be
determined).
Implement a helper function which, given an input string:
- copies it verbatim if it contains at least one path separator,
- prepends the named plugin installation directory to it otherwise.
This function will allow configuration parsing code to conveniently
determine the full path to a plugin module given either a path or a
filename.
While other, simpler ways exist for making sure filenames passed to
dlopen() cause the latter to look for shared objects in a specific
directory, they are very platform-specific. Using full paths is thus
likely the most portable and reliable solution.
Also added unit tests for ns_plugin_expandpath() to ensure it behaves
as expected for absolute paths, relative paths, and filenames, for
various target buffer sizes.
(Note: plugins share a directory with named on Windows; there is no
default plugin path. Therefore the source path is copied to the
destination path with no modification.)
Installing named plugins into ${libdir} clutters the latter and is not
in line with common filesystem conventions. Instead, install named
plugins into a separate directory, ${libdir}/named.
The "check key refreshes are resumed after root servers become
available" check may trigger a false positive for the "mkeys" system
test if the second example/TXT query sent by dig is received by ns5 less
than a second after it receives a REFUSED response to the upstream query
it sends to ns1 in order to resolve the first example/TXT query sent by
dig. Since that REFUSED response from ns1 causes ns5 to return a
SERVFAIL answer to dig, example/TXT is added to the SERVFAIL cache,
which is enabled by default with a TTL of 1 second. This in turn may
cause ns5 to return a cached SERVFAIL response to the second example/TXT
query sent by dig, i.e. make ns5 not perform full query processing as
expected by the check.
Since the primary purpose of the check in question is to ensure that key
refreshes are resumed once initially unavailable root servers become
available, the optimal solution appears to be disabling SERVFAIL cache
for ns5 as doing that still allows the check to fulfill its purpose and
it is arguably more prudent than always sleeping for 1 second.
For consistency between all system tests, add missing setup.sh scripts
for tests which do not have one yet and ensure every setup.sh script
calls its respective clean.sh script.
Temporary files created by a given system test should be removed by its
clean.sh script, not its setup.sh script. Remove redundant "rm"
invocations from setup.sh scripts. Move required "rm" invocations from
setup.sh scripts to their corresponding clean.sh scripts.
If dots are not escaped in the "1.2.3.4" regular expressions used for
checking whether IP address 1.2.3.4 is present in the tested resolver's
answers, a COOKIE that matches such a regular expression will trigger a
false positive for the "resolver" system test. Properly escape dots in
the aforementioned regular expressions to prevent that from happening.
in query_respond_any(), the assumption had previously been made that it
was impossible to get past iterating the node with a return value of
ISC_R_NOMORE but not have found any records, unless we were searching
for RRSIG or SIG. however, it is possible for other types to exist but
be hidden, such as when the zone is transitioning from insecure to
secure and DNSSEC types are encountered, and this situation could
trigger an assertion. removed the assertion and reorganized the code.
Including $SYSTEMTESTTOP/conf.sh from a system test's clean.sh script is
not needed for anything while it causes an error message to be printed
out when "./configure" is run, as "make clean" is invoked at the end.
Remove the offending line to prevent the error from occurring.
For all system tests utilizing named instances, call clean.sh from each
test's setup.sh script in a consistent way to make sure running the same
system test multiple times using run.sh does not trigger false positives
caused by stale files created by previous runs.
Ideally we would just call clean.sh from run.sh, but that would break
some quirky system tests like "rpz" or "rpzrecurse" and being consistent
for the time being does not hurt.
In case when a zone fails to load because the file does not exist
or is malformed, we should not run the callback that updates the
zone database when the load is done. This is achieved by
unregistering the callbacks if at zone load end if the result
indicates something else than success.
As pointed out in !813 db_registered is sort of redundant. It is
set to `true` only in `dns_zone_rpz_enable_db()` right before the
`dns_rpz_dbupdate_callback()` callback is registered. It is only
required in that callback and it is the only place that the callback
is registered. Therefore there is no path that that `REQUIRE` can
fail.
The `db_registered` variable is only set to `false` in
`dns_rpz_new_zone`, so it is not like the variable is unset again
later.
The only other place where `db_registered` is checked is in
`rpz_detach()`. If `true`, it will call
`dns_db_updatenotify_unregister()`. However if that happens, the
`db_registered` is not set back to `false` thus this implies that
this may happen multiple times. If called a second time, most
likely the unregister function will return `ISC_R_NOTFOUND`, but
the return value is not checked anyway. So it can do without the
`db_registered` check.
This may happen when loading an RPZ failed and the code path skips
calling dns_db_endload(). The dns_rpz_zone_t object is still kept
marked as having registered db. So when this object is finally
destroyed in rpz_detach(), this code will incorrectly call
`dns_db_updatenotify_unregister()`:
if (rpz->db_registered)
dns_db_updatenotify_unregister(rpz->db,
dns_rpz_dbupdate_callback, rpz);
and trigger this assertion failure:
REQUIRE(db != NULL);
To fix this, only call `dns_db_updatenotify_unregister()` when
`rpz->db` is not NULL.
These tests check if a key with an unsupported algorithm in
managed-keys is ignored and when seeing an algorithm rollover to
an unsupported algorithm, the new key will be ignored too.
If `dns_dnssec_keyfromrdata` failed we don't need to call
`dst_key_free` because no `dstkey` was created. Doing so
nevertheless will result in an assertion failure.
This can happen if the key uses an unsupported algorithm.
removed the SDB databases in contrib/sdb as they hadn't been
maintained in some time, and were no longer able to link to named
without modification. also:
- cleaned up contrib/README, which still referred to contrib
subdirectores that were removed already, and linked to an obsolete URL.
- removed references to sdb in doc/misc/roadmap and doc/misc/sdb.
Illustrate the syntax for the policy options, with semicolons.
Explicitly mention the "default" policy.
Fix a few typos and remove some redundant wording.
When a mirror zone is verified, the 'ignore_kskflag' argument passed to
dns_zoneverify_dnssec() is set to false. This means that in order for
its verification to succeed, a mirror zone needs to have at least one
key with the SEP bit set configured as a trust anchor. This brings no
security benefit and prevents zones signed only using keys without the
SEP bit set from being mirrored, so change the value of the
'ignore_kskflag' argument passed to dns_zoneverify_dnssec() to true.
The "mirror" system test checks whether log messages announcing a mirror
zone coming into effect are emitted properly. However, the helper
functions responsible for waiting for zone transfers and zone loading to
complete do not wait for these exact log messages, but rather for other
ones preceding them, which introduces a possibility of false positives.
This problem cannot be addressed by just changing the log message to
look for because the test still needs to discern between transferring a
zone and loading a zone.
Add two new log messages at debug level 99 (which is what named
instances used in system tests are configured with) that are to be
emitted after the log messages announcing a mirror zone coming into
effect. Tweak the aforementioned helper functions to only return once
the log messages they originally looked for are followed by the newly
added log messages. This reliably prevents races when looking for
"mirror zone is now in use" log messages and also enables a workaround
previously put into place in the "mirror" system test to be reverted.
In the "mirror" system test, ns3 periodically sends trust anchor
telemetry queries to ns1 and ns2. It may thus happen that for some
non-recursive queries for names inside mirror zones which are not yet
loaded, ns3 will be able to synthesize a negative answer from the cached
records it obtained from trust anchor telemetry responses. In such
cases, NXDOMAIN responses will be sent with the root zone SOA in the
AUTHORITY section. Since the root zone used in the "mirror" system test
has the same serial number as ns2/verify.db.in and zone verification
checks look for the specified serial numbers anywhere in the answer, the
test could be broken if different zone names were used.
The +noauth dig option could be used to address this weakness, but that
would prevent entire responses from being stored for later inspection,
which in turn would hamper troubleshooting test failures. Instead, use
a different serial number for ns2/verify.db.in than for any other zone
used in the "mirror" system test and check the number of records in the
ANSWER section of each response.
Due to the way the "mirror" system test is set up, it is impossible for
the "verify-unsigned" and "verify-untrusted" zones to contain any serial
number other than the original one present in ns2/verify.db.in. Thus,
using presence of a different serial number in the SOA records of these
zones as an indicator of problems with mirror zone verification is
wrong. Look for the original zone serial number instead as that is the
one that will be returned by ns3 if one of the aforementioned zones is
successfully verified.
Add a warning about potential performance implications of configuring a
non-root zone as a mirror zone. Explain in more detail how each mirror
zone version is validated and how validation failures are handled. Move
the paragraphs describing how to set up IANA root zone mirroring higher
up, so that they can be more easily found by the reader. Explicitly
state that the "masters" option needs to be present for any mirror zone
which is not the root zone. Tweak the description of the interaction
between the "dnssec-validation" setting and root zone mirroring to make
it less ambiguous. Specify what the default "notify" setting is for
mirror zones.
* Alphabetize the option lists in the man page and help text
* Make the synopses more consistent between the man page and help
text, in particular the number of different modes
* Group mutually exclusive options in the man page synopses, and order
options so that it is more clear which are available in every mode
* Expand the DESCRIPTION to provide an overview of the output modes
and input modes
* Improve cross-references between options
* Leave RFC citations to the SEE ALSO section, and clarify which RFC
specifies what
* Clarify list of digest algorithms in dnssec-dsfromkey and dnssec-cds
man pages
Running "make install" in a separate job in the "test" phase of a CI
pipeline causes a lot of object files to be rebuilt due to the way
artifacts are passed between GitLab CI jobs (object files extracted from
the artifacts archive have older modification times than their
respective source files checked out using Git by the worker running the
"install" job). Test "make install" in one of the build jobs instead,
in order to prevent object rebuilding.
Using 'after_script' for this purpose was not an option because its
failures are ignored.
Duplicating the build script in two places would be error-prone in the
long run and thus was rejected as a solution. YAML anchors would also
not help in this case.
A "positive" test (`test -n "${RUN_MAKE_INSTALL}" && make install`)
would not work because:
- it would cause the build script to fail for any job not supposed to
run "make install",
- appending `|| :` to the shell pipeline would prevent "make install"
errors from causing a job failure.
Due to the above, a "negative" test is performed, so that:
- jobs not supposed to run "make install" succeed immediately,
- jobs supposed to run "make install" only succeed when "make install"
succeeds.
5153. [func] Zone transfer statistics (size, number of records, and
number of messages) are now logged for outgoing
transfers as well as incoming ones. [GL #513]
Ensure IXFR statistics are calculated correctly by dig and named, both
for incoming and outgoing transfers. Disable EDNS when using dig to
request an IXFR so that the same reference file can be used for testing
statistics calculated by both dig and named (dig uses EDNS by default
when sending transfer requests, which affects the number of bytes
transferred).
Ensure AXFR statistics are calculated correctly by dig and named, both
for incoming and outgoing transfers. Rather than employing a zone which
is already used in the "xfer" system test, create a new one whose AXFR
form spans multiple TCP messages. Disable EDNS when using dig to
request an AXFR so that the same reference file can be used for testing
statistics calculated by both dig and named (dig uses EDNS by default
when sending transfer requests, which affects the number of bytes
transferred).
Transfer statistics are currently only reported for incoming transfers,
even though they are equally useful for outgoing transfers. Define a
separate structure for keeping track of the number of messages, records,
and bytes sent during each outgoing transfer, along with the time each
outgoing transfer took. Repurpose the 'nmsg' field of the xfrout_ctx_t
structure for tracking the number of messages actually sent, ensuring it
is only increased after isc_socket_send() indicates success. Report the
statistics gathered when an outgoing transfer completes.
The 'nmsg' field of the xfrout_ctx_t structure is an integer, even
though it is only ever compared against 0 (for tracking whether the
QUESTION section has already been sent to the client). Use a boolean
instead as it is more appropriate and also enables 'nmsg' to be
repurposed.
the occluded-key test creates both a KEY and a DNSKEY. the second
call to dnssec-keygen calls dns_dnssec_findmatchingkeys(), which causes
a spurious warning to be printed when it sees the type KEY record.
this should be fixed in dnssec.c, but the meantime this change silences
the warning by reversing the order in which the keys are created.
- there was a memory leak when using negotiated TSIG keys.
- TKEY responses could only be signed when using a newly negotiated
key; if an existent matching TSIG was found in in the keyring it
would not be used.
- options that were flagged as obsolete or not implemented in 9.0.0
are now flagged as "ancient", and are a fatal error
- the ARM has been updated to remove these, along with other
obsolete descriptions of BIND 8 behavior
- the log message for obsolete options explicitly recommends removal
This adds a test for rndc dumpdb to ensure the correct "stale
comment" is printed. It also adds a test for non-stale data to
ensure no "stale comment" is printed for active RRsets.
In addition, the serve-stale tests are hardened with more accurate
grep calls.
This change makes rndc dumpdb correctly print the "; stale" line.
It also provides extra information on how long this data may still
be served to clients (in other words how long the stale RRset may
still be used).
up until now, message->tsigkey could only be set during parsing
of the request, but gss-tsig allows one to be created afterward.
this commit adds a new flag to the message structure, `new_tsigkey`,
which indicates that in this case it's okay for `dns_message_settsigkey()`
to be run on a message after parsing, without hitting any assertions due
to the lack of a TSIG in the request. this allows us to keep the current
restriction in place generally, but add an exception for TKEY processing.
it's probably better to just remove the restriction entirely (see next
commit).
The introduced grep call checks whether there was a
response that has an answer and an additional record.
There should be only one in the nsupdate output that is
for the TKEY response.
The introduced grep call checks whether there was a
response that has an answer and an additional record.
There should be only one in the nsupdate output that is
for the TKEY response.
Resolve "--enable-querytrace has negative performance impact - update the documentation to say this"
Closes#766
See merge request isc-projects/bind9!1367
also:
- rearranged things a little, adding a "dependencies" section
- removed the documentation of 'enable-threads'. (this part of
the change should not be backported.)
- Change 5023 (present in BIND 9.13.3+) removed BIND's internal
implementation of the getifaddrs() function which was required for
iterating network interfaces on Solaris 10 as that system does not
support that function natively.
- As of January 2019, FreeBSD 10.x is neither supported upstream nor
regularly tested by ISC, so move it from the list of regularly tested
platforms to the "Best effort" section.
- Debian 10, OpenBSD 6.3, and Fedora 29 have been released and are now
tested regularly.
use a lame server configuration to force SERVFAILs instead of killing ns2.
this prevents test failures that occurred due to a different behavior of
the netowrking stack in windows.
test the average delay between notifies instead of the minimum delay;
this helps avoid unnecessary test failures on systems with bursty
network performance.
dig retries a TCP query when a server closes the connection prematurely.
However, dig's exit code remains unaffected even if the second attempt
to get a response also fails with the same error for the same lookup,
which should not be the case. Ensure the exit code is updated
appropriately when a retry triggered by a TCP EOF condition fails.
- mishandling of trailing dots caused bad behavior with the
root zone or names like "example.com."
- fixing this exposed an error in dnssec-coverage caused the
wrong return value if there were KSK errors but no ZSK errors
- incidentally silenced the dnssec-keygen output in the coverage
system test
- dig command had the @ parameter in the wrong place
- private-dnskey and private-cdnskey are queried in a separate
loop, which strips 'private-' from the name to determine the qtype
In an attempt to ensure that:
- all important changes to repository contents are tested,
- pipelines are not automatically created for every single push,
- some flexibility is allowed for corner cases,
change pipeline triggering settings so that:
- full build & test pipelines are only automatically created for merge
requests and tags (both for creation and updates),
- pipelines for other repository changes (e.g. pushes to arbitrary
branches) can only be created manually, using GitLab's web
interface,
- merging a merge request only causes jobs pushing the updated ARM to
GitLab Pages to be run (as semi-linear Git history is enforced and
thus testing a MR is identical to testing the target branch
post-merge in terms of code),
- repository synchronization does not trigger duplicate pipelines in
projects which are set as mirroring targets.
Make sure all jobs are named using the following pattern:
[<job-type>:]<build-type>:<system>:<architecture>
where specifying <job-type> is optional for "precheck" and "build" jobs.
This should make it easier to quickly recognize:
- what kind of actions are performed by each job,
- which BIND build flavor is used by each job,
- which operating system image is used by each job.
There is no need to build BIND binaries before building docs and thus
the job building the current version of the ARM can be moved to the
build stage of CI.
Remove the following from .gitlab-ci.yml:
- unused variable definitions,
- unused Docker image definitions,
- commands which have no effect,
- sections which were commented out.
If we try to fetch a record from cache and need to look into
hints database we assume that the resolver is not primed and
start dns_resolver_prime(). Priming query is supposed to return
NSes for "." in ANSWER section and glue records for them in
ADDITIONAL section, so that we can fill that info in 'regular'
cache and not use hints db anymore.
However, if we're using a forwarder the priming query goes through
it, and if it's configured to return minimal answers we won't get
the addresses of root servers in ADDITIONAL section. Since the
only records for root servers we have are in hints database we'll
try to prime the resolver with every single query.
This patch adds a DNS_FETCHOPT_NOFORWARD flag which avoids using
forwarders if possible (that is if we have forward-first policy).
Using this flag on priming fetch fixes the problem as we get the
proper glue. With forward-only policy the problem is non-existent,
as we'll never ask for root server addresses because we'll never
have a need to query them.
Also added a test to confirm priming queries are not forwarded.
go back to regular resolution. When this happens the fetch timer is
already running, and we might end up in a situation where we we create
a fetch for qname-minimized query and after that the timer is triggered
and the query is retried (fctx_try) - which causes relaunching of
qname-minimization fetch - and since we already have a qmin fetch
for this fctx - assertion failure.
This fix stops the timer when doing qname minimization - qmin fetch
internal timer should take care of all the possible timeouts.
Log a message if a mirror zone becomes unusable for the resolver (most
usually due to the zone's expiration timer firing). Ensure that
verification failures do not cause a mirror zone to be unloaded
(instead, its last successfully verified version should be served if it
is available).
Log a message when a mirror zone is successfully loaded from disk and
subsequently verified.
This could have been implemented in a simpler manner, e.g. by modifying
an earlier code branch inside zone_postload() which checks whether the
zone already has a database attached and calls attachdb() if it does
not, but that would cause the resulting logs to indicate that a mirror
zone comes into effect before the "loaded serial ..." message is logged,
which would be confusing.
Tweak some existing sed commands used in the "mirror" system test to
ensure that separate test cases comprising it do not break each other.
Log a message when a mirror zone is successfully transferred and
verified, but only if no database for that zone was yet loaded at the
time the transfer was initiated.
This could have been implemented in a simpler manner, e.g. by modifying
zone_replacedb(), but (due to the calling order of the functions
involved in finalizing a zone transfer) that would cause the resulting
logs to suggest that a mirror zone comes into effect before its transfer
is finished, which would be confusing given the nature of mirror zones
and the fact that no message is logged upon successful mirror zone
verification.
Once the dns_zone_replacedb() call in axfr_finalize() is made, it
becomes impossible to determine whether the transferred zone had a
database attached before the transfer was started. Thus, that check is
instead performed when the transfer context is first created and the
result of this check is passed around in a field of the transfer context
structure. If it turns out to be desired, the relevant log message is
then emitted just before the transfer context is freed.
Taking this approach means that the log message added by this commit is
not timed precisely, i.e. mirror zone data may be used before this
message is logged. However, that can only be fixed by logging the
message inside zone_replacedb(), which causes arguably more dire issues
discussed above.
dns_zone_isloaded() is not used to double-check that transferred zone
data was correctly loaded since the 'shutdown_result' field of the zone
transfer context will not be set to ISC_R_SUCCESS unless axfr_finalize()
succeeds (and that in turn will not happen unless dns_zone_replacedb()
succeeds).
The handling of class and view arguments was broken, because the code
didn't realise that next_token() would overwrite the class name when
it parsed the view name. The code was trying to implement a syntax
like `refresh [[class] view]`, but it was documented to have a syntax
like `refresh [class [view]]`. The latter is consistent with other rndc
commands, so that is how I have fixed it.
Before:
$ rndc managed-keys refresh in rec
rndc: 'managed-keys' failed: unknown class/type
unknown class 'rec'
After:
$ rndc managed-keys refresh in rec
refreshing managed keys for 'rec'
There were missing newlines in the output from `rndc managed-keys
refresh` and `rndc managed-keys destroy`.
Before:
$ rndc managed-keys refresh
refreshing managed keys for 'rec'refreshing managed keys for 'auth'
After:
$ rndc managed-keys refresh
refreshing managed keys for 'rec'
refreshing managed keys for 'auth'
- the checkprivate function in the dnssec test set ret=0, erasing
results from previous tests and making the test appear to have passed
when it shouldn't have
- checkprivate needed a delay loop to ensure there was time for all
private signing records to be updated before the test
Resolve "Large NSEC3 responses cause failure in adding records to ncache and, eventually, FORMERR (instead of NXDOMAIN)"
Closes#804
See merge request isc-projects/bind9!1295
Resolve "Large NSEC3 responses cause failure in adding records to ncache and, eventually, FORMERR (instead of NXDOMAIN)"
Closes#804
See merge request isc-projects/bind9!1298
When a query times out after a socket is created and associated with a
given dig_query_t structure, calling isc_socket_cancel() causes
connect_done() to be run, which in turn takes care of all necessary
cleanups. However, certain errors (e.g. get_address() returning
ISC_R_FAMILYNOSUPPORT) may prevent a TCP socket from being created in
the first place. Since force_timeout() may be used in code handling
such errors, connect_timeout() needs to properly clean up a TCP query
which is not associated with any socket. Call clear_query() from
connect_timeout() after attempting to send a TCP query to the next
available server if the timed out query does not have a socket
associated with it, in order to prevent dig from hanging indefinitely
due to the dig_query_t structure not being detached from its parent
dig_lookup_t structure.
When a query times out and another server is available for querying
within the same lookup, the timeout handler - connect_timeout() - is
responsible for sending the query to the next server. Extract the
relevant part of connect_timeout() to a separate function in order to
improve code readability.
Before commit c2ec022f57, using the "-b"
command line switch for dig did not disable use of the other address
family than the one to which the address supplied to that option
belonged to. Thus, bind9_getaddresses() could e.g. prepare an
isc_sockaddr_t structure for an IPv6 address when an IPv4 address has
been passed to the "-b" command line option. To avoid attempting the
impossible (e.g. querying an IPv6 address from a socket bound to an IPv4
address), a certain code block in send_tcp_connect() checked whether the
address family of the server to be queried was the same as the address
family of the socket set up for sending that query; if there was a
mismatch, that particular server address was skipped.
Commit c2ec022f57 made
bind9_getaddresses() fail upon an address family mismatch between the
address the hostname passed to it resolved to and the address supplied
to the "-b" command line option. Such failures were fatal to dig back
then.
Commit 7f65860391 made
bind9_getaddresses() failures non-fatal, but also ensured that a
get_address() failure in send_tcp_connect() still causes the given query
address to be skipped (and also made such failures trigger an early
return from send_tcp_connect()).
Summing up, the code block handling address family mismatches in
send_tcp_connect() has been redundant since commit
c2ec022f57. Remove it.
5122. [bug] In a "forward first;" configuration, a forwarder
timeout did not prevent that forwarder from being
queried again after falling back to full recursive
resolution. [GL #315]
Since following a delegation resets most fetch context state, address
marks (FCTX_ADDRINFO_MARK) set inside lib/dns/resolver.c are not
preserved when a delegation is followed. This is fine for full
recursive resolution but when named is configured with "forward first;"
and one of the specified forwarders times out, triggering a fallback to
full recursive resolution, that forwarder should no longer be consulted
at each delegation point subsequently reached within a given fetch
context.
Add a new badnstype_t enum value, badns_forwarder, and use it to mark a
forwarder as bad when it times out in a "forward first;" configuration.
Since the bad server list is not cleaned when a fetch context follows a
delegation, this prevents a forwarder from being queried again after
falling back to full recursive resolution. Yet, as each fetch context
maintains its own list of bad servers, this change does not cause a
forwarder timeout to prevent that forwarder from being used by other
fetch contexts.
dnssec-signzone should sign a zonefile that contains a DNSKEY record
with an unsupported algorithm. Current behavior is that it will
fail, hitting a fatal error. The fix detects unsupported algorithms
and will not try to add it to the keylist.
Also when determining the maximum iterations for NSEC3, don't take
into account DNSKEY records in the zonefile with an unsupported
algorithm.
Resolve "current version of cygwin grep causes tests to fail when grepping for end of line character"
Closes#782
See merge request isc-projects/bind9!1230
Resolve "[ISC-support #13767] NSEC3 typemap improperly includes DNSKEY RRset instead of ignoring it as out-of-zone"
Closes#742
See merge request isc-projects/bind9!1231
Add a new libisccfg function, cfg_pluginlist_foreach(), which allows an
arbitrary callback to be invoked for every "plugin" stanza present in a
configuration object. Use this function for both loading plugins and
checking their configuration in order to reduce duplication of
configuration processing code present in bin/named/server.c and
lib/bind9/check.c.
- "hook" is now used only for hook points and hook actions
- the "hook" statement in named.conf is now "plugin"
- ns_module and ns_modlist are now ns_plugin and ns_plugins
- ns_module_load is renamed ns_plugin_register
- the mandatory functions in plugin modules (hook_register,
hook_check, hook_version, hook_destroy) have been renamed
- use a per-view module list instead of global hook_modules
- create an 'instance' pointer when registering modules, store it in
the module structure, and use it as action_data when calling
hook functions - this enables multiple module instances to be set
up in parallel
- also some nomenclature changes and cleanup
- added some hook points that will be needed for a dns64 module later
- moved some code from the beginning of query_respond() to
the end of query_prepresponse(); this has no effect on functionality
but means we can have a hook point at the top of query_respond(),
which seems nicer
- compressed duplicated code into query_zerottl_refetch() function
- added a qctx->answered flag so that a module can prevent
query_addrrset() from being called from query_respond() when
it's already been called from the module.
- this is necessary because adding the same hook to multiple views
causes the ISC_LIST link value to become inconsistent; it isn't
noticeable when only one hook action is ever registered at a
given hook point, but it will break things when there are two.
- eliminate qctx->hookdata and client->hookflags.
- use a memory pool to allocate data blobs in the filter-aaaa module,
and associate them with the client address in a hash table
- instead of detaching the client in query_done(), mark it for deletion
and then call ns_client_detach() from qctx_destroy(); this ensures
that it will still exist when the QCTX_DESTROYED hook point is
reached.
- use a get_hooktab() function to determine the hook table.
- PROCESS_HOOK now jumps to a cleanup tag on failure
- add PROCESS_ALL_HOOKS in query.c, to run all hook functions at
a specified hook point without stopping. this is to be used for
intiialization and destruction functions that must run in every
module.
- 'result' is set in PROCESS_HOOK only when a hook function
interrupts processing.
- revised terminology: a "callback" is now a "hook action"
- remove unused NS_PROCESS_HOOK and NS_PROCESS_HOOK_VOID macros.
- added a 'hookdata' array to qctx to store pointers to up to
16 blobs of data which are allocated by modules as needed.
each module is assigned an ID number as it's loaded, and this
is the index into the hook data array. this is to be used for
holding persistent state between calls to a hook module for a
specific query.
- instead of using qctx->filter_aaaa, we now use qctx->hookdata.
(this was the last piece of filter-aaaa specific code outside the
module.)
- added hook points for qctx initialization and destruction. we get
a filter-aaaa data pointer from the mempool when initializing and
store it in the qctx->hookdata table; return to to the mempool
when destroying the qctx.
- link the view to the qctx so that detaching the client doesn't cause
hooks to fail
- added a qctx_destroy() function which must be called after qctx_init;
this calls the QCTX_DESTROY hook and detaches the view
- general cleanup and comments
- make some cfg-parsing functions global so they can be run
from filter-aaaa.so
- add filter-aaaa options to the hook module's parser
- mark filter-aaaa options in named.conf as obsolete, remove
from named and checkconf, and update the filter-aaaa test not to
use checkconf anymore
- remove filter-aaaa-related struct members from dns_view
- allow multiple "hook" statements at global or view level
- add "optional bracketed text" type for optional parameter list
- load hook module from specified path rather than hardcoded path
- add a hooktable pointer (and a callback for freeing it) to the
view structure
- change the hooktable functions so they no longer update ns__hook_table
by default, and modify PROCESS_HOOK so it uses the view hooktable, if
set, rather than ns__hook_table. (ns__hook_table is retained for
use by unit tests.)
- update the filter-aaaa system test to load filter-aaaa.so
- add a prereq script to check for dlopen support before running
the filter-aaaa system test
not yet done:
- configuration parameters are not being passed to the filter-aaaa
module; the filter-aaaa ACL and filter-aaaa-on-{v4,v6} settings are
still stored in dns_view
- temporary kluge! in this version, for testing purposes,
named always searches for a filter-aaaa module at /tmp/filter-aaaa.so.
this enables the filter-aaaa system test to run even though the
code to configure hooks in named.conf hasn't been written yet.
- filter-aaaa-on-v4, filter-aaaa-on-v6 and the filter-aaaa ACL are
still configured in the view as they were before, not in the hook.
- these formerly static helper functions have been moved into client.c
and made external so that they can be used in hook modules as well as
internally in libns: query_newrdataset, query_putrdataset,
query_newnamebuf, query_newname, query_getnamebuf, query_keepname,
query_releasename, query_newdbversion, query_findversion
- made query_recurse() and query_done() into public functions
ns_query_recurse() and ns_query_done() so they can be called from
modules.
- the goal of this change is for AAAA filtering to be fully contained
in the query logic, and implemented at discrete points that can be
replaced with hook callouts later on.
- the new code may be slightly less efficient than the old filter-aaaa
implementation, but maximum efficiency was never a priority for AAAA
filtering anyway.
- we now use the rdataset RENDERED attribute to indicate that an AAAA
rdataset should not be included when rendering the message. (this
flag was originally meant to indicate that an rdataset has already
been rendered and should not be repeated, but it can also be used to
prevent rendering in the first place.)
- the DNS_MESSAGERENDER_FILTER_AAAA, NS_CLIENTATTR_FILTER_AAAA,
and DNS_RDATASETGLUE_FILTERAAAA flags are all now unnecessary and
have been removed.
- the purpose of this change is allow for more well-defined hook points
to be available in the query processing logic. some functions that
formerly didn't have access to 'qctx' do now; this is needed because
'qctx' is what gets passed when calling a hook function.
- query_addrdataset() has been broken up into three separate functions
since it used to do three unrelated things, and what was formerly
query_addadditional() has been renamed query_additional_cb() for
clarity.
- client->filter_aaaa is now qctx->filter_aaaa. (later, it will be moved
into opaque storage in the qctx, for use by the filter-aaaa module.)
- cleaned up style and braces
- move hooks.h to public include directory
- ns_hooktable_init() initializes a hook table. if NULL is passed in, it
initializes the global hook table
- ns_hooktable_save() saves a pointer to the current global hook table.
- ns_hooktable_reset() replaces the global hook table with different
one
- ns_hook_add() adds hooks at specified hook points in a hook table (or
the global hook table if the specified table is NULL)
- load and unload functions support dlopen() of hook modules (this is
adapted from dyndb and not yet functional)
- began adding new hook points to query.c
We can remove this, because it is used in `strtodsdigest` but that
already no longer covers the algorithm name "GOST".
There is one more GOST reference in `bin/python/isc/checkds.py.in`
but that is used for presentation format and probably should stay.
(cherry picked from commit 57d44fbc628d3c7dafdd545f6b83dbdcdc39a986)
In the test the quota is set to 400, and softquota to 90%*400=360.
We first attach to quota, and then if we're above softquota we
drop the oldest client. With new socket code and taskmgr it's
parallel enough to create a race between multiple instances doing
'attach to quota' and then 'drop oldest client' - making number
of clients go over softquota. It's not a problem in real life, as
it's just soft quota.
If you have a catalog zone containing 10.in-addr.arpa and an
explicitly-configured version which overrides the catz version,
`named` used to log:
catz: error "success" while trying to add zone "10.in-addr.arpa"
After this patch it logs:
catz: zone "10.in-addr.arpa" is overridden by explicitly configured zone
Apply various fixes and tweaks to Python configuration logic implemented
in the "configure" script:
- Prevent PYTHON_INSTALL_DIR, which holds the value passed to the
--with-python-install-dir option, from being set to "unspec" by
default as this breaks installing Python modules when the
--with-python-install-dir option is not used.
- Make the --with-python-install-dir option also work when the Python
interpreter is specified explicitly (using --with-python=<...>).
- Remove dnspython dependency which was erroneously introduced in
commit 31b0dc1f20: no installed Python
module depends on dnspython, it is only used in system tests, for
which dedicated scripts exist that check whether dnspython is
available and act accordingly.
- Improve contents and placement of error messages.
- Reduce duplication of code checking Python dependencies.
- Use Autoconf macros AS_CASE() and AS_IF() instead of plain shell
code.
- Update comments. Capitalize the word "Python" when referring to the
language itself rather than a specific executable.
The stock toolchain available on CentOS 6 for i386 is unable to use the
_mm_pause() intrinsic. Fix by using "rep; nop" assembly instructions on
that platform instead.
If we know that we'll have a task pool doing specific thing it's better
to use this knowledge and bind tasks to task queues, this behaves better
than randomly choosing the task queue.
- use bound resolver tasks - we have a pool of tasks doing resolutions,
we can spread the load evenly using isc_task_create_bound
- quantum set universally to 25
Mutexes are slower if they're in the same cache line. Since
fd's come in herds, and usually our listen sockets will have nearby
fd numbers, we mangle fdlocks so that the locks are further away.
Sometimes it is useful to set a 'floor' on the TTL for records
to be cached. Some sites like to use ridiculously low TTLs for
some reason, and that often is not compatible with slow links.
Signed-off-by: Michael Milligan <milli@acmeps.com>
Signed-off-by: LaMont Jones <lamont@debian.org>
5089. [bug] Restore localhost fallback in dig and host which is
used when no nameserver addresses present in
/etc/resolv.conf are usable due to the requested
address family restrictions. [GL #433]
In BIND 9.11 and earlier, dig and similar tools used liblwres for
parsing /etc/resolv.conf. After getting a list of servers from
liblwres, a tool would check the address family of each server found and
reject those unusable. When the resulting list of usable servers was
empty, localhost addresses were queried as a fallback.
When liblwres was removed in BIND 9.12, dig and similar tools were
updated to parse /etc/resolv.conf using libirs instead. As part of that
process, the localhost fallback was removed from bin/dig/dighost.c since
the localhost fallback built into libirs was deemed to be sufficient.
However, libirs only falls back to localhost if it does not find any
name servers at all; if it does find any valid nameserver entry in
/etc/resolv.conf, it just returns it to the caller because it is
oblivious to whether the caller supports IPv4 and/or IPv6 or not. The
code in bin/dig/dighost.c subsequently filters the returned list of
servers in get_server_list() according to the requested address family
restrictions. This may result in none of the addresses returned by
libirs being usable, in which case a tool will attempt to work with an
empty server list, causing a hang and subsequently a crash upon user
interruption.
Restore the localhost fallback in bin/dig/dighost.c to prevent the
aforementioned hangs and crashes and ensure recent BIND versions behave
identically to the older ones in the circumstances described above.
If a tool using the routines defined in bin/dig/dighost.c is sent an
interruption signal around the time a connection timeout is scheduled to
fire, connect_timeout() may be executed after destroy_libs() detaches
from the global task (setting 'global_task' to NULL), which results in a
crash upon a UDP retry due to bringup_timer() attempting to create a
timer with 'task' set to NULL. Fix by preventing connect_timeout() from
attempting a retry when shutdown is in progress.
Resolve "Follow-up from "Redefine ISC's int and boolean types to use <stdint.h> and <stdbool.h> types""
Closes#449
See merge request isc-projects/bind9!998
While implementing the new unit testing framework cmocka, it was found that the
BIND 9 code doesn't compile when assertions are disabled or replaced with any
function (such as mock_assert() from cmocka unit testing framework) that's not
directly recognized as assertion by the compiler.
This made the compiler to complain about blocks of code that was recognized as
unreachable before, but now it isn't.
The changes in this commit include:
* assigns default values to couple of local variables,
* moves some return statements around INSIST assertions,
* adds __builtin_unreachable(); annotations after some INSIST assertions,
* fixes one broken assertion (= instead of ==)
Tell the user explicitly about their mistakes:
* Unknown options, e.g. -list instead of -dump
or -delete instead of -remove.
* Unknown view names.
* Excess arguments.
Include the view name in `rndc nta -dump` output, for consistency with
the NTA add and remove actions.
When removing an NTA from all views, do not abort with an error if the
NTA was not found in one of the views.
At the beginning of qname minimization we get fctx->finds filled with what's
in the cache at this point, in worst case root servers. After doing full
run querying for NSes at different levels we need to clean it and refill
it with proper values from cache.
Ensure that serve-stale works as expected when returning stale answers
is enabled, the authoritative server does not respond, and there is no
cached answer available.
Remove the following functions in order to simplify socket code:
- isc_socket_recvv()
- isc_socket_sendtov()
- isc_socket_sendtov2()
- isc_socket_sendv()
Remove the following functions in order to simplify socket code:
- isc_socket_recvv()
- isc_socket_sendtov()
- isc_socket_sendtov2()
- isc_socket_sendv()
Refactor diagnostic tools code to no longer use:
- isc_socket_recvv()
- isc_socket_sendtov2()
- isc_socket_sendv()
as these functions will be removed shortly.
While isc_buffer_copyregion() calls isc_buffer_reserve() to ensure the
target buffer will have enough available space to append the contents of
the source region to it, the variables used for subsequently checking
available space are not updated accordingly after that call. This
prevents isc_buffer_copyregion() from working as expected for
auto-reallocated buffers: ISC_R_NOSPACE will be returned if enough space
is not already available in the target buffer before it is reallocated.
Fix by calling isc_buffer_used() and isc_buffer_availablelength()
directly instead of assigning their return values to local variables.
Add some basic checks for isc_buffer_copyregion() to ensure it behaves
as expected for both fixed-size buffers and buffers which can be
automatically reallocated. Adjust the list of headers included by
lib/isc/tests/buffer_test.c so that it matches what that test program
really uses.
If an RPZ zone is to be freed during an update, canceling the
update_quantum() event is not enough because the resources released when
an update completes also need to be accounted for. Failure to do this
results in a hang upon shutdown. Fix by copying cleanup code from the
end of update_quantum() to rpz_detach().
If another RPZ update is pending when processing the previous one nears
completion and min-update-interval is set to 0, isc_timer_reset() gets
called with 'interval' set to 0, which triggers an assertion failure.
To prevent such a scenario from causing a crash, queue the update event
directly instead of asking the timer thread to do it.
Rationale: the nonce here is only used to make sure there is a low
probability of duplication, according to section B.2 of RFC7873.
It is only 32-bit, and even if an attacker knows the algorithm used
to generate nonces it won't, in any way, give him any platform to
attack the server as long as server secret used to sign the
(nonce, time) pair with HMAC-SHA1 is secure.
On the other hand, currently, each packet sent requires (unnecessarily)
a CS pseudo-random number which is ineffective.
The XSL stylesheet used by the web interface does not currently include
any element which would cause a list of zones configured in each view to
be displayed, making the "Zones" section of the web interface empty
unless some zone has been configured with "zone-statistics full;" and
queried. Since this can be confusing, modify the XSL stylesheet so that
a list of zones configured in each view is displayed in the web
interface.
XXXX. [func] Replace old message digest and hmac APIs with more
generic isc_md and isc_hmac APIs, and convert their
respective tests to cmocka. [GL #305]
XXXX. [func] A default list of primary servers for the root zone is
now built into named, allowing the "masters" statement
to be omitted when configuring an IANA root zone
mirror. [GL #564]
XXXX. [func] Attempts to use mirror zones with recursion disabled
are now considered a configuration error. [GL #564]
XXXX. [func] The only valid zone-level NOTIFY settings for mirror
zones are now "notify no;" and "notify explicit;".
[GL #564]
XXXX. [func] Mirror zones are now configured using "type mirror;"
rather than "mirror yes;". [GL #564]
To minimize the effort required to set up IANA root zone mirroring,
define a default master server list for the root zone and use it when
that zone is to be mirrored and no master server list was explicitly
specified. Contents of that list are taken from RFC 7706 and are
subject to change in future releases.
Since the static get_masters_def() function in bin/named/config.c does
exactly what named_zone_configure() in bin/named/zoneconf.c needs to do,
make the former non-static and use it in the latter to prevent code
duplication.
Since mirror zone data is treated as cache data for access control
purposes, configuring a mirror zone and disabling recursion at the same
time would effectively prevent mirror zone data from being used since
disabling recursion also disables cache access to all clients by
default. Even though this behavior can be inhibited by configuration,
mirror zones are a recursive resolver feature and thus recursion is now
required to use them.
Ignore the fact that certain configurations might still trick named into
assuming recursion is enabled when it effectively is not since this
change is not meant to put a hard policy in place but rather just to
prevent accidental mirror zone misuse.
Previous way of handling NOTIFY settings for mirror zones was a bit
tricky: any value of the "notify" option was accepted, but it was
subsequently overridden with dns_notifytype_explicit. Given the way
zone configuration is performed, this resulted in the following
behavior:
- if "notify yes;" was set explicitly at any configuration level or
inherited from default configuration, it was silently changed and so
only hosts specified in "also-notify", if any, were notified,
- if "notify no;" was set at any configuration level, it was
effectively honored since even though zone->notifytype was silently
set to dns_notifytype_explicit, the "also-notify" option was never
processed due to "notify no;" being set.
Effectively, this only allowed the hosts specified in "also-notify" to
be notified, when either "notify yes;" or "notify explicit;" was
explicitly set or inherited from default configuration.
Clean up handling of NOTIFY settings for mirror zones by:
- reporting a configuration error when anything else than "notify no;"
or "notify explicit;" is set for a mirror zone at the zone level,
- overriding inherited "notify yes;" setting with "notify explicit;"
for mirror zones,
- informing the user when the "notify" setting is overridden, unless
the setting in question was inherited from default configuration.
Use a zone's 'type' field instead of the value of its DNS_ZONEOPT_MIRROR
option for checking whether it is a mirror zone. This makes said zone
option and its associated helper function, dns_zone_mirror(), redundant,
so remove them. Remove a check specific to mirror zones from
named_zone_reusable() since another check in that function ensures that
changing a zone's type prevents it from being reused during
reconfiguration.
Rather than overloading dns_zone_slave and discerning between a slave
zone and a mirror zone using a zone option, define a separate enum
value, dns_zone_mirror, to be used exclusively by mirror zones. Update
code handling slave zones to ensure it also handles mirror zones where
applicable.
Add a new zone type, CFG_ZONE_MIRROR, to libisccfg, in order to limit
the list of options which are considered valid for mirror zones. Update
the relevant configuration checks.
Contrary to what the documentation states, the "server-addresses"
static-stub zone option does not accept custom port numbers. Fix the
configuration type used by the "server-addresses" option to ensure
documentation matches source code. Remove a check_zoneconf() test which
is unnecessary with this fix in place.
Commonly used network configuration tools write scoped IPv6 nameserver
addresses to /etc/resolv.conf. libirs only handles these when it is
compiled with -DIRS_HAVE_SIN6_SCOPE_ID, which is not the default, and
only handles numeric scopes, which is not what network configuration
tools typically use. This causes dig to be practically unable to handle
scoped IPv6 nameserver addresses in /etc/resolv.conf.
Fix the problem by:
- not requiring a custom compile-time flag to be set in order for
scoped IPv6 addresses to be processed by getaddrinfo(),
- parsing non-numeric scope identifiers using if_nametoindex(),
- setting the sin6_scope_id field in struct sockaddr_in6 structures
returned by getaddrinfo() even if the AI_CANONNAME flag is not set.
Commit ba91243542 causes the resolver to
respond to a client query with FORMERR when all upstream queries sent to
the servers authoritative for QNAME elicit FORMERR responses. This
happens because resolver code returns DNS_R_FORMERR in such a case and
dns_result_torcode() acts as a pass-through for all arguments which are
already a valid RCODE.
The correct RCODE to set in the response returned to the client in the
case described above is SERVFAIL. Make sure this happens by overriding
the RCODE in query_gotanswer(), on the grounds that any format errors in
the client query itself should be caught long before execution reaches
that point. This change should not reduce query error logging accuracy
as the resolver code itself reports the exact reason for returning a
DNS_R_FORMERR result using log_formerr().
* Add configure option --enable-fips-mode that detects and enables FIPS mode
* Add a function to enable FIPS mode and call it on crypto init
* Log an OpenSSL error when FIPS_mode_set() fails and exit
* Report FIPS mode status in a separate log message from named
Whenever master or one for the v9_* branches gets updated, the current
ARM should be published on GitLab Pages. Add a pipeline stage which
takes care of triggering GitLab Pages pipelines. Extend the lifetime of
artifact archives containing the ARM to prevent GitLab Pages pipelines
from failing due to artifacts being unavailable.
Add a CI job which generates the HTML version of the ARM and makes it
available for download. Since this is expected to be a quick process,
the new job is enabled for all pipelines.
- this enables memory to be allocated and freed in dyndb modules
when named is linked statically. when we standardize on libtool,
this should become unnecessary.
- also, simplified the isc_mem_create/createx API by removing
extra compatibility functions
In some cases, setting qctx->result to DNS_R_SERVFAIL causes the value
of a 'result' variable containing a more specific failure reason to be
effectively discarded. This may cause certain query error log messages
to lack specificity despite a more accurate problem cause being
determined during query processing.
In other cases, qctx->result is set to DNS_R_SERVFAIL even though a more
specific error (e.g. ISC_R_NOMEMORY) could be explicitly indicated.
Since the response message's RCODE is derived from qctx->result using
dns_result_torcode(), which handles a number of possible isc_result_t
values and returns SERVFAIL for anything not explicitly listed, it is
fine to set qctx->result to something more specific than DNS_R_SERVFAIL
(in fact, this is already being done in a few cases). Modify most
QUERY_ERROR() calls so that qctx->result is set to a more specific error
code when possible. Adjust query_error() so that statistics are still
calculated properly. Remove the RECURSE_ERROR() macro which was
introduced exactly because qctx->result could be set to DNS_R_SERVFAIL
instead of DNS_R_DUPLICATE or DNS_R_DROP, which need special handling.
Modify dns_sdlz_putrr() so that it returns DNS_R_SERVFAIL when a DLZ
driver returns invalid RDATA, in order to prevent setting RCODE to
FORMERR (which is what dns_result_torcode() translates e.g. DNS_R_SYNTAX
to) while responding authoritatively.
When something goes wrong while recursing for an answer to a query,
query_gotanswer() sets a flag (qctx->want_stale) in the query context.
query_done() is subsequently called and it can either set up a stale
response lookup (if serve-stale is enabled) or conclude that a SERVFAIL
response should be sent. This may cause confusion when looking at query
error logs since the QUERY_ERROR() line responsible for setting the
response's RCODE to SERVFAIL is not in a catch-all branch of a switch
statement inside query_gotanswer() (like it is for authoritative
responses) but rather in a code branch which appears to have something
to do with serve-stale, even when the latter is not enabled.
Extract the part of query_done() responsible for checking serve-stale
configuration and optionally setting up a stale response lookup into a
separate function, query_usestale(), shifting the responsibility for
setting the response's RCODE to SERVFAIL to the same QUERY_ERROR() line
in query_gotanswer() which is evaluated for authoritative responses.
Manual page of host contained instructions to disable IDN processing
when it was built with libidn2. When refactoring IDN support however,
support for disabling IDN in host and nslookup was lost. Use also
environment variable and document it for nslookup, host and dig.
- tried to improve struct variable alignment
- ignore braces on function definitions so we can keep the existing
BIND style; braces can be on a new line or not
to update file, run: uncrustify --replace -c $TOP/.uncrustify.cfg <filename>
- note that if this is in the user's $HOME dir, it's the default
uncrustify config path name. this can be overridden with
'uncrustify -c filenaeme' or the UNCRUSTIFY_CONFIG environment
variable
- if dnssec-enable is no, then dnssec-validation now also defaults to
no. if dnssec-enable is yes, dnssec-validation defaults to auto or yes
depending on --disable-auto-validation.
- correct the doc
Revert parts of commit c3b8130fe8 which
inadvertently broke creating and validating EdDSA signatures:
1. EVP_DigestSignInit() returns 1 on success.
2. EdDSA does not support streaming (EVP_Digest*Update() followed by
EVP_Digest*Final()), only one shot operations.
Resolve "Make the chained delegations in reclimit behave like they would in a regular name server."
Closes#578
See merge request isc-projects/bind9!840
- update_log() is called to log update errors, but if those errors
occur before the zone is set (for example, when returning NOTAUTH)
it returns without logging anything.
Resolve "allow-recursion-on and allow-query-cache-on should default to each other if only one is set"
Closes#319
See merge request isc-projects/bind9!556
The race condition is the timer elapses before isc__timer_create()
returns the pointer to the caller. Assigning the return pointer before
enabling the timer will fix it.
Zone loading happens in a different task (zone->loadtask) than other
zone actions (zone->task). Thus, when zone_postload() is called in the
context of zone->loadtask, it may cause zone maintenance to be queued in
zone->task and another thread can then execute zone_maintenance() before
zone_postload() gets a chance to finish its work in the first thread.
This would not be a problem if zone_maintenance() accounted for this
possibility by locking the zone before checking the state of its
DNS_ZONEFLG_LOADPENDING flag. However, the zone is currently not locked
before the state of that flag is checked, which may prevent zone
maintenance from happening despite zone_postload() scheduling it. Fix
by locking the zone in zone_maintenance() before checking the state of
the zone's DNS_ZONEFLG_LOADPENDING flag.
- the text returned by "rndc nta" when adding NTAs to multiple views
was incorrectly terminated after the first line, so users only saw
on NTA added unless they checked the logs.
5031. [cleanup] Various defines in platform.h has been either dropped
if always or never triggered on supported platforms
or replaced with config.h equivalents if the defines
didn't have any impact on public headers. Workarounds
for LinuxThreads have been removed because NPTL is
available since Linux kernel 2.6.0.
5023. [func] Replace custom assembly for atomic operations with
atomic support from the compiler. The code will now use
C11 stdatomic, or __atomic, or __sync builtins with GCC
or Clang compilers, and Interlocked functions with MSVC.
[GL #10]
This properly orders clearing the freed pointer and calling isc_refcount_destroy
as early as possible to have ability to put proper memory barrier when cleaning
up reference counting.
Remove all kind of legacy compatibility layers (including IPv6, networking and functions defined by C99 or POSIX.1)
Closes#192
See merge request isc-projects/bind9!668
5016. [cleanup] Remove wrappers that try to fix broken or incomplete
implementations of IPv6, pthreads and other core
functionality required and used by BIND. [GL #192]
The "exitcode" variable is set to 9 if a TCP connection fails, but is
not reset to 0 if a subsequent TCP connection succeeds. This causes dig
to return a non-zero exit code if it succeeds in getting a TCP response
after a retry. Fix by resetting "exitcode" to 0 if connect_done()
receives an event with the "result" field set to ISC_R_SUCCESS.
Force ns3 to use a constant source address (10.53.0.3) when sending
transfer requests for the "initially-unavailable" zone to prevent
failures of transfers not triggered by bin/tests/system/mirror/tests.sh
from causing fallback to using a source address for which transfers of
that zone are refused throughout the entire "mirror" system test since
that might yield false positives.
Add a missing semicolon to prevent "make test" run from the top-level
directory from failing even when all system and unit tests succeed due
to "(cd fuzz && ${MAKE} check)" returning a non-zero exit code.
For inline-signed zones, the value of "ixfr-from-differences" is
hardcoded to:
- "yes" for the raw version of the zone,
- "no" for the signed version of the zone.
In other words, any user-provided "ixfr-from-differences" setting is
effectively ignored for an inline-signed zone. Ensure the user is aware
of that by adding a note to the ARM and logging a message when an
"ixfr-from-differences" option is found at the zone level.
A short time window exists between logging the addition of an NSEC3PARAM
record to a zone and committing it to the current version of the zone
database. If a query arrives during such a time window, an unsigned
response will be returned. One of the checks in the "inline" system
test requires NSEC3 records to be present in an answer - that check
would fail in the case described above. Use rndc instead of log
watching for checking whether zone signing and NSEC3 chain modifications
are complete in order to prevent intermittent "inline" system test
failures.
While "rndc reload" causes dns_zone_asyncload() to be called for the
signed version of an inline-signed zone, the subsequent zone_load() call
causes the raw version to be reloaded from storage. This means that
DNS_ZONEFLG_LOADPENDING gets set for the signed version of the zone by
dns_zone_asyncload() before the reload is attempted, but zone_postload()
is only called for the raw version and thus DNS_ZONEFLG_LOADPENDING is
cleared for the raw version, but not for the signed version. This in
turn prevents zone maintenance from happening for the signed version of
the zone.
Until commit 29b7efdd9f, this problem
remained dormant because DNS_ZONEFLG_LOADPENDING was previously
immediately, unconditionally cleared after zone loading was started
(whereas it should only be cleared when zone loading is finished or an
error occurs). This behavior caused other issues [1] and thus had to be
changed.
Fix reloading inline-signed zones by clearing DNS_ZONEFLG_LOADPENDING
for the signed version of the zone once the raw version reload
completes. Take care not to clear it prematurely during initial zone
load. Also make sure that DNS_ZONEFLG_LOADPENDING gets cleared when
zone_postload() encounters an error or returns early, to prevent other
scenarios from resulting in the same problem. Add comments aiming to
help explain code flow.
[1] see RT #47076
When an inline-signed zone is loaded, the master file for its signed
version is loaded and then a rollforward of the journal for the signed
version of the zone is performed. If DNS_JOURNALOPT_RESIGN is not set
during the latter phase, signatures loaded from the journal for the
signed version of the zone will not be scheduled for refresh. Fix the
conditional expression determining which flags should be used for the
dns_journal_rollforward() call so that DNS_JOURNALOPT_RESIGN is set when
zone_postload() is called for the signed version of an inline-signed
zone.
Extend bin/tests/system/stop.pl so that it can use "rndc halt" instead
of "rndc stop" as the former allows master file flushing upon shutdown
to be suppressed.
As part of resquery_response() refactoring [1], a goto statement was
replaced [2] with a call to a new function - originally called
rctx_delegation(), now folded into rctx_answer_none() - extracted from
existing code. However, one call site of that refactored function does
not reset the "result" variable, causing a referral with a non-empty
ANSWER section to be inadvertently treated as an error, which prevents
resolution of names reliant on servers sending such responses. Fix by
resetting the "result" variable to ISC_R_SUCCESS when a response
containing a non-empty ANSWER section can be treated as a delegation.
[1] see RT #45362
[2] see commit e1380a16741a3b4a57e54d7a9ce09dd12691522f
- added new 'validate-except' option, which configures an NTA with
expiry of 0xffffffff. NTAs with that value in the expiry field do not
expire, are are not written out when saving the NTA table and are not
dumped by rndc secroots
dst__openssl_toresult3() first calls toresult() and subsequently uses
ERR_get_error_line_data() in a loop. Given this, it is a mistake to use
ERR_get_error() in toresult() because it causes the retrieved error to
be removed from the OpenSSL error queue, thus preventing it from being
retrieved by the subsequent ERR_get_error_line_data() calls. Fix by
using ERR_peek_error() instead of ERR_get_error() in toresult().
When two or more absolute, two-label names are added to a completely
empty RBT, an extra, empty node for the root name will be created due to
node splitting. check_tree() expects that, but the extra node will not
be created when just one name is added to a completely empty RBT. This
problem could be handled inside check_tree(), but that would introduce
unnecessary complexity into it since adding a single name will result in
a different node count for a completely empty RBT (node count will be 1)
and an RBT containing only an empty node for the root name, created due
to prior node splitting (node count will be 2). Thus, first explicitly
create a node for the root name to prevent rare check_tree() failures
caused by a single name being added in the first iteration of the
insert/remove loop.
Each zone used in the "inline" system test contains a few dozen records.
Over a dozen of these zones are used in the test. Most records present
in these zones are not subsequently used in the test itself, but all of
them need to be signed by the named instances launched by the test,
which puts quite a bit of strain on lower-end machines, leading to
intermittent failures of the "inline" system test. Remove all redundant
records from the zones used in the "inline" system test in order to
stabilize it.
If "rndc signing -nsec3param ..." is ran for a zone which has not yet
been loaded or transferred (i.e. its "db" field is NULL), it will be
silently ignored by named despite rndc logging an "nsec3param request
queued" message, which is misleading. Prevent this by keeping a
per-zone queue of NSEC3PARAM change requests which arrive before a zone
is loaded or transferred and processing that queue once the raw version
of an inline-signed zone becomes available.
5006. [cleanup] Code preparing a delegation response was extracted from
query_delegation() and query_zone_delegation() into a
separate function in order to decrease code
duplication. [GL #431]
Changes introduced by the previous two commits make the parts of
query_delegation() and query_zone_delegation() which prepare a
delegation response functionally equivalent. Extract this code into a
separate function, query_prepare_delegation_response(), and then call
the latter from both query_delegation() and query_zone_delegation() in
order to reduce code duplication. Add a comment describing the purpose
of the extracted code. Fix coding style issues.
When query processing hits a delegation from a locally configured zone,
an attempt may be made to look for a better answer in the cache. In
such a case, the zone-sourced delegation data is set aside and the
lookup is retried using the cache database. When that lookup is
completed, a decision is made whether the answer found in the cache is
better than the answer found in the zone.
Currently, if the zone-sourced answer turns out to be better than the
one found in the cache:
- qctx->zdb is not restored into qctx->db,
- qctx->node, holding the zone database node found, is not even saved.
Thus, in such a case both qctx->db and qctx->node will point at cache
data. This is not an issue for BIND versions which do not support
mirror zones because in these versions non-recursive queries always
cause the zone-sourced delegation to be returned and thus the
non-recursive part of query_delegation() is never reached if the
delegation is coming from a zone. With mirror zones, however,
non-recursive queries may cause cache lookups even after a zone
delegation is found. Leaving qctx->db assigned to the cache database
when query_delegation() determines that the zone-sourced delegation is
the best answer to the client's query prevents DS records from being
added to delegations coming from mirror zones. Fix this issue by
keeping the zone database and zone node in qctx while the cache is
searched for an answer and then restoring them into qctx->db and
qctx->node, respectively, if the zone-sourced delegation turns out to be
the best answer. Since this change means that qctx->zdb cannot be used
as the glue database any more as it will be reset to NULL by RESTORE(),
ensure that qctx->db is not a cache database before attaching it to
qctx->client->query.gluedb.
Furthermore, current code contains a conditional statement which
prevents a mirror zone from being used as a source of glue records.
Said statement was added to prevent assertion failures caused by
attempting to use a zone database's glue cache for finding glue for an
NS RRset coming from a cache database. However, that check is overly
strict since it completely prevents glue from being added to delegations
coming from mirror zones. With the changes described above in place,
the scenario this check was preventing can no longer happen, so remove
the aforementioned check.
If qctx->zdb is not NULL, qctx->zfname will also not be NULL;
qctx->zsigrdataset may be NULL in such a case, but query_putrdataset()
handles pointers to NULL pointers gracefully. Remove redundant
conditional expressions to make the cleanup code in query_freedata()
match the corresponding sequences of SAVE() / RESTORE() macros more
closely.
Make will choose modified manual from build directory or original from source
directory automagically. Take advantage of install tool feature.
Install all files in single command instead of iterating on each of them.
Resolve "isc_buffer_printf() grows buffer without autorealloc being set + nit in isc_buffer_realloc()"
Closes#443
See merge request isc-projects/bind9!559
Fix ax_check_openssl to accept "--with-openssl" or "--with-openssl=yes",
and improve it to modern autotools standard
See merge request isc-projects/bind9!550
dns_view_zonecut() may associate the dns_rdataset_t structure passed to
it even if it returns a result different then ISC_R_SUCCESS. Not
handling this properly may cause a reference leak. Fix by ensuring
'nameservers' is cleaned up in all relevant failure modes.
Prevent ans2.pl from responding authoritatively for any name at or below
example.net.
Make ans3.pl properly answer example.net/NS queries. Use string
comparisons instead of regular expressions where possible.
lo0 and lo0:0 are the same interface on Solaris. Make sure
bin/tests/system/ifconfig.sh does not touch lo0:0 in order to prevent it
from changing the address of the loopback interface on Solaris.
Modify .gitlab-ci.yml so that every CI pipeline also builds and tests
BIND on CentOS versions 6 and 7. Use --disable-warn-error on CentOS 6
since it uses GCC 4.4.7 which suffers from bugs causing bogus warnings
to be generated, e.g.:
sigs_test.c: In function 'compare_tuples':
sigs_test.c:75: warning: declaration of 'index' shadows a global declaration
/usr/include/string.h:489: warning: shadowed declaration is here
sigs_test.c: In function 'updatesigs_test':
sigs_test.c:193: warning: declaration of 'index' shadows a global declaration
/usr/include/string.h:489: warning: shadowed declaration is here
The "git status" command in Git versions before 1.7.2 does not support
the "--ignored" option. Prevent spamming the console when running
system tests from a Git repository on a host with an ancient Git version
installed.
The output of certain "dig +idnout" invocations may be locale-dependent.
Remove the "dig +idnout" subtest from the "digdelv" system test as IDN
support is already thoroughly tested by the "idna" system test.
Every prereq.sh script must include bin/tests/system/conf.sh, otherwise
if some prerequisite is not met, errors about echo_i not being found
will be printed instead of actual error messages.
The Docker images used for CI install ATF to /usr, not /usr/local.
Update the ./configure invocation in .gitlab-ci.yml accordingly in order
to prevent confusion.
Depending on tool versions used, "autoreconf -i" may not update all
Autoconf-generated files, which in turn may result in build errors.
Make autogen.sh call autoreconf with the "-f" command line argument to
ensure all Autoconf-generated files are updated when autogen.sh is run.
Trying to resolve a trust anchor telemetry query for a locally served
zone does not cause upstream queries to be sent as the response is
determined just by consulting local data. Work around this issue by
calling dns_view_findzonecut() first in order to determine the NS RRset
for a given domain name and then passing the zone cut found to
dns_resolver_createfetch().
Note that this change only applies to TAT queries generated by the
resolver itself, not to ones received from downstream resolvers.
Extract the part of dotat() reponsible for preparing the QNAME for a TAT
query to a separate function in order to limit the number of local
variables used by each function and improve code readability.
Rename 'name' to 'origin' to better convey the purpose of that variable.
Also mark it with the const qualifier.
Update named_zone_reusable() so that it does not consider a zone to be
eligible for reuse if its old value of the "mirror" option differs from
the new one. This causes "rndc reconfig" to create a new zone structure
whenever the value of the "mirror" option is changed, which ensures that
the previous zone database is not reused and that flags are properly set
in responses sourced from zones whose "mirror" setting was changed at
runtime.
Net::DNS versions older than 0.67 respond to queries sent to a
Net::DNS::Nameserver even if its ReplyHandler returns undef. This makes
the "serve-stale" system test fail as it takes advantage of the newer
behavior. Since the latest Net::DNS version available with stock
RHEL/CentOS 6 packages is 0.65 and we officially support that operating
system, bin/tests/system/serve-stale/ans2/ans.pl should behave
consistently for various Net::DNS versions. Ensure that by reworking it
so that it does not use Net::DNS::Nameserver.
Net::DNS versions older than 0.68 insert a ./ANY RR into the QUESTION
section if the latter is empty. Since the latest Net::DNS version
available with stock RHEL/CentOS 6 packages is 0.65 and we officially
support that operating system, bin/tests/system/resolver/ans8/ans.pl
should behave consistently for various Net::DNS versions. Ensure that
by making handleUDP() return the query ID and flags generated by
Net::DNS with 8 zero bytes appended.
While idn2_to_unicode_8zlz() takes a 'flags' argument, it is ignored and
thus cannot be used to perform IDN checks on the output string.
The bug in libidn2 versions before 2.0.5 was not that a call to
idn2_to_unicode_8zlz() with certain flags set did not cause IDN checks
to be performed. The bug was that idn2_to_unicode_8zlz() did not check
whether a conversion can be performed between UTF-8 and the current
locale's character encoding. In other words, with libidn2 version
2.0.5+, if the current locale's character encoding is ASCII, then
idn2_to_unicode_8zlz() will fail when it is passed any Punycode string
which decodes to a non-ASCII string, even if it is a valid IDNA2008
name.
Rework idn_ace_to_locale() so that invalid IDNA2008 names are properly
and consistently detected for all libidn2 versions and locales.
Update the "idna" system test accordingly. Add checks for processing a
server response containing Punycode which decodes to an invalid IDNA2008
name. Fix invalid subtest description.
Since idn_output_filter() no longer uses its 'absolute' argument and no
other callback is used with dns_name_settotextfilter(), remove the
'absolute' argument from the dns_name_totextfilter_t prototype.
output_filter() does not need to dot-terminate its input name because
libidn2 properly handles both dot-terminated and non-dot-terminated
names. libidn2 also does not implicitly dot-terminate names passed to
it, so parts of output_filter() handling dot termination can simply be
removed.
Fix a logical condition to make sure 'src' can fit the terminating NULL
byte. Replace the MAXDLEN macro with the MXNAME macro used in the rest
of dig source code. Tweak comments and variable names.
Rename output_filter() to idn_output_filter() so that it can be easily
associated with IDN and other idn_*() functions.
idn_ace_to_locale() may return a string longer than MAXDLEN because it
is using the current locale's character encoding. Rather then imposing
an arbitrary limit on the length of the string that function can return,
make it pass the string prepared by libidn2 back to the caller verbatim,
making the latter responsible for freeing that string. In conjunction
with the fact that libidn2 errors are considered fatal, this makes
returning an isc_result_t from idn_ace_to_locale() unnecessary.
Do not process success cases in conditional branches for improved
consistency with the rest of BIND source code. Add a comment explaining
the purpose of idn_ace_to_locale(). Rename that function's parameters
to match common BIND naming pattern.
idn_locale_to_ace() is a static function which is always used with a
buffer of size MXNAME, i.e. one that can fit any valid domain name.
Since libidn2 detects invalid domain names and libidn2 errors are
considered fatal, remove size checks from idn_locale_to_ace(). This
makes returning an isc_result_t from it unnecessary.
Do not process success cases in conditional branches for improved
consistency with the rest of BIND source code. Add a comment explaining
the purpose of idn_locale_to_ace(). Rename that function's parameters
to match common BIND naming pattern.
Certain characters, like symbols, are allowed by IDNA2003, but not by
IDNA2008. Make dig reject such symbols when IDN input processing is
enabled to ensure BIND only supports IDNA2008. Update the "idna" system
test so that it uses one of such symbols rather than one which is
disallowed by both IDNA2003 and IDNA2008.
There is no need to call dns_name_settotextfilter() in setup_system()
because setup_lookup() determines whether IDN output processing should
be enabled for a specific lookup (taking the global setting into
consideration) and calls dns_name_settotextfilter() anyway if it is.
Remove the dns_name_settotextfilter() call from setup_system().
Clean up the parts of configure.in responsible for handling libidn2
detection and adjust other pieces of the build system to match these
cleanups:
- use pkg-config when --with-libidn2 is used without an explicit path,
- look for idn2_to_ascii_lz() rather than idn2_to_ascii_8z() as the
former is used in BIND while the latter is not,
- do not look for idn2_to_unicode_8zlz() as it is present in all
libidn2 versions which have idn2_to_ascii_lz(),
- check whether the <idn2.h> header is usable,
- set LDFLAGS in the Makefile for dig so that, if specified, the
requested libidn2 path is used when linking with libidn2,
- override CPPFLAGS when looking for libidn2 components so that the
configure script does not produce warnings when libidn2 is not
installed system-wide,
- merge the AS_CASE() call into the AS_IF() call below it to simplify
code,
- indicate the default value of --with-libidn2 in "./configure --help"
output,
- use $with_libidn2 rather than $use_libidn2 to better match the name
of the configure script argument,
- stop differentiating between IDN "in" and "out" support, i.e. make
dig either support libidn2 or not; remove WITH_* Autoconf macros and
use a new one, HAVE_LIBIDN2, to determine whether libidn2 support
should be enabled.
4987. [cleanup] dns_rdataslab_tordataset() and its related
dns_rdatasetmethods_t callbacks were removed as they
were not being used by anything in BIND. [GL #371]
Since BIND libraries are no longer considered public and
dns_rdataslab_tordataset() is not used anywhere in the tree, remove the
latter and its associated dns_rdatasetmethods_t callbacks from
lib/dns/rdataslab.c.
4985. [func] Add a new slave zone option, "mirror", to enable
serving a non-authoritative copy of a zone that
is subject to DNSSEC validation before being
used. For now, this option is only meant to
facilitate deployment of an RFC 7706-style local
copy of the root zone. [GL #33]
Replace "type: slave" with "type: mirror" in "rndc zonestatus" output
for mirror zones in order to enable the user to tell a regular slave
zone and a mirror zone apart.
Since the mirror zone feature is expected to mostly be used for the root
zone, prevent slaves from sending NOTIFY messages for mirror zones by
default. Retain the possibility to use "also-notify" as it might be
useful in certain cases.
As mirror zone data should be treated the way validated, cached DNS
responses are, outgoing mirror zone transfers should be disabled unless
they are explicitly enabled by zone configuration.
As mirror zone data should be treated the way validated, cached DNS
responses are, it should not be used when responding to clients who are
not allowed cache access. Reuse code responsible for determining cache
database access for evaluating mirror zone access.
Modify query_checkcacheaccess() so that it only contains a single return
statement rather than three and so that the "check_acl" variable is no
longer needed. Tweak and expand comments. Fix coding style issues.
Modify query_getcachedb() so that it uses a common return path for both
success and failure. Remove a redundant NULL check since 'db' will
never be NULL after being passed as a target pointer to dns_db_attach().
Fix coding style issues.
Extract the parts of query_getcachedb() responsible for checking whether
the client is allowed to access the cache to a separate function, so
that it can be reused for determining mirror zone access.
If transferring or loading a mirror zone fails, resolution should still
succeed by means of falling back to regular recursive queries.
Currently, though, if a slave zone is present in the zone table and not
loaded, a SERVFAIL response is generated. Thus, mirror zones need
special handling in this regard.
Add a new dns_zt_find() flag, DNS_ZTFIND_MIRROR, and set it every time a
domain name is looked up rather than a zone itself. Handle that flag in
dns_zt_find() in such a way that a mirror zone which is expired or not
yet loaded is ignored when looking up domain names, but still possible
to find when the caller wants to know whether the zone is configured.
This causes a fallback to recursion when mirror zone data is unavailable
without making unloaded mirror zones invisible to code checking a zone's
existence.
Zone RRsets are assigned trust level "ultimate" upon load, which causes
the AD bit to not be set in responses coming from slave zones, including
mirror zones. Make dns_zoneverify_dnssec() update the trust level of
verified RRsets to "secure" so that the AD bit is set in such responses.
No rollback mechanism is implemented as dns_zoneverify_dnssec() fails in
case of any DNSSEC failure, which causes the mirror zone version being
verified to be discarded.
Section 4 of RFC 7706 suggests that responses sourced from a local copy
of a zone should not have the AA bit set. Follow that recommendation by
setting 'qctx->authoritative' to ISC_FALSE when a response to a query is
coming from a mirror zone.
When a resolver is a regular slave (i.e. not a mirror) for some zone,
non-recursive queries for names below that slaved zone will return a
delegation sourced from it. This behavior is suboptimal for mirror
zones as their contents should rather be treated as validated, cached
DNS responses. Modify query_delegation() and query_zone_delegation() to
permit clients allowed cache access to check its contents for a better
answer when responding to non-recursive queries.
Make ns3 mirror the "root" zone from ns1 and query the former for a
properly signed record below the root. Ensure ns1 is not queried during
resolution and that the AD bit is set in the response.
As mirror zone files are verified when they are loaded from disk, verify
journal files as well to ensure invalid data is not used. Reuse the
journals generated during IXFR tests to test this.
Update axfr_commit() so that all incoming versions of a mirror zone
transferred using AXFR are verified before being used. If zone
verification fails, discard the received version of the zone, wait until
the next refresh and retry.
Add a function for determining whether the supplied version of a mirror
zone passes DNSSEC validation and is signed using a trusted key. Define
a new libdns result signifying a zone verification failure.
Extend check_dnskey_sigs() so that, if requested, it checks whether the
DNSKEY RRset at zone apex is signed by at least one trust anchor. The
trust anchor table is passed as an argument to dns_zoneverify_dnssec()
and passed around in the verification context structure. Neither
dnssec-signzone nor dnssec-verify are yet modified to make use of that
feature, though.
The system test helper function nextpart() always updates the "lines
read so far" marker ("<file>.prev") when it is called, which somewhat
limits its flexibility. Add two new helper functions, nextpartpeek()
and nextpartreset(), so that certain parts of log files can be easily
examined more than once. Add some documentation to help understand the
purpose of each function in the nextpart*() family.
Add a new slave-only boolean configuration option, "mirror", along with
its corresponding dns_zoneopt_t enum and a helper function for checking
whether that option was set for a given zone. This commit does not
introduce any behavior changes yet.
With "dnssec-validation" now defaulting to "auto", it needs to be
explicitly set to "yes" (the previous default value) for all validating
resolvers used in system tests. Ensure that requirement is satisfied by
the resolvers used in the "rpz" system test.
Change 4897 modified the way the $DNSRPS_TEST_MODE variable is used in
bin/tests/system/rpz/tests.sh without updating all references to it,
which i.a. causes the $native and $dnsrps variables to not be set in the
default testing mode, effectively preventing failed checks from being
propagated to the final result of the test. Use $mode instead of
$DNSRPS_TEST_MODE where appropriate to fix error handling in the "rpz"
system test.
4973. [func] verifyzone() and the functions it uses were moved to
libdns and refactored to prevent exit() from being
called upon failure. A side effect of that is that
dnssec-signzone and dnssec-verify now check for memory
leaks upon shutdown. [GL #266]
Since exit() is no longer called upon any dns_zoneverify_dnssec() error,
verification failures should be signalled to callers. Make
dns_zoneverify_dnssec() return an isc_result_t and handle both success
and error appropriately in bin/dnssec/dnssec-signzone.c and
bin/dnssec/dnssec-verify.c. This enables memory leak detection during
shutdown of these tools and causes dnssec-signzone to print signing
statistics even when zone verification fails.
record_found() returns an isc_result_t, but its value is not checked.
Modify the only call site of record_found() so that its errors are
properly handled.
Replace the fprintf() call inside record_nsec3() with a
zoneverify_log_error() call. Remove the "mctx" argument of
record_nsec3() as it can be extracted from "vctx".
Modify one of the record_nsec3() call sites so that its errors are
properly handled.
Make match_nsec3() return the verification result through a separate
pointer, thus making it possible to signal errors using function
return value. Replace all check_result() and fprintf() calls inside
match_nsec3() with zoneverify_log_error() calls and error handling code.
Modify all call sites of match_nsec3() so that its errors are properly
handled.
Replace all check_result() calls inside isoptout() with
zoneverify_log_error() calls and error handling code. Enable isoptout()
to signal errors to the caller using its return value.
Modify the call site of isoptout() so that its errors are properly
handled.
Make verifynsec3(), verifynsec3s(), and verifyemptynodes() return the
verification result through a separate pointer, thus making it possible
to signal errors using function return values. Replace all
check_result() and fprintf() calls inside these functions with
zoneverify_log_error() calls and error handling code.
Modify all call sites of verifynsec3(), verifynsec3s(), and
verifyemptynodes() so that their errors are properly handled.
Make verifynsec() return the verification result through a separate
pointer, thus making it possible to signal errors using function
return value. Replace all check_result() and fprintf() calls inside
verifynsec() with zoneverify_log_error() calls and error handling code.
Modify the call site of verifynsec() so that its errors are properly
handled.
Rename "tresult" to "tvresult" in order to improve variable naming
consistency between functions.
Replace all check_result() and fprintf() calls inside check_no_rrsig()
with zoneverify_log_error() calls and error handling code. Enable
check_no_rrsig() to signal errors to the caller using its return
value.
Modify the call site of check_no_rrsig() so that its errors are properly
handled.
Define buffer size using a named constant rather than a plain integer.
Replace all check_result() and fprintf() calls inside verifyset() with
zoneverify_log_error() calls and error handling code. Enable
verifyset() to signal errors to the caller using its return value.
Modify the call site of verifyset() so that its errors are properly
handled.
Define buffer sizes using named constants rather than plain integers.
Make verifynode() return the verification result through a separate
pointer, thus making it possible to signal errors using function
return value. Replace all fatal() and check_result() calls inside
verifynode() with zoneverify_log_error() calls and error handling code.
Add a REQUIRE assertion to emphasize verifynode() may be called with
some of its arguments set to NULL.
Modify all call sites of verifynode() so that its errors are properly
handled.
Replace the check_result() call inside is_empty() with a
zoneverify_log_error() call and error handling code. Enable is_empty()
to signal errors to the caller using its return value.
Modify the call site of is_empty() so that its errors are properly
handled.
Replace the fatal() call inside check_no_nsec() with a
zoneverify_log_error() call. Enable check_no_nsec() to signal errors to
the caller using its return value.
Modify all call sites of check_no_nsec() so that its errors are properly
handled.
Replace all fatal(), check_result(), and check_dns_dbiterator_current()
calls inside verify_nodes() with zoneverify_log_error() calls and error
handling code. Enable verify_nodes() to signal errors to the caller
using its return value.
Modify the call site of verify_nodes() so that its errors are properly
handled.
Free all heap elements upon verification context cleanup as a
verification error may prevent them from being freed elsewhere.
Remove the check_dns_dbiterator_current() macro as it is no longer used
anywhere in lib/dns/zoneverify.c.
Replace all fatal() and fprintf() calls inside check_bad_algorithms()
with zoneverify_print() calls and error handling code. Enable
check_bad_algorithms() to signal errors to the caller using its return
value.
Modify the call site of check_bad_algorithms() so that its errors are
properly handled.
Replace all fatal() and check_result() calls inside check_dnskey() with
zoneverify_log_error() calls and error handling code. Enable
check_dnskey() to signal errors to the caller using its return value.
Modify the call site of check_dnskey() so that its errors are properly
handled.
Replace all fatal() calls inside check_apex_rrsets() with
zoneverify_log_error() calls and error handling code. Enable
check_apex_rrsets() to signal errors to the caller using its return
value.
Modify the call site of check_apex_rrsets() so that its errors are
properly handled.
Replace calls to check_result() with RUNTIME_CHECK assertions for all
dns_rdata_tostruct() calls in lib/dns/zoneverify.c as this function
cannot fail when the "mctx" argument is NULL (and that is the case for
all call sites of this function throughout lib/dns/zoneverify.c).
Extract the part of dns_zoneverify_dnssec() responsible for checking
whether the zone is fully signed using all active algorithms to a
separate function.
Extract the part of dns_zoneverify_dnssec() responsible for verifying
DNSSEC signatures against the DNSKEY RRset at zone apex and checking
consistency of NSEC/NSEC3 chains to a separate function.
Extract the part of dns_zoneverify_dnssec() responsible for determining
and printing a list of DNSSEC algorithms active in the verified zone to
a separate function.
Extract the part of check_dnskey() responsible for determining active
algorithms in the verified zone based on the signatures at zone apex to
a separate function.
Extract the part of dns_zoneverify_dnssec() responsible for fetching and
preliminarily checking DNSKEY, SOA, NSEC, and NSEC3PARAM RRsets from
zone apex to a separate function.
These functions will be used in the process of replacing fatal(),
check_result(), and fprintf() calls throughout lib/dns/zoneverify.c with
code that does not call exit(). They are intended for:
- zoneverify_log_error(): logging problems encountered while
performing zone verification,
- zoneverify_print(): printing status messages and reports which are
only useful in standalone tools.
To make using dns_zone_logv() possible, add a new "zone" argument to
dns_zoneverify_dnssec() that standalone tools are expected to set to
NULL.
Tables representing algorithm use in the verified zone are commonly
accessed throughout dns_zoneverify_dnssec(). Move them into the
structure representing a verification context. While this does not
really simplify currently existing code, it will facilitate passing data
around between smaller functions that dns_zoneverify_dnssec() is about
to get split into.
Eight structures representing four RRsets and their signatures are
commonly accessed throughout dns_zoneverify_dnssec(). Move them into
the structure representing a verification context. While this does not
really simplify currently existing code, it will facilitate passing data
around between smaller functions that dns_zoneverify_dnssec() is about
to get split into.
Move variables commonly used throughout dns_zoneverify_dnssec() and its
helper functions to the structure representing a verification context in
order to reduce the number of arguments passed between functions.
Make dns_zoneverify_dnssec() eligible for multithreaded use by replacing
the static variables it accesses with a stack-allocated structure
containing these variables. Implement setup and cleanup routines for
that structure, ensuring no error in these routines causes exit() to be
called any more. Pass a pointer to that structure to functions
requiring access to variables which were previously static.
This commit only moves code around, with the following exceptions:
- the check_dns_dbiterator_current() macro and functions
is_delegation() and has_dname() were removed from
bin/dnssec/dnssectool.{c,h} and duplicated in two locations:
bin/dnssec/dnssec-signzone.c and lib/dns/zoneverify.c; these
functions are used both by the code in bin/dnssec/dnssec-signzone.c
and verifyzone(), but are not a good fit for being exported by a
code module responsible for zone verification,
- fatal() and check_result() were duplicated in lib/dns/zoneverify.c
as static functions which do not use the "program" variable any more
(as it is only set by the tools in bin/dnssec/); this is a temporary
step which only aims to prevent compilation from breaking - these
duplicate functions will be removed once lib/dns/zoneverify.c is
refactored not to use them,
- the list of header files included by lib/dns/zoneverify.c was
expanded to encompass all header files that are actually used by the
code in that file,
- a description of the purpose of the commented out "fields" inside
struct nsec3_chain_fixed was added.
DNAME records indicate bottom of zone and thus no records below a DNAME
should be DNSSEC-signed or included in NSEC(3) chains. Add a helper
function, has_dname(), for detecting DNAME records at a given node.
Prevent signing DNAME-obscured records. Check that DNAME-obscured
records are not signed.
The keyfile and key ID for the original managed key do not change
throughout the mkeys system test. Keep them in helper variables to
prevent calling "cat" multiple times and improve code readability.
Reduce code duplication by replacing a code snippet repeated throughout
system tests using "trusted-keys" and/or "managed-keys" configuration
sections with calls to keyfile_to_{managed,trusted}_keys() helper
functions.
Add a set of helper functions for system test scripts which enable
converting key data from a set of keyfiles to either a "trusted-keys"
section or a "managed-keys" section suitable for including in a
resolver's configuration file.
- make qname-minimization option tristate {strict,relaxed,disabled}
- go straight for the record if we hit NXDOMAIN in relaxed mode
- go straight for the record after 3 labels without new delegation or 7 labels total
- use start of fetch (and not time of response) as 'now' time for querying cache for
zonecut when following delegation.
Add a new libdns function, dns_zone_logv(), which takes a single va_list
argument rather than a variable number of arguments and can be used as a
base for implementing more specific zone logging functions.
Resolve "Multiple RRSIGs on some records in signed zone even though only one key is ever active at a time"
Closes#240
See merge request isc-projects/bind9!231
Resolve "9.11.3-S1 totext_nsec3 inserts a redundant white space between next hash and type map [ISC-support #12887]"
See merge request isc-projects/bind9!313
- all tests with "recursion yes" now also specify "dnssec-validation yes",
and all tests with "recursion no" also specify "dnssec-validation no".
this must be maintained in all new tests, or else validation will fail
when we use local root zones for testing.
- clean.sh has been modified where necessary to remove managed-keys.bind
and viewname.mkeys files.
- the default setting for dnssec-validation is now "auto", which
activates DNSSEC validation using the IANA root key. The old behavior
can be restored by explicitly setting "dnssec-validation yes", which
"yes", which activates DNSSEC validation only if keys are explicitly
configured in named.conf.
- the ARM has been updated to describe the new behavior
This commit reverts the previous change to use system provided
entropy, as (SYS_)getrandom is very slow on Linux because it is
a syscall.
The change introduced in this commit adds a new call isc_nonce_buf
that uses CSPRNG from cryptographic library provider to generate
secure data that can be and must be used for generating nonces.
Example usage would be DNS cookies.
The isc_random() API has been changed to use fast PRNG that is not
cryptographically secure, but runs entirely in user space. Two
contestants have been considered xoroshiro family of the functions
by Villa&Blackman and PCG by O'Neill. After a consideration the
xoshiro128starstar function has been used as uint32_t random number
provider because it is very fast and has good enough properties
for our usage pattern.
The other change introduced in the commit is the more extensive usage
of isc_random_uniform in places where the usage pattern was
isc_random() % n to prevent modulo bias. For usage patterns where
only 16 or 8 bits are needed (DNS Message ID), the isc_random()
functions has been renamed to isc_random32(), and isc_random16() and
isc_random8() functions have been introduced by &-ing the
isc_random32() output with 0xffff and 0xff. Please note that the
functions that uses stripped down bit count doesn't pass our
NIST SP 800-22 based random test.
- added a 1-second floor to max-stale-ttl similar to stale-answer-ttl;
if set to 0, it will be silently updated to 1.
- fixed the ARM entry on max-stale-ttl, which incorrectly suggested that
the default was 0 instead of 1 week.
- clarified rndc serve-stale documentation.
- this was a compile time option to disable the use of a hash table in
the RBTDB. the code path without the hash table was buggy and
untested, and unlikely to be needed by anyone anyway.
- mark the 'geoip-use-ecs' option obsolete; warn when it is used
in named.conf
- prohibit 'ecs' ACL tags in named.conf; note that this is a fatal error
since simply ignoring the tags could make ACLs behave unpredictably
- re-simplify the radix and iptable code
- clean up dns_acl_match(), dns_aclelement_match(), dns_acl_allowed()
and dns_geoip_match() so they no longer take ecs options
- remove the ECS-specific unit and system test cases
- remove references to ECS from the ARM
- [ ]***(SwEng)*** Build documentation on `docs.isc.org`.
- [ ]***(QA)*** Check that all the above steps were performed correctly.
- [ ]***(QA)*** Check that the formatting is correct for text, PDF, and HTML versions of release notes.
- [ ]***(SwEng)*** Tag the releases[^2]. (Tags may only be pushed to the public repository for releases which are *not* security releases.)
- [ ]***(SwEng)*** If this is the first tag for a release (e.g. beta), create a release branch named `release_v9_X_Y` to allow development to continue on the maintenance branch whilst release engineering continues.
### Before the ASN Deadline (for ASN Releases) or the Public Release Date (for Regular Releases)
- [ ]***(QA)*** Verify GitLab CI results for the tags created and prepare a QA report for the releases to be published.
- [ ]***(QA)*** Request signatures for the tarballs, providing their location and checksums.
- [ ]***(Support)*** Pre-publish ASN and/or Subscription Edition tarballs so that packages can be built.
- [ ]***(QA)*** Build and test ASN and/or Subscription Edition packages.
- [ ]***(QA)*** Notify Support that the releases have been prepared.
- [ ]***(Support)*** Send out ASNs (if applicable).
### On the Day of Public Release
- [ ]***(Support)*** Wait for clearance from Security Officer to proceed with the public release (if applicable).
- [ ]***(Support)*** Place tarballs in public location on FTP site.
- [ ]***(Support)*** Publish links to downloads on ISC website.
- [ ]***(Support)*** Write release email to *bind-announce*.
- [ ]***(Support)*** Write email to *bind-users* (if a major release).
- [ ]***(Support)*** Update tickets in case of waiting support customers.
- [ ]***(QA)*** Build and test any outstanding private packages.
- [ ]***(QA)*** Build public packages (`*.deb`, RPMs).
- [ ]***(QA)*** Inform Marketing of the release.
- [ ]***(QA)*** Update the internal [BIND release dates wiki page](https://wiki.isc.org/bin/view/Main/BindReleaseDates) when public announcement has been made.
- [ ]***(Marketing)*** Post short note to Twitter.
- [ ]***(Marketing)*** Update [Wikipedia entry for BIND](https://en.wikipedia.org/wiki/BIND).
- [ ]***(Marketing)*** Write blog article (if a major release).
- [ ]***(QA)*** Ensure all new tags are annotated and signed.
- [ ]***(SwEng)*** Merge the automatically prepared `prep 9.X.Y` commit which updates `version` and documentation on the release branch into the relevant maintenance branch (`v9_X`).
- [ ]***(SwEng)*** Push tags for the published releases to the public repository.
- [ ]***(QA)*** For each maintained branch, update the `BIND_BASELINE_VERSION` variable for the `abi-check` job in `.gitlab-ci.yml` to the latest published BIND version tag for a given branch.
- [ ]***(QA)*** Prepare empty release notes for the next set of releases.
- [ ]***(QA)*** Sanitize all confidential issues assigned to the release milestone and make them public.
- [ ]***(QA)*** Update QA tools used in GitLab CI (e.g. Flake8, PyLint) by modifying the relevant `Dockerfile`.
[^1]: If not, use the time remaining until the tagging deadline to ensure all outstanding issues are either resolved or moved to a different milestone.
[^2]: Preferred command line: `git tag -u <DEVELOPER_KEYID> -a -s -m "BIND 9.X.Y[alphatag]" v9_X_Y[alphatag]`, where `[alphatag]` is an optional string such as `b1`, `rc1`, etc.
- See the COPYRIGHT file distributed with this work for additional
- information regarding copyright ownership.
-->
## BIND Source Access and Contributor Guidelines
*Feb 22, 2018*
## BIND 9 Source Access and Contributor Guidelines
*May 28, 2020*
### Contents
@@ -19,12 +19,12 @@
### Introduction
Thank you for using BIND!
Thank you for using BIND 9!
BIND is open source software that implements the Domain Name System (DNS)
protocols for the Internet. It is a reference implementation of those
protocols, but it is also production-grade software, suitable for use in
high-volume and high-reliability applications. It is by far the most
high-volume and high-reliability applications. It is very
widely used DNS software, providing a robust and stable platform on top of
which organizations can build distributed computing systems with the
knowledge that those systems are fully compliant with published DNS
@@ -33,14 +33,22 @@ standards.
BIND is and will always remain free and openly available. It can be
used and modified in any way by anyone.
BIND is maintained by the [Internet Systems Consortium](https://www.isc.org),
BIND is maintained by [Internet Systems Consortium](https://www.isc.org),
a public-benefit 501(c)(3) nonprofit, using a "managed open source" approach:
anyone can see the source, but only ISC employees have commit access.
Until recently, the source could only be seen once ISC had published
a release: read access to the source repository was restricted just
as commit access was. That's now changing, with the opening of a
In the past, the source could only be seen once ISC had published
a release; read access to the source repository was restricted just
as commit access was. That has changed, as ISC now provides a
public git mirror to the BIND source tree (see below).
At ISC, we're committed to
building communities that are welcoming and inclusive: environments where people
are encouraged to share ideas, treat each other with respect, and collaborate
towards the best solutions. To reinforce our commitment, ISC
has adopted a slightly modified version of the Django
[Code of Conduct](https://gitlab.isc.org/isc-projects/bind9/-/blob/master/CODE_OF_CONDUCT.md) for the BIND 9 project, as well as for the conduct of our
developers throughout the industry.
### <a name="access"></a>Access to source code
Public BIND releases are always available from the
@@ -68,7 +76,7 @@ branch, use:
> $ git checkout v9_12
Whenever a branch is ready for publication, a tag will be placed of the
Whenever a branch is ready for publication, a tag is placed of the
form `v9_X_Y`. The 9.12.0 release, for instance, is tagged as `v9_12_0`.
The branch in which the next major release is being developed is called
@@ -78,16 +86,16 @@ The branch in which the next major release is being developed is called
Reports of flaws in the BIND package, including software bugs, errors
in the documentation, missing files in the tarball, suggested changes
or requests for new features, etc, can be filed using
or requests for new features, etc., can be filed using
Due to a large ticket backlog, we are sometimes slow to respond,
especially if a bug is cosmetic or if a feature request is vague or
low in priority, but we will try at least to acknowledge legitimate
low in priority, but we try at least to acknowledge legitimate
bug reports within a week.
ISC's ticketing system is publicly readable; however, you must have
an account to file a new issue. You can either register locally or
ISC's GitLab system is publicly readable; however, you must have
an account to create a new issue. You can either register locally or
use credentials from an existing account at GitHub, GitLab, Google,
Twitter, or Facebook.
@@ -97,26 +105,26 @@ If you think you may be seeing a potential security vulnerability in BIND
report it immediately by emailing to security-officer@isc.org. Plain-text
e-mail is not a secure choice for communications concerning undisclosed
security issues so please encrypt your communications to us if possible,
using the [ISC Security Officer public key](https://www.isc.org/downloads/software-support-policy/openpgp-key/).
using the [ISC Security Officer public key](https://www.isc.org/pgpkey/).
Do not discuss undisclosed security vulnerabilites on any public mailing list.
Do not discuss undisclosed security vulnerabilities on any public mailing list.
ISC has a long history of handling reported vulnerabilities promptly and
effectively and we respect and acknowledge responsible reporters.
ISC's Security Vulnerability Disclosure Policy is documented at [https://kb.isc.org/article/AA-00861/0](https://kb.isc.org/article/AA-00861/0).
ISC's Security Vulnerability Disclosure Policy is documented at [https://kb.isc.org/docs/aa-00861](https://kb.isc.org/docs/aa-00861).
If you have a crash, you may want to consult
[‘What to do if your BIND or DHCP server has crashed.’](https://kb.isc.org/article/AA-00340/89/What-to-do-if-your-BIND-or-DHCP-server-has-crashed.html)
["What to do if your BIND or DHCP server has crashed."](https://kb.isc.org/docs/aa-00340)
### <a name="bugs"></a>Contributing code
### <a name="contrib"></a>Contributing code
BIND is licensed under the
[Mozilla Public License 2.0](http://www.isc.org/downloads/software-support-policy/isc-license/).
Earier versions (BIND 9.10 and earlier) were licensed under the [ISC License](http://www.isc.org/downloads/software-support-policy/isc-license/)
[Mozilla Public License 2.0](https://www.mozilla.org/en-US/MPL/2.0/).
Earlier versions (BIND 9.10 and earlier) were licensed under the [ISC License](https://www.isc.org/licenses/)
ISC does not require an explicit copyright assignment for patch
contributions. However, by submitting a patch to ISC, you implicitly
certify that you are the author of the code, that you intend to reliquish
certify that you are the author of the code, that you intend to relinquish
exclusive copyright, and that you grant permission to publish your work
under the open source license used for the BIND version(s) to which your
patch will be applied.
@@ -124,7 +132,7 @@ patch will be applied.
#### <a name="bind"></a>BIND code
Patches for BIND may be submitted directly via merge requests in
|`CC`|The C compiler to use. `configure` tries to figure out the right one for supported systems.|
|`CFLAGS`|C compiler flags. Defaults to include -g and/or -O2 as supported by the compiler. Please include '-g' if you need to set `CFLAGS`. |
|`STD_CINCLUDES`|System header file directories. Can be used to specify where add-on thread or IPv6 support is, for example. Defaults to empty string.|
|`STD_CDEFINES`|Any additional preprocessor symbols you want defined. Defaults to empty string. For a list of possible settings, see the file [OPTIONS](OPTIONS.md).|
|`LDFLAGS`|Linker flags. Defaults to empty string.|
|`BUILD_CC`|Needed when cross-compiling: the native C compiler to use when building for the target system.|
|`BUILD_CFLAGS`|Optional, used for cross-compiling|
|`BUILD_CPPFLAGS`||
|`BUILD_LDFLAGS`||
|`BUILD_LIBS`||
Additional environment variables affecting the build are listed at the
end of the `configure` help text, which can be obtained by running the
command:
$ ./configure --help
#### <a name="macos"> macOS
Building on macOS assumes that the "Command Tools for Xcode" is installed.
This can be downloaded from https://developer.apple.com/download/more/
or if you have Xcode already installed you can run "xcode-select --install".
This will add /usr/include to the system and install the compiler and other
tools so that they can be easily found.
Building on macOS assumes that the "Command Tools for Xcode" are installed.
and BIND must be configured with "--enable-dnstap".
and BIND must be configured with `--enable-dnstap`.
Portions of BIND that are written in Python, including
`dnssec-keymgr`, `dnssec-coverage`, `dnssec-checkds`, and some of the
system tests, require the 'argparse' and 'ply' modules to be available.
'argparse' is a standard module as of Python 2.7 and Python 3.2.
'ply' is available from [https://pypi.python.org/pypi/ply](https://pypi.python.org/pypi/ply).
Certain compiled-in constants and default settings can be decreased to
values better suited to small machines, e.g. OpenWRT boxes, by specifying
`--with-tuning=small` on the `configure` command line. This decreases
memory usage by using smaller structures, but degrades performance.
On Linux, process capabilities are managed in user space using
the `libcap` library, which can be installed on most Linux systems via
the `libcap-dev` or `libcap-devel` package. Process capability support can
also be disabled by configuring with `--disable-linux-caps`.
On some platforms it is necessary to explicitly request large file support
to handle files bigger than 2GB. This can be done by using
@@ -237,54 +260,54 @@ to handle files bigger than 2GB. This can be done by using
Support for the "fixed" rrset-order option can be enabled or disabled by
specifying `--enable-fixed-rrset` or `--disable-fixed-rrset` on the
configure command line. By default, fixed rrset-order is disabled to
configure command line. By default, fixed rrset-order is disabled to
reduce memory footprint.
If your operating system has integrated support for IPv6, it will be used
automatically. If you have installed KAME IPv6 separately, use
`--with-kame[=PATH]` to specify its location.
The `--enable-querytrace` option causes `named` to log every step of
processing every query. The `--enable-singletrace` option turns on the
same verbose tracing, but allows an individual query to be separately
traced by setting its query ID to 0. These options should only be enabled
when debugging, because they have a significant negative impact on query
performance.
`make install` will install `named` and the various BIND 9 libraries. By
`make install` installs`named` and the various BIND 9 libraries. By
default, installation is into /usr/local, but this can be changed with the
`--prefix` option when running `configure`.
You may specify the option `--sysconfdir` to set the directory where
configuration files like `named.conf` go by default, and `--localstatedir`
to set the default parent directory of `run/named.pid`. For backwards
compatibility with BIND 8, `--sysconfdir` defaults to `/etc` and
`--localstatedir` defaults to `/var` if no `--prefix` option is given. If
there is a `--prefix` option, sysconfdir defaults to `$prefix/etc` and
localstatedir defaults to `$prefix/var`.
to set the default parent directory of `run/named.pid`. `--sysconfdir`
defaults to `$prefix/etc` and `--localstatedir` defaults to `$prefix/var`.
### <a name="testing"/> Automated testing
A system test suite can be run with `make test`. The system tests require
A system test suite can be run with `make check`. The system tests require
you to configure a set of virtual IP addresses on your system (this allows
multiple servers to run locally and communicate with one another). These
multiple servers to run locally and communicate with each other). These
IP addresses can be configured by running the command
`bin/tests/system/ifconfig.sh up` as root.
Some tests require Perl and the Net::DNS and/or IO::Socket::INET6 modules,
and will be skipped if these are not available. Some tests require Python
and the 'dnspython' module and will be skipped if these are not available.
Some tests require Perl and the `Net::DNS` and/or `IO::Socket::INET6` modules,
and are skipped if these are not available. Some tests require Python
and the `dnspython` module and are skipped if these are not available.
See bin/tests/system/README for further details.
Unit tests are implemented using Automated Testing Framework (ATF).
To run them, use `configure --with-atf`, then run `make test` or
`make unit`.
Unit tests are implemented using the CMocka unit testing framework. To build
them, use `configure --with-cmocka`. Execution of tests is done by the automake
parallel test driver; unit tests are also run by `make check`.
### <a name="doc"/> Documentation
The *BIND 9 Administrator Reference Manual* is included with the source
distribution, in DocBook XML, HTML and PDF format, in the `doc/arm`
directory.
The *BIND 9 Administrator Reference Manual* (ARM) is included with the source
distribution, and in .rst format, in the `doc/arm`
directory. HTML and PDF versions are automatically generated and can
be viewed at [https://bind9.readthedocs.io/en/latest/index.html](https://bind9.readthedocs.io/en/latest/index.html).
Some of the programs in the BIND 9 distribution have man pages in their
directories. In particular, the command line options of `named` are
documented in `bin/named/named.8`.
Man pages for some of the programs in the BIND 9 distribution
are also included in the BIND ARM.
Frequently (and not-so-frequently) asked questions and their answers
can be found in the ISC Knowledge Base at
can be found in the ISC Knowledgebase at
[https://kb.isc.org](https://kb.isc.org).
Additional information on various subjects can be found in other
@@ -293,8 +316,8 @@ Additional information on various subjects can be found in other
### <a name="changes"/> Change log
A detailed list of all changes that have been made throughout the
development BIND 9 is included in the file CHANGES, with the most recent
changes listed first. Change notes include tags indicating the category of
development of BIND 9 is included in the file CHANGES, with the most recent
changes listed first. Change notes include tags indicating the category of
the change that was made; these categories are:
|Category |Description |
@@ -312,12 +335,31 @@ the change that was made; these categories are:
| [cleanup] | Minor corrections and refactoring |
| [doc] | Documentation |
| [contrib] | Changes to the contributed tools and libraries in the 'contrib' subdirectory |
| [placeholder] | Used in the master development branch to reserve change numbers for use in other branches, e.g. when fixing a bug that only exists in older releases |
| [placeholder] | Used in the master development branch to reserve change numbers for use in other branches, e.g., when fixing a bug that only exists in older releases |
In general, [func] and [experimental] tags will only appear in new-feature
releases (i.e., those with version numbers ending in zero). Some new
In general, [func] and [experimental] tags only appear in new-feature
releases (i.e., those with version numbers ending in zero). Some new
functionality may be backported to older releases on a case-by-case basis.
All other change types may be applied to all currently-supported releases.
All other change types may be applied to all currentlysupported releases.
#### Bug report identifiers
Most notes in the CHANGES file include a reference to a bug report or
issue number. Prior to 2018, these were usually of the form `[RT #NNN]`
and referred to entries in the "bind9-bugs" RT database, which was not open
to the public. More recent entries use the form `[GL #NNN]` or, less often,
`[GL !NNN]`, which, respectively, refer to issues or merge requests in the
GitLab database. Most of these are publicly readable, unless they include
information which is confidential or security-sensitive.
To look up a GitLab issue by its number, use the URL
configuration file\&. The file is parsed and checked for syntax errors, along with all files included by it\&. If no file is specified,
/etc/named\&.conf
is read by default\&.
.PP
Note: files that
\fBnamed\fR
reads in separate parser contexts, such as
rndc\&.key
and
bind\&.keys, are not automatically read by
\fBnamed\-checkconf\fR\&. Configuration errors in these files may cause
\fBnamed\fR
to fail to run, even if
\fBnamed\-checkconf\fR
was successful\&.
\fBnamed\-checkconf\fR
can be run on these files explicitly, however\&.
.SH"OPTIONS"
.PP
\-h
.RS4
Print the usage summary and exit\&.
.RE
.PP
\-j
.RS4
When loading a zonefile read the journal if it exists\&.
.RE
.PP
\-l
.RS4
List all the configured zones\&. Each line of output contains the zone name, class (e\&.g\&. IN), view, and type (e\&.g\&. master or slave)\&.
.RE
.PP
\-p
.RS4
Print out the
named\&.conf
and included files in canonical form if no errors were detected\&. See also the
\fB\-x\fR
option\&.
.RE
.PP
\-t \fIdirectory\fR
.RS4
Chroot to
directory
so that include directives in the configuration file are processed as if run by a similarly chrooted
\fBnamed\fR\&.
.RE
.PP
\-v
.RS4
Print the version of the
\fBnamed\-checkconf\fR
program and exit\&.
.RE
.PP
\-x
.RS4
When printing the configuration files in canonical form, obscure shared secrets by replacing them with strings of question marks (\*(Aq?\*(Aq)\&. This allows the contents of
named\&.conf
and related files to be shared \(em for example, when submitting bug reports \(em without compromising private data\&. This option cannot be used without
\fB\-p\fR\&.
.RE
.PP
\-z
.RS4
Perform a test load of all master zones found in
named\&.conf\&.
.RE
.PP
filename
.RS4
The name of the configuration file to be checked\&. If not specified, it defaults to
/etc/named\&.conf\&.
.RE
.SH"RETURN VALUES"
.PP
\fBnamed\-checkconf\fR
returns an exit status of 1 if errors were detected and 0 otherwise\&.
.SH"SEE ALSO"
.PP
\fBnamed\fR(8),
\fBnamed-checkzone\fR(8),
BIND 9 Administrator Reference Manual\&.
.SH"AUTHOR"
.PP
\fBInternet Systems Consortium, Inc\&.\fR
.SH"COPYRIGHT"
.br
Copyright \(co 2000-2002, 2004, 2005, 2007, 2009, 2014-2016, 2018 Internet Systems Consortium, Inc. ("ISC")
checks the syntax and integrity of a zone file\&. It performs the same checks as
\fBnamed\fR
does when loading a zone\&. This makes
\fBnamed\-checkzone\fR
useful for checking zone files before configuring them into a name server\&.
.PP
\fBnamed\-compilezone\fR
is similar to
\fBnamed\-checkzone\fR, but it always dumps the zone contents to a specified file in a specified format\&. Additionally, it applies stricter check levels by default, since the dump output will be used as an actual zone file loaded by
\fBnamed\fR\&. When manually specified otherwise, the check levels must at least be as strict as those specified in the
\fBnamed\fR
configuration file\&.
.SH"OPTIONS"
.PP
\-d
.RS4
Enable debugging\&.
.RE
.PP
\-h
.RS4
Print the usage summary and exit\&.
.RE
.PP
\-q
.RS4
Quiet mode \- exit code only\&.
.RE
.PP
\-v
.RS4
Print the version of the
\fBnamed\-checkzone\fR
program and exit\&.
.RE
.PP
\-j
.RS4
When loading a zone file, read the journal if it exists\&. The journal file name is assumed to be the zone file name appended with the string
\&.jnl\&.
.RE
.PP
\-J \fIfilename\fR
.RS4
When loading the zone file read the journal from the given file, if it exists\&. (Implies \-j\&.)
.RE
.PP
\-c \fIclass\fR
.RS4
Specify the class of the zone\&. If not specified, "IN" is assumed\&.
.RE
.PP
\-i \fImode\fR
.RS4
Perform post\-load zone integrity checks\&. Possible modes are
\fB"full"\fR
(default),
\fB"full\-sibling"\fR,
\fB"local"\fR,
\fB"local\-sibling"\fR
and
\fB"none"\fR\&.
.sp
Mode
\fB"full"\fR
checks that MX records refer to A or AAAA record (both in\-zone and out\-of\-zone hostnames)\&. Mode
\fB"local"\fR
only checks MX records which refer to in\-zone hostnames\&.
.sp
Mode
\fB"full"\fR
checks that SRV records refer to A or AAAA record (both in\-zone and out\-of\-zone hostnames)\&. Mode
\fB"local"\fR
only checks SRV records which refer to in\-zone hostnames\&.
.sp
Mode
\fB"full"\fR
checks that delegation NS records refer to A or AAAA record (both in\-zone and out\-of\-zone hostnames)\&. It also checks that glue address records in the zone match those advertised by the child\&. Mode
\fB"local"\fR
only checks NS records which refer to in\-zone hostnames or that some required glue exists, that is when the nameserver is in a child zone\&.
.sp
Mode
\fB"full\-sibling"\fR
and
\fB"local\-sibling"\fR
disable sibling glue checks but are otherwise the same as
\fB"full"\fR
and
\fB"local"\fR
respectively\&.
.sp
Mode
\fB"none"\fR
disables the checks\&.
.RE
.PP
\-f \fIformat\fR
.RS4
Specify the format of the zone file\&. Possible formats are
\fB"text"\fR
(default),
\fB"raw"\fR, and
\fB"map"\fR\&.
.RE
.PP
\-F \fIformat\fR
.RS4
Specify the format of the output file specified\&. For
\fBnamed\-checkzone\fR, this does not cause any effects unless it dumps the zone contents\&.
.sp
Possible formats are
\fB"text"\fR
(default), which is the standard textual representation of the zone, and
\fB"map"\fR,
\fB"raw"\fR, and
\fB"raw=N"\fR, which store the zone in a binary format for rapid loading by
\fBnamed\fR\&.
\fB"raw=N"\fR
specifies the format version of the raw zone file: if N is 0, the raw file can be read by any version of
\fBnamed\fR; if N is 1, the file can be read by release 9\&.9\&.0 or higher; the default is 1\&.
.RE
.PP
\-k \fImode\fR
.RS4
Perform
\fB"check\-names"\fR
checks with the specified failure mode\&. Possible modes are
\fB"fail"\fR
(default for
\fBnamed\-compilezone\fR),
\fB"warn"\fR
(default for
\fBnamed\-checkzone\fR) and
\fB"ignore"\fR\&.
.RE
.PP
\-l \fIttl\fR
.RS4
Sets a maximum permissible TTL for the input file\&. Any record with a TTL higher than this value will cause the zone to be rejected\&. This is similar to using the
\fBmax\-zone\-ttl\fR
option in
named\&.conf\&.
.RE
.PP
\-L \fIserial\fR
.RS4
When compiling a zone to "raw" or "map" format, set the "source serial" value in the header to the specified serial number\&. (This is expected to be used primarily for testing purposes\&.)
.RE
.PP
\-m \fImode\fR
.RS4
Specify whether MX records should be checked to see if they are addresses\&. Possible modes are
\fB"fail"\fR,
\fB"warn"\fR
(default) and
\fB"ignore"\fR\&.
.RE
.PP
\-M \fImode\fR
.RS4
Check if a MX record refers to a CNAME\&. Possible modes are
\fB"fail"\fR,
\fB"warn"\fR
(default) and
\fB"ignore"\fR\&.
.RE
.PP
\-n \fImode\fR
.RS4
Specify whether NS records should be checked to see if they are addresses\&. Possible modes are
\fB"fail"\fR
(default for
\fBnamed\-compilezone\fR),
\fB"warn"\fR
(default for
\fBnamed\-checkzone\fR) and
\fB"ignore"\fR\&.
.RE
.PP
\-o \fIfilename\fR
.RS4
Write zone output to
filename\&. If
filename
is
\-
then write to standard out\&. This is mandatory for
\fBnamed\-compilezone\fR\&.
.RE
.PP
\-r \fImode\fR
.RS4
Check for records that are treated as different by DNSSEC but are semantically equal in plain DNS\&. Possible modes are
\fB"fail"\fR,
\fB"warn"\fR
(default) and
\fB"ignore"\fR\&.
.RE
.PP
\-s \fIstyle\fR
.RS4
Specify the style of the dumped zone file\&. Possible styles are
\fB"full"\fR
(default) and
\fB"relative"\fR\&. The full format is most suitable for processing automatically by a separate script\&. On the other hand, the relative format is more human\-readable and is thus suitable for editing by hand\&. For
\fBnamed\-checkzone\fR
this does not cause any effects unless it dumps the zone contents\&. It also does not have any meaning if the output format is not text\&.
.RE
.PP
\-S \fImode\fR
.RS4
Check if a SRV record refers to a CNAME\&. Possible modes are
\fB"fail"\fR,
\fB"warn"\fR
(default) and
\fB"ignore"\fR\&.
.RE
.PP
\-t \fIdirectory\fR
.RS4
Chroot to
directory
so that include directives in the configuration file are processed as if run by a similarly chrooted
\fBnamed\fR\&.
.RE
.PP
\-T \fImode\fR
.RS4
Check if Sender Policy Framework (SPF) records exist and issues a warning if an SPF\-formatted TXT record is not also present\&. Possible modes are
\fB"warn"\fR
(default),
\fB"ignore"\fR\&.
.RE
.PP
\-w \fIdirectory\fR
.RS4
chdir to
directory
so that relative filenames in master file $INCLUDE directives work\&. This is similar to the directory clause in
named\&.conf\&.
.RE
.PP
\-D
.RS4
Dump zone file in canonical format\&. This is always enabled for
\fBnamed\-compilezone\fR\&.
.RE
.PP
\-W \fImode\fR
.RS4
Specify whether to check for non\-terminal wildcards\&. Non\-terminal wildcards are almost always the result of a failure to understand the wildcard matching algorithm (RFC 1034)\&. Possible modes are
\fB"warn"\fR
(default) and
\fB"ignore"\fR\&.
.RE
.PP
zonename
.RS4
The domain name of the zone being checked\&.
.RE
.PP
filename
.RS4
The name of the zone file\&.
.RE
.SH"RETURN VALUES"
.PP
\fBnamed\-checkzone\fR
returns an exit status of 1 if errors were detected and 0 otherwise\&.
.SH"SEE ALSO"
.PP
\fBnamed\fR(8),
\fBnamed-checkconf\fR(8),
RFC 1035,
BIND 9 Administrator Reference Manual\&.
.SH"AUTHOR"
.PP
\fBInternet Systems Consortium, Inc\&.\fR
.SH"COPYRIGHT"
.br
Copyright \(co 2000-2002, 2004-2007, 2009-2016, 2018 Internet Systems Consortium, Inc. ("ISC")
are invocation methods for a utility that generates keys for use in TSIG signing\&. The resulting keys can be used, for example, to secure dynamic DNS updates to a zone or for the
\fBrndc\fR
command channel\&.
.PP
When run as
\fBtsig\-keygen\fR, a domain name can be specified on the command line which will be used as the name of the generated key\&. If no name is specified, the default is
\fBtsig\-key\fR\&.
.PP
When run as
\fBddns\-confgen\fR, the generated key is accompanied by configuration text and instructions that can be used with
\fBnsupdate\fR
and
\fBnamed\fR
when setting up dynamic DNS, including an example
\fBupdate\-policy\fR
statement\&. (This usage similar to the
\fBrndc\-confgen\fR
command for setting up command channel security\&.)
.PP
Note that
\fBnamed\fR
itself can configure a local DDNS key for use with
\fBnsupdate \-l\fR: it does this when a zone is configured with
\fBupdate\-policy local;\fR\&.
\fBddns\-confgen\fR
is only needed when a more elaborate configuration is required: for instance, if
\fBnsupdate\fR
is to be used from a remote system\&.
.SH"OPTIONS"
.PP
\-a \fIalgorithm\fR
.RS4
Specifies the algorithm to use for the TSIG key\&. Available choices are: hmac\-md5, hmac\-sha1, hmac\-sha224, hmac\-sha256, hmac\-sha384 and hmac\-sha512\&. The default is hmac\-sha256\&. Options are case\-insensitive, and the "hmac\-" prefix may be omitted\&.
.RE
.PP
\-h
.RS4
Prints a short summary of options and arguments\&.
.RE
.PP
\-k \fIkeyname\fR
.RS4
Specifies the key name of the DDNS authentication key\&. The default is
\fBddns\-key\fR
when neither the
\fB\-s\fR
nor
\fB\-z\fR
option is specified; otherwise, the default is
\fBddns\-key\fR
as a separate label followed by the argument of the option, e\&.g\&.,
\fBddns\-key\&.example\&.com\&.\fR
The key name must have the format of a valid domain name, consisting of letters, digits, hyphens and periods\&.
.RE
.PP
\-q
.RS4
(\fBddns\-confgen\fR
only\&.) Quiet mode: Print only the key, with no explanatory text or usage examples; This is essentially identical to
\fBtsig\-keygen\fR\&.
.RE
.PP
\-s \fIname\fR
.RS4
(\fBddns\-confgen\fR
only\&.) Generate configuration example to allow dynamic updates of a single hostname\&. The example
\fBnamed\&.conf\fR
text shows how to set an update policy for the specified
\fIname\fR
using the "name" nametype\&. The default key name is ddns\-key\&.\fIname\fR\&. Note that the "self" nametype cannot be used, since the name to be updated may differ from the key name\&. This option cannot be used with the
\fB\-z\fR
option\&.
.RE
.PP
\-z \fIzone\fR
.RS4
(\fBddns\-confgen\fR
only\&.) Generate configuration example to allow dynamic updates of a zone: The example
\fBnamed\&.conf\fR
text shows how to set an update policy for the specified
\fIzone\fR
using the "zonesub" nametype, allowing updates to all subdomain names within that
\fIzone\fR\&. This option cannot be used with the
\fB\-s\fR
option\&.
.RE
.SH"SEE ALSO"
.PP
\fBnsupdate\fR(1),
\fBnamed.conf\fR(5),
\fBnamed\fR(8),
BIND 9 Administrator Reference Manual\&.
.SH"AUTHOR"
.PP
\fBInternet Systems Consortium, Inc\&.\fR
.SH"COPYRIGHT"
.br
Copyright \(co 2009, 2014-2016, 2018 Internet Systems Consortium, Inc. ("ISC")
\fBrndc\fR\&. It can be used as a convenient alternative to writing the
rndc\&.conf
file and the corresponding
\fBcontrols\fR
and
\fBkey\fR
statements in
named\&.conf
by hand\&. Alternatively, it can be run with the
\fB\-a\fR
option to set up a
rndc\&.key
file and avoid the need for a
rndc\&.conf
file and a
\fBcontrols\fR
statement altogether\&.
.SH"OPTIONS"
.PP
\-a
.RS4
Do automatic
\fBrndc\fR
configuration\&. This creates a file
rndc\&.key
in
/etc
(or whatever
\fIsysconfdir\fR
was specified as when
BIND
was built) that is read by both
\fBrndc\fR
and
\fBnamed\fR
on startup\&. The
rndc\&.key
file defines a default command channel and authentication key allowing
\fBrndc\fR
to communicate with
\fBnamed\fR
on the local host with no further configuration\&.
.sp
Running
\fBrndc\-confgen \-a\fR
allows BIND 9 and
\fBrndc\fR
to be used as drop\-in replacements for BIND 8 and
\fBndc\fR, with no changes to the existing BIND 8
named\&.conf
file\&.
.sp
If a more elaborate configuration than that generated by
\fBrndc\-confgen \-a\fR
is required, for example if rndc is to be used remotely, you should run
\fBrndc\-confgen\fR
without the
\fB\-a\fR
option and set up a
rndc\&.conf
and
named\&.conf
as directed\&.
.RE
.PP
\-A \fIalgorithm\fR
.RS4
Specifies the algorithm to use for the TSIG key\&. Available choices are: hmac\-md5, hmac\-sha1, hmac\-sha224, hmac\-sha256, hmac\-sha384 and hmac\-sha512\&. The default is hmac\-sha256\&.
.RE
.PP
\-b \fIkeysize\fR
.RS4
Specifies the size of the authentication key in bits\&. Must be between 1 and 512 bits; the default is the hash size\&.
.RE
.PP
\-c \fIkeyfile\fR
.RS4
Used with the
\fB\-a\fR
option to specify an alternate location for
rndc\&.key\&.
.RE
.PP
\-h
.RS4
Prints a short summary of the options and arguments to
\fBrndc\-confgen\fR\&.
.RE
.PP
\-k \fIkeyname\fR
.RS4
Specifies the key name of the rndc authentication key\&. This must be a valid domain name\&. The default is
\fBrndc\-key\fR\&.
.RE
.PP
\-p \fIport\fR
.RS4
Specifies the command channel port where
\fBnamed\fR
listens for connections from
\fBrndc\fR\&. The default is 953\&.
.RE
.PP
\-s \fIaddress\fR
.RS4
Specifies the IP address where
\fBnamed\fR
listens for command channel connections from
\fBrndc\fR\&. The default is the loopback address 127\&.0\&.0\&.1\&.
.RE
.PP
\-t \fIchrootdir\fR
.RS4
Used with the
\fB\-a\fR
option to specify a directory where
\fBnamed\fR
will run chrooted\&. An additional copy of the
rndc\&.key
will be written relative to this directory so that it will be found by the chrooted
\fBnamed\fR\&.
.RE
.PP
\-u \fIuser\fR
.RS4
Used with the
\fB\-a\fR
option to set the owner of the
rndc\&.key
file generated\&. If
\fB\-t\fR
is also specified only the file in the chroot area has its owner changed\&.
.RE
.SH"EXAMPLES"
.PP
To allow
\fBrndc\fR
to be used with no manual configuration, run
.PP
\fBrndc\-confgen \-a\fR
.PP
To print a sample
rndc\&.conf
file and corresponding
\fBcontrols\fR
and
\fBkey\fR
statements to be manually inserted into
named\&.conf, run
.PP
\fBrndc\-confgen\fR
.SH"SEE ALSO"
.PP
\fBrndc\fR(8),
\fBrndc.conf\fR(5),
\fBnamed\fR(8),
BIND 9 Administrator Reference Manual\&.
.SH"AUTHOR"
.PP
\fBInternet Systems Consortium, Inc\&.\fR
.SH"COPYRIGHT"
.br
Copyright \(co 2001, 2003-2005, 2007, 2009, 2013-2018 Internet Systems Consortium, Inc. ("ISC")
is a tool for sending DNS queries and validating the results, using the same internal resolver and validator logic as
\fBnamed\fR\&.
.PP
\fBdelv\fR
will send to a specified name server all queries needed to fetch and validate the requested data; this includes the original requested query, subsequent queries to follow CNAME or DNAME chains, and queries for DNSKEY, DS and DLV records to establish a chain of trust for DNSSEC validation\&. It does not perform iterative resolution, but simulates the behavior of a name server configured for DNSSEC validating and forwarding\&.
.PP
By default, responses are validated using built\-in DNSSEC trust anchor for the root zone ("\&.")\&. Records returned by
\fBdelv\fR
are either fully validated or were not signed\&. If validation fails, an explanation of the failure is included in the output; the validation process can be traced in detail\&. Because
\fBdelv\fR
does not rely on an external server to carry out validation, it can be used to check the validity of DNS responses in environments where local name servers may not be trustworthy\&.
.PP
Unless it is told to query a specific name server,
\fBdelv\fR
will try each of the servers listed in
/etc/resolv\&.conf\&. If no usable server addresses are found,
\fBdelv\fR
will send queries to the localhost addresses (127\&.0\&.0\&.1 for IPv4, ::1 for IPv6)\&.
.PP
When no command line arguments or options are given,
\fBdelv\fR
will perform an NS query for "\&." (the root zone)\&.
.SH"SIMPLE USAGE"
.PP
A typical invocation of
\fBdelv\fR
looks like:
.sp
.ifn\{\
.RS4
.\}
.nf
delv @server name type
.fi
.ifn\{\
.RE
.\}
.sp
where:
.PP
\fBserver\fR
.RS4
is the name or IP address of the name server to query\&. This can be an IPv4 address in dotted\-decimal notation or an IPv6 address in colon\-delimited notation\&. When the supplied
\fIserver\fR
argument is a hostname,
\fBdelv\fR
resolves that name before querying that name server (note, however, that this initial lookup is
\fInot\fR
validated by DNSSEC)\&.
.sp
If no
\fIserver\fR
argument is provided,
\fBdelv\fR
consults
/etc/resolv\&.conf; if an address is found there, it queries the name server at that address\&. If either of the
\fB\-4\fR
or
\fB\-6\fR
options are in use, then only addresses for the corresponding transport will be tried\&. If no usable addresses are found,
\fBdelv\fR
will send queries to the localhost addresses (127\&.0\&.0\&.1 for IPv4, ::1 for IPv6)\&.
.RE
.PP
\fBname\fR
.RS4
is the domain name to be looked up\&.
.RE
.PP
\fBtype\fR
.RS4
indicates what type of query is required \(em ANY, A, MX, etc\&.
\fItype\fR
can be any valid query type\&. If no
\fItype\fR
argument is supplied,
\fBdelv\fR
will perform a lookup for an A record\&.
.RE
.SH"OPTIONS"
.PP
\-a \fIanchor\-file\fR
.RS4
Specifies a file from which to read DNSSEC trust anchors\&. The default is
/etc/bind\&.keys, which is included with
BIND
9 and contains one or more trust anchors for the root zone ("\&.")\&.
.sp
Keys that do not match the root zone name are ignored\&. An alternate key name can be specified using the
\fB+root=NAME\fR
options\&. DNSSEC Lookaside Validation can also be turned on by using the
\fB+dlv=NAME\fR
to specify the name of a zone containing DLV records\&.
.sp
Note: When reading the trust anchor file,
\fBdelv\fR
treats
\fBmanaged\-keys\fR
statements and
\fBtrusted\-keys\fR
statements identically\&. That is, for a managed key, it is the
\fIinitial\fR
key that is trusted; RFC 5011 key management is not supported\&.
\fBdelv\fR
will not consult the managed\-keys database maintained by
\fBnamed\fR\&. This means that if either of the keys in
/etc/bind\&.keys
is revoked and rolled over, it will be necessary to update
/etc/bind\&.keys
to use DNSSEC validation in
\fBdelv\fR\&.
.RE
.PP
\-b \fIaddress\fR
.RS4
Sets the source IP address of the query to
\fIaddress\fR\&. This must be a valid address on one of the host\*(Aqs network interfaces or "0\&.0\&.0\&.0" or "::"\&. An optional source port may be specified by appending "#<port>"
.RE
.PP
\-c \fIclass\fR
.RS4
Sets the query class for the requested data\&. Currently, only class "IN" is supported in
\fBdelv\fR
and any other value is ignored\&.
.RE
.PP
\-d \fIlevel\fR
.RS4
Set the systemwide debug level to
\fBlevel\fR\&. The allowed range is from 0 to 99\&. The default is 0 (no debugging)\&. Debugging traces from
\fBdelv\fR
become more verbose as the debug level increases\&. See the
\fB+mtrace\fR,
\fB+rtrace\fR, and
\fB+vtrace\fR
options below for additional debugging details\&.
.RE
.PP
\-h
.RS4
Display the
\fBdelv\fR
help usage output and exit\&.
.RE
.PP
\-i
.RS4
Insecure mode\&. This disables internal DNSSEC validation\&. (Note, however, this does not set the CD bit on upstream queries\&. If the server being queried is performing DNSSEC validation, then it will not return invalid data; this can cause
\fBdelv\fR
to time out\&. When it is necessary to examine invalid data to debug a DNSSEC problem, use
\fBdig +cd\fR\&.)
.RE
.PP
\-m
.RS4
Enables memory usage debugging\&.
.RE
.PP
\-p \fIport#\fR
.RS4
Specifies a destination port to use for queries instead of the standard DNS port number 53\&. This option would be used with a name server that has been configured to listen for queries on a non\-standard port number\&.
.RE
.PP
\-q \fIname\fR
.RS4
Sets the query name to
\fIname\fR\&. While the query name can be specified without using the
\fB\-q\fR, it is sometimes necessary to disambiguate names from types or classes (for example, when looking up the name "ns", which could be misinterpreted as the type NS, or "ch", which could be misinterpreted as class CH)\&.
.RE
.PP
\-t \fItype\fR
.RS4
Sets the query type to
\fItype\fR, which can be any valid query type supported in BIND 9 except for zone transfer types AXFR and IXFR\&. As with
\fB\-q\fR, this is useful to distinguish query name type or class when they are ambiguous\&. it is sometimes necessary to disambiguate names from types\&.
.sp
The default query type is "A", unless the
\fB\-x\fR
option is supplied to indicate a reverse lookup, in which case it is "PTR"\&.
.RE
.PP
\-v
.RS4
Print the
\fBdelv\fR
version and exit\&.
.RE
.PP
\-x \fIaddr\fR
.RS4
Performs a reverse lookup, mapping an addresses to a name\&.
\fIaddr\fR
is an IPv4 address in dotted\-decimal notation, or a colon\-delimited IPv6 address\&. When
\fB\-x\fR
is used, there is no need to provide the
\fIname\fR
or
\fItype\fR
arguments\&.
\fBdelv\fR
automatically performs a lookup for a name like
11\&.12\&.13\&.10\&.in\-addr\&.arpa
and sets the query type to PTR\&. IPv6 addresses are looked up using nibble format under the IP6\&.ARPA domain\&.
.RE
.PP
\-4
.RS4
Forces
\fBdelv\fR
to only use IPv4\&.
.RE
.PP
\-6
.RS4
Forces
\fBdelv\fR
to only use IPv6\&.
.RE
.SH"QUERY OPTIONS"
.PP
\fBdelv\fR
provides a number of query options which affect the way results are displayed, and in some cases the way lookups are performed\&.
.PP
Each query option is identified by a keyword preceded by a plus sign (+)\&. Some keywords set or reset an option\&. These may be preceded by the string
no
to negate the meaning of that keyword\&. Other keywords assign values to options like the timeout interval\&. They have the form
\fB+keyword=value\fR\&. The query options are:
.PP
\fB+[no]cdflag\fR
.RS4
Controls whether to set the CD (checking disabled) bit in queries sent by
\fBdelv\fR\&. This may be useful when troubleshooting DNSSEC problems from behind a validating resolver\&. A validating resolver will block invalid responses, making it difficult to retrieve them for analysis\&. Setting the CD flag on queries will cause the resolver to return invalid responses, which
\fBdelv\fR
can then validate internally and report the errors in detail\&.
.RE
.PP
\fB+[no]class\fR
.RS4
Controls whether to display the CLASS when printing a record\&. The default is to display the CLASS\&.
.RE
.PP
\fB+[no]ttl\fR
.RS4
Controls whether to display the TTL when printing a record\&. The default is to display the TTL\&.
.RE
.PP
\fB+[no]rtrace\fR
.RS4
Toggle resolver fetch logging\&. This reports the name and type of each query sent by
\fBdelv\fR
in the process of carrying out the resolution and validation process: this includes including the original query and all subsequent queries to follow CNAMEs and to establish a chain of trust for DNSSEC validation\&.
.sp
This is equivalent to setting the debug level to 1 in the "resolver" logging category\&. Setting the systemwide debug level to 1 using the
\fB\-d\fR
option will product the same output (but will affect other logging categories as well)\&.
.RE
.PP
\fB+[no]mtrace\fR
.RS4
Toggle message logging\&. This produces a detailed dump of the responses received by
\fBdelv\fR
in the process of carrying out the resolution and validation process\&.
.sp
This is equivalent to setting the debug level to 10 for the "packets" module of the "resolver" logging category\&. Setting the systemwide debug level to 10 using the
\fB\-d\fR
option will produce the same output (but will affect other logging categories as well)\&.
.RE
.PP
\fB+[no]vtrace\fR
.RS4
Toggle validation logging\&. This shows the internal process of the validator as it determines whether an answer is validly signed, unsigned, or invalid\&.
.sp
This is equivalent to setting the debug level to 3 for the "validator" module of the "dnssec" logging category\&. Setting the systemwide debug level to 3 using the
\fB\-d\fR
option will produce the same output (but will affect other logging categories as well)\&.
.RE
.PP
\fB+[no]short\fR
.RS4
Provide a terse answer\&. The default is to print the answer in a verbose form\&.
.RE
.PP
\fB+[no]comments\fR
.RS4
Toggle the display of comment lines in the output\&. The default is to print comments\&.
.RE
.PP
\fB+[no]rrcomments\fR
.RS4
Toggle the display of per\-record comments in the output (for example, human\-readable key information about DNSKEY records)\&. The default is to print per\-record comments\&.
.RE
.PP
\fB+[no]crypto\fR
.RS4
Toggle the display of cryptographic fields in DNSSEC records\&. The contents of these field are unnecessary to debug most DNSSEC validation failures and removing them makes it easier to see the common failures\&. The default is to display the fields\&. When omitted they are replaced by the string "[omitted]" or in the DNSKEY case the key id is displayed as the replacement, e\&.g\&. "[ key id = value ]"\&.
.RE
.PP
\fB+[no]trust\fR
.RS4
Controls whether to display the trust level when printing a record\&. The default is to display the trust level\&.
.RE
.PP
\fB+[no]split[=W]\fR
.RS4
Split long hex\- or base64\-formatted fields in resource records into chunks of
\fIW\fR
characters (where
\fIW\fR
is rounded up to the nearest multiple of 4)\&.
\fI+nosplit\fR
or
\fI+split=0\fR
causes fields not to be split at all\&. The default is 56 characters, or 44 characters when multiline mode is active\&.
.RE
.PP
\fB+[no]all\fR
.RS4
Set or clear the display options
\fB+[no]comments\fR,
\fB+[no]rrcomments\fR, and
\fB+[no]trust\fR
as a group\&.
.RE
.PP
\fB+[no]multiline\fR
.RS4
Print long records (such as RRSIG, DNSKEY, and SOA records) in a verbose multi\-line format with human\-readable comments\&. The default is to print each record on a single line, to facilitate machine parsing of the
\fBdelv\fR
output\&.
.RE
.PP
\fB+[no]dnssec\fR
.RS4
Indicates whether to display RRSIG records in the
\fBdelv\fR
output\&. The default is to do so\&. Note that (unlike in
\fBdig\fR) this does
\fInot\fR
control whether to request DNSSEC records or whether to validate them\&. DNSSEC records are always requested, and validation will always occur unless suppressed by the use of
\fB\-i\fR
or
\fB+noroot\fR
and
\fB+nodlv\fR\&.
.RE
.PP
\fB+[no]root[=ROOT]\fR
.RS4
Indicates whether to perform conventional (non\-lookaside) DNSSEC validation, and if so, specifies the name of a trust anchor\&. The default is to validate using a trust anchor of "\&." (the root zone), for which there is a built\-in key\&. If specifying a different trust anchor, then
\fB\-a\fR
must be used to specify a file containing the key\&.
.RE
.PP
\fB+[no]dlv[=DLV]\fR
.RS4
Indicates whether to perform DNSSEC lookaside validation, and if so, specifies the name of the DLV trust anchor\&. The
\fB\-a\fR
option must also be used to specify a file containing the DLV key\&.
.RE
.PP
\fB+[no]tcp\fR
.RS4
Controls whether to use TCP when sending queries\&. The default is to use UDP unless a truncated response has been received\&.
.RE
.PP
\fB+[no]unknownformat\fR
.RS4
Print all RDATA in unknown RR type presentation format (RFC 3597)\&. The default is to print RDATA for known types in the type\*(Aqs presentation format\&.
.RE
.SH"FILES"
.PP
/etc/bind\&.keys
.PP
/etc/resolv\&.conf
.SH"SEE ALSO"
.PP
\fBdig\fR(1),
\fBnamed\fR(8),
RFC4034,
RFC4035,
RFC4431,
RFC5074,
RFC5155\&.
.SH"AUTHOR"
.PP
\fBInternet Systems Consortium, Inc\&.\fR
.SH"COPYRIGHT"
.br
Copyright \(co 2014-2018 Internet Systems Consortium, Inc. ("ISC")
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.