The hash table rework MRs (!3865, !3871) increased the default RBT hash
table size from 64 to 65,536 entries (for 64-bit architectures, that is
512 bytes before vs. 524,288 bytes after). This works fine for RBTs
used for cache databases, but since three separate RBT databases are
created for every zone loaded (RRs, NSEC, NSEC3), memory usage would
skyrocket when BIND 9 is used as an authoritative DNS server with many
zones.
The default RBT hash table size before the rework was 64 entries, this
commit reduces it to 16 entries because our educated guess is that most
zones are just couple of entries (SOA, NS, A, AAAA, MX) and rehashing
small hash tables is actually cheap. The rework we did in the previous
MRs tries to avoid growing the hash tables for big-to-huge caches where
growing the hash table comes at a price because the whole cache needs to
be locked.
(cherry picked from commit 1e043a011b)
(cherry picked from commit f0ccc17f30)
When pk11_numbits() is passed a user provided input that contains all
zeroes (via crafted DNS message), it would crash with assertion
failure. Fix that by properly handling such input.
QNAME minimization is normally disabled when forwarding. if, in the
course of processing a fetch, we switch back to normal recursion at
some point, we can't safely start minimizing because we may have
been left in an inconsistent state.
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.
This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.
The only arm64 runner we have at our disposal is suffering from
intermittent connectivity issues which make it unusable for extended
periods of time. Remove arm64 jobs from GitLab CI until we manage to
set up an arm64 runner with more reliable connectivity.
(cherry picked from commit 49f245f7c0)
The named configuration files used in the "geoip2" system test cause a
rather large number of views (6-8) to be set up in each tested named
instance. Each view has its own cache.
Commit aa72c31422 caused the RBT hash
table to be pre-allocated to a size derived from "max-cache-size", so
that it never needs to be rehashed. The size of that hash table is not
expected to be significant enough to cause memory use issues in typical
conditions even for large "max-cache-size" settings.
However, these two factors combined can cause memory exhaustion issues
in GitLab CI, where we run multiple "instances" of the test suite in
parallel on the same runner, each test suite executes multiple system
tests concurrently, and each system test may potentially start multiple
named instances at the same time. In practice, this problem currently
only seems to be affecting the "geoip2" system test, which is failing
intermittently due to named instances used by that test getting killed
by oom-killer.
Prevent the "geoip2" system test from failing intermittently by setting
"max-cache-size" in named configuration files used in that test to a low
value in order to keep memory usage at bay even with a large number of
views configured.
(cherry picked from commit 4292d5bdfe)
In 9.17 we introduced 'primaries' as a synonym for 'masters' in the
configuration file. This synonym has not been backported so change
the serve-stale test to make use of the 'masters' keyword.
Add a fifth named (ns5) that runs with `stale-cache-enable no;` and
check that there are no stale records in the cache.
(cherry picked from commit abc2ab9223)
When a received RRSet has TTL 0, they would be preserved for
serve-stale (default `max-stale-cache` is 12 hours) rather than expiring
them quickly from the cache database.
This commit makes sure the RRSet didn't have TTL 0 before marking the
entry in the database as "stale".
(cherry picked from commit 6ffa2ddae0)
The current serve-stale implementation in BIND 9 stores all received
records in the cache for a max-stale-ttl interval (default 12 hours).
This allows DNS operators to turn the serve-stale answers in an event of
large authoritative DNS outage. The caching of the stale answers needs
to be enabled before the outage happens or the feature would be
otherwise useless.
The negative consequence of the default setting is the inevitable
cache-bloat that happens for every and each DNS operator running named.
In this MR, a new configuration option `stale-cache-enable` is
introduced that allows the operators to selectively enable or disable
the serve-stale feature of BIND 9 based on their decision.
The newly introduced option has been disabled by default,
e.g. serve-stale is disabled in the default configuration and has to be
enabled if required.
(cherry picked from commit ce53db34d6)
Now that the log message has been printed set the result code to
DNS_R_FORMERR. We don't do this via dns_result_torcode() as we
don't want upstream errors to produce FORMERR if that processing
end with DNS_R_BADTSIG.
(cherry picked from commit 20488d6ad3)
The basic scenario for the problem was that in the process of
resolving a query, if any rrset was eligible for prefetching, then it
would trigger a call to query_prefetch(), this call would run in
parallel to the normal query processing.
The problem arises due to the fact that both query_prefetch(), and,
in the original thread, a call to ns_query_recurse(), try to attach
to the recursionquota, but recursing client stats counter is only
incremented if ns_query_recurse() attachs to it first.
Conversely, if fetch_callback() is called before prefetch_done(),
it would not only detach from recursionquota, but also decrement
the stats counter, if query_prefetch() attached to te quota first
that would result in a decrement not matched by an increment, as
expected.
To solve this issue an atomic bool was added, it is set once in
ns_query_recurse(), allowing fetch_callback() to check for it
and decrement stats accordingly.
For a more compreensive explanation check the thread comment below:
https://gitlab.isc.org/isc-projects/bind9/-/issues/1719#note_145857
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.
(cherry picked from commit a0f7d28967)
Running system tests with root privileges is potentially dangerous.
Only allow it when explicitly requested (by building with
--enable-developer).
(cherry picked from commit 3ef106f69d)
Created isc_refcount_decrement_expect macro to test conditionally
the return value to ensure it is in expected range. Converted
unchecked isc_refcount_decrement to use isc_refcount_decrement_expect.
Converted INSIST(isc_refcount_decrement()...) to isc_refcount_decrement_expect.
(cherry picked from commit bde5c7632a)
When silencing the Coverity warning in remove_old_tsversions(), the code
was refactored to reduce the indentation levels and break down the long
code into individual functions. This improve fix for [GL #1989].
(cherry picked from commit aca18b8b5b)
Change the dns_hash_name() and dns_hash_fullname() functions to use
isc_hash32() as the maximum hashtable size in rbt is 0..UINT32_MAX
large.
(cherry picked from commit a9182c89a6)
As the names suggest the original isc_hash64 function returns 64-bit
long hash values and the isc_hash32() returns 32-bit values.
(cherry picked from commit f59fd49fd8)
Creation of EVP_MD_CTX and EVP_PKEY is quite expensive, so until we fix the code
to reuse the OpenSSL contexts and keys we'll use our own implementation of
siphash instead of trying to integrate with OpenSSL.
(cherry picked from commit 21d751dfc7)
There were several problems with rbt hashtable implementation:
1. Our internal hashing function returns uint64_t value, but it was
silently truncated to unsigned int in dns_name_hash() and
dns_name_fullhash() functions. As the SipHash 2-4 higher bits are
more random, we need to use the upper half of the return value.
2. The hashtable implementation in rbt.c was using modulo to pick the
slot number for the hash table. This has several problems because
modulo is: a) slow, b) oblivious to patterns in the input data. This
could lead to very uneven distribution of the hashed data in the
hashtable. Combined with the single-linked lists we use, it could
really hog-down the lookup and removal of the nodes from the rbt
tree[a]. The Fibonacci Hashing is much better fit for the hashtable
function here. For longer description, read "Fibonacci Hashing: The
Optimization that the World Forgot"[b] or just look at the Linux
kernel. Also this will make Diego very happy :).
3. The hashtable would rehash every time the number of nodes in the rbt
tree would exceed 3 * (hashtable size). The overcommit will make the
uneven distribution in the hashtable even worse, but the main problem
lies in the rehashing - every time the database grows beyond the
limit, each subsequent rehashing will be much slower. The mitigation
here is letting the rbt know how big the cache can grown and
pre-allocate the hashtable to be big enough to actually never need to
rehash. This will consume more memory at the start, but since the
size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
will only consume maximum of 32GB of memory for hashtable in the
worst case (and max-cache-size would need to be set to more than
4TB). Calling the dns_db_adjusthashsize() will also cap the maximum
size of the hashtable to the pre-computed number of bits, so it won't
try to consume more gigabytes of memory than available for the
database.
FIXME: What is the average size of the rbt node that gets hashed? I
chose the pagesize (4k) as initial value to precompute the size of
the hashtable, but the value is based on feeling and not any real
data.
For future work, there are more places where we use result of the hash
value modulo some small number and that would benefit from Fibonacci
Hashing to get better distribution.
Notes:
a. A doubly linked list should be used here to speedup the removal of
the entries from the hashtable.
b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/
(cherry picked from commit e24bc324b4)
When named acting as a resolver connects to an authoritative server over
TCP, it sets the idle timeout for that connection to 20 seconds. This
fixed timeout was picked back when the default processing timeout for
each client query was hardcoded to 30 seconds. Commit
000a8970f8 made this processing timeout
configurable through "resolver-query-timeout" and decreased its default
value to 10 seconds, but the idle TCP timeout was not adjusted to
reflect that change. As a result, with the current defaults in effect,
a single hung TCP connection will consistently cause the resolution
process for a given query to time out.
Set the idle timeout for connected TCP sockets to half of the client
query processing timeout configured for a resolver. This allows named
to handle hung TCP connections more robustly and prevents the timeout
mismatch issue from resurfacing in the future if the default is ever
changed again.
(cherry picked from commit 953d704bd2)
Whenever an exact match is found by dns_rbt_findnode(),
the highest level node in the chain will not be put into
chain->levels[] array, but instead the chain->end
pointer will be adjusted to point to that node.
Suppose we have the following entries in a rpz zone:
example.com CNAME rpz-passthru.
*.example.com CNAME rpz-passthru.
A query for www.example.com would result in the
following chain object returned by dns_rbt_findnode():
chain->level_count = 2
chain->level_matches = 2
chain->levels[0] = .
chain->levels[1] = example.com
chain->levels[2] = NULL
chain->end = www
Since exact matches only care for testing rpz set bits,
we need to test for rpz wild bits through iterating the nodechain, and
that includes testing the rpz wild bits in the highest level node found.
In the case of an exact match, chain->levels[chain->level_matches]
will be NULL, to address that we must use chain->end as the start point,
then iterate over the remaining levels in the chain.
server might be created, but not yet fully initialized, when fatal
function is called. Check both server and task before attaching
exclusive task.
(cherry picked from commit c5e7152cf0)
Fix Coverity CHECKED_RETURN reports for dst_key_getbool(). In most
cases we do not really care about its return value, but it is prudent
to check it.
In one case, where a dst_key_getbool() error should be treated
identically as success, cast the return value to void and add a relevant
comment.
(cherry picked from commit e645d2ef1e)
Our GitLab Runner Custom executor scripts now use the "image" key
instead of the job name for determining the QCOW2 image to use for a
given CI job. Update .gitlab-ci.yml to reflect that change.
(cherry picked from commit 72201badf0)
Check that resign interval is actually in days rather than hours
by checking that RRSIGs are all within the allowed day range.
(cherry picked from commit 11ecf7901b)
Since October 2019 I have had complaints from `dnssec-cds` reporting
that the signatures on some of my test zones had expired. These were
zones signed by BIND 9.15 or 9.17, with a DNSKEY TTL of 24h and
`sig-validity-interval 10 8`.
This is the same setup we have used for our production zones since
2015, which is intended to re-sign the zones every 2 days, keeping
at least 8 days signature validity. The SOA expire interval is 7
days, so even in the presence of zone transfer problems, no-one
should ever see expired signatures. (These timers are a bit too
tight to be completely correct, because I should have increased
the expiry timers when I increased the DNSKEY TTLs from 1h to 24h.
But that should only matter when zone transfers are broken, which
was not the case for the error reports that led to this patch.)
For example, this morning my test zone contained:
dev.dns.cam.ac.uk. 86400 IN RRSIG DNSKEY 13 5 86400 (
20200701221418 20200621213022 ...)
But one of my resolvers had cached:
dev.dns.cam.ac.uk. 21424 IN RRSIG DNSKEY 13 5 86400 (
20200622063022 20200612061136 ...)
This TTL was captured at 20200622105807 so the resolver cached the
RRset 64976 seconds previously (18h02m56s), at 20200621165511
only about 12h before expiry.
The other symptom of this error was incorrect `resign` times in
the output from `rndc zonestatus`.
For example, I have configured a test zone
zone fast.dotat.at {
file "../u/z/fast.dotat.at";
type primary;
auto-dnssec maintain;
sig-validity-interval 500 499;
};
The zone is reset to a minimal zone containing only SOA and NS
records, and when `named` starts it loads and signs the zone. After
that, `rndc zonestatus` reports:
next resign node: fast.dotat.at/NS
next resign time: Fri, 28 May 2021 12:48:47 GMT
The resign time should be within the next 24h, but instead it is
near the signature expiry time, which the RRSIG(NS) says is
20210618074847. (Note 499 hours is a bit more than 20 days.)
May/June 2021 is less than 500 days from now because expiry time
jitter is applied to the NS records.
Using this test I bisected this bug to 09990672d which contained a
mistake leading to the resigning interval always being calculated in
hours, when days are expected.
This bug only occurs for configurations that use the two-argument form
of `sig-validity-interval`.
(cherry picked from commit 030674b2a3)
it was possible for the count_newzones() function to try to
unlock view->new_zone_lock on return before locking it, which
caused a crash on shutdown.
(cherry picked from commit ed37c63e2b)
If too many versions of log / dnstap files to be saved where requests
the memory after to_keep could be overwritten. Force the number of
versions to be saved to a save level. Additionally the memmove length
was incorrect.
(cherry picked from commit 6ca78bc57d)
When "rndc reconfig" is run, named first configures a fresh set of views
and then tears down the old views. Consider what happens for a single
view with LMDB enabled; "envA" is the pointer to the LMDB environment
used by the original/old version of the view, "envB" is the pointer to
the same LMDB environment used by the new version of that view:
1. mdb_env_open(envA) is called when the view is first created.
2. "rndc reconfig" is called.
3. mdb_env_open(envB) is called for the new instance of the view.
4. mdb_env_close(envA) is called for the old instance of the view.
This seems to have worked so far. However, an upstream change [1] in
LMDB which will be part of its 0.9.26 release prevents the above
sequence of calls from working as intended because the locktable mutexes
will now get destroyed by the mdb_env_close() call in step 4 above,
causing any subsequent mdb_txn_begin() calls to fail (because all of the
above steps are happening within a single named process).
Preventing the above scenario from happening would require either
redesigning the way we use LMDB in BIND, which is not something we can
easily backport, or redesigning the way BIND carries out its
reconfiguration process, which would be an even more severe change.
To work around the problem, set MDB_NOLOCK when calling mdb_env_open()
to stop LMDB from controlling concurrent access to the database and do
the necessary locking in named instead. Reuse the view->new_zone_lock
mutex for this purpose to prevent the need for modifying struct dns_view
(which would necessitate library API version bumps). Drop use of
MDB_NOTLS as it is made redundant by MDB_NOLOCK: MDB_NOTLS only affects
where LMDB reader locktable slots are stored while MDB_NOLOCK prevents
the reader locktable from being used altogether.
[1] 2fd44e3251
(cherry picked from commit 53120279b5)
The ThreadSanitizer found a data race when updating the stale header.
Instead of trying to acquire the write lock and failing occasionally
which would skew the statistics, the dns_rdatasetheader_t.attributes
field has been promoted to use stdatomics. Updating the attributes in
the mark_header_ancient() and mark_header_stale() now uses the cmpxchg
to update the attributes forfeiting the need to hold the write lock on
the tree. Please note that mark_header_ancient() still needs to hold
the lock because .dirty is being updated in the same go.
(cherry picked from commit 81d4230e60)
The stdatomic shims for non-C11 compilers (Windows, old gcc, ...) and
mutexatomic implemented only and minimal subset of the atomic types.
This commit adds 16-bit operations for Windows and all atomic types as
defined in standard.
(cherry picked from commit bccea5862d)
BUFSIZ (512 bytes on Windows) may not be enough to fit the status of a
DNSSEC policy and three DNSSEC keys.
Set the size of the relevant buffer to a hardcoded value of 4096 bytes,
which should be enough for most scenarios.
(cherry picked from commit 9347e7db7e)
While the creation and publication times of the various keys
in this policy are nearly at the same time there is a chance that
one key is created a second later than the other.
The `set_keytimes_algorithm_policy` mistakenly set the keytimes
for KEY3 based of the "published" time from KEY2.
(cherry picked from commit 24e07ae98e)
Commit b580eb2fb3 inadvertently caused the
man pages for symlinked BIND tools (named-compilezone, tsig-keygen) to
no longer be installed by "make install". Fix by restoring the commands
which ensure that.
Commit b580eb2fb3 inadvertently caused
dnstap-related man pages to be installed unconditionally. Ensure they
are only installed for dnstap-enabled builds.
When we're coming back from recursion fetch_callback does not accept
DNS_R_NXDOMAIN as an rcode - query_gotanswer calls query_nxdomain in
which an assertion fails on qctx->is_zone. Yet, under some
circumstances, qname minimization will return an DNS_R_NXDOMAIN - when
root zone mirror is not yet loaded. The fix changes the DNS_R_NXDOMAIN
answer to DNS_R_SERVFAIL.
This test ensures that named will correctly shutdown
when receiving multiple control connections after processing
of either "rncd stop" or "kill -SIGTERM" commands.
Before the fix, named was crashing due to a race condition happening
between two threads, one running shutdown logic in named/server.c
and other handling control logic in controlconf.c.
This test tries to reproduce the above scenario by issuing multiple
queries to a target named instance, issuing either rndc stop or kill
-SIGTERM command to the same named instance, then starting multiple rndc
status connections to ensure it is not crashing anymore.
(cherry picked from commit 042e509753)
Due to lack of synchronization, whenever named was being requested to
stop using rndc, controlconf.c module could be trying to access an already
released pointer through named_g_server->interfacemgr in a separate
thread.
The race could only be triggered if named was being shutdown and more
rndc connections were ocurring at the same time.
This fix correctly checks if the server is shutting down before opening
a new rndc connection.
(cherry picked from commit be6cc53ec2)
When a library is examined, an object file within it can be left out
of the link if it does not provide symbols that the symbol table
needs. Introducing `isc_stdtime_tostring` caused a build failure for
`update_test` because it now requires `libisc.a(stdtime.o)` and that
also exports the `isc_stdtime_get` symbol, meaning we have a
multiple definition error.
Add a local version of `isc_stdtime_tostring`, so that the linker
will not search for it in available object files.
Implement the 'rndc dnssec -status' command that will output
some information about the key states, such as which policy is
used for the zone, what keys are in use, and when rollover is
scheduled.
Add loose testing in the kasp system test, the actual times are
already tested via key file inspection.
(cherry picked from commit 19ce9ec1d4)
Add the code and documentation required to provide DNSSEC signing
status through rndc. This does not yet show any useful information,
just provide the command that will output some dummy string.
(cherry picked from commit e1ba1bea7c)
I'd like to use the same functionality (pretty print the datetime
of keytime metadata) in the 'rndc dnssec -status' command. So it is
better that this logic is done in a separate function.
Since the stdtime.c code have differernt files for unix and win32,
I think the "#ifdef WIN32" define can be dropped.
(cherry picked from commit 9e03f8e8fe)
the blackhole ACL was accidentally disabled with respect to client
queries during the netmgr conversion.
in order to make this work for TCP, it was necessary to add a return
code to the accept callback functions passed to isc_nm_listentcp() and
isc_nm_listentcpdns().
(cherry picked from commit 23c7373d68)
Make yaml.load_all() use yaml.SafeLoader to address a warning currently
emitted when bin/tests/system/dnstap/ydump.py is run:
ydump.py:28: YAMLLoadWarning: calling yaml.load_all() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
for l in yaml.load_all(f.stdout):
The `rndc` argument was always overridden by the static configuration,
because the logic for handling the number of dnstap files to retain
was both backwards and a bit redundant.
(cherry picked from commit 7c07129a51)
The rndc.conf main header was missing the header markup and that was
breaking the TOC for all manpages in the ARM because sphinx-build
incorrectly remembered the markup for subheader to be ~~~~ instead of
----.
(cherry picked from commit 5c56a0ddbc)
The "krb5-devel" package on openSUSE Tumbleweed installs the
"krb5-config" binary into a custom prefix, which prevents BIND's
"configure" script from autodetecting it. Fix by specifying the path to
the "krb5-config" binary using --with-gssapi.
(cherry picked from commit 1be15f5900)
With Clang 10.0.0 on FreeBSD 11.4, compiling lib/dns/spnego.c triggers
the following warnings:
spnego.c:361:11: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
return (GSS_S_DEFECTIVE_TOKEN);
^
/usr/include/gssapi/gssapi.h:423:41: note: expanded from macro 'GSS_S_DEFECTIVE_TOKEN'
#define GSS_S_DEFECTIVE_TOKEN (9ul << GSS_C_ROUTINE_ERROR_OFFSET)
^
spnego.c:366:11: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
return (GSS_S_DEFECTIVE_TOKEN);
^
/usr/include/gssapi/gssapi.h:423:41: note: expanded from macro 'GSS_S_DEFECTIVE_TOKEN'
#define GSS_S_DEFECTIVE_TOKEN (9ul << GSS_C_ROUTINE_ERROR_OFFSET)
^
spnego.c:371:12: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
return (GSS_S_DEFECTIVE_TOKEN);
^
/usr/include/gssapi/gssapi.h:423:41: note: expanded from macro 'GSS_S_DEFECTIVE_TOKEN'
#define GSS_S_DEFECTIVE_TOKEN (9ul << GSS_C_ROUTINE_ERROR_OFFSET)
^
spnego.c:376:11: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
return (GSS_S_DEFECTIVE_TOKEN);
^
/usr/include/gssapi/gssapi.h:423:41: note: expanded from macro 'GSS_S_DEFECTIVE_TOKEN'
#define GSS_S_DEFECTIVE_TOKEN (9ul << GSS_C_ROUTINE_ERROR_OFFSET)
^
spnego.c:380:11: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
return (GSS_S_DEFECTIVE_TOKEN);
^
/usr/include/gssapi/gssapi.h:423:41: note: expanded from macro 'GSS_S_DEFECTIVE_TOKEN'
#define GSS_S_DEFECTIVE_TOKEN (9ul << GSS_C_ROUTINE_ERROR_OFFSET)
^
5 errors generated.
Address by replacing all instances of the GSS_S_DEFECTIVE_TOKEN constant
with a boolean value. Invert the values returned by cmp_gss_type() so
that its only call site reads more naturally in the context of the
comment preceding it.
The wait until zones are signed after rndc reconfig is broken
because the zones are already signed before the reconfig. Fix
by having a different way to ensure the signing of the zone is
complete. This does require a call to the "wait_for_done_signing"
function after each "check_keys" call after the ns6 reconfig.
The "wait_for_done_signing" looks for a (newly added) debug log
message that named will output if it is done signing with a certain
key.
(cherry picked from commit a47192ed5b)
We need to mark the socket as inactive early (and synchronously)
in the stoplistening process - otherwise we might destroy the
callback argument before actually stopping listening, and call
the callback on a bad memory.
isc__nm_tcpdns_send() was not asynchronous and accessed socket
internal fields in an unsafe manner, which could lead to a race
condition and subsequent crash. Fix it by moving the whole tcpdns
processing to a proper netmgr thread.
Add a note why we don't have a test case for the issue.
It is tricky to write a good test case for this if our tools are
not allowed to create signatures for unsupported algorithms.
(cherry picked from commit c6345fffe9)
Assign and then check node for NULL to address another thread
changing radix->head in the meantime.
Move 'node != NULL' check into while loop test to silence cppcheck
false positive.
Fix pointer != NULL style.
(cherry picked from commit 51f08d2095)
enough buffer space. Also change named_os_uname() prototype so
that it is now returning (const char *) rather than (char *). If
uname() is not supported on a UNIX build prepopulate unamebuf[]
with "unknown architecture".
(cherry picked from commit 4bc3de070f)
these keywords were added to the parser as synonyms for "master"
and "slave" but were never hooked in to the configuration of named,
so they were ignored. this has been fixed and the option is now
checked for correctness.
(cherry picked from commit ba31b189b4)
RBTDB node can now appear on the deadnodes lists following the changes
to decrement_reference in 176b23b6cd to
defer checking of node->down when the tree write lock is not held. The
node should be unlinked instead.
(cherry picked from commit 569cc155b8680d8ed12db1fabbe20947db24a0f9)
NS_CLIENT_TCP_BUFFER_SIZE was 2 byte too large following the
move to netmgr add associated changes to lib/ns/client.c and
as a result an INSIST could be trigger if the DNS message being
constructed had a checkpoint stage that fell in those two extra
bytes. Adjusted NS_CLIENT_TCP_BUFFER_SIZE and cleaned up
client_allocsendbuf now that the previously reserved 2 bytes
are no longer used.
(cherry picked from commit 5a92af19b7dce684b0e6670ae6ec1c4c58613263)
The unittest.sh script tried to execute the unit tests when cmocka
development libraries was available, but kyua, the execution engine,
was not. Now, both need to be installed in the system.
The ThreadSanitizer uses system synchronization primitives to check for
data race. The netmgr handle->references was missing acquire memory
barrier before resetting and reusing the memory occupied by isc_nmhandle_t.
(cherry picked from commit 1013c0930e)
- clone keynode->dsset rather than return a pointer so that thread
use is independent of each other.
- hold a reference to the dsset (keynode) so it can't be deleted
while in use.
- create a new keynode when removing DS records so that dangling
pointers to the deleted records will not occur.
- use a rwlock when accessing the rdatalist to prevent instabilities
when DS records are added.
(cherry picked from commit e5b2eca1d3)
There's a possibility of a race in TCP accepting code:
T1 accepts a connection C1
T2 accepts a connection C2
T1 tries to accept a connection C3, but we hit a quota,
isc_quota_cb_init() sets quota_accept_cb for the socket,
we return from accept_connection
T2 drops C2, but we race in quota_release with accepting C3 so
we don't see quota->waiting is > 0, we don't launch the callback
T1 accepts a connection C4, we are able to get the quota we clear
the quota_accept_cb from sock->quotacb
T1 drops C1, tries to call the callback which is zeroed, sigsegv.
Adjust the script for the GitLab CI job building release tarballs to
account for the changes in the documentation building process introduced
by the migration to Sphinx.
In process_fd we lock sock->lock and then internal_accept locks mgr->lock,
in isc_sockmgr_render* functions we lock mgr->lock and then lock sock->lock,
that can cause a deadlock when accessing stats. Unlock sock->lock early in
all the internal_{send,recv,connect,accept} functions instead of late
in process_fd.
We were passing client address to dns_resolver_createfetch as a pointer
and it was saved as a pointer. The client (with its address) could be
gone before the fetch is finished, and in a very odd scenario
log_formerr would call isc_sockaddr_format() which first checks if the
address family is valid (and at this point it still is), then the
sockaddr is cleared, and then isc_netaddr_fromsockaddr is called which
fails an assertion as the address family is now invalid.
(cherry picked from commit 175c4d9055)
DS records only belong at delegation points and if present
at the zone apex are invariably the result of administrative
errors. Additionally they can't be queried for with modern
resolvers as the parent servers will be queried.
(cherry picked from commit 35a58d30c9)
With netmgr we're creating separate socket for each IPv6 interface,
just as with IPv4 - update documentation accordingly.
(cherry picked from commit 6a2100034b)
To indicate the SoftHSM version used in each CI job while avoiding the
need to add another token to job names, replace "pkcs11" with
"softhsm2.4" and "fedora31:amd64" with "softhsm2.6".
(cherry picked from commit c7169c4ab0)
Various SoftHSM versions differ in algorithm support. Since Fedora
tends to have the latest SoftHSM version available in its stock package
repositories, enable PKCS#11 support in Fedora jobs to test multiple
SoftHSM versions in GitLab CI.
(cherry picked from commit 3ecb202ba3)
It might be possible some pending task would run when kserver is already
cleaned up. Postpone gsstsig structures cleanup after task and timer
managers are destroyed. No pending threads are possible after it.
Make action in maybeshutdown only if doshutdown was not already called.
Might be called from getinput event.
(cherry picked from commit 2685e69be8)
The release notes were previously built as a separate document
(including the PDF version). It was agreed that this doesn't make much
sense, so the release notes are now included only as an appendix to the
BIND 9 ARM.
(cherry picked from commit 8eb2323ec3)
When we made BIND 9 libraries private to BIND 9, we forgot to remove the
libdns section on "export" libraries from the ARM.
(cherry picked from commit 3637c466c9)
This includes reorganization of the lists of RFCs supported by BIND 9.
I included all the RFCs and notes from the list identified by Vicky in
any DNS-related RFCs written by current ISC engineers, on the assumption
that BIND would comply with them.
(cherry picked from commit 8ca7f22671)
As a leftover from old TCP accept code isc_uv_import passed TCP_SERVER
flag when importing a socket on Windows.
Since now we're importing/exporting accepted connections it needs to
pass TCP_CONNECTION flag.
(cherry picked from commit 801f7af6e9)
Instead of using bind() and passing the listening socket to the children
threads using uv_export/uv_import use one thread that does the accepting,
and then passes the connected socket using uv_export/uv_import to a random
worker. The previous solution had thundering herd problems (all workers
waking up on one connection and trying to accept()), this one avoids this
and is simpler.
The tcp clients quota is simplified with isc_quota_attach_cb - a callback
is issued when the quota is available.
(cherry picked from commit 60629e5b0b)
The Danger script inspects differences between the current version of a
given merge request's target branch and the merge request branch. If
the latter falls behind the former, the Danger script will wrongly warn
about missing GitLab/RT identifiers because it incorrectly treats the
"+++" diff marker as an indication of the merge request adding new lines
to a file. Tweak the relevant conditional expression to prevent such
invalid warnings from being raised.
(cherry picked from commit e062812c38)
As GitLab Runner Docker executor caches Git repositories between jobs,
prevent the Danger script from attempting to update local refs to ensure
"git fetch" returns with an exit code of 0. Use the FETCH_HEAD ref for
determining the differences between the merge request branch and its
target branch.
(cherry picked from commit d558c4cb78)
Commits adding CHANGES entries and/or release notes do not need a commit
log message. Do not warn about a missing commit log message for such
commits to make the warning more meaningful.
(cherry picked from commit c13944ca46)
In case of a test failure we weren't tearing down sockets and tasks
properly, causing the test to hang instead of failing nicely.
(cherry picked from commit 4a8d9250cf)
The SO_INCOMING_CPU is available since Linux 3.19 for getting the value,
but only since Linux 4.4 for setting the value (see below for a full
description). BIND 9 should not fail when setting the option on the
socket fails, as this is only an optimization and not hard requirement
to run BIND 9.
SO_INCOMING_CPU (gettable since Linux 3.19, settable since Linux 4.4)
Sets or gets the CPU affinity of a socket. Expects an integer flag.
int cpu = 1;
setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, sizeof(cpu));
Because all of the packets for a single stream (i.e., all
packets for the same 4-tuple) arrive on the single RX queue that
is associated with a particular CPU, the typical use case is to
employ one listening process per RX queue, with the incoming
flow being handled by a listener on the same CPU that is
handling the RX queue. This provides optimal NUMA behavior and
keeps CPU caches hot.
(cherry picked from commit 4ec357da0a)
Originally, the default value for max-stale-ttl was 1 week, which could
and in some scenarios lead to cache exhaustion on a busy resolvers.
Picking the default value will always be juggling between value that's
useful (e.g. keeping the already cached records after they have already
expired and the upstream name servers are down) and not bloating the
cache too much (e.g. keeping everything for a very long time). The new
default reflects what we think is a reasonable to time to react on both
sides (upstream authoritative and downstream recursive).
(cherry picked from commit 13fd3ecfab)
When creating the successor, the current active key (predecessor)
should change its goal state to HIDDEN.
Also add two useful debug logs in the keymgr_key_rollover function.
(cherry picked from commit e71d60299f)
Catch a case where if the prepublication time of the successor key
is later than the retire time of the predecessor. If that is the
case we should prepublish as soon as possible, a.k.a. now.
(cherry picked from commit c08d0f7dd6)
The `dns_keymgr_run()` function became quite long, put the logic
that looks if a new key needs to be created (start a key rollover)
in a separate function.
(cherry picked from commit bcf8192438)
The logic in `keymgr_key_has_successor(key, keyring)` is flawed, it
returns true if there is any key in the keyring that has a successor,
while what we really want here is to make sure that the given key
has a successor in the given keyring.
Rather than relying on `keymgr_key_exists_with_state`, walk the
list of keys in the keyring and check if the key is a successor of
the given predecessor key.
(cherry picked from commit 0d578097ef)
The usage of 'date -d' in the kasp system test is not portable,
replace with a python script. Also remove some leftover
"set_keytime 'yes'" calls.
(cherry picked from commit 5b3decaf48)
This improves keytime testing on algorithm rollover. It now
tests for specific times, and also tests for SyncPublish and
Removed keytimes.
(cherry picked from commit 61c1040ae5)
This improves keytime testing on CSK rollover. It now
tests for specific times, and also tests for SyncPublish and
Removed keytimes.
Since an "active key" for ZSK and KSK means something
different, this makes it tricky to decide when a CSK is
active. An "active key" intuitively means the key is signing
so we say a CSK is active when it is creating zone signatures.
This change means a lot of timings for the CSK rollover tests
need to be adjusted.
The keymgr code needs a slight change on calculating the
prepublication time: For a KSK we need to include the parent
registration delay, but for CSK we look at the zone signing
property and stick with the ZSK prepublication calculation.
(cherry picked from commit e233433772)
This improves keytime testing on KSK rollover. It now
tests for specific times, and also tests for SyncPublish and
Removed keytimes.
(cherry picked from commit 649d0833ce)
Registration delay is not part of the Iret retire interval, thus
removed from the calculation when setting the Delete time metadata.
Include the registration delay in prepublication time, because
we need to prepublish the key sooner than just the Ipub
publication interval.
(cherry picked from commit 50bbbb76a8)
This improves keytime testing on ZSK rollover. It now
tests for specific times, and also tests for SyncPublish and
Removed keytimes.
(cherry picked from commit e01fcbbaf8)
This improves keytime testing for enabling DNSSEC. It now
tests for specific times, and also tests for SyncPublish.
(cherry picked from commit cf51c87fad)
This commit adds testing keytiming metadata. In order to facilitate
this, the kasp system test undergoes a few changes:
1. When finding a key file, rather than only saving the key ID,
also save the base filename and creation date with `key_save`.
These can be used later to set expected key times.
2. Add a test function `set_addkeytime` that takes a key, which
keytiming to update, a datetime in keytiming format, and a number
(seconds) to add, and sets the new time in the given keytime
parameter of the given key. This is used to set the expected key
times.
3. Split `check_keys` in `check_keys` and `check_keytimes`. First we
need to find the keyfile before we can check the keytimes.
We need to retrieve the creation date (and sometimes other
keytimes) to determine the other expected key times.
4. Add helper functions to set the expected key times per policy.
This avoids lots of duplication.
Check for keytimes for the first test cases (all that do not cover
rollovers).
(cherry picked from commit f8e34b57b4)
After removing dnssec-settime calls that set key rollover
relationship, we can adjust the counts in test output filenames.
Also fix a couple of more wrong counts in output filenames.
(cherry picked from commit 8204e31f0e)
Using dnssec-setttime after dnssec-keygen in the kasp system test
can lead to off by one second failures, so reduce the usage of
dnssec-settime in the setup scripts. This commit deals with
setting the key rollover relationship (predecessor/successor).
(cherry picked from commit 5a590c47a5)
In the kasp system test, we are going to set the keytimes on
dnssec-keygen so we can test them against the key creation time.
This prevents off by one second in the test, something that can
happen if you set those times with dnssec-settime after
dnssec-keygen.
Also fix some test output filenames.
(cherry picked from commit 637d5f9a68)
While kasp relies on key states to determine when a key needs to
be published or be used for signing, the keytimes are used by
operators to get some expectation of key publication and usage.
Update the code such that these keytimes are set appropriately.
That means:
- Print "PublishCDS" and "DeleteCDS" times in the state files.
- The keymgr sets the "Removed" and "PublishCDS" times and derives
those from the dnssec-policy.
- Tweak setting of the "Retired" time, when retiring keys, only
update the time to now when the retire time is not yet set, or is
in the future.
This also fixes a bug in "keymgr_transition_time" where we may wait
too long before zone signatrues become omnipresent or hidden. Not
only can we skip waiting the sign delay Dsgn if there is no
predecessor, we can also skip it if there is no successor.
Finally, this commit moves setting the lifetime, reducing two calls
to one.
(cherry picked from commit 18dc27afd3)
For testing purposes mainly, we want to allow set keytimings on
generated keys, such that we don't have to "keygen/settime" which
can result in one second off times.
(cherry picked from commit 1c21631730)
Certain rules of the BIND development process are not codified anywhere
and/or are used inconsistently. In an attempt to improve this
situation, add a GitLab CI job which uses Danger Python to add comments
to merge requests when certain expectations are not met. Two categories
of feedback are used, only one of which - fail() - causes the GitLab CI
job to fail. Exclude dangerfile.py from Python QA checks as the way the
contents of that file are evaluated triggers a lot of Flake8 and PyLint
warnings.
(cherry picked from commit 36bb45a8b6)
The ARM and the manpages have been converted into Sphinx documentation
format.
Sphinx uses reStructuredText as its markup language, and many of its
strengths come from the power and straightforwardness of
reStructuredText and its parsing and translating suite, the Docutils.
(cherry picked from commit 9fb6d11abb)
While harmless on Linux, missing isc_{mutex,conditional}_destroy
causes a memory leak on *BSD. Missing calls were added.
(cherry picked from commit a8807d9a7b)
in addition to being more efficient, this prevents a possible crash by
looking up the node name before the tree sructure can be changed when
cleaning up dead nodes in addrdataset().
(cherry picked from commit db9d10e3c1)
Some operating systems (e.g. CentOS, OpenBSD) install the main pytest
script as "py.test-3". Add that name to the list of names passed to
AC_PATH_PROGS() in order for pytest to be properly detected on a broader
range of operating systems.
(cherry picked from commit d5562a3e7e)
Use str.format() instead of f-strings in Python system tests to enable
them to work on Python 3 versions older than 3.6 as the latter is not
available on some operating systems used in GitLab CI that are still
actively supported (CentOS 6, Debian 9, Ubuntu 16.04).
(cherry picked from commit 5562c38ffb)
As Python QA tools, BIND system test prerequisites, and documentation
building utilities are now all included in operating system images used
in GitLab CI, do not use pip for installing them in each CI job any
more.
(cherry picked from commit e3c217296d)
Add a system test that counts how many address fetches are made
for different numbers of NS records and checks that the number
are successfully limited.
If there are more that 5 NS record for a zone only perform a
maximum of 4 address lookups for all the name servers. This
limits the amount of remote lookup performed for server
addresses at each level for a given query.
as the update triggers by the rndc command to clear the signing records
may not have completed by the time the subsequent rndc command to test
that the records have been removed is commenced. Loop several times to
prevent false negative.
(cherry picked from commit 353018c0e5)
cppcheck 2.0 reports false positives about uninitialized variables in a
lot of places throughout BIND source code, e.g.:
bin/dnssec/dnssec-cds.c:282:6: error: Uninitialized variable: length [uninitvar]
if (isc_buffer_availablelength(&buf) <= len) {
^
Apparently cppcheck 2.0 has issues with processing (&var)->field syntax,
which is what the macros from lib/isc/include/isc/buffer.h are evaluated
to. This issue was reported upstream [1] and will hopefully be
addressed in a future cppcheck release.
In the meantime, to avoid modifying BIND source code in multiple places
just because of a static checker false positive, work around the issue
by adding intermediate variables to buffer macro definitions using a sed
invocation in the cppcheck job script.
[1] https://sourceforge.net/p/cppcheck/discussion/general/thread/122153e3c1/
(cherry picked from commit 481fa34e50)
Add whitespace to the regular expression used for extracting the GCC
version from "gcc --version" output so that it works properly with
multi-digit major version numbers.
(cherry picked from commit 3b48eec79f)
Commit 691c8f6828 broke the cppcheck job
in GitLab CI: when cppcheck fails, the script is immediately
interrupted, preventing cppcheck-htmlreport from being run. To ensure
the HTML report is generated when cppcheck fails, revert to invoking
cppcheck-htmlreport in the "after_script" part of the job.
(cherry picked from commit cb2037ee9d)
There a race between when the delta is logged and when the
server returns signed record. Retry the queries if the
lookups fail to meet expectations.
(cherry picked from commit 46c4e5d96f)
Although in util/api-checker.sh we create textual reports, we don't
preserve them in job artifacts, but we should.
We don't want to keep all HTML pages present in the project root, but
just those produced by ABI checker.
(cherry picked from commit b5ccf95b0a)
The `statschannel/ns2/` was missing `manykeys.db.in`, but the test
succeeded even when `setup.sh` (or `clean.sh`) failed to execute. This
commit makes run.sh to run in stricter mode and fail the test
immediately when `clean.sh` or `setup.sh` fails.
(cherry picked from commit 8b357a35d2)
Originally, every library and binaries got linked to everything, which
creates unnecessary overlinking. This wasn't as straightforward as it
should be as we still support configuration without libtool for 9.16.
Couple of smaller issues related to include headers and an issue where
sanitizer overload dlopen and dlclose symbols, so we were getting false
negatives in the autoconf test.
Underlinking states for the situation when a binary uses a symbol not provided
by libraries it is directly linked to. The libns was not linked to libisc and
libdns, and libirs was not linked to libisc, libdns and libisccfg) while using
symbols from these libraries directly.
a4f0281962 is a flawed backport of
cf5105939c - it retained the original
invocation of bin/tests/system/stop.pl in bin/tests/system/run.sh. This
results in the former script being called twice for each system test,
which does not cause problems on Unix systems, but triggers false
positives about named instances dying prematurely on Windows. Fix by
removing the offending invocation of bin/tests/system/stop.pl from
bin/tests/system/run.sh.
the CHECK() macro resets result, so an error code from an earlier
view could be erased if the last view loaded had no errors.
(cherry picked from commit 7e73660206)
The SO_REUSEPORT socket option on Linux means something else on BSD
based systems. On FreeBSD there's 1:1 option SO_REUSEPORT_LB, so we can
use that.
(cherry picked from commit 09ba47b067)
The introduction of netmgr doubled the number of threads from which
dnstap data may be logged: previously, it could only happen from within
taskmgr worker threads; with netmgr, it can happen both from taskmgr
worker threads and from network threads. Since the argument passed to
fstrm_iothr_options_set_num_input_queues() was not updated to reflect
this change, some calls to fstrm_iothr_get_input_queue() can now return
NULL, effectively preventing some dnstap data from being logged.
Whether this bug is triggered or not depends on thread scheduling order
and packet distribution between network threads, but will almost
certainly be triggered on any recursive resolver sooner or later. Fix
by requesting the correct number of dnstap input queues to be allocated.
(cherry picked from commit 77dc091855)
Per Current Mechanisms 2.3.5, the curve name is DER-encoded in the
EC_PARAMS attribute, and the public key value is DER-encoded in the
EC_POINT attribute.
(cherry picked from commit 2e6b7a56cc)
Previously, the code would do:
REQUIRE(alg == CURVE1 || alg == CURVE2);
[...]
if (alg == CURVE1) { /* code for CURVE1 */ }
else { /* code for CURVE2 */ }
This approach is less extensible and also more prone to errors in case
the initial REQUIRE() is forgotten. The code has been refactored to
use:
REQUIRE(alg == CURVE1 || alg == CURVE2);
[...]
switch (alg) {
case CURVE1: /* code for CURVE1 */; break;
case CURVE2: /* code for CURVE2 */; break;
default: INSIST(0);
}
(cherry picked from commit cf30e7d0d1)
The pk11/constants.h header contained static CK_BYTE arrays and
we had to use #defines to pull only those we need. This commit
changes the constants to only define byte arrays with the content
and either use them directly or define the CK_BYTE arrays locally
where used.
(cherry picked from commit da38bd0e1d)
nzf_append is conditionally compiled and this is intended to
catch error introduced by changes to the called functions on all
systems before the changes are run through the CI.
(cherry picked from commit a66c6fc883)
Test zones with various escape sequences and filesystem seperator
characters.
* escaped double quote (\")
* escaped escape (\\)
* escaped decimal byte value (\032)
* slash seperator (/)
(cherry picked from commit 5ab9b5b1e6)
When we were printing quoted string, the double quotes where unescaped
leading to prematurely ending the quoted string.
(cherry picked from commit b02081d423)
The system tests currently uses patchwork of shell scripts which doesn't
offer proper error handling.
This commit introduced option to write new tests in pytest framework
that also allows easier manipulation of DNS traffic (using dnspython),
native XML and JSON manipulation and proper error reporting.
(cherry picked from commit cf5105939c)
named_os_openfile was being called with switch_user set to true
unconditionally leading to log messages about being unable to
switch user identity from named when regenerating the key.
(cherry picked from commit 071bc29962)
When running on Linux and system capabilities are available, named will
drop the extra capabilities before loading the configuration. This led
to spurious warnings from `seteuid()` because named already dropped
CAP_SETUID and CAP_GETUID capabilities.
The fix removes setting the effective uid/gid when capabilities are
available, and adds a check that we are running under the user we were
requested to run.
(cherry picked from commit 6c82e2af92)
The <isc/md.h> header directly included <openssl/hmac.h> header which
enforced all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto. By hiding the specific
implementation into the private namespace, we no longer enforce this.
In the long run, this might also allow us to switch cryptographic
library implementation without affecting the downstream users.
(cherry picked from commit 70100c664a)
The two "functions" that isc/safe.h declared before were actually simple
defines to matching OpenSSL functions. The downside of the approach was
enforcing all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto. By hiding the specific
implementation into the private namespace changing the defines into
simple functions, we no longer enforce this. In the long run, this
might also allow us to switch cryptographic library implementation
without affecting the downstream users.
(cherry picked from commit ab827ab5bf)
The <isc/md.h> header directly included <openssl/evp.h> header which
enforced all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto. By hiding the specific
implementation into the private namespace, we no longer enforce this.
In the long run, this might also allow us to switch cryptographic
library implementation without affecting the downstream users.
While making the isc_md_type_t type opaque, the API using the data type
was changed to use the pointer to isc_md_type_t instead of using the
type directly.
(cherry picked from commit 4e114f8ed6)
Right before the release API version (LIBINTERFACE, LIBREVISION, LIBAGE)
for older and newer libraries tends to be the same. Given that, commit
hash can't be the determining factor here, Unix time of the commit
should suit us better and is placed after the API version. The commit
hash is preserved as it's useful to see it in the actual report.
(cherry picked from commit 8e3e2836b0)
this addresses a race that could occur during shutdown or when
reconfiguring to remove RPZ zones.
this change should ensure that the rpzs structure and the incremental
updates don't interfere with each other: rpzs->zones entries cannot
be set to NULL while an update quantum is running, and the
task should be destroyed and its queue purged so that no subsequent
quanta will run.
(cherry picked from commit 286e8cd7ea)
Coverity showed that the return value of `dst_key_gettime` was
unchecked in INITIALIZE_STATE. If DST_TIME_CREATED was not set we
would set the state to be initialized to a weird last changed time.
This would normally not happen because DST_TIME_CREATED is always
set. However, we would rather set the time to now (as the comment
also indicates) not match the creation time.
The comment on INITIALIZE_STATE also needs updating as we no
longer always initialize to HIDDEN.
(cherry picked from commit 564f9dca35)
Revert the change from ad03c22e97 as
further testing has shown that with hyper-threading disabled, named with
ISC rwlocks outperforms named with pthread rwlocks in cold cache testing
scenarios. Since building named with pthread rwlocks might still be a
better choice for some workloads, keep the compile-time option which
enables that.
(cherry picked from commit 17101fd093)
Add two tests that checks that dynamic zones
can be updated and will be signed appropriately.
One zone covers an update with freeze/thaw, the
other covers an update through nsupdate.
(cherry picked from commit e3aa12fc0a)
When dnssec-policy was introduced, it implicitly set inline-signing.
But DNSSEC maintenance required either inline-signing to be enabled,
or a dynamic zone. In other words, not in all cases you want to
DNSSEC maintain your zone with inline-signing.
Change the behavior and determine whether inline-signing is
required: if the zone is dynamic, don't use inline-signing,
otherwise implicitly set it.
You can also explicitly set inline-signing to yes with dnssec-policy,
the restriction that both inline-signing and dnssec-policy cannot
be set at the same time is now lifted.
However, 'inline-signing no;' on a non-dynamic zone with a
dnssec-policy is not possible.
(cherry picked from commit 644f0d958a)
Change 5332 renamed "dnssec-keys" configuration statement to the
more descriptive "trust-anchors". Not all occurrences in the
documentation had been updated.
(cherry picked from commit 7c6dde024155585008e9bfd09c03722d69211d02)
The yamlget.py file was changed in !3311 as part of making the
python code pylint and flake8 compliant. This omitted setting
'item' to 'item[key]' which caused the digdelv yaml tests to fail.
Also, the pretty printing is not really necessary, so remove
the "if key not in item; print error" logic.
(cherry picked from commit 464d0417d1)
All our MSVS Project files share the same intermediate directory. We
know that this doesn't cause any problems, so we can just disable the
detection in the project files.
Example of the warning:
warning MSB8028: The intermediate directory (.\Release\) contains files shared from another project (dnssectool.vcxproj). This can lead to incorrect clean and rebuild behavior.
(cherry picked from commit b6c2012d93)
There was a missing indirection for the pluginlist_cb_t *callback in the
declaration of the cfg_pluginlist_foreach function. Reported by MSVC as:
lib\isccfg\parser.c(4057): warning C4028: formal parameter 4 different from declaration
(cherry picked from commit 4ffe725585)
Due to a way the stdatomic.h shim is implemented on Windows, the MSVC
always things that the outside type is the largest - atomic_(u)int_fast64_t.
This can lead to false positives as this one:
lib\dns\adb.c(3678): warning C4477: 'fprintf' : format string '%u' requires an argument of type 'unsigned int', but variadic argument 2 has type 'unsigned __int64'
We workaround the issue by loading the value in a scoped local variable
with correct type first.
(cherry picked from commit 60c632ab91)
MSVC documentation states: "This warning can be caused when a pointer to
a const or volatile item is assigned to a pointer not declared as
pointing to const or volatile."
Unfortunately, this happens when we dynamically allocate and deallocate
block of atomic variables using isc_mem_get and isc_mem_put.
Couple of examples:
lib\isc\hp.c(134): warning C4090: 'function': different 'volatile' qualifiers [C:\builds\isc-projects\bind9\lib\isc\win32\libisc.vcxproj]
lib\isc\hp.c(144): warning C4090: 'function': different 'volatile' qualifiers [C:\builds\isc-projects\bind9\lib\isc\win32\libisc.vcxproj]
lib\isc\stats.c(55): warning C4090: 'function': different 'volatile' qualifiers [C:\builds\isc-projects\bind9\lib\isc\win32\libisc.vcxproj]
lib\isc\stats.c(87): warning C4090: 'function': different 'volatile' qualifiers [C:\builds\isc-projects\bind9\lib\isc\win32\libisc.vcxproj]
(cherry picked from commit 063e05491b)
The InterlockedOr8() and InterlockedAnd8() first argument was cast
to (atomic_int_fast8_t) instead of (atomic_int_fast8_t *), this was
reported by MSVC as:
warning C4024: '_InterlockedOr8': different types for formal and actual parameter 1
warning C4024: '_InterlockedAnd8': different types for formal and actual parameter 1
(cherry picked from commit 54168d55c0)
Our vcxproj files set the WarningLevel to Level3, which is too verbose
for a code that needs to be portable. That basically leads to ignoring
all the errors that MSVC produces. This commits downgrades the
WarningLevel to Level1 and enables treating warnings as errors for
Release builds. For the Debug builds the WarningLevel got upgraded to
Level4, and treating warnings as errors is explicitly disabled.
We should eventually make the code clean of all MSVC warnings, but it's
a long way to go for Level4, so it's more reasonable to start at Level1.
For reference[1], these are the warning levels as described by MSVC
documentation:
* /W0 suppresses all warnings. It's equivalent to /w.
* /W1 displays level 1 (severe) warnings. /W1 is the default setting
in the command-line compiler.
* /W2 displays level 1 and level 2 (significant) warnings.
* /W3 displays level 1, level 2, and level 3 (production quality)
warnings. /W3 is the default setting in the IDE.
* /W4 displays level 1, level 2, and level 3 warnings, and all level 4
(informational) warnings that aren't off by default. We recommend
that you use this option to provide lint-like warnings. For a new
project, it may be best to use /W4 in all compilations. This option
helps ensure the fewest possible hard-to-find code defects.
* /Wall displays all warnings displayed by /W4 and all other warnings
that /W4 doesn't include — for example, warnings that are off by
default.
* /WX treats all compiler warnings as errors. For a new project, it
may be best to use /WX in all compilations; resolving all warnings
ensures the fewest possible hard-to-find code defects.
1. https://docs.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level?view=vs-2019
(cherry picked from commit 789d253e3d)
Windows BIND releases produced by GitLab CI are built from Git
repositories, not from release tarballs, which means the "srcid" file is
not present in the top source directory when MSBuild is invoked. This
causes the Git commit hash for such builds to be set to "unset_id".
Enable win32utils/Configure to try determining the commit hash for a
build by invoking Git on the build host if the "srcid" file is not
present (which is what its Unix counterpart does).
(cherry picked from commit 05e13e7caf)
Our python code didn't adhere to any coding standard. In this commit, we add
flame8 (https://pypi.org/project/flake8/), and pylint (https://www.pylint.org/).
There's couple of exceptions:
- ans.py scripts are not checked, nor fixed as part of this MR
- pylint's missing-*-docstring and duplicate-code checks have
been disabled via .pylintrc
Both exceptions should be removed in due time.
(cherry picked from commit ee534592e3)
The assembly code generated by MSVC for at least some signed comparisons
involving atomic variables incorrectly uses unsigned conditional jumps
instead of signed ones. In particular, the checks in isc_log_wouldlog()
are affected in a way which breaks logging on Windows and thus also all
system tests involving a named instance. Work around the issue by
assigning the values returned by atomic_load_acquire() calls in
isc_log_wouldlog() to local variables before performing comparisons.
(cherry picked from commit 4c4f5cccaa)
Increate the DNSKEY TTL of the migrate.kasp zone for the following
reason: The key states are initialized depending on the timing
metadata. If a key is present long enough in the zone it will be
initialized to OMNIPRESENT. Long enough here is the time when it
was published (when the setup script was run) plus DNSKEY TTL.
Otherwise it is set to RUMOURED, or to HIDDEN if no timing metadata
is set or the time is still in the future.
Since the TTL is "only" 5 minutes, the DNSKEY state may be
initialized to OMNIPRESENT if the test is slow, but we expect it
to be in RUMOURED state. If we increase the TTL to a couple of
hours it is very unlikely that it will be initialized to something
else than RUMOURED.
(cherry picked from commit 04e6711029)
This fixes another intermittent failure in the kasp system test.
It does not happen often, except for in the Windows platform tests
where it takes a long time to run the tests.
In the "kasp" system test, there is an "rndc reconfig" call which
triggers a new rekey event. check_next_key_event() verifies the time
remaining from the moment "rndc reconfig" is called until the next key
event. However, the next key event time is calculated from the key
times provided during key creation (i.e. during test setup). Given
this, if "rndc reconfig" is called a significant amount of time after
the test is started, some check_next_key_event() checks will fail.
Fix by calculating the time passed since the start of the test and
when 'rndc reconfig' happens. Substract this time from the
calculated next key event.
This only needs to be done after an "rndc reconfig" on zones where
the keymgr needs to wait for a period of time (for example for keys
to become OMNIPRESENT, or HIDDEN). This is on step 2 and step 5 of
the algorithm rollover. In step 2 there is a waiting period before
the DNSKEY is OMNIPRESENT, in step 5 there is a waiting period
before the DNSKEY is HIDDEN.
In step 1 new keys are created, in step 3 and 4 key states just
entered OMNIPRESENT, and in step 6 we no longer care because the
key lifetime is unlimited and we default to checking once per hour.
Regardless of our indifference about the next key event after step 6,
change some of the key timings in the setup script to better
reflect reality: DNSKEY is in HIDDEN after step 5, DS times have
changed when the new DS became active.
(cherry picked from commit 62a97570b8)
This test asserts that option "deny-answer-aliases" works correctly
when forwarding requests.
As a matter of example, the behavior expected for a forwarder BIND
instance, having an option such as deny-answer-aliases { "domain"; }
is that when forwarding a request for *.anything-but-domain, it is
expected that it will return SERVFAIL if any answer received has a CNAME
for "*.domain".
(cherry picked from commit 9bdb960a16a69997b08746e698b6b02c8dc6c795)
BIND wasn't honoring option "deny-answer-aliases" when configured to
forward queries.
Before the fix it was possible for nameservers listed in "forwarders"
option to return CNAME answers pointing to unrelated domains of the
original query, which could be used as a vector for rebinding attacks.
The fix ensures that BIND apply filters even if configured as a forwarder
instance.
(cherry picked from commit af6a4de3d5ad6c1967173facf366e6c86b3ffc28)
In case of normal fetch, the .recursionquota is attached and
ns_statscounter_recursclients is incremented when the fetch is created. Then
the .recursionquota is detached and the counter decremented in the
fetch_callback().
In case of prefetch or rpzfetch, the quota is attached, but the counter is not
incremented. When we reach the soft-quota, the function returns early but don't
detach from the quota, and it gets destroyed during the ns_client_endrequest(),
so no memory was leaked.
But because the ns_statscounter_recursclients is only incremented during the
normal fetch the counter would be incorrectly decremented on two occassions:
1) When we reached the softquota, because the quota was not properly detached
2) When the prefetch or rpzfetch was cancelled mid-flight and the callback
function was never called.
(cherry picked from commit 78886d4bed)
tcpdns used transport-specific functions to operate on the outer socket.
Use generic ones instead, and select the proper call in netmgr.c.
Make the missing functions (e.g. isc_nm_read) generic and add type-specific
calls (isc__nm_tcp_read). This is the preparation for netmgr TLS layer.
(cherry picked from commit 5fedd21e16)
Rather than group key ids together, group key id with its
corresponding counters. This should make growing / shrinking easier
than having keyids then counters.
(cherry picked from commit eb6a8b47d7)
Add a statschannel test case for DNSSEC sign metrics that has more
keys than there are allocated stats counters for. This will produce
gibberish, but at least it should not crash.
(cherry picked from commit 31e8b2b13c)
The first attempt to add DNSSEC sign statistics was naive: for each
zone we allocated 64K counters, twice. In reality each zone has at
most four keys, so the new approach only has room for four keys per
zone. If after a rollover more keys have signed the zone, existing
keys are rotated out.
The DNSSEC sign statistics has three counters per key, so twelve
counters per zone. First counter is actually a key id, so it is
clear what key contributed to the metrics. The second counter
tracks the number of generated signatures, and the third tracks
how many of those are refreshes.
This means that in the zone structure we no longer need two separate
references to DNSSEC sign metrics: both the resign and refresh stats
are kept in a single dns_stats structure.
Incrementing dnssecsignstats:
Whenever a dnssecsignstat is incremented, we look up the key id
to see if we already are counting metrics for this key. If so,
we update the corresponding operation counter (resign or
refresh).
If the key is new, store the value in a new counter and increment
corresponding counter.
If all slots are full, we rotate the keys and overwrite the last
slot with the new key.
Dumping dnssecsignstats:
Dumping dnssecsignstats is no longer a simple wrapper around
isc_stats_dump, but uses the same principle. The difference is that
rather than dumping the index (key tag) and counter, we have to look
up the corresponding counter.
(cherry picked from commit 705810d577)
The rwlock introduced to protect the .logconfig member of isc_log_t
structure caused a significant performance drop because of the rwlock
contention. It was also found, that the debug_level member of said
structure was not protected from concurrent read/writes.
The .dynamic and .highest_level members of isc_logconfig_t structure
were actually just cached values pulled from the assigned channels.
We introduced an even higher cache level for .dynamic and .highest_level
members directly into the isc_log_t structure, so we don't have to
access the .logconfig member in the isc_log_wouldlog() function.
(cherry picked from commit 3a24eacbb6)
Add a test to ensure migration from 'auto-dnssec maintain;' to
dnssec-policy works even if the algorithm is changed. The existing
keys should not be removed immediately, but their goal should be
changed to become hidden, and the new keys with the different
algorithm should be introduced immediately.
(cherry picked from commit 551acb44f4)
If we initialize goals on all keys, superfluous keys that match
the policy all desire to be active. For example, there are six
keys available for a policy that needs just two, we only want to
set the goal state to OMNIPRESENT on two keys, not six.
(cherry picked from commit 2389fcb4dc)
Migrating from 'auto-dnssec maintain;' to dnssec-policy did not
work properly, mainly because the legacy keys were initialized
badly. Earlier commit deals with migration where existing keys
match the policy. This commit deals with migration where existing
keys do not match the policy. In that case, named must not
immediately delete the existing keys, but gracefully roll to the
dnssec-policy.
However, named did remove the existing keys immediately. This is
because the legacy key states were initialized badly. Because
those keys had their states initialized to HIDDEN or RUMOURED, the
keymgr decides that they can be removed (because only when the key
has its states in OMNIPRESENT it can be used safely).
The original thought to initialize key states to HIDDEN (and
RUMOURED to deal with existing keys) was to ensure that those keys
will go through the required propagation time before the keymgr
decides they can be used safely. However, those keys are already
in the zone for a long time and making the key states represent
otherwise is dangerous: keys may be pulled out of the zone while
in fact they are required to establish the chain of trust.
Fix initializing key states for existing keys by looking more closely
at the time metadata. Add TTL and propagation delays to the time
metadata and see if the DNSSEC records have been propagated.
Initialize the state to OMNIPRESENT if so, otherwise initialize to
RUMOURED. If the time metadata is in the future, or does not exist,
keep initializing the state to HIDDEN.
The added test makes sure that new keys matching the policy are
introduced, but existing keys are kept in the zone until the new
keys have been propagated.
(cherry picked from commit 7f43520893)
A few kasp system test tweaks to improve test failure debugging and
deal with tests related to migration to dnssec-policy.
1. When clearing a key, set lifetime to "none". If "none", skip
expect no lifetime set in the state file. Legacy keys that
are migrated but don't match the dnssec-policy will not have a
lifetime.
2. The kasp system test prints which key id and file it is checking.
Log explicitly if we are checking the id or a file.
3. Add quotes around "ID" when setting the key id, for consistency.
4. Fix a typo (non -> none).
5. Print which key ids are found, this way it is easier to see what
KEY[1-4] failed to match one of the key files.
(cherry picked from commit a224754d59)
Migrating from 'auto-dnssec maintain;' to dnssec-policy did not
work properly, mainly because the legacy keys were initialized
badly. Several adjustments in the keymgr are required to get it right:
- Set published time on keys when we calculate prepublication time.
This is not strictly necessary, but it is weird to have an active
key without the published time set.
- Initalize key states also before matching keys. Determine the
target state by looking at existing time metadata: If the time
data is set and is in the past, it is a hint that the key and
its corresponding records have been published in the zone already,
and the state is initialized to RUMOURED. Otherwise, initialize it
as HIDDEN. This fixes migration to dnssec-policy from existing
keys.
- Initialize key goal on keys that match key policy to OMNIPRESENT.
These may be existing legacy keys that are being migrated.
- A key that has its goal to OMNIPRESENT *or* an active key can
match a kasp key. The code was changed with CHANGE 5354 that
was a bugfix to prevent creating new KSK keys for zones in the
initial stage of signing. However, this caused problems for
restarts when rollovers are in progress, because an outroducing
key can still be an active key.
The test for this introduces a new KEY property 'legacy'. This is
used to skip tests related to .state files.
(cherry picked from commit 6801899134)
After an RPZ zone is updated via zone transfer, the RPZ summary
database is updated, inserting the newly added names in the policy
zone and deleting the newly removed ones. The first part of this
was quantized so it would not run too long and starve other tasks
during large updates, but the second part was not quantized, so
that an update in which a large number of records were deleted
could cause named to become briefly unresponsive.
(cherry picked from commit 32da119ed8)
We could have a race between handle closing and processing async
callback. Deactivate the handle before issuing the callback - we
have the socket referenced anyway so it's not a problem.
We introduce a isc_quota_attach_cb function - if ISC_R_QUOTA is returned
at the time the function is called, then a callback will be called when
there's quota available (with quota already attached). The callbacks are
organized as a LIFO queue in the quota structure.
It's needed for TCP client quota - with old networking code we had one
single place where tcp clients quota was processed so we could resume
accepting when the we had spare slots, but it's gone with netmgr - now
we need to notify the listener/accepter that there's quota available so
that it can resume accepting.
Remove unused isc_quota_force() function.
The isc_quote_reserve and isc_quota_release were used only internally
from the quota.c and the tests. We should not expose API we are not
using.
(cherry picked from commit d151a10f30)
ORACLE MySQL 8.0 has dropped the my_bool type, so we need to reinstate
it back when compiling with that version or higher. MariaDB is still
keeping the my_bool type. The numbering between the two (MariaDB 5.x
jumped to MariaDB 10.x) doesn't make the life of the developer easy.
(cherry picked from commit c6d5d5c88f)
Most build/test job names already contain a "clang", "gcc", or "msvc"
prefix which indicates the compiler used for a given job. Apply that
naming convention to all build/test job names.
(cherry picked from commit 0c898084cd)
Multiple YAML keys have identical values for both TSAN unit test job
definitions. Extract these common keys to a YAML anchor and use it in
TSAN unit test job definitions to reduce code duplication.
(cherry picked from commit 84463f33bf)
Definitions of jobs running unit tests under TSAN contain an
"after_script" YAML key. Since the "unit_test_job" anchor is included
in those job definitions before "after_script" is defined, the
job-specific value of that key overrides the one defined in the included
anchor. This prevents "kyua report-html" from being run for TSAN unit
test jobs. Moving the invocation of "kyua report-html" to the "script"
key in the "unit_test_job" anchor is not acceptable as it would cause
the exit code of that command to determine the result of all unit test
jobs and we need that to be the exit code of "make unit". Instead, add
"kyua report-html" invocations to the "after_script" key of TSAN unit
test job definitions to address the problem without affecting other job
definitions.
(cherry picked from commit 6ebce9425e)
Multiple YAML keys have identical values for both TSAN system test job
definitions. Extract these common keys to a YAML anchor and use it in
TSAN system test job definitions to reduce code duplication.
(cherry picked from commit a9aa295f1f)
Both "system_test_job" and "unit_test_job" YAML anchors contain a
"before_script" key. TSAN job definitions first specify their own value
of the "before_script" key and then include the aforementioned YAML
anchors, which results in the value of the "before_script" key being
overridden with the value specified by the included anchor. Given this,
remove "before_script" definitions specific to TSAN jobs as they serve
no practical purpose.
(cherry picked from commit 8ef01c7b50)
All assignments for the TSAN_OPTIONS variable are identical across the
entire .gitlab-ci.yml file. Define a global TSAN_OPTIONS_COMMON
variable and use it in job definitions to reduce code duplication.
(cherry picked from commit 6325c0993a)
The custom builds (oot, asan, tsan) were mostly built using Debian sid
amd64 image. The problem was that this image broke too easily, because
it's Debian "unstable" after all.
This commit introduces "base_image" that should be most stable with
extra bits on top (clang, coccinelle, cppcheck, ...). Currently, that
would be Debian buster amd64.
Other changes introduced by this commit:
* Change the default clang version to 10
* Run both ASAN and TSAN with both gcc and clang compilers
* Remove Clang Debian stretch i386 job
(cherry picked from commit 5f5721aa11)
These are mostly false positives, the clang-analyzer FAQ[1] specifies
why and how to fix it:
> The reason the analyzer often thinks that a pointer can be null is
> because the preceding code checked compared it against null. So if you
> are absolutely sure that it cannot be null, remove the preceding check
> and, preferably, add an assertion as well.
The 4 warnings reported are:
dnssec-cds.c:781:4: warning: Access to field 'base' results in a dereference of a null pointer (loaded from variable 'buf')
isc_buffer_availableregion(buf, &r);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:996:36: note: expanded from macro 'isc_buffer_availableregion'
^
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:821:16: note: expanded from macro 'ISC__BUFFER_AVAILABLEREGION'
(_r)->base = isc_buffer_used(_b); \
^~~~~~~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:152:29: note: expanded from macro 'isc_buffer_used'
((void *)((unsigned char *)(b)->base + (b)->used)) /*d*/
^~~~~~~~~
1 warning generated.
--
byname_test.c:308:34: warning: Access to field 'fwdtable' results in a dereference of a null pointer (loaded from variable 'view')
RUNTIME_CHECK(dns_fwdtable_add(view->fwdtable, dns_rootname,
^~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/util.h:318:52: note: expanded from macro 'RUNTIME_CHECK'
^~~~
/builds/isc-projects/bind9/lib/isc/include/isc/error.h:50:21: note: expanded from macro 'ISC_ERROR_RUNTIMECHECK'
((void)(ISC_LIKELY(cond) || \
^~~~
/builds/isc-projects/bind9/lib/isc/include/isc/likely.h:23:43: note: expanded from macro 'ISC_LIKELY'
^
1 warning generated.
--
./rndc.c:255:6: warning: Dereference of null pointer (loaded from variable 'host')
if (*host == '/') {
^~~~~
1 warning generated.
--
./main.c:1254:9: warning: Access to field 'sctx' results in a dereference of a null pointer (loaded from variable 'named_g_server')
sctx = named_g_server->sctx;
^~~~~~~~~~~~~~~~~~~~
1 warning generated.
References:
1. https://clang-analyzer.llvm.org/faq.html#null_pointer
(cherry picked from commit ddd0d356e5)
The 3 warnings reported are:
os.c:872:7: warning: Although the value stored to 'ptr' is used in the enclosing expression, the value is never actually read from 'ptr'
if ((ptr = strtok_r(command, " \t", &last)) == NULL) {
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
--
rpz.c:1117:10: warning: Although the value stored to 'zbits' is used in the enclosing expression, the value is never actually read from 'zbits'
return (zbits &= x);
^ ~
1 warning generated.
--
openssleddsa_link.c:532:10: warning: Although the value stored to 'err' is used in the enclosing expression, the value is never actually read from 'err'
while ((err = ERR_get_error()) != 0) {
^ ~~~~~~~~~~~~~~~
1 warning generated.
(cherry picked from commit 262f087bcf)
There are several reason why remove Debian 8 from the CI:
* Debian 8 ("jessie") has been superseded by Debian 9 ("stretch").
* Regular security support updates have been discontinued as of
June 17th, 2018.
* Jessie LTS is supported from 17th June 2018 to June 30, 2020.
In other words, it's no longer officially supported by Debian security
team, but by the volunteer/paid contributor composed LTS team. And the
release will be discontinued in three months from now. We can use the
freed CI resources to bring new platforms or just to make the jobs run a
bit faster.
(cherry picked from commit 75f46cc3d1)
The environment variable MAKE has been replaced with MAKE_COMMAND,
because overriding MAKE variable also changed the definition of the MAKE
inside the Makefiles, and we want only a single wrapper around the whole
build process.
Previously, setting `MAKE` to `bear make` meant that `bear make` would
be run at every nested make invocation, which messed up the upcoming
automake transition as compile_commands.json would be generated in every
subdirectory instead of just having one central file at the top of the
build tree.
(cherry picked from commit de1a637a69)
All *:sid:amd64 jobs were errorneously copied to *:sid:arm64 including
the extra cppcheck run. Remove the extra definitions from arm64 jobs.
(cherry picked from commit 99f9e2c53e)
All jobs now use solely the newer needs configuration to declare
dependencies between jobs:
needs:
- job: <foo>
artifacts: true
instead of combination of dependencies and needs which is deprecated.
This change completely unbundles the stages (alas the stages still needs
to stay because the job graph has to stay acyclic between the stages).
(cherry picked from commit 66ba808c1b)
Updated version and CHANGES files with new release number.
Check the API files:
- lib/bind9/api:
Source code changes, but no interface changes: increment
LIBREVISION.
- lib/dns/api:
Function dns_acl_match changed, struct dns_badcache changed,
function dns_badcache_add changed, function dns_clent_startupdate
changed, struct dns_compress changed, struct dns_resolver changed,
rwlock size changed. This means a LIBINTERFACE increment.
- lib/irs/api:
Source code changes, but no interface changes: increment
LIBREVISION.
- lib/isc/api:
The structs isc__networker and isc_nmsocket changed. This means
increment LIBINTERFACE. The functions isc_uv_export and
isc_uv_import are removed, so LIBAGE must beq zero.
- lib/isccc/api:
Source code changes, but no interface changes: increment
LIBREVISION.
- lib/isccfg/api:
Source code changes, but no interface changes: increment
LIBREVISION.
- lib/ns/api:
Function ns_clientmgr_create, ns_interfacemgr_create, and
structs ns_clientmgr, ns_interface, ns_interfacemgr changed:
increment LIBINTERFACE.
No need to update README or release notes.
Updated CHANGES: Add GitLab MR reference to entry 5357. Remove
merge conflict gone wrong ("max-ixfr-ratio" is not in 9.16).
Add /util/check-make-install.in to .gitattributes.
When unit test fails, core file is created. Kyua's 'debug' command can
run GDB on it and provide backtrace. Unfortunately Kyua is picky about
location of these core files we opt to use custom Kyua fork and copy
core files from Kyua working directory to source tree and make it
available in GitLab.
(cherry picked from commit 8fad74e0e5)
In isc_log_woudlog() the .logconfig member of isc_log_t structure was
accessed unlocked on the merit that there could be just a race when
.logconfig would be NULL, so the message would not be logged. This
turned not to be true, as there's also data race deeper. The accessed
isc_logconfig_t object could be in the middle of destruction, so the
pointer would be still non-NULL, but the structure members could point
to a chunk of memory no longer belonging to the object. Since we are
only accessing integer types (the log level), this would never lead to
a crash, it leads to memory access to memory area no longer belonging to
the object and this a) wrong, b) raises a red flag in thread-safety tools.
(cherry picked from commit 4d58856ff7)
The isc_mem API now crashes on memory allocation failure, and this is
the next commit in series to cleanup the code that could fail before,
but cannot fail now, e.g. isc_result_t return type has been changed to
void for the isc_log API functions that could only return ISC_R_SUCCESS.
(cherry picked from commit 0b793166d0)
On Windows, C11 localtime_r() and gmtime_r() functions are not
available. While localtime() and gmtime() functions are already thread
safe because they use Thread Local Storage, it's quite ugly to #ifdef
around every localtime_r() and gmtime_r() usage to make the usage also
thread-safe on POSIX platforms.
The commit adds wrappers around Windows localtime_s() and gmtime_s()
functions.
NOTE: The implementation of localtime_s and gmtime_s in Microsoft CRT
are incompatible with the C standard since it has reversed parameter
order and errno_t return type.
(cherry picked from commit 08f4c7d6c0)
some empty conditional branches which contained a semicolon were
"fixed" by clang-format to contain nothing. add comments to prevent this.
(cherry picked from commit 735be3b816)
To get rid of the currently used FreeBSD-specific executor, move FreeBSD
CI jobs to libvirt-based executors. Make the necessary tag and variable
adjustments.
(cherry picked from commit 80618b5378)
Since FreeBSD 12.1 is the current FreeBSD 12.x release, replace FreeBSD
12.0 GitLab CI jobs with their up-to-date counterparts.
(cherry picked from commit 4c68b56246)
Waiting for the reply message will ensure that all messages being
looked for exist in the logs at the time of checking. When the
test was only waiting for the send message there was a race between
grep and the ns1 instance of named logging that it had seen the
request.
(cherry picked from commit a38a324442)
Save 'i' to 'locknum' and use that rather than using
'header->node->locknum' when performing the deferred
unlock as 'header->node->locknum' can theoretically be
different to 'i'.
(cherry picked from commit 8dd8d48c9f)
"max-journal-size" is set by default to twice the size of the zone
database. however, the calculation of zone database size was flawed.
- change the size calculations in dns_db_getsize() to more accurately
represent the space needed for a journal file or *XFR message to
contain the data in the database. previously we returned the sizes
of all rdataslabs, including header overhead and offset tables,
which resulted in the database size being reported as much larger
than the equivalent journal transactions would have been.
- map files caused a particular problem here: the full name can't be
determined from the node while a file is being deserialized, because
the uppernode pointers aren't set yet. so we store "full name length"
in the dns_rbtnode structure while serializing, and clear it after
deserialization is complete.
The kasp system test is timing critical. The test passes on all
Linux based machines, but fails frequently on Windows. The test
takes a lot more time on Windows and at the final checks fail
because the expected next key event is too far off. For example:
I:kasp:check next key event for zone step2.algorithm-roll.kasp (570)
I:kasp:error: bad next key event time 20909 for zone \
step2.algorithm-roll.kasp (expect 21600)
I:kasp:failed
This is because the kasp system test calculates the time when the
next key event should occur based on the policy. This assumes that
named is able to do key management within a minute. But starting,
named, doing key management for other zones, and reconfiguring takes
much more time on Windows and thus the next key event on Windows is
much shorter than anticipated.
That this happens is a good thing because this means that the
correct next key event is used, but is not so nice for testing, as
it is hard to determine how much time named needed before finishing
the current key event.
Disable the kasp test on Windows now because it is blocking the
release. We know the cause of these test failures, and it is clear
that this is a fault in the test, not the code. Therefore we feel
comfortable disabling the test right now and work on a fix while
unblocking the release.
(cherry picked from commit 4e610b7f6b)
ABI checker tools generate HTML and TXT API compatibility reports of
BIND libraries. Comparison is being done between two bind source trees
which hold built BIND.
In the CI one version is the reference version defined by
BIND_BASELINE_VERSION variable, the latter one is the HEAD of branch
under test.
(cherry picked from commit 49bc08e612)
When configuring the same dnssec-policy for two zones with the same
name but in different views, there is a race condition for who will
run the keymgr first. If running sequential only one set of keys will
be created, if running parallel two set of keys will be created.
Lock the kasp when running looking for keys and running the key
manager. This way, for the same zone in different views only one
keyset will be created.
The dnssec-policy does not implement sharing keys between different
zones.
(cherry picked from commit e0bdff7ecd)
The test case for zsk-retired was missing the actual checks. Add
them and fix the set_policy call to expect three keys.
(cherry picked from commit 2e4b55de85)
Some comments started with a lowercased letter. Capitalized them to
be more consistent with the rest of the comments.
Add some newlines between `set_*` calls and check calls, also to be
more consistent with the other test cases.
(cherry picked from commit 7e54dd74f9)
We may be checking the algorithm steps too fast: the reconfig
command may still be in progress. Make sure the zones are signed
and loaded by digging the NSEC records for these zones.
(cherry picked from commit d16520532f)
Algorithm rollover waited too long before introducing zone
signatures. It waited to make sure all signatures were resigned,
but when introducing a new algorithm, all signatures are resigned
immediately. Only add the sign delay if there is a predecessor key.
(cherry picked from commit 28506159f0)
Algorithm rollover was stuck on submitting DS because keymgr thought
it would move to an invalid state. It did not match the current
key because it checked it against the current key in the next state.
Fixed by when checking the current key, check it against the desired
state, not the existing state.
(cherry picked from commit a8542b8cab)
Add a test case for algorithm rollover. This is triggered by
changing the dnssec-policy. A new nameserver ns6 is introduced
for tests related to dnssec-policy changes.
This requires a slight change in check_next_key_event to only
check the last occurrence. Also, change the debug log message in
lib/dns/zone.c to deal with checks when no next scheduled key event
exists (and default to loadkeys interval 3600).
(cherry picked from commit 88ebe9581b)
The zone 'step6.ksk-doubleksk.autosign' is configured but is not
set up nor tested. Remove the unneeded configured zone.
(cherry picked from commit cc2afe853b)
Algorithm rollover will require four keys so introduce KEY4.
Also it requires to look at key files for multiple algorithms so
change getting key ids to be algorithm rollover agnostic (adjusting
count checks). The algorithm will be verified in check_key so
relaxing 'get_keyids' is fine.
Replace '${_alg_num}' with '$(key_get KEY[1-4] ALG_NUM)' in checks
to deal with multiple algorithms.
(cherry picked from commit 00ced2d2e7)
OpenBSD virtual machines seem to affected particularly badly by other
activity happening on the host. This causes trouble around release
time: when multiple tags are pushed to the repository, a large number of
jobs is started concurrently on all CI runners. In extreme cases, this
causes the system test suite to run for about an hour (!) on OpenBSD
VMs, with multiple tests failing. We investigated the test artifacts
for all such cases in the past and the outcome was always the same: test
failures were caused by extremely slow I/O on the guest. We tried
various tricks to work around this problem, but nothing helped.
Given the above, stop running OpenBSD system test jobs for pending BIND
releases to prevent the results of these jobs from affecting the
assessment of a given release's readiness for publication. This change
does not affect OpenBSD build jobs. OpenBSD system test jobs will still
be run for scheduled and web-requested pipelines, to make sure we catch
any severe issues with test code on that platform sooner or later.
(cherry picked from commit 7b002cea83)
There is a failure mode which gets triggered on heavily loaded
systems. A key change is scheduled in 5 seconds to make ZSK2 inactive
and ZSK3 active, but `named` takes more than 5 seconds to progress
from `rndc loadkeys` to the query check. At this time the SOA RRset
is already signed by the new ZSK which is not expected to be active
at that point yet.
Split up the checks to test the case where RRsets are signed
correctly with the offline KSK (maintained the signature) and
the active ZSK. First run, RRsets should be signed with the still
active ZSK2, second run RRsets should be signed with the new active
ZSK3.
(cherry picked from commit aebb2aaa0f)
This commit simplifies a bit the lock management within dns_resolver_prime()
and prime_done() functions by means of turning resolver's attribute
"priming" into an atomic_bool and by creating only one dependent object on the
lock "primelock", namely the "primefetch" attribute.
By having the attribute "priming" as an atomic type, it save us from having to
use a lock just to test if priming is on or off for the given resolver context
object, within "dns_resolver_prime" function.
The "primelock" lock is still necessary, since dns_resolver_prime() function
internally calls dns_resolver_createfetch(), and whenever this function
succeeds it registers an event in the task manager which could be called by
another thread, namely the "prime_done" function, and this function is
responsible for disposing the "primefetch" attribute in the resolver object,
also for resetting "priming" attribute to false.
It is important that the invariant "priming == false AND primefetch == NULL"
remains constant, so that any thread calling "dns_resolver_prime" knows for sure
that if the "priming" attribute is false, "primefetch" attribute should also be
NULL, so a new fetch context could be created to fulfill this purpose, and
assigned to "primefetch" attribute under the lock protection.
To honor the explanation above, dns_resolver_prime is implemented as follow:
1. Atomically checks the attribute "priming" for the given resolver context.
2. If "priming" is false, assumes that "primefetch" is NULL (this is
ensured by the "prime_done" implementation), acquire "primelock"
lock and create a new fetch context, update "primefetch" pointer to
point to the newly allocated fetch context.
3. If "priming" is true, assumes that the job is already in progress,
no locks are acquired, nothing else to do.
To keep the previous invariant consistent, "prime_done" is implemented as follow:
1. Acquire "primefetch" lock.
2. Keep a reference to the current "primefetch" object;
3. Reset "primefetch" attribute to NULL.
4. Release "primefetch" lock.
5. Atomically update "priming" attribute to false.
6. Destroy the "primefetch" object by using the temporary reference.
This ensures that if "priming" is false, "primefetch" was already reset to NULL.
It doesn't make any difference in having the "priming" attribute not protected
by a lock, since the visible state of this variable would depend on the calling
order of the functions "dns_resolver_prime" and "prime_done".
As an example, suppose that instead of using an atomic for the "priming" attribute
we employed a lock to protect it.
Now suppose that "prime_done" function is called by Thread A, it is then preempted
before acquiring the lock, thus not reseting "priming" to false.
In parallel to that suppose that a Thread B is scheduled and that it calls
"dns_resolver_prime()", it then acquires the lock and check that "priming" is true,
thus it will consider that this resolver object is already priming and it won't do
any more job.
Conversely if the lock order was acquired in the other direction, Thread B would check
that "priming" is false (since prime_done acquired the lock first and set "priming" to false)
and it would initiate a priming fetch for this resolver.
An atomic variable wouldn't change this behavior, since it would behave exactly the
same, depending on the function call order, with the exception that it would avoid
having to use a lock.
There should be no side effects resulting from this change, since the previous
implementation employed use of the more general resolver's "lock" mutex, which
is used in far more contexts, but in the specifics of the "dns_resolver_prime"
and "prime_done" it was only used to protect "primefetch" and "priming" attributes,
which are not used in any of the other critical sections protected by the same lock,
thus having zero dependency on those variables.
HAVE_UV_IMPORT and other config.h macros must not be set unconditionally
because no existing libuv release exposes uv_import() and/or uv_export()
yet. Windows builds not passing an explicit path to libuv to
win32utils/Configure are currently broken because of this, so comment
out the offending lines and describe when the aforementioned config.h
macros should be set.
(cherry picked from commit 57b430b8ca)
Fixes a race between ns_client_killoldestquery and ns_client_endrequest -
killoldestquery takes a client from `recursing` list while endrequest
destroys client object, then killoldestquery works on a destroyed client
object. Prevent it by holding reclist lock while cancelling query.
(cherry picked from commit df3dbdff81)
- Define the SLOT environment variable before starting the test. This
variable defaults to 0 and that does not work with SoftHSM 2.
- The system test expects the PIN environment variable to be set to
"1234" while bin/tests/prepare-softhsm2.sh sets it to "0000".
Update bin/tests/prepare-softhsm2.sh so that it sets the PIN to
"1234".
- Move contents of bin/tests/system/pkcs11/prereq.sh to
bin/tests/system/pkcs11/setup.sh as the former was creating a file
called "supported" that was getting removed by the latter before
bin/tests/system/pkcs11/tests.sh could access it.
- Fix typo in "have_ecx".
(cherry picked from commit 100a230e80f01a777b917b135b4bae9a4ac0e8ae)
Previously badcache used one single mutex for everything, which
was causing performance issues. Use one global rwlock for the whole
hashtable and per-bucket mutexes.
(cherry picked from commit 47e5f5564c)
There was a very slim chance of a race between isc_socket_detach and
process_fd: isc_socket_detach decrements references to 0, and before it
calls destroy gets preempted. Second thread calls process_fd, increments
socket references temporarily to 1, and then gets preempted, first thread
then hits assertion in destroy() as the reference counter is now 1 and
not 0.
(cherry picked from commit 81ba0fe0e6)
zone_needdump() could potentially not call zone_settimer() so
explitly call zone_settimer() as zone->resigntime could have
gone backward.
(cherry picked from commit 5ec57f31b0)
With RRSIG records no longer being signed with the full
sig-validity-interval we need to ensure the zone->resigntime
as it may need to be set to a earlier time.
(cherry picked from commit 5d1611afdc)
When --with-zlib is passed to ./configure (or when the latter
autodetects zlib's presence), libisc uses certain zlib functions and
thus libisc's users should be linked against zlib in that case. Adjust
Makefile variables appropriately to prevent shared build failures caused
by underlinking.
(cherry picked from commit fc967ba092)
Updating LRU requires write-locking the node, which causes contention.
Update LRU only if time difference is large enough.
(cherry picked from commit fe584c01cc)
mem.c:add_trace_entry() -> isc_hash_function() -> isc_siphash24()
129 for (; in != end; in += 8) {
6. byte_swapping: Performing a byte swapping operation on
in implies that it came from an external source, and is
therefore tainted.
130 uint64_t m = U8TO64_LE(in);
(cherry picked from commit 8c983a7ebd)
Fix a potential assertion failure on shutdown in ns__client_endrequest.
Scenario:
1. We are shutting down, interface->clientmgr is gone.
2. We receive a packet, it gets through ns__client_request
3. mgr == NULL, return
4. isc_nmhandle_detach calls ns_client_reset_cb
5. ns_client_reset_cb calls ns_client_endrequest
6. INSIST(client->state == NS_CLIENTSTATE_WORKING ||
client->state == NS_CLIENTSTATE_RECURSING) is not met
- we haven't started processing this packet so
client->state == NS_CLIENTSTATE_READY.
As a solution - don't do anything in ns_client_reset_cb if the client
is still in READY state.
(cherry picked from commit b0888ff039)
sending each group of queries simultaneously, and then checking the
output after the last one finishes, reduces the runtime of the
serve-stale test by about six minutes.
(cherry picked from commit 195d25b222)
"yes" and "no" are permissible synonyms for "on" and "off", which
use exactly the same code paths. making sure they work isn't a good
use of 80 seconds of test time.
(cherry picked from commit 027601cd3e)
- the configuration summary reported zlib compression was not
supported even when it was.
- when bind.keys.h was regenerated it violated clang-format style.
(cherry picked from commit beda680f90)
The change introduced by commit be159f5565
was not fully complete. Adjust ./configure summary so that it reflects
the new way the --with-tuning switch works, fixing the Autoconf variable
used for determining the value of that switch. Fix win32utils/Configure
so that it behaves the same way as its Unix counterpart.
(cherry picked from commit a5fc3a6364)
* ctx needs to be destroyed before it is regenerated.
* emit the name of the signature to be replaced.
* cleanup memory before asserting so post longjump doesn't detect a
memory leak.
* comment code.
(cherry picked from commit 3a8c8a2a31)
Fix crash on arm64 from using atomic_compare_exchange_weak outside of the loop
See merge request isc-projects/bind9!3042
(cherry picked from commit e4671ef2fa)
fa68a0d8 Added atomic_compare_exchange_strong_acq_rel macro
4cf275ba Replace non-loop usage of atomic_compare_exchange_weak with strong variant
4ff887db Add arm64 to GitLab CI
BSD sed does not recognize \s as a whitespace matching token. Make the
sed script in doc/arm/Makefile.in which ensures GitLab identifiers are
not split across lines portable by replacing \s with [[:space:]].
(cherry picked from commit b25e6b51f6)
Artifacts generated by the docs:sid:amd64 job need to be retained longer
than for other jobs as they are used for building bind.isc.org contents.
If these artifacts are removed too quickly, pipelines in the pages/bind
GitLab project start failing, preventing content updates from being
published. Increase lifetime of the relevant job artifacts to prevent
this from happening.
(cherry picked from commit 9751ba5a75)
We were using our own versions of isc_uv_{export,import} functions
for multithreaded TCP listeners. Upcoming libuv version will
contain proper uv_{export,import} functions - use them if they're
available.
Upcoming version of libuv will suport uv_recvmmsg and uv_sendmmsg. To
use uv_recvmmsg we need to provide a larger buffer and be able to
properly free it.
isc_task_pause/unpause were inherently thread-unsafe - a task
could be paused only once by one thread, if the task was running
while we paused it it led to races. Fix it by making sure that
the task will pause if requested to, and by using a 'pause reference
counter' to count task pause requests - a task will be unpaused
iff all threads unpause it.
Don't remove from queue when pausing task - we lock the queue lock
(expensive), while it's unlikely that the task will be running -
and we'll remove it anyway in dispatcher
this corrects some style glitches such as:
```
long_function_call(arg, arg2, arg3, arg4, arg5, "str"
"ing");
```
...by adjusting the penalties for breaking strings and call
parameter lists.
(cherry picked from commit 0002377dca)
Start enforcing the clang-format rules on changed files
Closes#46
See merge request isc-projects/bind9!3063
(cherry picked from commit a04cdde45d)
d2b5853b Start enforcing the clang-format rules on changed files
618947c6 Switch AlwaysBreakAfterReturnType from TopLevelDefinitions to All
654927c8 Add separate .clang-format files for headers
5777c44a Reformat using the new rules
60d29f69 Don't enforce copyrights on .clang-format
adjust clang-format options to get closer to ISC style
See merge request isc-projects/bind9!3061
(cherry picked from commit d3b49b6675)
0255a974 revise .clang-format and add a C formatting script in util
e851ed0b apply the modified style
Add curly braces using uncrustify and then reformat with clang-format back
Closes#46
See merge request isc-projects/bind9!3057
(cherry picked from commit 67b68e06ad)
36c6105e Use coccinelle to add braces to nested single line statement
d14bb713 Add copy of run-clang-tidy that can fixup the filepaths
056e133c Use clang-tidy to add curly braces around one-line statements
- Merge release notes from all 9.15.x releases, leaving only those
which do not apply to BIND 9.14.
- Add missing GitLab/RT issue identifiers.
- Update "Introduction", "Note on Version Numbering", and "End of
Life" sections with BIND 9.16 information.
Reformat source code with clang-format
Closes#46
See merge request isc-projects/bind9!2156
(cherry picked from commit 7099e79a9b)
4c3b063e Import Linux kernel .clang-format with small modifications
f50b1e06 Use clang-format to reformat the source files
11341c76 Update the definition files for Windows
df6c1f76 Remove tkey_test (which is no-op anyway)
Submissions to Coverity Scan should be limited to those originated from
release branches and only from a specific schedule which holds
COVERITY_SCAN_PROJECT_NAME and COVERITY_SCAN_TOKEN variables.
(cherry picked from commit 48530aa21395414b0f9788ea5ab158b2b09ab977)
2020-02-12 14:47:52 +00:00
3249 changed files with 135266 additions and 103245 deletions
- [ ] Ensure the merge request from the previous step is reviewed by SWENG staff and has no outstanding discussions
- [ ] Ensure the documentation changes introduced by the merge request addressing the problem are reviewed by Support and Marketing staff
- [ ] Prepare backports of the merge request addressing the problem for all affected (and still maintained) BIND branches (backporting might affect the issue's scope and/or description)
- [ ] Prepare a standalone patch for the last stable release of each affected (and still maintained) BIND branch
### Release-specific actions
- [ ] Create/update the private issue containing links to fixes & reproducers for all CVEs fixed in a given release cycle
- [ ] Reserve a block of `CHANGES` placeholders once the complete set of vulnerabilities fixed in a given release cycle is determined
- [ ] Ensure the merge requests containing CVE fixes are merged into `security-*` branches in CVE identifier order
- [ ]***(SwEng)*** Build documentation on `docs.isc.org`.
- [ ]***(QA)*** Check that all the above steps were performed correctly.
- [ ]***(QA)*** Check that the contents of release notes match the merge requests comprising the releases.
- [ ]***(QA)*** Check that the formatting is correct for text, PDF, and HTML versions of release notes.
- [ ]***(QA)*** Check that the formatting of the generated man pages is correct.
- [ ]***(QA)*** Tag the releases in the private repository (`git tag -s -m "BIND 9.x.y" v9_x_y`).
- [ ]***(SwEng)*** Tag the releases[^2]. (Tags may only be pushed to the public repository for releases which are *not* security releases.)
- [ ]***(SwEng)*** If this is the first tag for a release (e.g. beta), create a release branch named `release_v9_X_Y` to allow development to continue on the maintenance branch whilst release engineering continues.
### Before the ASN Deadline (for ASN Releases) or the Public Release Date (for Regular Releases)
## Before the ASN Deadline (for ASN Releases) or the Public Release Date (for Regular Releases)
- [ ]***(QA)*** Verify GitLab CI results for the tags created and prepare a QA report for the releases to be published.
- [ ]***(QA)*** Request signatures for the tarballs, providing their location and checksums.
@@ -64,14 +41,13 @@
- [ ]***(QA)*** Notify Support that the releases have been prepared.
- [ ]***(Support)*** Send out ASNs (if applicable).
### On the Day of Public Release
## On the Day of Public Release
- [ ]***(Support)*** Wait for clearance from Security Officer to proceed with the public release (if applicable).
- [ ]***(Support)*** Place tarballs in public location on FTP site.
- [ ]***(Support)*** Publish links to downloads on ISC website.
- [ ]***(Support)*** Write release email to *bind-announce*.
- [ ]***(Support)*** Write email to *bind-users* (if a major release).
- [ ]***(Support)*** Send eligible customers updated links to the Subscription Edition (update the -S edition delivery tickets, even if those links were provided earlier via an ASN ticket).
- [ ]***(Support)*** Update tickets in case of waiting support customers.
- [ ]***(QA)*** Build and test any outstanding private packages.
- [ ]***(QA)*** Build public packages (`*.deb`, RPMs).
@@ -81,13 +57,9 @@
- [ ]***(Marketing)*** Update [Wikipedia entry for BIND](https://en.wikipedia.org/wiki/BIND).
- [ ]***(Marketing)*** Write blog article (if a major release).
- [ ]***(QA)*** Ensure all new tags are annotated and signed.
- [ ]***(QA)*** Push tags for the published releases to the public repository.
- [ ]***(QA)*** Merge the automatically prepared `prep 9.x.y` commit which updates `version` and documentation on the release branch into the relevant maintenance branch (`v9_x`).
- [ ]***(QA)*** For each maintained branch, update the `BIND_BASELINE_VERSION` variable for the `abi-check` job in `.gitlab-ci.yml` to the latest published BIND version tag for a given branch.
- [ ]***(QA)*** Prepare empty release notes for the next set of releases.
- [ ]***(QA)*** Sanitize confidential issues which are assigned to the current release milestone and do not describe a security vulnerability, then make them public.
- [ ]***(QA)*** Sanitize confidential issues which are assigned to older release milestones and describe security vulnerabilities, then make them public if appropriate[^2].
- [ ]***(QA)*** Update QA tools used in GitLab CI (e.g. Flake8, PyLint) by modifying the relevant `Dockerfile`.
- [ ]***(SwEng)*** Push tags for the published releases to the public repository.
- [ ]***(SwEng)*** Merge the automatically prepared `prep 9.X.Y` commit which updates `version` and documentation on the release branch into the relevant maintenance branch (`v9_X`).
[^1]: If not, use the time remaining until the tagging deadline to ensure all outstanding issues are either resolved or moved to a different milestone.
[^2]: As a rule of thumb, security vulnerabilities which have reproducers merged to the public repository are considered okay for full disclosure.
[^2]: Preferred command line: `git tag -u <DEVELOPER_KEYID> -a -s -m "BIND 9.X.Y[alphatag]" v9_X_Y[alphatag]`, where `[alphatag]` is an optional string such as `b1`, `rc1`, etc.
@@ -39,28 +39,29 @@ anyone can see the source, but only ISC employees have commit access.
In the past, the source could only be seen once ISC had published
a release; read access to the source repository was restricted just
as commit access was. That has changed, as ISC now provides a
public git repository of the BIND source tree (see below).
public git mirror to the BIND source tree (see below).
At ISC, we're committed to
building communities that are welcoming and inclusive: environments where people
are encouraged to share ideas, treat each other with respect, and collaborate
towards the best solutions. To reinforce our commitment, ISC
has adopted a slightly modified version of the Django
[Code of Conduct](https://gitlab.isc.org/isc-projects/bind9/-/blob/main/CODE_OF_CONDUCT.md)
for the BIND 9 project, as well as for the conduct of our developers throughout
the industry.
[Code of Conduct](https://gitlab.isc.org/isc-projects/bind9/-/blob/master/CODE_OF_CONDUCT.md) for the BIND 9 project, as well as for the conduct of our
developers throughout the industry.
### <a name="access"></a>Access to source code
Public BIND releases are always available from the
[ISC FTP site](ftp://ftp.isc.org/isc/bind9).
A public-access git repository is also available at
[https://gitlab.isc.org](https://gitlab.isc.org). This repository
contains all public release branches. Upcoming releases can be viewed in
their current state at any time. Short-lived development branches
contain unreviewed work in progress. Commits which address security
vulnerablilities are withheld until after public disclosure.
A public-access GIT repository is also available at
[https://gitlab.isc.org](https://gitlab.isc.org).
This repository is a mirror, updated several times per day, of the
source repository maintained by ISC. It contains all the public release
branches; upcoming releases can be viewed in their current state at any
time. It does *not* contain development branches or unreviewed work in
progress. Commits which address security vulnerablilities are withheld
|`-DCHECK_LOCAL=0` | Don't check out-of-zone addresses in `named-checkzone`|
|`-DCHECK_SIBLING=0` | Don't check sibling glue in `named-checkzone`|
|`-DISC_FACILITY=LOG_LOCAL0` | Change the default syslog facility for `named`|
|`-DISC_HEAP_CHECK` | Test heap consistency after every heap operation; used when debugging |
|`-DISC_MEM_DEFAULTFILL=1` | Overwrite memory with tag values when allocating or freeing it; this impairs performance but makes debugging of memory problems easier |
|`-DISC_MEM_TRACKLINES=0` | Don't track memory allocations by file and line number; this improves performance but makes debugging more difficult |
|`-DNAMED_RUN_PID_DIR=0` | Create default PID files in `${localstatedir}/run` rather than `${localstatedir}/run/named/`|
|`-DNS_CLIENT_DROPPORT=0` | Disable dropping queries from particular well-known ports |
|`-DISC_MEM_DEFAULTFILL=1`|Overwrite memory with tag values when allocating or freeing it; this impairs performance but makes debugging of memory problems easier.|
|`-DISC_MEM_TRACKLINES=0`|Don't track memory allocations by file and line number; this improves performance but makes debugging more difficult.|
|<nobr>`-DISC_FACILITY=LOG_LOCAL0`</nobr>|Change the default syslog facility for `named`|
|`-DNS_CLIENT_DROPPORT=0`|Disable dropping queries from particular well-known ports:|
|`-DCHECK_SIBLING=0`|Don't check sibling glue in `named-checkzone`|
|`-DCHECK_LOCAL=0`|Don't check out-of-zone addresses in `named-checkzone`|
|`-DNS_RUN_PID_DIR=0`|Create default PID files in `${localstatedir}/run` rather than `${localstatedir}/run/named/`|
|`-DISC_BUFFER_USEINLINE=0`|Disable the use of inline functions to implement the `isc_buffer` API: this reduces performance but may be useful when debugging |
|`-DISC_HEAP_CHECK`|Test heap consistency after every heap operation; used when debugging|
|`CC`|The C compiler to use. `configure` tries to figure out the right one for supported systems.|
|`CFLAGS`|C compiler flags. Defaults to include -g and/or -O2 as supported by the compiler. Please include '-g' if you need to set `CFLAGS`. |
|`STD_CINCLUDES`|System header file directories. Can be used to specify where add-on thread or IPv6 support is, for example. Defaults to empty string.|
|`STD_CDEFINES`|Any additional preprocessor symbols you want defined. Defaults to empty string. For a list of possible settings, see the file [OPTIONS](OPTIONS.md).|
|`LDFLAGS`|Linker flags. Defaults to empty string.|
|`BUILD_CC`|Needed when cross-compiling: the native C compiler to use when building for the target system.|
|`BUILD_CFLAGS`|`CFLAGS` for the target system during cross-compiling.|
|`BUILD_CPPFLAGS`|`CPPFLAGS` for the target system during cross-compiling.|
|`BUILD_LDFLAGS`|`LDFLAGS` for the target system during cross-compiling.|
|`BUILD_LIBS`|`LIBS` for the target system during cross-compiling.|
Additional environment variables affecting the build are listed at the
end of the `configure` help text, which can be obtained by running the
@@ -170,32 +193,32 @@ command:
#### <a name="macos"> macOS
Building on macOS assumes that the "Command Tools for Xcode" are installed.
These can be downloaded from
Building on macOS assumes that the "Command Tools for Xcode" is installed.
If these are installed at a nonstandard location, then:
* for `libxml2`, specify the prefix using `--with-libxml2=/prefix`.
* for `libxml2`, specify the prefix using `--with-libxml2=/prefix`,
* for `json-c`, adjust `PKG_CONFIG_PATH`.
To support compression on the HTTP statistics channel, the server must be
linked against `libzlib`. If this is installed in a nonstandard location,
linked against `libzlib`. If this is installed in a nonstandard location,
specify the prefix using `--with-zlib=/prefix`.
To support storing configuration data for runtime-added zones in an LMDB
database, the server must be linked with `liblmdb`. If this is installed in a
database, the server must be linked with liblmdb. If this is installed in a
nonstandard location, specify the prefix using `with-lmdb=/prefix`.
To support MaxMind GeoIP2 location-based ACLs, the server must be linked
@@ -233,8 +256,8 @@ and BIND must be configured with `--enable-dnstap`.
Certain compiled-in constants and default settings can be decreased to
values better suited to small machines, e.g. OpenWRT boxes, by specifying
`--with-tuning=small` on the `configure` command line. This decreases
memory usage by using smaller structures, but degrades performance.
`--with-tuning=small` on the `configure` command line. This will decrease
memory usage by using smaller structures, but will degrade performance.
On Linux, process capabilities are managed in user space using
the `libcap` library, which can be installed on most Linux systems via
@@ -247,54 +270,53 @@ to handle files bigger than 2GB. This can be done by using
Support for the "fixed" rrset-order option can be enabled or disabled by
specifying `--enable-fixed-rrset` or `--disable-fixed-rrset` on the
configure command line. By default, fixed rrset-order is disabled to
configure command line. By default, fixed rrset-order is disabled to
reduce memory footprint.
The `--enable-querytrace` option causes `named` to log every step of
processing every query. The `--enable-singletrace` option turns on the
same verbose tracing, but allows an individual query to be separately
traced by setting its query ID to 0. These options should only be enabled
when debugging, because they have a significant negative impact on query
performance.
processing every query. This should only be enabled when debugging, because
it has a significant negative impact on query performance.
`make install` installs`named` and the various BIND 9 libraries. By
`make install` will install `named` and the various BIND 9 libraries. By
default, installation is into /usr/local, but this can be changed with the
`--prefix` option when running `configure`.
You may specify the option `--sysconfdir` to set the directory where
configuration files like `named.conf` go by default, and `--localstatedir`
to set the default parent directory of `run/named.pid`. `--sysconfdir`
to set the default parent directory of `run/named.pid`.`--sysconfdir`
defaults to `$prefix/etc` and `--localstatedir` defaults to `$prefix/var`.
### <a name="testing"/> Automated testing
A system test suite can be run with `make check`. The system tests require
A system test suite can be run with `make test`. The system tests require
you to configure a set of virtual IP addresses on your system (this allows
multiple servers to run locally and communicate with each other). These
multiple servers to run locally and communicate with one another). These
IP addresses can be configured by running the command
`bin/tests/system/ifconfig.sh up` as root.
Some tests require Perl and the `Net::DNS` and/or `IO::Socket::INET6` modules,
and are skipped if these are not available. Some tests require Python
and the `dnspython` module and are skipped if these are not available.
and will be skipped if these are not available. Some tests require Python
and the `dnspython` module and will be skipped if these are not available.
See bin/tests/system/README for further details.
Unit tests are implemented using the CMocka unit testing framework. To build
them, use `configure --with-cmocka`. Execution of tests is done by the automake
parallel test driver; unit tests are also run by `make check`.
Unit tests are implemented using the [CMocka unit testing framework](https://cmocka.org/).
To build them, use `configure --with-cmocka`. Execution of tests is done
by the [Kyua test execution engine](https://github.com/jmmv/kyua); if the
`kyua` command is available, then unit tests can be run via `make test`
or `make unit`.
### <a name="doc"/> Documentation
The *BIND 9 Administrator Reference Manual* (ARM) is included with the source
distribution, and in .rst format, in the `doc/arm`
directory. HTML and PDF versions are automatically generated and can
be viewed at [https://bind9.readthedocs.io/en/latest/index.html](https://bind9.readthedocs.io/en/latest/index.html).
The *BIND 9 Administrator Reference Manual* is included with the source
distribution, in DocBook XML, HTML, and PDF format, in the `doc/arm`
directory.
Man pages for some of the programs in the BIND 9 distribution
are also included in the BIND ARM.
Some of the programs in the BIND 9 distribution have man pages in their
directories. In particular, the command line options of `named` are
documented in `bin/named/named.8`.
Frequently (and not-so-frequently) asked questions and their answers
can be found in the ISC Knowledgebase at
can be found in the ISC Knowledge Base at
[https://kb.isc.org](https://kb.isc.org).
Additional information on various subjects can be found in other
@@ -303,8 +325,8 @@ Additional information on various subjects can be found in other
### <a name="changes"/> Change log
A detailed list of all changes that have been made throughout the
development of BIND 9 is included in the file CHANGES, with the most recent
changes listed first. Change notes include tags indicating the category of
development BIND 9 is included in the file CHANGES, with the most recent
changes listed first. Change notes include tags indicating the category of
the change that was made; these categories are:
|Category |Description |
@@ -322,12 +344,12 @@ the change that was made; these categories are:
| [cleanup] | Minor corrections and refactoring |
| [doc] | Documentation |
| [contrib] | Changes to the contributed tools and libraries in the 'contrib' subdirectory |
| [placeholder] | Used in the main development branch to reserve change numbers for use in other branches, e.g., when fixing a bug that only exists in older releases |
| [placeholder] | Used in the master development branch to reserve change numbers for use in other branches, e.g. when fixing a bug that only exists in older releases |
In general, [func] and [experimental] tags only appear in new-feature
releases (i.e., those with version numbers ending in zero). Some new
In general, [func] and [experimental] tags will only appear in new-feature
releases (i.e., those with version numbers ending in zero). Some new
functionality may be backported to older releases on a case-by-case basis.
All other change types may be applied to all currentlysupported releases.
All other change types may be applied to all currently-supported releases.
#### Bug report identifiers
@@ -337,7 +359,7 @@ and referred to entries in the "bind9-bugs" RT database, which was not open
to the public. More recent entries use the form `[GL #NNN]` or, less often,
`[GL !NNN]`, which, respectively, refer to issues or merge requests in the
GitLab database. Most of these are publicly readable, unless they include
information which is confidential or security-sensitive.
information which is confidential or securitysensitive.
To look up a GitLab issue by its number, use the URL
fprintf(stderr," -l label: label of the key pair\n");
fprintf(stderr," name: owner of the key\n");
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.