When "rndc reconfig" is run, named first configures a fresh set of views
and then tears down the old views. Consider what happens for a single
view with LMDB enabled; "envA" is the pointer to the LMDB environment
used by the original/old version of the view, "envB" is the pointer to
the same LMDB environment used by the new version of that view:
1. mdb_env_open(envA) is called when the view is first created.
2. "rndc reconfig" is called.
3. mdb_env_open(envB) is called for the new instance of the view.
4. mdb_env_close(envA) is called for the old instance of the view.
This seems to have worked so far. However, an upstream change [1] in
LMDB which will be part of its 0.9.26 release prevents the above
sequence of calls from working as intended because the locktable mutexes
will now get destroyed by the mdb_env_close() call in step 4 above,
causing any subsequent mdb_txn_begin() calls to fail (because all of the
above steps are happening within a single named process).
Preventing the above scenario from happening would require either
redesigning the way we use LMDB in BIND, which is not something we can
easily backport, or redesigning the way BIND carries out its
reconfiguration process, which would be an even more severe change.
To work around the problem, set MDB_NOLOCK when calling mdb_env_open()
to stop LMDB from controlling concurrent access to the database and do
the necessary locking in named instead. Reuse the view->new_zone_lock
mutex for this purpose to prevent the need for modifying struct dns_view
(which would necessitate library API version bumps). Drop use of
MDB_NOTLS as it is made redundant by MDB_NOLOCK: MDB_NOTLS only affects
where LMDB reader locktable slots are stored while MDB_NOLOCK prevents
the reader locktable from being used altogether.
[1] 2fd44e3251
BUFSIZ (512 bytes on Windows) may not be enough to fit the status of a
DNSSEC policy and three DNSSEC keys.
Set the size of the relevant buffer to a hardcoded value of 4096 bytes,
which should be enough for most scenarios.
While the creation and publication times of the various keys
in this policy are nearly at the same time there is a chance that
one key is created a second later than the other.
The `set_keytimes_algorithm_policy` mistakenly set the keytimes
for KEY3 based of the "published" time from KEY2.
this changes most visble uses of master/slave terminology in tests.sh
and most uses of 'type master' or 'type slave' in named.conf files.
files in the checkconf test were not updated in order to confirm that
the old syntax still works. rpzrecurse was also left mostly unchanged
to avoid interference with DNSRPS.
it is now an error to have two primaries lists with the same
name. this is true regardless of whether the "primaries" or
"masters" keywords were used to define them.
as "type primary" is preferred over "type master" now, it makes
sense to make "primaries" available as a synonym too.
added a correctness check to ensure "primaries" and "masters"
cannot both be used in the same zone.
This test ensures that named will correctly shutdown
when receiving multiple control connections after processing
of either "rncd stop" or "kill -SIGTERM" commands.
Before the fix, named was crashing due to a race condition happening
between two threads, one running shutdown logic in named/server.c
and other handling control logic in controlconf.c.
This test tries to reproduce the above scenario by issuing multiple
queries to a target named instance, issuing either rndc stop or kill
-SIGTERM command to the same named instance, then starting multiple rndc
status connections to ensure it is not crashing anymore.
Due to lack of synchronization, whenever named was being requested to
stop using rndc, controlconf.c module could be trying to access an already
released pointer through named_g_server->interfacemgr in a separate
thread.
The race could only be triggered if named was being shutdown and more
rndc connections were ocurring at the same time.
This fix correctly checks if the server is shutting down before opening
a new rndc connection.
Implement the 'rndc dnssec -status' command that will output
some information about the key states, such as which policy is
used for the zone, what keys are in use, and when rollover is
scheduled.
Add loose testing in the kasp system test, the actual times are
already tested via key file inspection.
Add the code and documentation required to provide DNSSEC signing
status through rndc. This does not yet show any useful information,
just provide the command that will output some dummy string.
The rndc.conf main header was missing the header markup and that was
breaking the TOC for all manpages in the ARM because sphinx-build
incorrectly remembered the markup for subheader to be ~~~~ instead of
----.
The wait until zones are signed after rndc reconfig is broken
because the zones are already signed before the reconfig. Fix
by having a different way to ensure the signing of the zone is
complete. This does require a call to the "wait_for_done_signing"
function after each "check_keys" call after the ns6 reconfig.
The "wait_for_done_signing" looks for a (newly added) debug log
message that named will output if it is done signing with a certain
key.
Add a note why we don't have a test case for the issue.
It is tricky to write a good test case for this if our tools are
not allowed to create signatures for unsupported algorithms.
enough buffer space. Also change named_os_uname() prototype so
that it is now returning (const char *) rather than (char *). If
uname() is not supported on a UNIX build prepopulate unamebuf[]
with "unknown architecture".
if tests that take a particularly long time to complete
(serve-stale, dnssec, rpzrecurse) are run first, a parallel
run of the system tests can finish 1-2 minutes faster.
these keywords were added to the parser as synonyms for "master"
and "slave" but were never hooked in to the configuration of named,
so they were ignored. this has been fixed and the option is now
checked for correctness.
- clone keynode->dsset rather than return a pointer so that thread
use is independent of each other.
- hold a reference to the dsset (keynode) so it can't be deleted
while in use.
- create a new keynode when removing DS records so that dangling
pointers to the deleted records will not occur.
- use a rwlock when accessing the rdatalist to prevent instabilities
when DS records are added.
Due to the changes introduced by the Automake migration, system tests
requiring Python (chain, pipelined, qmin, tcp), dynamic loading of
shared objects (dlzexternal, dyndb, filter-aaaa), or LMDB (nzd2nzf)
currently do not work on Windows. Temporarily disable them on that
platform by moving them from the PARALLEL_COMMON list to the
PARALLEL_UNIX list until the situation is rectified.
Without SYSTEMTESTTOP=.. lines in tests.sh scripts, SYSTEMTESTTOP is
being set to an absolute path. On Windows, this means that an absolute
Cygwin path gets passed as a command line argument to native Windows
binaries, which cannot work and causes system tests to break. Fix by
passing SYSTEMTESTTOP through cygpath on Windows, which causes that
variable to be set to an absolute "mixed mode" path (Windows path with
forward slashes).
Make various adjustments necessary to enable "make dist" to build a BIND
source tarball whose contents are complete enough to build binaries, run
unit & system tests, and generate documentation on Unix systems.
Known outstanding issues:
- "make distcheck" does not work yet.
- Tests do not work for out-of-tree source-tarball-based builds.
- Source tarballs are not complete enough for building on Windows.
All of the above will be addressed in due course.
DS records only belong at delegation points and if present
at the zone apex are invariably the result of administrative
errors. Additionally they can't be queried for with modern
resolvers as the parent servers will be queried.
When ./run.sh <test> is invoked, it acts as a wrapper around
`env - TESTS="<test>" make -e check` to preserve the ability to build
files defined only in the `check` target. Unfortunately, cleaning the
full environment had a side-effect of some tests failing due to missing
binaries and libraries. We now preserve the two most important
variables - PATH and LD_LIBRARY_PATH.
Move BIND binaries which are neither daemons nor administrative programs
to $bindir. This results in only the following binaries being left in
$sbindir:
- ddns-confgen
- named
- rndc
- rndc-confgen
- tsig-confgen
It might be possible some pending task would run when kserver is already
cleaned up. Postpone gsstsig structures cleanup after task and timer
managers are destroyed. No pending threads are possible after it.
Make action in maybeshutdown only if doshutdown was not already called.
Might be called from getinput event.