The new command is `rndc memprof`. The memory profiling status is also
reported inside `rndc status`. The status also shows whether named can
toggle memory profiling or not and if the server is built with jemalloc.
(cherry picked from commit b495e9918e)
Log to the querylog the rcode of a previous query using
the identifier 'response:' to diffenciate queries from
responses.
(cherry picked from commit 5fad79c92f)
Add the code and documentation required to provide KSR import using
rndc. This is just the command, and the feature is at this point in
time still not implemented.
(cherry picked from commit edbb219fda)
With this new optional argument if there is an ongoing zone
transfer it will be aborted before a new zone transfer is scheduled.
(cherry picked from commit 402ca316ae)
Previously, only a single controlconf message would be processed from a
single TCP read even if the TCP read buffer contained multiple messages.
Refactor the isccc_ccmsg unit to store the extra buffer in the internal
buffer and use the already read data first before reading from the
network again.
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Dominik Thalhammer <dominik@thalhammer.it>
With _exit() instead of exit() in place, we don't need
isc__tls_setfatalmode() mechanism as the atexit() calls will not be
executed including OpenSSL atexit hooks.
Since the fatal() isn't a correct but rather abrupt termination of the
program, we want to skip the various atexit() calls because not all
memory might be freed during fatal() call, etc. Using _exit() instead
of exit() has this effect - the program will end, but no destructors or
atexit routines will be called.
The conn_shutdown() function is called whenever a control channel
connection is supposed to be closed, e.g. after a response to the client
is sent or when named is being shut down. That function calls
isccc_ccmsg_invalidate(), which resets the magic number in the structure
holding the messages exchanged over a given control channel connection
(isccc_ccmsg_t). The expectation here is that all operations related to
the given control channel connection will have been completed by the
time the connection needs to be shut down.
However, if named shutdown is initiated while a control channel message
is still in flight, some netmgr callbacks might still be pending when
conn_shutdown() is called and isccc_ccmsg_t invalidated. This causes
the REQUIRE assertion checking the magic number in ccmsg_senddone() to
fail when the latter function is eventually called, resulting in a
crash.
Fix by splitting up isccc_ccmsg_invalidate() into two separate
functions:
- isccc_ccmsg_disconnect(), which initiates TCP connection shutdown,
- isccc_ccmsg_invalidate(), which cleans up magic number and buffer,
and then:
- replacing all existing uses of isccc_ccmsg_invalidate() with calls
to isccc_ccmsg_disconnect(),
- only calling isccc_ccmsg_invalidate() when all netmgr callbacks are
guaranteed to have been run.
Adjust function comments accordingly.
The Unix Domain Sockets support in BIND 9 has been completely disabled
since BIND 9.18 and it has been a fatal error since then. Cleanup the
code and the documentation that suggest that Unix Domain Sockets are
supported.
Allow an arbitrary TCP timeout value to be specified when running
rndc, so that commands that take a long time to execute (for example,
reloading a very large configuration) can be given time to do so.
While the connect timeout was set to 60 seconds in rndc, the
idle read timeout was left at the default value of 30 seconds.
This commit sets it back to 60, to match the behavior in 9.16
and earlier.
When shutting down TCP sockets, the read callback calling logic was
flawed, it would call either one less callback or one extra. Fix the
logic in the way:
1. When isc_nm_read() has been called but isc_nm_read_stop() hasn't on
the handle, the read callback will be called with ISC_R_CANCELED to
cancel active reading from the socket/handle.
2. When isc_nm_read() has been called and isc_nm_read_stop() has been
called on the on the handle, the read callback will be called with
ISC_R_SHUTTINGDOWN to signal that the dormant (not-reading) socket
is being shut down.
3. The .reading and .recv_read flags are little bit tricky. The
.reading flag indicates if the outer layer is reading the data (that
would be uv_tcp_t for TCP and isc_nmsocket_t (TCP) for TLSStream),
the .recv_read flag indicates whether somebody is interested in the
data read from the socket.
Usually, you would expect that the .reading should be false when
.recv_read is false, but it gets even more tricky with TLSStream as
the TLS protocol might need to read from the socket even when sending
data.
Fix the usage of the .recv_read and .reading flags in the TLSStream
to their true meaning - which mostly consist of using .recv_read
everywhere and then wrapping isc_nm_read() and isc_nm_read_stop()
with the .reading flag.
4. The TLS failed read helper has been modified to resemble the TCP code
as much as possible, clearing and re-setting the .recv_read flag in
the TCP timeout code has been fixed and .recv_read is now cleared
when isc_nm_read_stop() has been called on the streaming socket.
5. The use of Network Manager in the named_controlconf, isccc_ccmsg, and
isc_httpd units have been greatly simplified due to the improved design.
6. More unit tests for TCP and TLS testing the shutdown conditions have
been added.
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
When fatal is called we may be holding memory allocated by OpenSSL.
This may result in the reference count for the FIPS provider not
going to zero and the shared library not being unloaded during
OPENSSL_cleanup. When the shared library is ultimately unloaded,
when all remaining dynamically loaded libraries are freed, we have
already destroyed the memory context we where using to track memory
leaks / late frees resulting in INSIST being called.
Disable triggering the INSIST when fatal has being called.
This is a simple replacement using the semantic patch from the previous
commit and as added bonus, one removal of previously undetected unused
variable in named/server.c.
without diffie-hellman TKEY negotiation, some other code is
now effectively dead or unnecessary, and can be cleaned up:
- the rndc tsig-list and tsig-delete commands.
- a nonoperational command-line option to dnssec-keygen that
was documented as being specific to DH.
- the section of the ARM that discussed TKEY/DH.
- the functions dns_tkey_builddeletequery(), processdeleteresponse(),
and tkey_processgssresponse(), which are unused.
as there is no further use of isc_task in BIND, this commit removes
it, along with isc_taskmgr, isc_event, and all other related types.
functions that accepted taskmgr as a parameter have been cleaned up.
as a result of this change, some functions can no longer fail, so
they've been changed to type void, and their callers have been
updated accordingly.
the tasks table has been removed from the statistics channel and
the stats version has been updated. dns_dyndbctx has been changed
to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION
has been udpated as well.
In rndc_recvdone(), if 'sends' was not 0, then 'recvs' was not
decremented, in which case isc_loopmgr_shutdown() was never reached,
which could cause a hang. (This has not been observed to happen, but
the code was incorrect on examination.)
Previously:
* applications were using isc_app as the base unit for running the
application and signal handling.
* networking was handled in the netmgr layer, which would start a
number of threads, each with a uv_loop event loop.
* task/event handling was done in the isc_task unit, which used
netmgr event loops to run the isc_event calls.
In this refactoring:
* the network manager now uses isc_loop instead of maintaining its
own worker threads and event loops.
* the taskmgr that manages isc_task instances now also uses isc_loopmgr,
and every isc_task runs on a specific isc_loop bound to the specific
thread.
* applications have been updated as necessary to use the new API.
* new ISC_LOOP_TEST macros have been added to enable unit tests to
run isc_loop event loops. unit tests have been updated to use this
where needed.
* isc_timer was rewritten using the uv_timer, and isc_timermgr_t was
completely removed; isc_timer objects are now directly created on the
isc_loop event loops.
* the isc_timer API has been simplified. the "inactive" timer type has
been removed; timers are now stopped by calling isc_timer_stop()
instead of resetting to inactive.
* isc_manager now creates a loop manager rather than a timer manager.
* modules and applications using isc_timer have been updated to use the
new API.
"rndc fetchlimit" now also prints a list of domain names that are
currently rate-limited by "fetches-per-zone".
The "fetchlimit" system test has been updated to use this feature
to check that domain limits are applied correctly.
this command runs dns_adb_dumpquota() to display all servers
in the ADB that are being actively fetchlimited by the
fetches-per-server controls (i.e, servers with a nonzero average
timeout ratio or with the quota having been reduced from the
default value).
the "fetchlimit" system test has been updated to use the
new command to check quota values instead of "rndc dumpdb".
Previously, tasks could be created either unbound or bound to a specific
thread (worker loop). The unbound tasks would be assigned to a random
thread every time isc_task_send() was called. Because there's no logic
that would assign the task to the least busy worker, this just creates
unpredictability. Instead of random assignment, bind all the previously
unbound tasks to worker 0, which is guaranteed to exist.
Sphinx "standard domain" provides directive types ".. program::" and
".. option::" to create link anchor for a program name + option combination.
These can be referenced using :ref:`program option` syntax.
The problem is that Sphinx 1.8.5 (e.g. in Ubuntu 18.04) generates
conflicting link targets if a page contains two option directives
starting with the same word, e.g.:
.. program:: dnssec-settime
.. option:: -P date
.. option:: -P ds date
The reason is that option directive consumes only first word as "option
name" (-P) and all the rest is considered "option argument" (date, ds
date). Newer versions of Sphinx (e.g. 4.5.0) handle this by creating
numbered link anchors, but older versions warn and BIND build system
turns the warning into a hard error.
To handle that we use method recommended by Sphinx maintainer:
https://github.com/sphinx-doc/sphinx/issues/10218#issuecomment-1059925508
As a bonus it provides more accurate link anchors for sub-options.
Alternatives considered:
- Replacing standard domain definition of .. option - causes more
problems, see BIND issue #3294.
- Removing hyperlinks for options - that would be a step back.
Fixes: #3295
Previously, it was possible to assign a bit of memory space in the
nmhandle to store the client data. This was complicated and prevents
further refactoring of isc_nmhandle_t caching (future work).
Instead of caching the data in the nmhandle, allocate the hot-path
ns_client_t objects from per-thread clientmgr memory context and just
assign it to the isc_nmhandle_t via isc_nmhandle_set().
C11 has builtin support for _Noreturn function specifier with
convenience noreturn macro defined in <stdnoreturn.h> header.
Replace ISC_NORETURN macro by C11 noreturn with fallback to
__attribute__((noreturn)) if the C11 support is not complete.
Previously, the unreachable code paths would have to be tagged with:
INSIST(0);
ISC_UNREACHABLE();
There was also older parts of the code that used comment annotation:
/* NOTREACHED */
Unify the handling of unreachable code paths to just use:
UNREACHABLE();
The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which
is explicit version of the /* FALLTHROUGH */ comment we are currently
using.
Add and apply FALLTHROUGH macro that uses the attribute if available,
but does nothing on older compilers.
In one case (lib/dns/zone.c), using the macro revealed that we were
using the /* FALLTHROUGH */ comment in wrong place, remove that comment.
Replace :manpage: with :iscman: to generate internal hyperlinks. That
way reader can use links even when offline, and jumps to man pages
for the same version.
Formerly HTML version of man pages did not have links in See Also
section because :manpage: role in Sphinx can generate only external
hyperlinks - and we do not have that enabled.
Enabling the Sphinx :manpage: linking could reliably create hyperlinks
only to external URLs, but that would take users to another version
of docs.
Generated by:
find bin -name '*.rst' | xargs sed -i -e 's/:manpage:`\([^(]\+\)(\([0-9]\))`/:iscman:`\1(\2) <\1>`/g'
+ hand-edit to revert change for mmencode reference which is
not provided in our source tree.
Use the new role :iscman: to replace all occurences or ``binary``
with :iscman:`binary`, creating a hyperlink to the manual page.
Generated using:
find bin -name *.rst | xargs fgrep --files-with-matches '.. iscman' | xargs -I{} -n1 basename {} .rst > /tmp/progs
for PROG in $(cat /tmp/progs); do find -name '*.rst' | xargs sed -i -e "s/\`\`$PROG\`\`/:iscman:\`$PROG\`/g"; done
Additional hand-edits were done mainly around filter-aaaa and
filter-a which are program names and and option names at the
same time. Couple more edits was neede to fix .rst syntax broken by
automatic replacement.
Sphinx has it's own :program: syntax for refering to program names.
Use it for self-references in manual pages. These self-references are
not clickable and not as eye-cathing as links, which is a good thing.
There is no point in attracting attention to ``dig`` several times on a
single page dedicated to dig itself.
Substituted automatically using:
find bin -name *.rst | xargs fgrep --files-with-matches '.. program' | xargs -n1 bash /tmp/repl.sh
With /tmp/repl.sh being:
BASE=$(basename "$1" .rst)
sed -i -e "s/\`\`$BASE\`\`/:program:\`$BASE\`/g" "$1"
The new directive and role "iscman" allow to tag & reference man pages in
our source tree. Essentially it is just namespacing for ISC man pages,
but it comes with couple benefits.
Differences from .. _man_program label we formerly used:
- Does not expand :ref:`man_program` into full text of the page header.
- Generates index entry with category "manual page".
- Rendering style is closer to ubiquitous to the one produced
by ``named`` syntax.
Differences from Sphinx built-in :manpage: role:
- Supports all builders with support for cross-references.
- Generates internal links (unlike :manpage: which generates external
URLs).
- Checks that target exists withing our source tree.
Side-effect of hyperlinking is that typos in program and option names
are now detected by Sphinx.
Candidate -options were detected using:
find -name *.rst | xargs grep '``-[^`]'
and then modified from ``-o`` to :option:`-o` using regex
s/``\(-[^`]\+\)``/:option:`\1`/
+ manual modifications where necessary.
Non-hyphenated options were detected by looking at context around
program names:
find bin -name *.rst | xargs -I{} -n1 basename {} .rst | sort -u
and grepping for program name with trailing whitespace.
Stand-alone program names like ``named`` are not hyperlinked in this
commit.