This commit changes the taskmgr to run the individual tasks on the
netmgr internal workers. While an effort has been put into keeping the
taskmgr interface intact, couple of changes have been made:
* The taskmgr has no concept of universal privileged mode - rather the
tasks are either privileged or unprivileged (normal). The privileged
tasks are run as a first thing when the netmgr is unpaused. There
are now four different queues in in the netmgr:
1. priority queue - netievent on the priority queue are run even when
the taskmgr enter exclusive mode and netmgr is paused. This is
needed to properly start listening on the interfaces, free
resources and resume.
2. privileged task queue - only privileged tasks are queued here and
this is the first queue that gets processed when network manager
is unpaused using isc_nm_resume(). All netmgr workers need to
clean the privileged task queue before they all proceed normal
operation. Both task queues are processed when the workers are
finished.
3. task queue - only (traditional) task are scheduled here and this
queue along with privileged task queues are process when the
netmgr workers are finishing. This is needed to process the task
shutdown events.
4. normal queue - this is the queue with netmgr events, e.g. reading,
sending, callbacks and pretty much everything is processed here.
* The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t)
object.
* The isc_nm_destroy() function now waits for indefinite time, but it
will print out the active objects when in tracing mode
(-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been
made a little bit more asynchronous and it might take longer time to
shutdown all the active networking connections.
* Previously, the isc_nm_stoplistening() was a synchronous operation.
This has been changed and the isc_nm_stoplistening() just schedules
the child sockets to stop listening and exits. This was needed to
prevent a deadlock as the the (traditional) tasks are now executed on
the netmgr threads.
* The socket selection logic in isc__nm_udp_send() was flawed, but
fortunatelly, it was broken, so we never hit the problem where we
created uvreq_t on a socket from nmhandle_t, but then a different
socket could be picked up and then we were trying to run the send
callback on a socket that had different threadid than currently
running.
The isc_nm_*connect() functions were refactored to always return the
connection status via the connect callback instead of sometimes returning
the hard failure directly (for example, when the socket could not be
created, or when the network manager was shutting down).
This commit changes the connect functions in all the network manager
modules, and also makes the necessary refactoring changes in places
where the connect functions are called.
This commit extends the perl Configure script to also check for libssl
in addition to libcrypto and change the vcxproj source files to link
with both libcrypto and libssl.
- isc_nm_tcpdnsconnect() sets up up an outgoing TCP DNS connection.
- isc_nm_tcpconnect(), _udpconnect() and _tcpdnsconnect() now take a
timeout argument to ensure connections time out and are correctly
cleaned up on failure.
- isc_nm_read() now supports UDP; it reads a single datagram and then
stops until the next time it's called.
- isc_nm_cancelread() now runs asynchronously to prevent assertion
failure if reading is interrupted by a non-network thread (e.g.
a timeout).
- isc_nm_cancelread() can now apply to UDP sockets.
- added shim code to support UDP connection in versions of libuv
prior to 1.27, when uv_udp_connect() was added
all these functions will be used to support outgoing queries in dig,
xfrin, dispatch, etc.
1. The isc__nm_tcp_send() and isc__nm_tcp_read() was not checking
whether the socket was still alive and scheduling reads/sends on
closed socket.
2. The isc_nm_read(), isc_nm_send() and isc_nm_resumeread() have been
changed to always return the error conditions via the callbacks, so
they always succeed. This applies to all protocols (UDP, TCP and
TCPDNS).
While working on 'rndc dnssec -rollover' I noticed the following
(small) issues:
- The key files where updated with hints set to "-when" and that
should always be "now.
- The kasp system test did not properly update the test number when
calling 'rndc dnssec -checkds' (and ensuring that works).
- There was a missing ']' in the rndc.c help output.
This command is similar in arguments as -checkds so refactor the
'named_server_dnssec' function accordingly. The only difference
are that:
- It does not take a "publish" or "withdrawn" argument.
- It requires the key id to be set (add a check to make sure).
Add tests that will trigger rollover immediately and one that
schedules a test in the future.
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.
A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:
- reqhandle: held while waiting for a request callback (query,
notify, update)
- sendhandle: held while waiting for a send callback
- fetchhandle: held while waiting for a recursive fetch to
complete
- updatehandle: held while waiting for an update-forwarding
task to complete
control channel connection objects now contain:
- readhandle: held while waiting for a read callback
- sendhandle: held while waiting for a send callback
- cmdhandle: held while an rndc command is running
httpd connections contain:
- readhandle: held while waiting for a read callback
- sendhandle: held while waiting for a send callback
- rename isc_nmsocket_t->tcphandle to statichandle
- cancelread functions now take handles instead of sockets
- add a 'client' flag in socket objects, currently unused, to
indicate whether it is to be used as a client or server socket
Add a new 'rndc' command 'dnssec -checkds' that allows the user to
signal named that a new DS record has been seen published in the
parent, or that an existing DS record has been withdrawn from the
parent.
Upon the 'checkds' request, 'named' will write out the new state for
the key, updating the 'DSPublish' or 'DSRemoved' timing metadata.
This replaces the "parent-registration-delay" configuration option,
this was unreliable because it was purely time based (if the user
did not actually submit the new DS to the parent for example, this
could result in an invalid DNSSEC state).
Because we cannot rely on the parent registration delay for state
transition, we need to replace it with a different guard. Instead,
if a key wants its DS state to be moved to RUMOURED, the "DSPublish"
time must be set and must not be in the future. If a key wants its
DS state to be moved to UNRETENTIVE, the "DSRemoved" time must be set
and must not be in the future.
By default, with '-checkds' you set the time that the DS has been
published or withdrawn to now, but you can set a different time with
'-when'. If there is only one KSK for the zone, that key has its
DS state moved to RUMOURED. If there are multiple keys for the zone,
specify the right key with '-key'.
- using an isc_task to execute all rndc functions makes it relatively
simple for them to acquire task exclusive mode when needed
- control_recvmessage() has been separated into two functions,
control_recvmessage() and control_respond(). the respond function
can be called immediately from control_recvmessage() when processing
a nonce, or it can be called after returning from the task event
that ran the rndc command function.
- updated libisccc to use netmgr events
- updated rndc to use isc_nm_tcpconnect() to establish connections
- updated control channel to use isc_nm_listentcp()
open issues:
- the control channel timeout was previously 60 seconds, but it is now
overridden by the TCP idle timeout setting, which defaults to 30
seconds. we should add a function that sets the timeout value for
a specific listener socket, instead of always using the global value
set in the netmgr. (for the moment, since 30 seconds is a reasonable
timeout for the control channel, I'm not prioritizing this.)
- the netmgr currently has no support for UNIX-domain sockets; until
this is addressed, it will not be possible to configure rndc to use
them. we will need to either fix this or document the change in
behavior.
Add the code and documentation required to provide DNSSEC signing
status through rndc. This does not yet show any useful information,
just provide the command that will output some dummy string.
The rndc.conf main header was missing the header markup and that was
breaking the TOC for all manpages in the ARM because sphinx-build
incorrectly remembered the markup for subheader to be ~~~~ instead of
----.
The ARM and the manpages have been converted into Sphinx documentation
format.
Sphinx uses reStructuredText as its markup language, and many of its
strengths come from the power and straightforwardness of
reStructuredText and its parsing and translating suite, the Docutils.
The rewrite of BIND 9 build system is a large work and cannot be reasonable
split into separate merge requests. Addition of the automake has a positive
effect on the readability and maintainability of the build system as it is more
declarative, it allows conditional and we are able to drop all of the custom
make code that BIND 9 developed over the years to overcome the deficiencies of
autoconf + custom Makefile.in files.
This squashed commit contains following changes:
- conversion (or rather fresh rewrite) of all Makefile.in files to Makefile.am
by using automake
- the libtool is now properly integrated with automake (the way we used it
was rather hackish as the only official way how to use libtool is via
automake
- the dynamic module loading was rewritten from a custom patchwork to libtool's
libltdl (which includes the patchwork to support module loading on different
systems internally)
- conversion of the unit test executor from kyua to automake parallel driver
- conversion of the system test executor from custom make/shell to automake
parallel driver
- The GSSAPI has been refactored, the custom SPNEGO on the basis that
all major KRB5/GSSAPI (mit-krb5, heimdal and Windows) implementations
support SPNEGO mechanism.
- The various defunct tests from bin/tests have been removed:
bin/tests/optional and bin/tests/pkcs11
- The text files generated from the MD files have been removed, the
MarkDown has been designed to be readable by both humans and computers
- The xsl header is now generated by a simple sed command instead of
perl helper
- The <irs/platform.h> header has been removed
- cleanups of configure.ac script to make it more simpler, addition of multiple
macros (there's still work to be done though)
- the tarball can now be prepared with `make dist`
- the system tests are partially able to run in oot build
Here's a list of unfinished work that needs to be completed in subsequent merge
requests:
- `make distcheck` doesn't yet work (because of system tests oot run is not yet
finished)
- documentation is not yet built, there's a different merge request with docbook
to sphinx-build rst conversion that needs to be rebased and adapted on top of
the automake
- msvc build is non functional yet and we need to decide whether we will just
cross-compile bind9 using mingw-w64 or fix the msvc build
- contributed dlz modules are not included neither in the autoconf nor automake
All our MSVS Project files share the same intermediate directory. We
know that this doesn't cause any problems, so we can just disable the
detection in the project files.
Example of the warning:
warning MSB8028: The intermediate directory (.\Release\) contains files shared from another project (dnssectool.vcxproj). This can lead to incorrect clean and rebuild behavior.
Our vcxproj files set the WarningLevel to Level3, which is too verbose
for a code that needs to be portable. That basically leads to ignoring
all the errors that MSVC produces. This commits downgrades the
WarningLevel to Level1 and enables treating warnings as errors for
Release builds. For the Debug builds the WarningLevel got upgraded to
Level4, and treating warnings as errors is explicitly disabled.
We should eventually make the code clean of all MSVC warnings, but it's
a long way to go for Level4, so it's more reasonable to start at Level1.
For reference[1], these are the warning levels as described by MSVC
documentation:
* /W0 suppresses all warnings. It's equivalent to /w.
* /W1 displays level 1 (severe) warnings. /W1 is the default setting
in the command-line compiler.
* /W2 displays level 1 and level 2 (significant) warnings.
* /W3 displays level 1, level 2, and level 3 (production quality)
warnings. /W3 is the default setting in the IDE.
* /W4 displays level 1, level 2, and level 3 warnings, and all level 4
(informational) warnings that aren't off by default. We recommend
that you use this option to provide lint-like warnings. For a new
project, it may be best to use /W4 in all compilations. This option
helps ensure the fewest possible hard-to-find code defects.
* /Wall displays all warnings displayed by /W4 and all other warnings
that /W4 doesn't include — for example, warnings that are off by
default.
* /WX treats all compiler warnings as errors. For a new project, it
may be best to use /WX in all compilations; resolving all warnings
ensures the fewest possible hard-to-find code defects.
1. https://docs.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level?view=vs-2019
These are mostly false positives, the clang-analyzer FAQ[1] specifies
why and how to fix it:
> The reason the analyzer often thinks that a pointer can be null is
> because the preceding code checked compared it against null. So if you
> are absolutely sure that it cannot be null, remove the preceding check
> and, preferably, add an assertion as well.
The 4 warnings reported are:
dnssec-cds.c:781:4: warning: Access to field 'base' results in a dereference of a null pointer (loaded from variable 'buf')
isc_buffer_availableregion(buf, &r);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:996:36: note: expanded from macro 'isc_buffer_availableregion'
^
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:821:16: note: expanded from macro 'ISC__BUFFER_AVAILABLEREGION'
(_r)->base = isc_buffer_used(_b); \
^~~~~~~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/buffer.h:152:29: note: expanded from macro 'isc_buffer_used'
((void *)((unsigned char *)(b)->base + (b)->used)) /*d*/
^~~~~~~~~
1 warning generated.
--
byname_test.c:308:34: warning: Access to field 'fwdtable' results in a dereference of a null pointer (loaded from variable 'view')
RUNTIME_CHECK(dns_fwdtable_add(view->fwdtable, dns_rootname,
^~~~~~~~~~~~~~
/builds/isc-projects/bind9/lib/isc/include/isc/util.h:318:52: note: expanded from macro 'RUNTIME_CHECK'
^~~~
/builds/isc-projects/bind9/lib/isc/include/isc/error.h:50:21: note: expanded from macro 'ISC_ERROR_RUNTIMECHECK'
((void)(ISC_LIKELY(cond) || \
^~~~
/builds/isc-projects/bind9/lib/isc/include/isc/likely.h:23:43: note: expanded from macro 'ISC_LIKELY'
^
1 warning generated.
--
./rndc.c:255:6: warning: Dereference of null pointer (loaded from variable 'host')
if (*host == '/') {
^~~~~
1 warning generated.
--
./main.c:1254:9: warning: Access to field 'sctx' results in a dereference of a null pointer (loaded from variable 'named_g_server')
sctx = named_g_server->sctx;
^~~~~~~~~~~~~~~~~~~~
1 warning generated.
References:
1. https://clang-analyzer.llvm.org/faq.html#null_pointer
The isc_mem API now crashes on memory allocation failure, and this is
the next commit in series to cleanup the code that could fail before,
but cannot fail now, e.g. isc_result_t return type has been changed to
void for the isc_log API functions that could only return ISC_R_SUCCESS.
When --with-zlib is passed to ./configure (or when the latter
autodetects zlib's presence), libisc uses certain zlib functions and
thus libisc's users should be linked against zlib in that case. Adjust
Makefile variables appropriately to prevent shared build failures caused
by underlinking.
The isc_buffer_allocate() function now cannot fail with ISC_R_MEMORY.
This commit removes all the checks on the return code using the semantic
patch from previous commit, as isc_buffer_allocate() now returns void.
Previously, the dns_name API used isc_thread_key API for TLS, which is
fairly complicated and requires initialization of memory contexts, etc.
This part of code was refactored to use a ISC_THREAD_LOCAL pointer which
greatly simplifies the whole code related to storing TLS variables.
When a task manager is created, we can now specify an `isc_nm`
object to associate with it; thereafter when the task manager is
placed into exclusive mode, the network manager will be paused.