Commit Graph

116 Commits

Author SHA1 Message Date
Ondřej Surý
db49ffca20 Change the isc_nm_(get|set)timeouts() to work with milliseconds
The RFC7828 specifies the keepalive interval to be 16-bit, specified in
units of 100 milliseconds and the configuration options tcp-*-timeouts
are following the suit.  The units of 100 milliseconds are very
unintuitive and while we can't change the configuration and presentation
format, we should not follow this weird unit in the API.

This commit changes the isc_nm_(get|set)timeouts() functions to work
with milliseconds and convert the values to milliseconds before passing
them to the function, not just internally.
2021-03-18 15:16:13 +01:00
Ondřej Surý
5d0647e067 Merge the common parts between udp, tcpdns and tlsdns protocol
The udp, tcpdns and tlsdns contained lot of cut&paste code or code that
was very similar making the stack harder to maintain as any change to
one would have to be copied to the the other protocols.

In this commit, we merge the common parts into the common functions
under isc__nm_<foo> namespace and just keep the little differences based
on the socket type.
2021-03-18 15:16:13 +01:00
Ondřej Surý
a017ba2615 Fix TCPDNS and TLSDNS timers
After the TCPDNS refactoring the initial and idle timers were broken and
only the tcp-initial-timeout was always applied on the whole TCP
connection.

This broke any TCP connection that took longer than tcp-initial-timeout,
most often this would affect large zone AXFRs.

This commit changes the timeout logic in this way:

  * On TCP connection accept the tcp-initial-timeout is applied
    and the timer is started
  * When we are processing and/or sending any DNS message the timer is
    stopped
  * When we stop processing all DNS messages, the tcp-idle-timeout
    is applied and the timer is started again
2021-03-18 15:16:13 +01:00
Ondřej Surý
2f0f531ee8 Use library constructor/destructor to initialize OpenSSL
Instead of calling isc_tls_initialize()/isc_tls_destroy() explicitly use
gcc/clang attributes on POSIX and DLLMain on Windows to initialize and
shutdown OpenSSL library.

This resolves the issue when isc_nm_create() / isc_nm_destroy() was
called multiple times and it would call OpenSSL library destructors from
isc_nm_destroy().

At the same time, since we now have introduced the ctor/dtor for libisc,
this commit moves the isc_mem API initialization (the list of the
contexts) and changes the isc_mem_checkdestroyed() to schedule the
checking of memory context on library unload instead of executing the
code immediately.
2021-02-26 17:18:06 +01:00
Ondřej Surý
effe3ee595 Refactor TLSDNS module to work with libuv/ssl directly
* Following the example set in 634bdfb16d, the tlsdns netmgr
  module now uses libuv and SSL primitives directly, rather than
  opening a TLS socket which opens a TCP socket, as the previous
  model was difficult to debug.  Closes #2335.

* Remove the netmgr tls layer (we will have to re-add it for DoH)

* Add isc_tls API to wrap the OpenSSL SSL_CTX object into libisc
  library; move the OpenSSL initialization/deinitialization from dstapi
  needed for OpenSSL 1.0.x to the isc_tls_{initialize,destroy}()

* Add couple of new shims needed for OpenSSL 1.0.x

* When LibreSSL is used, require at least version 2.7.0 that
  has the best OpenSSL 1.1.x compatibility and auto init/deinit

* Enforce OpenSSL 1.1.x usage on Windows

(cherry picked from commit e493e04c0f)
2021-02-26 16:14:50 +01:00
Ondřej Surý
d7b3a6a016 Rollback setting IP_DONTFRAG option on the UDP sockets
In DNS Flag Day 2020, the development branch started setting the
IP_DONTFRAG option on the UDP sockets.  It turned out, that this
code was incomplete leading to dropping the outgoing UDP packets.
Henceforth this commit rolls back this setting until we have a
proper fix that would send back empty response with TC flag set.

(cherry picked from commit 66eefac78c)
2021-02-17 14:41:56 +01:00
Ondřej Surý
90979a79e2 Sync the func() -> func(void) in netmgr 2020-12-09 10:46:16 +01:00
Ondřej Surý
bb9b55dfba Use sock->nchildren instead of mgr->nworkers when initializing NM
On Windows, we were limiting the number of listening children to just 1,
but we were then iterating on mgr->nworkers.  That lead to scheduling
more async_*listen() than actually allocated and out-of-bound read-write
operation on the heap.

(cherry picked from commit 87c5867202)
2020-12-09 10:46:16 +01:00
Ondřej Surý
90a9b0611a Add FreeBSD connection timeout socket option
On FreeBSD, the option to configure connection timeout is called
TCP_KEEPINIT, use it to configure the connection timeout there.

This also fixes the dangling socket problems in the unit test, so
re-enable them.
2020-12-09 10:46:16 +01:00
Ondřej Surý
0ee8672692 Distribute queries among threads even on platforms without lb sockets
On platforms without load-balancing socket all the queries would be
handle by a single thread.  Currently, the support for load-balanced
sockets is present in Linux with SO_REUSEPORT and FreeBSD 12 with
SO_REUSEPORT_LB.

This commit adds workaround for such platforms that:

1. setups single shared listening socket for all listening nmthreads for
   UDP, TCP and TCPDNS netmgr transports

2. Calls uv_udp_bind/uv_tcp_bind on the underlying socket just once and
   for rest of the nmthreads only copy the internal libuv flags (should
   be just UV_HANDLE_BOUND and optionally UV_HANDLE_IPV6).

3. start reading on UDP socket or listening on TCP socket

The load distribution among the nmthreads is uneven, but it's still
better than utilizing just one thread for processing all the incoming
queries
2020-12-09 10:46:16 +01:00
Michał Kępień
12fa8a7aed Make netmgr initialize and cleanup Winsock itself
On Windows, WSAStartup() needs to be called to initialize Winsock before
any sockets are created or else socket() calls will return error code
10093 (WSANOTINITIALISED).  Since BIND's Network Manager is intended to
work as a reusable networking library, it should take care of calling
WSAStartup() - and its cleanup counterpart, WSACleanup() - itself rather
than relying on external code to do it.  Add the necessary WSAStartup()
and WSACleanup() calls to isc_nm_start() and isc_nm_destroy(),
respectively.

(cherry picked from commit 88f96faba8)
2020-12-09 10:46:16 +01:00
Michał Kępień
216fc34490 Extend log message for unexpected socket() errors
Make sure the error code is included in the message logged for
unexpected socket creation errors in order to facilitate troubleshooting
on Windows.

(cherry picked from commit dc2e1dea86)
2020-12-09 10:46:16 +01:00
Ondřej Surý
48759bd047 Fix the data race in accessing the isc_nm_t timers
The following TSAN report about accessing the mgr timers (mgr->init,
mgr->idle, mgr->keepalive and mgr->advertised) has been fixed in this
commit:

    ==================
    WARNING: ThreadSanitizer: data race (pid=2746)
    Read of size 4 at 0x7b440008a948 by thread T18:
    #0 isc__nm_tcpdns_read /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:849:25 (libisc.so.1706+0x2ba0f)
    #1 isc_nm_read /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1679:3 (libisc.so.1706+0x22258)
    #2 tcpdns_connect_connect_cb /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:363:2 (tcpdns_test+0x4bc5fb)
    #3 isc__nm_async_connectcb /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1816:2 (libisc.so.1706+0x228c9)
    #4 isc__nm_connectcb /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1791:3 (libisc.so.1706+0x22713)
    #5 tcpdns_connect_cb /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:343:2 (libisc.so.1706+0x2d89d)
    #6 uv__stream_connect /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1381:5 (libuv.so.1+0x27c18)
    #7 uv__stream_io /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1298:5 (libuv.so.1+0x25977)
    #8 uv__io_poll /home/ondrej/Projects/tsan/libuv/src/unix/linux-core.c:462:11 (libuv.so.1+0x2e795)
    #9 uv_run /home/ondrej/Projects/tsan/libuv/src/unix/core.c:385:5 (libuv.so.1+0x158ec)
    #10 nm_thread /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:530:11 (libisc.so.1706+0x1c94a)

    Previous write of size 4 at 0x7b440008a948 by main thread:
    #0 isc_nm_settimeouts /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:490:12 (libisc.so.1706+0x1dda5)
    #1 tcpdns_recv_two /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:601:2 (tcpdns_test+0x4bad0e)
    #2 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70be)
    #3 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    Location is heap block of size 281 at 0x7b440008a840 allocated by main thread:
    #0 malloc <null> (tcpdns_test+0x42864b)
    #1 default_memalloc /home/ondrej/Projects/bind9/lib/isc/mem.c:713:8 (libisc.so.1706+0x6d261)
    #2 mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:622:8 (libisc.so.1706+0x69b9c)
    #3 isc___mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:1044:9 (libisc.so.1706+0x6d379)
    #4 isc__mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:2432:10 (libisc.so.1706+0x6889e)
    #5 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:203:8 (libisc.so.1706+0x1c219)
    #6 nm_setup /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:244:11 (tcpdns_test+0x4baaa4)
    #7 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70fd)
    #8 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    Thread T18 'isc-net-0000' (tid=3513, running) created by main thread at:
    #0 pthread_create <null> (tcpdns_test+0x429e7b)
    #1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:73:8 (libisc.so.1706+0x8476a)
    #2 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:271:3 (libisc.so.1706+0x1c66a)
    #3 nm_setup /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:244:11 (tcpdns_test+0x4baaa4)
    #4 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70fd)
    #5 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    SUMMARY: ThreadSanitizer: data race /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:849:25 in isc__nm_tcpdns_read
    ==================
    ThreadSanitizer: reported 1 warnings

(cherry picked from commit 2e1dd56d0b)
2020-12-09 10:46:16 +01:00
Ondřej Surý
a61b7294c2 Avoid netievent allocations when the callbacks can be called directly
After turning the users callbacks to be asynchronous, there was a
visible performance drop.  This commit prevents the unnecessary
allocations while keeping the code paths same for both asynchronous and
synchronous calls.

The same change was done to the isc__nm_udp_{read,send} as those two
functions are in the hot path.

(cherry picked from commit d6d2fbe0e9)
2020-12-09 10:46:16 +01:00
Ondřej Surý
7b9c8b9781 Refactor netmgr and add more unit tests
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested.  It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.

NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.

The changes that are included in this commit are listed here
(extensively, but not exclusively):

* The netmgr_test unit test was split into individual tests (udp_test,
  tcp_test, tcpdns_test and newly added tcp_quota_test)

* The udp_test and tcp_test has been extended to allow programatic
  failures from the libuv API.  Unfortunately, we can't use cmocka
  mock() and will_return(), so we emulate the behaviour with #define and
  including the netmgr/{udp,tcp}.c source file directly.

* The netievents that we put on the nm queue have variable number of
  members, out of these the isc_nmsocket_t and isc_nmhandle_t always
  needs to be attached before enqueueing the netievent_<foo> and
  detached after we have called the isc_nm_async_<foo> to ensure that
  the socket (handle) doesn't disappear between scheduling the event and
  actually executing the event.

* Cancelling the in-flight TCP connection using libuv requires to call
  uv_close() on the original uv_tcp_t handle which just breaks too many
  assumptions we have in the netmgr code.  Instead of using uv_timer for
  TCP connection timeouts, we use platform specific socket option.

* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}

  When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
  waiting for socket to either end up with error (that path was fine) or
  to be listening or connected using condition variable and mutex.

  Several things could happen:

    0. everything is ok

    1. the waiting thread would miss the SIGNAL() - because the enqueued
       event would be processed faster than we could start WAIT()ing.
       In case the operation would end up with error, it would be ok, as
       the error variable would be unchanged.

    2. the waiting thread miss the sock->{connected,listening} = `true`
       would be set to `false` in the tcp_{listen,connect}close_cb() as
       the connection would be so short lived that the socket would be
       closed before we could even start WAIT()ing

* The tcpdns has been converted to using libuv directly.  Previously,
  the tcpdns protocol used tcp protocol from netmgr, this proved to be
  very complicated to understand, fix and make changes to.  The new
  tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
  Closes: #2194, #2283, #2318, #2266, #2034, #1920

* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
  pass accepted TCP sockets between netthreads, but instead (similar to
  UDP) uses per netthread uv_loop listener.  This greatly reduces the
  complexity as the socket is always run in the associated nm and uv
  loops, and we are also not touching the libuv internals.

  There's an unfortunate side effect though, the new code requires
  support for load-balanced sockets from the operating system for both
  UDP and TCP (see #2137).  If the operating system doesn't support the
  load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
  on FreeBSD 12+), the number of netthreads is limited to 1.

* The netmgr has now two debugging #ifdefs:

  1. Already existing NETMGR_TRACE prints any dangling nmsockets and
     nmhandles before triggering assertion failure.  This options would
     reduce performance when enabled, but in theory, it could be enabled
     on low-performance systems.

  2. New NETMGR_TRACE_VERBOSE option has been added that enables
     extensive netmgr logging that allows the software engineer to
     precisely track any attach/detach operations on the nmsockets and
     nmhandles.  This is not suitable for any kind of production
     machine, only for debugging.

* The tlsdns netmgr protocol has been split from the tcpdns and it still
  uses the old method of stacking the netmgr boxes on top of each other.
  We will have to refactor the tlsdns netmgr protocol to use the same
  approach - build the stack using only libuv and openssl.

* Limit but not assert the tcp buffer size in tcp_alloc_cb
  Closes: #2061

(cherry picked from commit 634bdfb16d)
2020-12-09 10:46:16 +01:00
Ondřej Surý
fa9ca83862 Turn all the callback to be always asynchronous
When calling the high level netmgr functions, the callback would be
sometimes called synchronously if we catch the failure directly, or
asynchronously if it happens later.  The synchronous call to the
callback could create deadlocks as the caller would not expect the
failed callback to be executed directly.

(cherry picked from commit a49d88568f)
2020-12-09 10:46:16 +01:00
Witold Kręcicki
4a854da141 netmgr: server-side TLS support
Add server-side TLS support to netmgr - that includes moving some of the
isc_nm_ functions from tcp.c to a wrapper in netmgr.c calling a proper
tcp or tls function, and a new isc_nm_listentls() function.

Add DoT support to tcpdns - isc_nm_listentlsdns().

(cherry picked from commit b2ee0e9dc3)
2020-12-09 10:46:16 +01:00
Ondřej Surý
c4dcedd2dc netmgr: Don't crash if socket() returns an error in udpconnect
socket() call can return an error - e.g. EMFILE, so we need to handle
this nicely and not crash.

Additionally wrap the socket() call inside a platform independent helper
function as the Socket data type on Windows is unsigned integer:

> This means, for example, that checking for errors when the socket and
> accept functions return should not be done by comparing the return
> value with –1, or seeing if the value is negative (both common and
> legal approaches in UNIX). Instead, an application should use the
> manifest constant INVALID_SOCKET as defined in the Winsock2.h header
> file.

(cherry picked from commit 8af7f81d6c)
2020-12-09 10:46:16 +01:00
Evan Hunt
4598d7b30d add isc_nmhandle_settimeout() function
this function sets the read timeout for the socket associated
with a netmgr handle and, if the timer is running, resets it.
for TCPDNS sockets it also sets the read timeout and resets the
timer on the outer TCP socket.

(cherry picked from commit 4be63c5b00)
2020-12-09 10:46:16 +01:00
Ondřej Surý
5877befb51 fix nmhandle attach/detach errors in tcpdnsconnect_cb()
we need to attach to the statichandle when connecting TCPDNS sockets,
same as with UDP.

(cherry picked from commit 2191d2bf44)
2020-12-09 10:46:16 +01:00
Ondřej Surý
e35b8db249 Fix more races between connect and shutdown
There were more races that could happen while connecting to a
socket while closing or shutting down the same socket.  This
commit introduces a .closing flag to guard the socket from
being closed twice.

(cherry picked from commit ed3ab63f74)
2020-12-09 10:46:16 +01:00
Ondřej Surý
d8c3e48970 Fix a race between isc__nm_async_shutdown() and new sends/reads
There was a data race where a new event could be scheduled after
isc__nm_async_shutdown() had cleaned up all the dangling UDP/TCP
sockets from the loop.

(cherry picked from commit 6cfadf9db0)
2020-12-09 10:46:16 +01:00
Ondřej Surý
7945fb0c90 Fix netmgr read/connect timeout issues
- don't bother closing sockets that are already closing.
- UDP read timeout timer was not stopped after reading.
- improve handling of TCP connection failures.

(cherry picked from commit cdccac4993)
2020-12-09 10:46:16 +01:00
Ondřej Surý
e9354e7bfe Add isc__nm_udp_shutdown() function
This function will be called during isc_nm_closedown() to ensure
that all UDP sockets are closed and detached.

(cherry picked from commit 7a6056bc8f)
2020-12-09 10:46:16 +01:00
Evan Hunt
c919a3338f add netmgr functions to support outgoing DNS queries
- isc_nm_tcpdnsconnect() sets up up an outgoing TCP DNS connection.
- isc_nm_tcpconnect(), _udpconnect() and _tcpdnsconnect() now take a
  timeout argument to ensure connections time out and are correctly
  cleaned up on failure.
- isc_nm_read() now supports UDP; it reads a single datagram and then
  stops until the next time it's called.
- isc_nm_cancelread() now runs asynchronously to prevent assertion
  failure if reading is interrupted by a non-network thread (e.g.
  a timeout).
- isc_nm_cancelread() can now apply to UDP sockets.
- added shim code to support UDP connection in versions of libuv
  prior to 1.27, when uv_udp_connect() was added

all these functions will be used to support outgoing queries in dig,
xfrin, dispatch, etc.

(cherry picked from commit 5dcdc00b93)
2020-12-09 10:46:16 +01:00
Ondřej Surý
bca8604bf3 Fix the data race when read-writing sock->active by using cmpxchg
(cherry picked from commit 8797e5efd5)
2020-10-22 15:00:07 -07:00
Ondřej Surý
301e4145de Fix the isc_nm_closedown() to actually close the pending connections
1. The isc__nm_tcp_send() and isc__nm_tcp_read() was not checking
   whether the socket was still alive and scheduling reads/sends on
   closed socket.

2. The isc_nm_read(), isc_nm_send() and isc_nm_resumeread() have been
   changed to always return the error conditions via the callbacks, so
   they always succeed.  This applies to all protocols (UDP, TCP and
   TCPDNS).

(cherry picked from commit f7c82e406e)
2020-10-22 15:00:00 -07:00
Mark Andrews
da0a7a34ec Complete the isc_nmhandle_detach() in the worker thread.
isc_nmhandle_detach() needs to complete in the same thread
as shutdown_walk_cb() to avoid a race.  Clear the caller's
pointer then pass control to the worker if necessary.

    WARNING: ThreadSanitizer: data race
    Write of size 8 at 0x000000000001 by thread T1:
    #0 isc_nmhandle_detach lib/isc/netmgr/netmgr.c:1258:15
    #1 control_command bin/named/controlconf.c:388:3
    #2 dispatch lib/isc/task.c:1152:7
    #3 run lib/isc/task.c:1344:2

    Previous read of size 8 at 0x000000000001 by thread T2:
    #0 isc_nm_pauseread lib/isc/netmgr/netmgr.c:1449:33
    #1 recv_data lib/isccc/ccmsg.c:109:2
    #2 isc__nm_tcp_shutdown lib/isc/netmgr/tcp.c:1157:4
    #3 shutdown_walk_cb lib/isc/netmgr/netmgr.c:1515:3
    #4 uv_walk <null>
    #5 process_queue lib/isc/netmgr/netmgr.c:659:4
    #6 process_normal_queue lib/isc/netmgr/netmgr.c:582:10
    #7 process_queues lib/isc/netmgr/netmgr.c:590:8
    #8 async_cb lib/isc/netmgr/netmgr.c:548:2
    #9 <null> <null>

(cherry picked from commit f95ba8aa20)
2020-10-15 11:03:47 +11:00
Ondřej Surý
53d6a11a0e Clone the csock in accept_connection(), not in callback
If we clone the csock (children socket) in TCP accept_connection()
instead of passing the ssock (server socket) to the call back and
cloning it there we unbreak the assumption that every socket is handled
inside it's own worker thread and therefore we can get rid of (at least)
callback locking.

(cherry picked from commit e8b56acb49)
2020-10-08 08:16:54 +02:00
Ondřej Surý
69e6c8467c Change the isc__nm_tcpdns_stoplistening() to be asynchronous event
The isc__nm_tcpdns_stoplistening() would call isc__nmsocket_clearcb()
that would clear the .accept_cb from non-netmgr thread.  Change the
tcpdns_stoplistening to enqueue ievent that would get processed in the
right netmgr thread to avoid locking.

(cherry picked from commit d86a74d8a4)
2020-10-08 08:16:53 +02:00
Ondřej Surý
ccd2902a02 Split reusing the addr/port and load-balancing socket options
The SO_REUSEADDR, SO_REUSEPORT and SO_REUSEPORT_LB has different meaning
on different platform. In this commit, we split the function to set the
reuse of address/port and setting the load-balancing into separate
functions.

The libuv library already have multiplatform support for setting
SO_REUSEADDR and SO_REUSEPORT that allows binding to the same address
and port, but unfortunately, when used after the load-balancing socket
options have been already set, it overrides the previous setting, so we
need our own helper function to enable the SO_REUSEADDR/SO_REUSEPORT
first and then enable the load-balancing socket option.

(cherry picked from commit fd975a551d)
2020-10-05 16:19:23 +02:00
Ondřej Surý
5cff119533 Use uv_os_sock_t instead of uv_os_fd_t for sockets
On POSIX based systems both uv_os_sock_t and uv_os_fd_t are both typedef
to int.  That's not true on Windows, where uv_os_sock_t is SOCKET and
uv_os_fd_t is HANDLE and they differ in level of indirection.

(cherry picked from commit acb6ad9e3c)
2020-10-05 16:19:23 +02:00
Ondřej Surý
601fc37efe Refactor isc__nm_socket_freebind() to take fd and sa_family as args
The isc__nm_socket_freebind() has been refactored to match other
isc__nm_socket_...() helper functions and take uv_os_fd_t and
sa_family_t as function arguments.

(cherry picked from commit 9dc01a636b)
2020-10-05 16:19:23 +02:00
Ondřej Surý
c3d721b13b Add helper function to enable DF (don't fragment) flag on UDP sockets
This commits add isc__nm_socket_dontfrag() helper functions.

(cherry picked from commit d685bbc822)
2020-10-05 16:19:23 +02:00
Ondřej Surý
72b85a4100 Add SO_REUSEPORT and SO_INCOMING_CPU helper functions
The setting of SO_REUSE**** and SO_INCOMING_CPU have been moved into a
separate helper functions.

(cherry picked from commit 5daaca7146)
2020-10-05 16:19:23 +02:00
Ondřej Surý
1126fe3b5b Refactor the pausing/unpausing and finishing the nm_thread
The isc_nm_pause(), isc_nm_resume() and finishing the nm_thread() from
nm_destroy() has been refactored, so all use the netievents instead of
directly touching the worker structure members.  This allows us to
remove most of the locking as the .paused and .finished members are
always accessed from the matching nm_thread.

When shutting down the nm_thread(), instead of issuing uv_stop(), we
just shutdown the .async handler, so all uv_loop_t events are properly
finished first and uv_run() ends gracefully with no outstanding active
handles in the loop.

(cherry picked from commit e5ab137ba3)
2020-10-01 18:09:35 +02:00
Witold Kręcicki
4a7dfd69ac tracing of active sockets and handles
If NETMGR_TRACE is defined, we now maintain a list of active sockets
in the netmgr object and a list of active handles in each socket
object; by walking the list and printing `backtrace` in a debugger
we can see where they were created, to assist in in debugging of
reference counting errors.

On shutdown, if netmgr finds there are still active sockets after
waiting, isc__nm_dump_active() will be called to log the list of
active sockets and their underlying handles, along with some details
about them.

(cherry picked from commit 00e04a86c8)
2020-10-01 18:09:35 +02:00
Evan Hunt
686b73ae25 limit the time we wait for netmgr to be destroyed
if more than 10 seconds pass while we wait for netmgr events to
finish running on shutdown, something is almost certainly wrong
and we should assert and crash.

(cherry picked from commit 2f2d60a989)
2020-10-01 18:09:35 +02:00
Ondřej Surý
5a92958fba properly lock the setting/unsetting of callbacks in isc_nmsocket_t
changes to socket callback functions were not thread safe.

(cherry picked from commit 89c534d3b9)
2020-10-01 18:09:35 +02:00
Evan Hunt
ba2e9dfb99 change from isc_nmhandle_ref/unref to isc_nmhandle attach/detach
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.

A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:

        - reqhandle:    held while waiting for a request callback (query,
                        notify, update)
        - sendhandle:   held while waiting for a send callback
        - fetchhandle:  held while waiting for a recursive fetch to
                        complete
        - updatehandle: held while waiting for an update-forwarding
                        task to complete

(cherry picked from commit 57b4dde974)
2020-10-01 18:09:35 +02:00
Witold Kręcicki
0202b289c2 assorted small netmgr-related changes
- rename isc_nmsocket_t->tcphandle to statichandle
- cancelread functions now take handles instead of sockets
- add a 'client' flag in socket objects, currently unused, to
  indicate whether it is to be used as a client or server socket

(cherry picked from commit 7eb4564895)
2020-10-01 16:44:43 +02:00
Evan Hunt
7a4e97ef50 Use different allocators for UDP and TCP
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.

This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.

(cherry picked from commit 38264b6a4d)
2020-10-01 16:44:43 +02:00
Witold Kręcicki
f0b089d922 netmgr: retry binding with IP_FREEBIND when EADDRNOTAVAIL is returned.
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.

(cherry picked from commit a0f7d28967)
2020-10-01 16:44:43 +02:00
Evan Hunt
bc5ea9d65e use handles for isc_nm_pauseread() and isc_nm_resumeread()
by having these functions act on netmgr handles instead of socket
objects, they can be used in callback functions outside the netgmr.

(cherry picked from commit 55896df79d)
2020-10-01 16:44:43 +02:00
Evan Hunt
6b77bd309a Don't destroy a non-closed socket, wait for all the callbacks.
We erroneously tried to destroy a socket after issuing
isc__nm_tcp{,dns}_close. Under some (race) circumstances we could get
nm_socket_cleanup to be called twice for the same socket, causing an
access to a dead memory.

(cherry picked from commit 233f134a4f)
2020-10-01 16:44:43 +02:00
Evan Hunt
80569bf977 Make netmgr tcpdns send calls asynchronous
isc__nm_tcpdns_send() was not asynchronous and accessed socket
internal fields in an unsafe manner, which could lead to a race
condition and subsequent crash. Fix it by moving tcpdns processing
to a proper netmgr thread.

(cherry picked from commit 591b79b597)
2020-10-01 16:44:43 +02:00
Witold Kręcicki
3942b226b8 Fix a shutdown race in netmgr udp
We need to mark the socket as inactive early (and synchronously)
in the stoplistening process; otherwise we might destroy the
callback argument before we actually stop listening, and call
the callback on bad memory.

(cherry picked from commit 1cf65cd882)
2020-10-01 16:44:43 +02:00
Evan Hunt
ca39572e5d clean up outerhandle when a tcpdns socket is disconnected
this prevents a crash when some non-netmgr thread, such as a
recursive lookup, times out after the TCP socket is already
disconnected.

(cherry picked from commit 3704c4fff2)
2020-10-01 16:44:43 +02:00
Evan Hunt
d9d482e9e2 implement isc_nm_cancelread()
The isc_nm_cancelread() function cancels reading on a connected
socket and calls its read callback function with a 'result'
parameter of ISC_R_CANCELED.

(cherry picked from commit 5191ec8f86)
2020-10-01 16:44:43 +02:00
Evan Hunt
e1ebbaacea shorten the sleep in isc_nm_destroy()
when isc_nm_destroy() is called, there's a loop that waits for
other references to be detached, pausing and unpausing the netmgr
to ensure that all the workers' events are run, followed by a
1-second sleep. this caused a delay on shutdown which will be
noticeable when netmgr is used in tools other than named itself,
so the delay has now been reduced to a hundredth of a second.

(cherry picked from commit 870204fe47)
2020-10-01 16:44:43 +02:00