Commit Graph

48 Commits

Author SHA1 Message Date
Ondřej Surý
bca8604bf3 Fix the data race when read-writing sock->active by using cmpxchg
(cherry picked from commit 8797e5efd5)
2020-10-22 15:00:07 -07:00
Ondřej Surý
301e4145de Fix the isc_nm_closedown() to actually close the pending connections
1. The isc__nm_tcp_send() and isc__nm_tcp_read() was not checking
   whether the socket was still alive and scheduling reads/sends on
   closed socket.

2. The isc_nm_read(), isc_nm_send() and isc_nm_resumeread() have been
   changed to always return the error conditions via the callbacks, so
   they always succeed.  This applies to all protocols (UDP, TCP and
   TCPDNS).

(cherry picked from commit f7c82e406e)
2020-10-22 15:00:00 -07:00
Ondřej Surý
81085bbeca Fix the way udp_send_direct() is used
There were two problems how udp_send_direct() was used:

1. The udp_send_direct() can return ISC_R_CANCELED (or translated error
   from uv_udp_send()), but the isc__nm_async_udpsend() wasn't checking
   the error code and not releasing the uvreq in case of an error.

2. In isc__nm_udp_send(), when the UDP send is already in the right
   netthread, it uses udp_send_direct() to send the UDP packet right
   away.  When that happened the uvreq was not freed, and the error code
   was returned to the caller.  We need to return ISC_R_SUCCESS and
   rather use the callback to report an error in such case.

(cherry picked from commit afca2e3b21)
2020-10-22 14:59:00 -07:00
Ondřej Surý
53d6a11a0e Clone the csock in accept_connection(), not in callback
If we clone the csock (children socket) in TCP accept_connection()
instead of passing the ssock (server socket) to the call back and
cloning it there we unbreak the assumption that every socket is handled
inside it's own worker thread and therefore we can get rid of (at least)
callback locking.

(cherry picked from commit e8b56acb49)
2020-10-08 08:16:54 +02:00
Ondřej Surý
ccd2902a02 Split reusing the addr/port and load-balancing socket options
The SO_REUSEADDR, SO_REUSEPORT and SO_REUSEPORT_LB has different meaning
on different platform. In this commit, we split the function to set the
reuse of address/port and setting the load-balancing into separate
functions.

The libuv library already have multiplatform support for setting
SO_REUSEADDR and SO_REUSEPORT that allows binding to the same address
and port, but unfortunately, when used after the load-balancing socket
options have been already set, it overrides the previous setting, so we
need our own helper function to enable the SO_REUSEADDR/SO_REUSEPORT
first and then enable the load-balancing socket option.

(cherry picked from commit fd975a551d)
2020-10-05 16:19:23 +02:00
Ondřej Surý
601fc37efe Refactor isc__nm_socket_freebind() to take fd and sa_family as args
The isc__nm_socket_freebind() has been refactored to match other
isc__nm_socket_...() helper functions and take uv_os_fd_t and
sa_family_t as function arguments.

(cherry picked from commit 9dc01a636b)
2020-10-05 16:19:23 +02:00
Ondřej Surý
72b85a4100 Add SO_REUSEPORT and SO_INCOMING_CPU helper functions
The setting of SO_REUSE**** and SO_INCOMING_CPU have been moved into a
separate helper functions.

(cherry picked from commit 5daaca7146)
2020-10-05 16:19:23 +02:00
Ondřej Surý
5a92958fba properly lock the setting/unsetting of callbacks in isc_nmsocket_t
changes to socket callback functions were not thread safe.

(cherry picked from commit 89c534d3b9)
2020-10-01 18:09:35 +02:00
Evan Hunt
ba2e9dfb99 change from isc_nmhandle_ref/unref to isc_nmhandle attach/detach
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.

A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:

        - reqhandle:    held while waiting for a request callback (query,
                        notify, update)
        - sendhandle:   held while waiting for a send callback
        - fetchhandle:  held while waiting for a recursive fetch to
                        complete
        - updatehandle: held while waiting for an update-forwarding
                        task to complete

(cherry picked from commit 57b4dde974)
2020-10-01 18:09:35 +02:00
Witold Kręcicki
0202b289c2 assorted small netmgr-related changes
- rename isc_nmsocket_t->tcphandle to statichandle
- cancelread functions now take handles instead of sockets
- add a 'client' flag in socket objects, currently unused, to
  indicate whether it is to be used as a client or server socket

(cherry picked from commit 7eb4564895)
2020-10-01 16:44:43 +02:00
Evan Hunt
7a4e97ef50 Use different allocators for UDP and TCP
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.

This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.

(cherry picked from commit 38264b6a4d)
2020-10-01 16:44:43 +02:00
Witold Kręcicki
f0b089d922 netmgr: retry binding with IP_FREEBIND when EADDRNOTAVAIL is returned.
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.

(cherry picked from commit a0f7d28967)
2020-10-01 16:44:43 +02:00
Witold Kręcicki
3942b226b8 Fix a shutdown race in netmgr udp
We need to mark the socket as inactive early (and synchronously)
in the stoplistening process; otherwise we might destroy the
callback argument before we actually stop listening, and call
the callback on bad memory.

(cherry picked from commit 1cf65cd882)
2020-10-01 16:44:43 +02:00
Evan Hunt
f64a881a30 change the signature of recv callbacks to include a result code
this will allow recv event handlers to distinguish between cases
in which the region is NULL because of error, shutdown, or cancelation.

(cherry picked from commit 75c985c07f)
2020-10-01 16:44:43 +02:00
Evan Hunt
4209f051e9 modify reference counting within netmgr
- isc__nmhandle_get() now attaches to the sock in the nmhandle object.
  the caller is responsible for dereferencing the original socket
  pointer when necessary.
- tcpdns listener sockets attach sock->outer to the outer tcp listener
  socket. tcpdns connected sockets attach sock->outerhandle to the handle
  for the tcp connected socket.
- only listener sockets need to be attached/detached directly. connected
  sockets should only be accessed and reference-counted via their
  associated handles.

(cherry picked from commit 5ea26ee1f1)
2020-10-01 16:44:43 +02:00
Evan Hunt
573bcdf932 make isc_nmsocket_{attach,detach}{} functions private
there is no need for a caller to reference-count socket objects.
they need tto be able tto close listener sockets (i.e., those
returned by isc_nm_listen{udp,tcp,tcpdns}), and an isc_nmsocket_close()
function has been added for that. other sockets are only accessed via
handles.

(cherry picked from commit 9e740cad21)
2020-10-01 16:44:43 +02:00
Ondřej Surý
826ddb246e Revert the tree to allow cherry-picking netmgr changes from main
The following reverted changes will be picked again as part of the
netmgr sync with main branch.

Revert "Merge branch '1996-confidential-issue-v9_16' into 'security-v9_16'"

This reverts commit e160b1509f, reversing
changes made to c01e643715.

Revert "Merge branch '2038-use-freebind-when-bind-fails-v9_16' into 'v9_16'"

This reverts commit 5f8ecfb918, reversing
changes made to 23021385d5.

Revert "Merge branch '1936-blackhole-fix-v9_16' into 'v9_16'"

This reverts commit f20bc90a72, reversing
changes made to 490016ebf1.

Revert "Merge branch '1938-fix-udp-race' into 'v9_16'"

This reverts commit 0a6c7ab2a9, reversing
changes made to 4ea84740e6.

Revert "Merge branch '1947-fix-tcpdns-race' into 'v9_16'"

This reverts commit 4ea84740e6, reversing
changes made to d761cd576b.
2020-10-01 16:44:43 +02:00
Evan Hunt
df698d73f4 update all copyright headers to eliminate the typo 2020-09-14 16:50:58 -07:00
Evan Hunt
9a372f2bce Use different allocators for UDP and TCP
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.

This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.
2020-08-05 12:57:58 +02:00
Witold Kręcicki
a12076cc52 netmgr: retry binding with IP_FREEBIND when EADDRNOTAVAIL is returned.
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.

(cherry picked from commit a0f7d28967)
2020-07-31 13:33:06 +02:00
Witold Kręcicki
4582ef3bb2 Fix a shutdown race in netmgr udp.
We need to mark the socket as inactive early (and synchronously)
in the stoplistening process - otherwise we might destroy the
callback argument before actually stopping listening, and call
the callback on a bad memory.
2020-06-26 01:44:03 -07:00
Ondřej Surý
1217916c1e Don't check the result of setting SO_INCOMING_CPU
The SO_INCOMING_CPU is available since Linux 3.19 for getting the value,
but only since Linux 4.4 for setting the value (see below for a full
description).  BIND 9 should not fail when setting the option on the
socket fails, as this is only an optimization and not hard requirement
to run BIND 9.

    SO_INCOMING_CPU (gettable since Linux 3.19, settable since Linux 4.4)
        Sets or gets the CPU affinity of a socket.  Expects an integer flag.

            int cpu = 1;
            setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, sizeof(cpu));

        Because all of the packets for a single stream (i.e., all
	packets for the same 4-tuple) arrive on the single RX queue that
	is associated with a particular CPU, the typical use case is to
	employ one listening process per RX queue, with the incoming
	flow being handled by a listener on the same CPU that is
	handling the RX queue.  This provides optimal NUMA behavior and
	keeps CPU caches hot.

(cherry picked from commit 4ec357da0a)
2020-06-03 12:47:21 +02:00
Witold Kręcicki
444a16bff9 Don't set UDP recv/send buffer sizes - use system defaults (unless explicitly defined)
(cherry picked from commit fa02f6438b)
2020-05-01 17:47:19 +02:00
Ondřej Surý
c56cd29bbb Use SO_REUSEPORT only on Linux, use SO_REUSEPORT_LB on FreeBSD
The SO_REUSEPORT socket option on Linux means something else on BSD
based systems.  On FreeBSD there's 1:1 option SO_REUSEPORT_LB, so we can
use that.

(cherry picked from commit 09ba47b067)
2020-05-01 16:50:06 +02:00
Witold Kręcicki
786a289dfb Don't free udp recv buffer if UV_UDP_MMSG_CHUNK is set
(cherry picked from commit 83049ceabf)
2020-05-01 11:27:46 +02:00
Ondřej Surý
cf7975400e Use UV_UDP_RECVMMSG to enable mmsg support in libuv if available
(cherry picked from commit d5356a40ff)
2020-05-01 11:27:46 +02:00
Witold Kręcicki
365636dbc9 netmgr refactoring: use generic functions when operating on sockets.
tcpdns used transport-specific functions to operate on the outer socket.
Use generic ones instead, and select the proper call in netmgr.c.
Make the missing functions (e.g. isc_nm_read) generic and add type-specific
calls (isc__nm_tcp_read). This is the preparation for netmgr TLS layer.

(cherry picked from commit 5fedd21e16)
2020-04-03 13:44:28 +02:00
Evan Hunt
d794d85ce1 comments
(cherry picked from commit 0b76d8a490)
2020-02-28 10:05:25 +01:00
Witold Kręcicki
bd33adfb67 use SO_INCOMING_CPU for UDP sockets
(cherry picked from commit 517e6eccdf)
2020-02-28 10:05:25 +01:00
Witold Kręcicki
4e422b3f10 We don't need to fill udp local address every time since we are bound to it.
(cherry picked from commit a658f7976c)
2020-02-28 10:05:25 +01:00
Witold Kręcicki
f7039eb27e Use the original threadid when sending a UDP packet to decrease probability of context switching
(cherry picked from commit eb874608c1)
2020-02-28 10:05:25 +01:00
Ondřej Surý
829b461c54 Merge branch '46-enforce-clang-format-rules' into 'master'
Start enforcing the clang-format rules on changed files

Closes #46

See merge request isc-projects/bind9!3063

(cherry picked from commit a04cdde45d)

d2b5853b Start enforcing the clang-format rules on changed files
618947c6 Switch AlwaysBreakAfterReturnType from TopLevelDefinitions to All
654927c8 Add separate .clang-format files for headers
5777c44a Reformat using the new rules
60d29f69 Don't enforce copyrights on .clang-format
2020-02-14 08:45:59 +00:00
Ondřej Surý
cdef20bb66 Merge branch 'each-style-tweak' into 'master'
adjust clang-format options to get closer to ISC style

See merge request isc-projects/bind9!3061

(cherry picked from commit d3b49b6675)

0255a974 revise .clang-format and add a C formatting script in util
e851ed0b apply the modified style
2020-02-14 05:35:29 +00:00
Ondřej Surý
2e55baddd8 Merge branch '46-add-curly-braces' into 'master'
Add curly braces using uncrustify and then reformat with clang-format back

Closes #46

See merge request isc-projects/bind9!3057

(cherry picked from commit 67b68e06ad)

36c6105e Use coccinelle to add braces to nested single line statement
d14bb713 Add copy of run-clang-tidy that can fixup the filepaths
056e133c Use clang-tidy to add curly braces around one-line statements
2020-02-13 21:28:35 +00:00
Ondřej Surý
c931d8e417 Merge branch '46-just-use-clang-format-to-reformat-sources' into 'master'
Reformat source code with clang-format

Closes #46

See merge request isc-projects/bind9!2156

(cherry picked from commit 7099e79a9b)

4c3b063e Import Linux kernel .clang-format with small modifications
f50b1e06 Use clang-format to reformat the source files
11341c76 Update the definition files for Windows
df6c1f76 Remove tkey_test (which is no-op anyway)
2020-02-12 14:51:18 +00:00
Witold Kręcicki
42f0e25a4c calling isc__nm_udp_send() on a non-udp socket is not 'unexpected', it's a critical failure 2020-01-20 22:28:36 +01:00
Witold Kręcicki
8d6dc8613a clean up some handle/client reference counting errors in error cases.
We weren't consistent about who should unreference the handle in
case of network error. Make it consistent so that it's always the
client code responsibility to unreference the handle - either
in the callback or right away if send function failed and the callback
will never be called.
2020-01-20 22:28:36 +01:00
Witold Kręcicki
f75a9e32be netmgr: fix a non-thread-safe access to libuv structures
In tcp and udp stoplistening code we accessed libuv structures
from a different thread, which caused a shutdown crash when named
was under load. Also added additional DbC checks making sure we're
in a proper thread when accessing uv_ functions.
2020-01-20 22:28:36 +01:00
Witold Kręcicki
16908ec3d9 netmgr: don't send to an inactive (closing) udp socket
We had a race in which n UDP socket could have been already closing
by libuv but we still sent data to it. Mark socket as not-active
when stopping listening and verify that socket is not active when
trying to send data to it.
2020-01-20 22:28:36 +01:00
Evan Hunt
90a1dabe74 count statistics in netmgr UDP code
- also restored a test in the statistics test which was changed when
  the netmgr was introduced because active sockets were not being
  counted.
2020-01-13 14:09:37 -08:00
Evan Hunt
80a5c9f5c8 associate socket stats counters with netmgr socket objects
- the socket stat counters have been moved from socket.h to stats.h.
- isc_nm_t now attaches to the same stats counter group as
  isc_socketmgr_t, so that both managers can increment the same
  set of statistics
- isc__nmsocket_init() now takes an interface as a paramter so that
  the address family can be determined when initializing the socket.
- based on the address family and socket type, a group of statistics
  counters will be associated with the socket - for example, UDP4Active
  with IPv4 UDP sockets and TCP6Active with IPv6 TCP sockets.  note
  that no counters are currently associated with TCPDNS sockets; those
  stats will be handled by the underlying TCP socket.
- the counters are not actually used by netmgr sockets yet; counter
  increment and decrement calls will be added in a later commit.
2020-01-13 14:05:02 -08:00
Evan Hunt
e38004457c netmgr fixes:
- use UV_{TC,UD}P_IPV6ONLY for IPv6 sockets, keeping the pre-netmgr
   behaviour.
 - add a new listening_error bool flag which is set if the child
   listener fails to start listening. This fixes a bug where named would
   hang if, e.g.,  we failed to bind to a TCP socket.
2020-01-13 10:54:17 -08:00
Evan Hunt
31b3980ef0 shorten some names
reduce line breaks and general unwieldiness by changing some
function, type, and parameter names.
2019-12-09 21:44:04 +01:00
Evan Hunt
b05194160b style, comments 2019-12-09 11:15:27 -08:00
Witold Kręcicki
5a65ec0aff Add uv_handle_{get,set}_data functions that's absent in pre-1.19 libuv to make code clearer.
This might be removed when we stop supporting older libuv versions.
2019-12-09 11:15:27 -08:00
Witold Kręcicki
bc5aae1579 netmgr: make tcp listening multithreaded.
When listening for TCP connections we create a socket, bind it
and then pass it over IPC to all threads - which then listen on
in and accept connections. This sounds broken, but it's the
official way of dealing with multithreaded TCP listeners in libuv,
and works on all platforms supported by libuv.
2019-12-09 11:15:27 -08:00
Evan Hunt
73cafd9d57 clean up comments 2019-11-17 18:59:40 -08:00
Witold Kręcicki
70397f9d92 netmgr: libuv-based network manager
This is a replacement for the existing isc_socket and isc_socketmgr
implementation. It uses libuv for asynchronous network communication;
"networker" objects will be distributed across worker threads reading
incoming packets and sending them for processing.

UDP listener sockets automatically create an array of "child" sockets
so each worker can listen separately.

TCP sockets are shared amongst worker threads.

A TCPDNS socket is a wrapper around a TCP socket, which handles the
the two-byte length field at the beginning of DNS messages over TCP.

(Other wrapper socket types can be implemented in the future to handle
DNS over TLS, DNS over HTTPS, etc.)
2019-11-07 11:55:37 -08:00