Commit Graph

3713 Commits

Author SHA1 Message Date
Ondřej Surý
4459745ff2 isc_buffer_allocate() can't fail now, change the return type to void 2020-02-03 08:29:00 +01:00
Ondřej Surý
5eb3f71a3e Refactor the isc_mempool_create() usage using the semantic patch
The isc_mempool_create() function now cannot fail with ISC_R_MEMORY.
This commit removes all the checks on the return code using the semantic
patch from previous commit, as isc_mempool_create() now returns void.
2020-02-03 08:27:16 +01:00
Ondřej Surý
de123a67d6 isc_mempool_create cannot fail, change the return type to void 2020-02-02 08:39:45 +01:00
Ondřej Surý
5b448996e5 Clean the ENTER/EXIT/NOTICE debugging from production code 2020-01-22 11:13:53 +11:00
Ondřej Surý
9643a62dd5 Refactor parts of isc_httpd and isc_httpd for better readability and safety 2020-01-22 11:13:53 +11:00
Mark Andrews
7c3f419d66 add ISC_MAGIC and reference counting to httpd and httpdmgr 2020-01-22 11:13:53 +11:00
Witold Kręcicki
1beba0fa59 Unit test for the taskmgr pause/unpause race 2020-01-21 10:06:19 +01:00
Witold Kręcicki
e1c4a69197 Fix a race in taskmgr between worker and task pausing/unpausing.
To reproduce the race - create a task, send two events to it, first one
must take some time. Then, from the outside, pause(), unpause() and detach()
the task.
When the long-running event is processed by the task it is in
task_state_running state. When we called pause() the state changed to
task_state_paused, on unpause we checked that there are events in the task
queue, changed the state to task_state_ready and enqueued the task on the
workers readyq. We then detach the task.
The dispatch() is done with processing the event, it processes the second
event in the queue, and then shuts down the task and frees it (as it's not
referenced anymore). Dispatcher then takes the, already freed, task from
the queue where it was wrongly put, causing an use-after free and,
subsequently, either an assertion failure or a segmentation fault.
The probability of this happening is very slim, yet it might happen under a
very high load, more probably on a recursive resolver than on an
authoritative.
The fix introduces a new 'task_state_pausing' state - to which tasks
are moved if they're being paused while still running. They are moved
to task_state_paused state when dispatcher is done with them, and
if we unpause a task in paused state it's moved back to task_state_running
and not requeued.
2020-01-21 10:06:19 +01:00
Witold Kręcicki
fd8788eb94 Fix possible race in socket destruction.
When two threads unreferenced handles coming from one socket while
the socket was being destructed we could get a use-after-free:
Having handle H1 coming from socket S1, H2 coming from socket S2,
S0 being a parent socket to S1 and S2:

Thread A                             Thread B
Unref handle H1                      Unref handle H2
Remove H1 from S1 active handles     Remove H2 from S2 active handles
nmsocket_maybe_destroy(S1)           nmsocket_maybe_destroy(S2)
nmsocket_maybe_destroy(S0)           nmsocket_maybe_destroy(S0)
LOCK(S0->lock)
Go through all children, figure
out that we have no more active
handles:
sum of S0->children[i]->ah == 0
UNLOCK(S0->lock)
destroy(S0)
                                     LOCK(S0->lock)
                                      - but S0 is already gone
2020-01-20 22:28:36 +01:00
Witold Kręcicki
42f0e25a4c calling isc__nm_udp_send() on a non-udp socket is not 'unexpected', it's a critical failure 2020-01-20 22:28:36 +01:00
Witold Kręcicki
8d6dc8613a clean up some handle/client reference counting errors in error cases.
We weren't consistent about who should unreference the handle in
case of network error. Make it consistent so that it's always the
client code responsibility to unreference the handle - either
in the callback or right away if send function failed and the callback
will never be called.
2020-01-20 22:28:36 +01:00
Witold Kręcicki
f75a9e32be netmgr: fix a non-thread-safe access to libuv structures
In tcp and udp stoplistening code we accessed libuv structures
from a different thread, which caused a shutdown crash when named
was under load. Also added additional DbC checks making sure we're
in a proper thread when accessing uv_ functions.
2020-01-20 22:28:36 +01:00
Witold Kręcicki
16908ec3d9 netmgr: don't send to an inactive (closing) udp socket
We had a race in which n UDP socket could have been already closing
by libuv but we still sent data to it. Mark socket as not-active
when stopping listening and verify that socket is not active when
trying to send data to it.
2020-01-20 22:28:36 +01:00
Tinderbox User
05f2241fcb prep 9.15.8 2020-01-16 08:01:20 +00:00
Witold Kręcicki
eda4300bbb netmgr: have a single source of truth for tcpdns callback
We pass interface as an opaque argument to tcpdns listening socket.
If we stop listening on an interface but still have in-flight connections
the opaque 'interface' is not properly reference counted, and we might
hit a dead memory. We put just a single source of truth in a listening
socket and make the child sockets use that instead of copying the
value from listening socket. We clean the callback when we stop listening.
2020-01-15 17:22:13 +01:00
Witold Kręcicki
0d637b5985 netmgr: we can't uv_close(sock->timer) when in sock->timer close callback 2020-01-15 14:56:40 +01:00
Witold Kręcicki
525c583145 netmgr:
- isc__netievent_storage_t was to small to contain
   isc__netievent__socket_streaminfo_t on Windows
 - handle isc_uv_export and isc_uv_import errors properly
 - rewrite isc_uv_export and isc_uv_import on Windows
2020-01-15 14:08:44 +01:00
Witold Kręcicki
493b6a9f33 Make hazard pointers max_threads configurable at runtime.
hp implementation requires an object for each thread accessing
a hazard pointer. previous implementation had a hardcoded
HP_MAX_THREAD value of 128, which failed on machines with lots of
CPU cores (named uses 3n threads). We make isc__hp_max_threads
configurable at startup, with the value set to 4*named_g_cpus.
It's also important for this value not to be too big as we do
linear searches on a list.
2020-01-14 21:26:57 +01:00
Ondřej Surý
3000f14eba Use isc_refcount_increment0() when reusing handle or socket; remove extra DbC checks 2020-01-14 13:12:13 +01:00
Ondřej Surý
4d1e3b1e10 Move the NO_SANITIZE attribute to a correct place (gcc is picky) 2020-01-14 13:12:13 +01:00
Ondřej Surý
c4aec79079 When compiling with MSVC, use inline functions for isc_refcount_increment/decrement 2020-01-14 13:12:13 +01:00
Ondřej Surý
49976947ab Restore DbC checks in isc_refcount API
The isc_refcount API that provides reference counting lost DbC checks for
overflows and underflows in the isc_refcount_{increment,decrement} functions.

The commit restores the overflow check in the isc_refcount_increment and
underflows check in the isc_refcount_decrement by checking for the previous
value to not be on the boundary.
2020-01-14 13:12:13 +01:00
Ondřej Surý
6afa99362a Remove duplicate INSIST checks for isc_refcount API
This commits removes superfluous checks when using the isc_refcount API.

Examples of superfluous checks:

1. The isc_refcount_decrement function ensures there was not underflow,
   so this check is superfluous:

    INSIST(isc_refcount_decrement(&r) > 0);

2 .The isc_refcount_destroy() includes check whether the counter
   is zero, therefore this is superfluous:

    INSIST(isc_refcount_decrement(&r) == 1 && isc_refcount_destroy(&r));
2020-01-14 13:12:13 +01:00
Ondřej Surý
e711b0304f Convert more reference counting to isc_refcount API 2020-01-14 13:12:13 +01:00
Ondřej Surý
7c3e342935 Use isc_refcount_increment0() where appropriate 2020-01-14 13:12:13 +01:00
Ondřej Surý
fbf9856f43 Add isc_refcount_destroy() as appropriate 2020-01-14 13:12:13 +01:00
Witold Krecicki
6ee1461cc3 netmgr: handle errors properly in accept_connection.
If a connection was closed early (right after accept()) an assertion
that assumed that the connection was still alive could be triggered
in accept_connection. Handle those errors properly and not with
assertions, free all the resources afterwards.
2020-01-14 11:03:06 +01:00
Evan Hunt
5234a8e00a count statistics in netmgr TCP code 2020-01-13 14:09:42 -08:00
Evan Hunt
90a1dabe74 count statistics in netmgr UDP code
- also restored a test in the statistics test which was changed when
  the netmgr was introduced because active sockets were not being
  counted.
2020-01-13 14:09:37 -08:00
Evan Hunt
80a5c9f5c8 associate socket stats counters with netmgr socket objects
- the socket stat counters have been moved from socket.h to stats.h.
- isc_nm_t now attaches to the same stats counter group as
  isc_socketmgr_t, so that both managers can increment the same
  set of statistics
- isc__nmsocket_init() now takes an interface as a paramter so that
  the address family can be determined when initializing the socket.
- based on the address family and socket type, a group of statistics
  counters will be associated with the socket - for example, UDP4Active
  with IPv4 UDP sockets and TCP6Active with IPv6 TCP sockets.  note
  that no counters are currently associated with TCPDNS sockets; those
  stats will be handled by the underlying TCP socket.
- the counters are not actually used by netmgr sockets yet; counter
  increment and decrement calls will be added in a later commit.
2020-01-13 14:05:02 -08:00
Witold Kręcicki
20c077afc5 Disable pktinfo for ipv6 on all unices
If pktinfo were supported then we could listen on :: for ipv6 and get
the information about the destination address from pktinfo structure passed
in recvmsg but this method is not portable and libuv doesn't support it - so
we need to listen on all interfaces.
We should verify that this doesn't impact performance (we already do it for
ipv4) and either remove all the ipv6pktinfo detection code or think of fixing
libuv.
2020-01-13 22:00:20 +01:00
Evan Hunt
e38004457c netmgr fixes:
- use UV_{TC,UD}P_IPV6ONLY for IPv6 sockets, keeping the pre-netmgr
   behaviour.
 - add a new listening_error bool flag which is set if the child
   listener fails to start listening. This fixes a bug where named would
   hang if, e.g.,  we failed to bind to a TCP socket.
2020-01-13 10:54:17 -08:00
Witold Kręcicki
67c1ca9a79 Use isc_uv_export() to pass bound TCP listening socket to child listeners.
For multithreaded TCP listening we need to pass a bound socket to all
listening threads. Instead of using uv_pipe handle passing method which
is quite complex (lots of callbacks, each of them with its own error
handling) we now use isc_uv_export() to export the socket, pass it as a
member of the isc__netievent_tcpchildlisten_t structure, and then
isc_uv_import() it in the child thread, simplifying the process
significantly.
2020-01-13 10:53:44 -08:00
Witold Kręcicki
c6c0a9fdba Add isc_uv_export()/isc_uv_import() functions to libuv compatibility layer.
These functions can be used to pass a uv handle between threads in a
safe manner. The other option is to use uv_pipe and pass the uv_handle
via IPC, which is way more complex.  uv_export() and uv_import() functions
existed in libuv at some point but were removed later. This code is
based on the original removed code.

The Windows version of the code uses two functions internal to libuv;
a patch for libuv is attached for exporting these functions.
2020-01-13 10:52:07 -08:00
Ondřej Surý
afc4867e99 Remove use of PTHREAD_MUTEX_INITIALIZER in tests
Remove the pthread specific static initializer in favor of dynamic
initialization.
2020-01-13 09:09:03 +01:00
Ondřej Surý
4f7d1298a8 Use isc_threadresult_t instead of pthread specific void * return type
The ISC thread API already defines isc_threadresult_t type,
but we are using a pthread specific return type (void *).
2020-01-13 09:08:48 +01:00
Michal Nowak
640dd566e9 Add out-of-tree build to the CI
Fixes #1546.
2020-01-09 10:16:06 +01:00
Ondřej Surý
17deac8b8e Remove unused isc_log_get() function 2020-01-08 11:53:04 +01:00
Ondřej Surý
91e1981988 Add missing locks to isc_logconfig_get and disable thread sanitizer for isc_log_wouldlog 2020-01-08 11:53:04 +01:00
Ondřej Surý
255134166c Add conditional ISC_NO_SANITIZE macro to disable TSAN for function 2020-01-08 11:53:04 +01:00
Ondřej Surý
5746172da3 Convert task flags to C11 atomics 2019-12-13 07:10:25 +01:00
Tinderbox User
e088272172 prep 9.15.7 2019-12-12 23:59:39 +00:00
Diego Fronza
ed9853e739 Fix tcp-highwater stats updating
After the network manager rewrite, tcp-higwater stats was only being
updated when a valid DNS query was received over tcp.

It turns out tcp-quota is updated right after a tcp connection is
accepted, before any data is read, so in the event that some client
connect but don't send a valid query, it wouldn't be taken into
account to update tcp-highwater stats, that is wrong.

This commit fix tcp-highwater to update its stats whenever a tcp connection
is established, independent of what happens after (timeout/invalid
request, etc).
2019-12-12 11:23:10 -08:00
Ondřej Surý
d5b6db3b09 Additionally lock accessing the ISC_LISTs in free_socket() 2019-12-12 13:08:34 +01:00
Ondřej Surý
d35739d516 Add missing isc_refcount_destroy and lock the socket ISC_LISTS in destroy() 2019-12-12 12:59:39 +01:00
Mark Andrews
ad12c2f3b0 address lock order inversion 2019-12-12 17:43:03 +11:00
Ondřej Surý
1fa0deb4ea Add isc_refcount_destroy() call to nm_handle_free() 2019-12-10 13:43:18 +01:00
Ondřej Surý
71fe7d3c25 Add isc_refcount_destroy() call to nm_destroy() 2019-12-10 13:43:18 +01:00
Ondřej Surý
3248de7785 Correct the DbC check order in isc__nm_async_tcpchildstop() 2019-12-10 13:43:18 +01:00
Witold Kręcicki
ccd44b69e5 Fix a potential lock-order-inversion in tcp listening code 2019-12-10 10:05:15 +01:00