Commit Graph

3755 Commits

Author SHA1 Message Date
Witold Kręcicki
eb874608c1 Use the original threadid when sending a UDP packet to decrease probability of context switching 2020-02-28 08:46:16 +01:00
Mark Andrews
8c983a7ebd Simplify hash computation to prevent pointer being classed as tainted.
mem.c:add_trace_entry() -> isc_hash_function() -> isc_siphash24()

129        for (; in != end; in += 8) {

	6. byte_swapping: Performing a byte swapping operation on
	in implies that it came from an external source, and is
	therefore tainted.

130                uint64_t m = U8TO64_LE(in);
2020-02-27 19:41:36 +00:00
Witold Kręcicki
00f2146265 Use isc_rwlock for isc_result tables 2020-02-27 07:58:48 +00:00
Matthijs Mekking
5cc33084af Make clang-format happy 2020-02-25 09:07:45 +01:00
Diego Fronza
9b4e28e155 Added a isc_glob() function that wraps glob() calls for POSIX systems
and implement a custom glob() function on Windows systems.
2020-02-24 13:46:39 -03:00
Michał Kępień
9f34e0d5af Bump library API versions for BIND 9.17 2020-02-24 10:56:47 +01:00
Evan Hunt
ba0313e649 fix spelling errors reported by Fossies. 2020-02-21 15:05:08 +11:00
Witold Krecicki
0fe149b2fa Fix lib/isc/tests/socket_test hangs 2020-02-20 11:39:15 +01:00
Witold Kręcicki
093af1a609 Use libuv-provided uv_{export,import} if available.
We were using our own versions of isc_uv_{export,import} functions
for multithreaded TCP listeners. Upcoming libuv version will
contain proper uv_{export,import} functions - use them if they're
available.
2020-02-18 12:17:55 +01:00
Witold Kręcicki
a0d36d7601 Make nm->recvbuf larger and heap allocated, to allow uv_recvmmsg usage.
Upcoming version of libuv will suport uv_recvmmsg and uv_sendmmsg. To
use uv_recvmmsg we need to provide a larger buffer and be able to
properly free it.
2020-02-18 12:17:55 +01:00
Witold Kręcicki
23bd04d2f1 Make isc_task_pause/isc_task_unpause thread safe.
isc_task_pause/unpause were inherently thread-unsafe - a task
could be paused only once by one thread, if the task was running
while we paused it it led to races. Fix it by making sure that
the task will pause if requested to, and by using a 'pause reference
counter' to count task pause requests - a task will be unpaused
iff all threads unpause it.

Don't remove from queue when pausing task - we lock the queue lock
(expensive), while it's unlikely that the task will be running -
and we'll remove it anyway in dispatcher
2020-02-18 09:22:04 +00:00
Evan Hunt
0002377dca adjust the clang-format penalties to reduce string breaking
this corrects some style glitches such as:
```
        long_function_call(arg, arg2, arg3, arg4, arg5, "str"
                                                        "ing");
```
...by adjusting the penalties for breaking strings and call
parameter lists.
2020-02-17 14:23:58 -08:00
Ondřej Surý
4cf275ba8a Replace non-loop usage of atomic_compare_exchange_weak with strong variant
While testing BIND 9 on arm64 8+ core machine, it was discovered that
the weak variants in fact does spuriously fail, we haven't observed that
on other architectures.

This commit replaces all non-loop usage of atomic_compare_exchange_weak
with atomic_compare_exchange_strong.
2020-02-16 18:09:19 +01:00
Diego Fronza
fa68a0d869 Added atomic_compare_exchange_strong_acq_rel macro
It is much better to read than:
atomic_compare_exchange_strong_explicit() with 5 arguments.
2020-02-16 18:09:19 +01:00
Ondřej Surý
3832e3ecc9 Fixup the missing clang-format bits 2020-02-16 17:34:24 +01:00
Diego Fronza
e7b36924e2 Fixed potential-lock-inversion
This commit simplifies a bit the lock management within dns_resolver_prime()
and prime_done() functions by means of turning resolver's attribute
"priming" into an atomic_bool and by creating only one dependent object on the
lock "primelock", namely the "primefetch" attribute.

By having the attribute "priming" as an atomic type, it save us from having to
use a lock just to test if priming is on or off for the given resolver context
object, within "dns_resolver_prime" function.

The "primelock" lock is still necessary, since dns_resolver_prime() function
internally calls dns_resolver_createfetch(), and whenever this function
succeeds it registers an event in the task manager which could be called by
another thread, namely the "prime_done" function, and this function is
responsible for disposing the "primefetch" attribute in the resolver object,
also for resetting "priming" attribute to false.

It is important that the invariant "priming == false AND primefetch == NULL"
remains constant, so that any thread calling "dns_resolver_prime" knows for sure
that if the "priming" attribute is false, "primefetch" attribute should also be
NULL, so a new fetch context could be created to fulfill this purpose, and
assigned to "primefetch" attribute under the lock protection.

To honor the explanation above, dns_resolver_prime is implemented as follow:
	1. Atomically checks the attribute "priming" for the given resolver context.
	2. If "priming" is false, assumes that "primefetch" is NULL (this is
           ensured by the "prime_done" implementation), acquire "primelock"
	   lock and create a new fetch context, update "primefetch" pointer to
	   point to the newly allocated fetch context.
	3. If "priming" is true, assumes that the job is already in progress,
	   no locks are acquired, nothing else to do.

To keep the previous invariant consistent, "prime_done" is implemented as follow:
	1. Acquire "primefetch" lock.
	2. Keep a reference to the current "primefetch" object;
	3. Reset "primefetch" attribute to NULL.
	4. Release "primefetch" lock.
	5. Atomically update "priming" attribute to false.
	6. Destroy the "primefetch" object by using the temporary reference.

This ensures that if "priming" is false, "primefetch" was already reset to NULL.

It doesn't make any difference in having the "priming" attribute not protected
by a lock, since the visible state of this variable would depend on the calling
order of the functions "dns_resolver_prime" and "prime_done".

As an example, suppose that instead of using an atomic for the "priming" attribute
we employed a lock to protect it.
Now suppose that "prime_done" function is called by Thread A, it is then preempted
before acquiring the lock, thus not reseting "priming" to false.
In parallel to that suppose that a Thread B is scheduled and that it calls
"dns_resolver_prime()", it then acquires the lock and check that "priming" is true,
thus it will consider that this resolver object is already priming and it won't do
any more job.
Conversely if the lock order was acquired in the other direction, Thread B would check
that "priming" is false (since prime_done acquired the lock first and set "priming" to false)
and it would initiate a priming fetch for this resolver.

An atomic variable wouldn't change this behavior, since it would behave exactly the
same, depending on the function call order, with the exception that it would avoid
having to use a lock.

There should be no side effects resulting from this change, since the previous
implementation employed use of the more general resolver's "lock" mutex, which
is used in far more contexts, but in the specifics of the "dns_resolver_prime"
and "prime_done" it was only used to protect "primefetch" and "priming" attributes,
which are not used in any of the other critical sections protected by the same lock,
thus having zero dependency on those variables.
2020-02-14 14:28:31 -03:00
Diego Fronza
c210413a8a Added atomic_compare_exchange_strong_acq_rel macro
It is much better to read than:
atomic_compare_exchange_strong_explicit() with 5 arguments.
2020-02-14 11:41:36 -03:00
Ondřej Surý
5777c44ad0 Reformat using the new rules 2020-02-14 09:31:05 +01:00
Ondřej Surý
654927c871 Add separate .clang-format files for headers 2020-02-14 09:31:05 +01:00
Evan Hunt
e851ed0bb5 apply the modified style 2020-02-13 15:05:06 -08:00
Ondřej Surý
056e133c4c Use clang-tidy to add curly braces around one-line statements
The command used to reformat the files in this commit was:

./util/run-clang-tidy \
	-clang-tidy-binary clang-tidy-11
	-clang-apply-replacements-binary clang-apply-replacements-11 \
	-checks=-*,readability-braces-around-statements \
	-j 9 \
	-fix \
	-format \
	-style=file \
	-quiet
clang-format -i --style=format $(git ls-files '*.c' '*.h')
uncrustify -c .uncrustify.cfg --replace --no-backup $(git ls-files '*.c' '*.h')
clang-format -i --style=format $(git ls-files '*.c' '*.h')
2020-02-13 22:07:21 +01:00
Ondřej Surý
36c6105e4f Use coccinelle to add braces to nested single line statement
Both clang-tidy and uncrustify chokes on statement like this:

for (...)
	if (...)
		break;

This commit uses a very simple semantic patch (below) to add braces around such
statements.

Semantic patch used:

@@
statement S;
expression E;
@@

while (...)
- if (E) S
+ { if (E) { S } }

@@
statement S;
expression E;
@@

for (...;...;...)
- if (E) S
+ { if (E) { S } }

@@
statement S;
expression E;
@@

if (...)
- if (E) S
+ { if (E) { S } }
2020-02-13 21:58:55 +01:00
Ondřej Surý
11341c7688 Update the definition files for Windows 2020-02-12 15:04:17 +01:00
Ondřej Surý
f50b1e0685 Use clang-format to reformat the source files 2020-02-12 15:04:17 +01:00
Witold Kręcicki
a133239698 Don't limit the size of uvreq/nmhandle pool artificially.
There was a hard limit set on number of uvreq and nmhandles
that can be allocated by a pool, but we don't handle a situation
where we can't get an uvreq. Don't limit the number at all,
let the OS deal with it.
2020-02-11 12:10:57 +00:00
Ondřej Surý
b43f5e0238 Convert all atomic operations in isc_rwlock to release-acquire memory ordering
The memory ordering in the rwlock was all wrong, I am copying excerpts
from the https://en.cppreference.com/w/c/atomic/memory_order#Relaxed_ordering
for the convenience of the reader:

  Relaxed ordering

  Atomic operations tagged memory_order_relaxed are not synchronization
  operations; they do not impose an order among concurrent memory
  accesses. They only guarantee atomicity and modification order
  consistency.

  Release-Acquire ordering

  If an atomic store in thread A is tagged memory_order_release and an
  atomic load in thread B from the same variable is tagged
  memory_order_acquire, all memory writes (non-atomic and relaxed atomic)
  that happened-before the atomic store from the point of view of thread
  A, become visible side-effects in thread B. That is, once the atomic
  load is completed, thread B is guaranteed to see everything thread A
  wrote to memory.

  The synchronization is established only between the threads releasing
  and acquiring the same atomic variable. Other threads can see different
  order of memory accesses than either or both of the synchronized
  threads.

Which basically means that we had no or weak synchronization between
threads using the same variables in the rwlock structure.  There should
not be a significant performance drop because the critical sections were
already protected by:

  while(1) {
    if (relaxed_atomic_operation) {
      break;
    }
    LOCK(lock);
    if (!relaxed_atomic_operation) {
      WAIT(sem, lock);
    }
    UNLOCK(lock)l
  }

I would add one more thing to "Don't do your own crypto, folks.":

  - Also don't do your own locking, folks.
2020-02-11 11:10:55 +01:00
Ondřej Surý
bc1d4c9cb4 Clear the pointer to destroyed object early using the semantic patch
Also disable the semantic patch as the code needs tweaks here and there because
some destroy functions might not destroy the object and return early if the
object is still in use.
2020-02-09 18:00:17 -08:00
Witold Kręcicki
d708370db4 Fix atomics usage for mutexatomics 2020-02-08 12:34:19 -08:00
Ondřej Surý
41fe9b7a14 Formatting issues found by local coccinelle run 2020-02-08 03:12:09 -08:00
Ondřej Surý
0dfec4eef7 Remove #include <config.h> from netmgr.h 2020-02-08 03:12:09 -08:00
Witold Kręcicki
9371bad268 Disable OpenSSL siphash.
Creation of EVP_MD_CTX and EVP_PKEY is quite expensive, until
we fix the code to reuse the context and key we'll use our own
implementation of siphash.
2020-02-07 11:55:17 +00:00
Michal Nowak
7f0fcb8a3e Windows: Prevent tools from clashing with named in system tests
In system tests on Windows tool's local port can sometimes clash with
'named'. On Unix the system is poked for the minimal local port,
otherwise is set to 32768 as a sane minimum. For Windows we don't
poke but set a hardcoded limit; this change aligns the limit with
Unix and changes it to 32768.
2020-02-05 10:03:09 +00:00
Mark Andrews
7ba1af0280 'lcfg' must be non NULL, remove test.
389        else

	CID 1452695 (#1 of 1): Dereference before null check (REVERSE_INULL)
	check_after_deref: Null-checking lcfg suggests that it may
	be null, but it has already been dereferenced on all paths
	leading to the check.

390                if (lcfg != NULL)
391                        isc_logconfig_destroy(&lcfg);
2020-02-05 18:37:17 +11:00
Mark Andrews
0be2dc9f22 break was on wrong line.
959                break;

	CID 1457872 (#1 of 1): Structurally dead code (UNREACHABLE)
	unreachable: This code cannot be reached:
	isc__nm_incstats(sock->mgr,....

 960                isc__nm_incstats(sock->mgr, sock->statsindex[STATID_ACTIVE]);
 961        default:
2020-02-05 18:37:17 +11:00
Matthijs Mekking
b8be29fee6 Add a note on memory allocation
isc__memalloc_t must deal with memory allocation failure
and must never return NULL.
2020-02-04 11:09:22 +01:00
Ondřej Surý
05ae2e48ab Change pk11_mem_get() so it cannot soft-fail 2020-02-04 11:09:22 +01:00
Ondřej Surý
478e4ac201 Make the DbC checks to be consistent and cppcheck clean 2020-02-04 11:09:22 +01:00
Mark Andrews
c65c06301c delay assignment until after REQUIRE 2020-02-04 11:09:22 +01:00
Mark Andrews
7b948c7335 remove brackets 2020-02-04 11:09:22 +01:00
Mark Andrews
6c2e138d7a simplify ISC_LIKELY/ISC_UNLIKELY for CPPCHECK 2020-02-04 11:09:22 +01:00
Mark Andrews
668a972d1e simplify RUNTIME_CHECK for cppcheck 2020-02-04 11:09:22 +01:00
Ondřej Surý
c73e5866c4 Refactor the isc_buffer_allocate() usage using the semantic patch
The isc_buffer_allocate() function now cannot fail with ISC_R_MEMORY.
This commit removes all the checks on the return code using the semantic
patch from previous commit, as isc_buffer_allocate() now returns void.
2020-02-03 08:29:00 +01:00
Ondřej Surý
4459745ff2 isc_buffer_allocate() can't fail now, change the return type to void 2020-02-03 08:29:00 +01:00
Ondřej Surý
5eb3f71a3e Refactor the isc_mempool_create() usage using the semantic patch
The isc_mempool_create() function now cannot fail with ISC_R_MEMORY.
This commit removes all the checks on the return code using the semantic
patch from previous commit, as isc_mempool_create() now returns void.
2020-02-03 08:27:16 +01:00
Ondřej Surý
de123a67d6 isc_mempool_create cannot fail, change the return type to void 2020-02-02 08:39:45 +01:00
Ondřej Surý
5b448996e5 Clean the ENTER/EXIT/NOTICE debugging from production code 2020-01-22 11:13:53 +11:00
Ondřej Surý
9643a62dd5 Refactor parts of isc_httpd and isc_httpd for better readability and safety 2020-01-22 11:13:53 +11:00
Mark Andrews
7c3f419d66 add ISC_MAGIC and reference counting to httpd and httpdmgr 2020-01-22 11:13:53 +11:00
Witold Kręcicki
1beba0fa59 Unit test for the taskmgr pause/unpause race 2020-01-21 10:06:19 +01:00
Witold Kręcicki
e1c4a69197 Fix a race in taskmgr between worker and task pausing/unpausing.
To reproduce the race - create a task, send two events to it, first one
must take some time. Then, from the outside, pause(), unpause() and detach()
the task.
When the long-running event is processed by the task it is in
task_state_running state. When we called pause() the state changed to
task_state_paused, on unpause we checked that there are events in the task
queue, changed the state to task_state_ready and enqueued the task on the
workers readyq. We then detach the task.
The dispatch() is done with processing the event, it processes the second
event in the queue, and then shuts down the task and frees it (as it's not
referenced anymore). Dispatcher then takes the, already freed, task from
the queue where it was wrongly put, causing an use-after free and,
subsequently, either an assertion failure or a segmentation fault.
The probability of this happening is very slim, yet it might happen under a
very high load, more probably on a recursive resolver than on an
authoritative.
The fix introduces a new 'task_state_pausing' state - to which tasks
are moved if they're being paused while still running. They are moved
to task_state_paused state when dispatcher is done with them, and
if we unpause a task in paused state it's moved back to task_state_running
and not requeued.
2020-01-21 10:06:19 +01:00