this corrects some style glitches such as:
```
long_function_call(arg, arg2, arg3, arg4, arg5, "str"
"ing");
```
...by adjusting the penalties for breaking strings and call
parameter lists.
While testing BIND 9 on arm64 8+ core machine, it was discovered that
the weak variants in fact does spuriously fail, we haven't observed that
on other architectures.
This commit replaces all non-loop usage of atomic_compare_exchange_weak
with atomic_compare_exchange_strong.
This commit simplifies a bit the lock management within dns_resolver_prime()
and prime_done() functions by means of turning resolver's attribute
"priming" into an atomic_bool and by creating only one dependent object on the
lock "primelock", namely the "primefetch" attribute.
By having the attribute "priming" as an atomic type, it save us from having to
use a lock just to test if priming is on or off for the given resolver context
object, within "dns_resolver_prime" function.
The "primelock" lock is still necessary, since dns_resolver_prime() function
internally calls dns_resolver_createfetch(), and whenever this function
succeeds it registers an event in the task manager which could be called by
another thread, namely the "prime_done" function, and this function is
responsible for disposing the "primefetch" attribute in the resolver object,
also for resetting "priming" attribute to false.
It is important that the invariant "priming == false AND primefetch == NULL"
remains constant, so that any thread calling "dns_resolver_prime" knows for sure
that if the "priming" attribute is false, "primefetch" attribute should also be
NULL, so a new fetch context could be created to fulfill this purpose, and
assigned to "primefetch" attribute under the lock protection.
To honor the explanation above, dns_resolver_prime is implemented as follow:
1. Atomically checks the attribute "priming" for the given resolver context.
2. If "priming" is false, assumes that "primefetch" is NULL (this is
ensured by the "prime_done" implementation), acquire "primelock"
lock and create a new fetch context, update "primefetch" pointer to
point to the newly allocated fetch context.
3. If "priming" is true, assumes that the job is already in progress,
no locks are acquired, nothing else to do.
To keep the previous invariant consistent, "prime_done" is implemented as follow:
1. Acquire "primefetch" lock.
2. Keep a reference to the current "primefetch" object;
3. Reset "primefetch" attribute to NULL.
4. Release "primefetch" lock.
5. Atomically update "priming" attribute to false.
6. Destroy the "primefetch" object by using the temporary reference.
This ensures that if "priming" is false, "primefetch" was already reset to NULL.
It doesn't make any difference in having the "priming" attribute not protected
by a lock, since the visible state of this variable would depend on the calling
order of the functions "dns_resolver_prime" and "prime_done".
As an example, suppose that instead of using an atomic for the "priming" attribute
we employed a lock to protect it.
Now suppose that "prime_done" function is called by Thread A, it is then preempted
before acquiring the lock, thus not reseting "priming" to false.
In parallel to that suppose that a Thread B is scheduled and that it calls
"dns_resolver_prime()", it then acquires the lock and check that "priming" is true,
thus it will consider that this resolver object is already priming and it won't do
any more job.
Conversely if the lock order was acquired in the other direction, Thread B would check
that "priming" is false (since prime_done acquired the lock first and set "priming" to false)
and it would initiate a priming fetch for this resolver.
An atomic variable wouldn't change this behavior, since it would behave exactly the
same, depending on the function call order, with the exception that it would avoid
having to use a lock.
There should be no side effects resulting from this change, since the previous
implementation employed use of the more general resolver's "lock" mutex, which
is used in far more contexts, but in the specifics of the "dns_resolver_prime"
and "prime_done" it was only used to protect "primefetch" and "priming" attributes,
which are not used in any of the other critical sections protected by the same lock,
thus having zero dependency on those variables.
Both clang-tidy and uncrustify chokes on statement like this:
for (...)
if (...)
break;
This commit uses a very simple semantic patch (below) to add braces around such
statements.
Semantic patch used:
@@
statement S;
expression E;
@@
while (...)
- if (E) S
+ { if (E) { S } }
@@
statement S;
expression E;
@@
for (...;...;...)
- if (E) S
+ { if (E) { S } }
@@
statement S;
expression E;
@@
if (...)
- if (E) S
+ { if (E) { S } }
There was a hard limit set on number of uvreq and nmhandles
that can be allocated by a pool, but we don't handle a situation
where we can't get an uvreq. Don't limit the number at all,
let the OS deal with it.
The memory ordering in the rwlock was all wrong, I am copying excerpts
from the https://en.cppreference.com/w/c/atomic/memory_order#Relaxed_ordering
for the convenience of the reader:
Relaxed ordering
Atomic operations tagged memory_order_relaxed are not synchronization
operations; they do not impose an order among concurrent memory
accesses. They only guarantee atomicity and modification order
consistency.
Release-Acquire ordering
If an atomic store in thread A is tagged memory_order_release and an
atomic load in thread B from the same variable is tagged
memory_order_acquire, all memory writes (non-atomic and relaxed atomic)
that happened-before the atomic store from the point of view of thread
A, become visible side-effects in thread B. That is, once the atomic
load is completed, thread B is guaranteed to see everything thread A
wrote to memory.
The synchronization is established only between the threads releasing
and acquiring the same atomic variable. Other threads can see different
order of memory accesses than either or both of the synchronized
threads.
Which basically means that we had no or weak synchronization between
threads using the same variables in the rwlock structure. There should
not be a significant performance drop because the critical sections were
already protected by:
while(1) {
if (relaxed_atomic_operation) {
break;
}
LOCK(lock);
if (!relaxed_atomic_operation) {
WAIT(sem, lock);
}
UNLOCK(lock)l
}
I would add one more thing to "Don't do your own crypto, folks.":
- Also don't do your own locking, folks.
The code for specifying OpenSSL PKCS#11 engine as part of the label
(e.g. -l "pkcs11:token=..." instead of -E pkcs11 -l "token=...")
was non-functional. This commit just cleans the related code.
Also disable the semantic patch as the code needs tweaks here and there because
some destroy functions might not destroy the object and return early if the
object is still in use.
The key-directory keyword actually does nothing right now but may
be useful in the future if we want to differentiate between key
directories or HSM keys, or if we want to speficy different
directories for different keys or policies. Make it optional for
the time being.
The keyword 'unlimited' can be used instead of PT0S which means the
same but is more comprehensible for users.
Also fix some redundant "none" parameters in the kasp test.
Creation of EVP_MD_CTX and EVP_PKEY is quite expensive, until
we fix the code to reuse the context and key we'll use our own
implementation of siphash.
Add checks to the kasp system test to verify CDNSKEY publication.
This test is not entirely complete, because when there is a CDNSKEY
available but there should not be one for KEY N, it is hard to tell
whether the existing CDNSKEY actually belongs to KEY N or another
key.
The check works if we expect a CDNSKEY although we cannot guarantee
that the CDNSKEY is correct: The test verifies existence, not
correctness of the record.
When you do a restart or reconfig of named, or rndc loadkeys, this
triggers the key manager to run. The key manager will check if new
keys need to be created. If there is an active key, and key rollover
is scheduled far enough away, no new key needs to be created.
However, there was a bug that when you just start to sign your zone,
it takes a while before the KSK becomes an active key. An active KSK
has its DS submitted or published, but before the key manager allows
that, the DNSKEY needs to be omnipresent. If you restart named
or rndc loadkeys in quick succession when you just started to sign
your zone, new keys will be created because the KSK is not yet
considered active.
Fix is to check for introducing as well as active keys. These keys
all have in common that their goal is to become omnipresent.
In system tests on Windows tool's local port can sometimes clash with
'named'. On Unix the system is poked for the minimal local port,
otherwise is set to 32768 as a sane minimum. For Windows we don't
poke but set a hardcoded limit; this change aligns the limit with
Unix and changes it to 32768.
1549 cleanup:
1550 if (dctx->dbiter != NULL)
1551 dns_dbiterator_destroy(&dctx->dbiter);
1552 if (dctx->db != NULL)
1553 dns_db_detach(&dctx->db);
CID 1452686 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking dctx suggests that it may
be null, but it has already been dereferenced on all paths
leading to the check.
1554 if (dctx != NULL)
1555 isc_mem_put(mctx, dctx, sizeof(*dctx));
389 else
CID 1452695 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking lcfg suggests that it may
be null, but it has already been dereferenced on all paths
leading to the check.
390 if (lcfg != NULL)
391 isc_logconfig_destroy(&lcfg);
6412 cleanup:
6413 dns_rdataset_disassociate(&neg);
6414 dns_rdataset_disassociate(&negsig);
CID 1452700 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking closest suggests that it
may be null, but it has already been dereferenced on all
paths leading to the check.
6415 if (closest != NULL)
6416 free_noqname(mctx, &closest);
13429 cleanup:
13430 cancel_refresh(zone);
CID 1452702 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking stub suggests that it may
be null, but it has already been dereferenced on all paths
leading to the check.
13431 if (stub != NULL) {
13432 stub->magic = 0;
6367cleanup:
6368 dns_rdataset_disassociate(&neg);
6369 dns_rdataset_disassociate(&negsig);
CID 1452704 (#1 of 1): Dereference before null check
(REVERSE_INULL) check_after_deref: Null-checking noqname
suggests that it may be null, but it has already been
dereferenced on all paths leading to the check.
6370 if (noqname != NULL)
6371 free_noqname(mctx, &noqname);
1401 }
CID 1453455 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking event suggests that it may be null,
but it has already been dereferenced on all paths leading to the check.
1402 if (event != NULL)
1403 isc_event_free(ISC_EVENT_PTR(&event));
128 return (ISC_R_SUCCESS);
129
CID 1456146 (#1 of 1): Structurally dead code (UNREACHABLE)
unreachable: This code cannot be reached: {
if (dst->labels[i] != N....
130 do {
402 ctx->serve_stale_ttl = 0;
notnull: At condition indentctx, the value of indentctx
cannot be NULL. dead_error_condition: The condition indentctx
must be true.
CID 1456147 (#1 of 1): Logically dead code (DEADCODE)
dead_error_line: Execution cannot reach the expression
default_indent inside this statement: ctx->indent = (indentctx
? ....
403 ctx->indent = indentctx ? *indentctx : default_indent;
1636 cleanup:
CID 1458130 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking buffer suggests that it may be
null, but it has already been dereferenced on all paths leading to
the check.
1637 if (buffer != NULL)
1638 isc_buffer_free(&buffer);
Found by LGTM.com (see below for description), and while it should not
happen as EDNS OPT RDLEN is uint16_t, the fix is easy. A little bit
of cleanup is included too.
> In a loop condition, comparison of a value of a narrow type with a value
> of a wide type may result in unexpected behavior if the wider value is
> sufficiently large (or small). This is because the narrower value may
> overflow. This can lead to an infinite loop.
This commit simplifies the cachedb rrset statistics in two ways:
- Introduce new rdtypecounter arithmetics, allowing bitwise
operations.
- Remove the special DLV statistic counter.
New rdtypecounter arithmetics
-----------------------------
"The rdtypecounter arithmetics is a brain twister". Replace the
enum counters with some defines. A rdtypecounter is now 8 bits for
RRtypes and 3 bits for flags:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | | S |NX| RRType |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
If the 8 bits for RRtype are all zero, this is an Other RRtype.
Bit 7 is the NXRRSET (NX) flag and indicates whether this is a
positive (0) or a negative (1) RRset.
Then bit 5 and 6 mostly tell you if this counter is for an active,
stale, or ancient RRtype:
S = 0x00 means Active
S = 0x01 means Stale
S = 0x10 means Ancient
Since a counter cannot be stale and ancient at the same time, we
treat S = 0x11 as a special case to deal with NXDOMAIN counters.
S = 0x11 indicates an NXDOMAIN counter and in this case the RRtype
field signals the expiry of this cached item:
RRType = 0 means Active
RRType = 1 means Stale
RRType = 2 means Ancient