Replace the ISC_LIST based deadnodes implementation with isc_queue which
is wait-free and we don't have to acquire neither the tree nor node lock
to append nodes to the queue and the cleaning process can also
copy (splice) the list into a local copy without acquiring the list.
Currently, there's little benefit to this as we need to hold those
locks anyway, but in the future as we move to RCU based implementation,
this will be ready.
To align the cleaning with our event loop based model, remove the
hardcoded count for the node locks and use the number of the event loops
instead. This way, each event loop can have its own cleaning as part of
the process. Use uniform random numbers to spread the nodes evenly
between the buckets (instead of hashing the domain name).
Add an isc_queue implementation that hides the gory details of cds_wfcq
into more neat API. The same caveats as with cds_wfcq.
TODO: Add documentation to the API.
When the cache's memory context was in over memory state when the
cache was flushed it resulted in LRU cleaning removing newly entered
data in the new cache straight away until the old cache had been
destroyed enough to take it out of over memory state. When flushing
the cache create a new memory context for the new db to prevent this.
The `axfr_makedb()` didn't set the loop on the newly created database,
effectively killing delayed cleaning on such database. Move the
database creation into dns_zone API that knows all the gory details of
creating new database suitable for the zone.
The rdataslab.c was full of code like this:
length = raw[0] * 256 + raw[1];
and
count2 = *current2++ * 256;
count2 += *current2++;
Refactor code like this into peek_uint16() and get_uint16 macros
to prevent code repetition and possible mistakes when copy and
pasting the same code over and over.
As a side note for an entertainment of a careful reader of the commit
messages: The byte manipulation was changed from multiplication and
addition to shift with or.
The difference in the assembly looks like this:
MUL and ADD:
movzx eax, BYTE PTR [rdi]
movzx edi, BYTE PTR [rdi+1]
sal eax, 8
or edi, eax
SHIFT and OR:
movzx edi, WORD PTR [rdi]
rol di, 8
movzx edi, di
If the result and/or buffer is then being used after the macro call,
there's more differences in favor of the SHIFT+OR solution.
The detach function declaration in `ISC__REFCOUNT_TRACE_DECL` had an
returned an accidental implicit int. While not allowed since C99, it
became an error by default in GCC 14.
`ISC_REFCOUNT_TRACE_IMPL` and `ISC_REFCOUNT_STATIC_TRACE_IMPL` expanded
into the wrong macros, trying to declare it again with the wrong number
of parameters.
- duplicated question
- duplicated answer
- qtype as an answer
- two question types
- question names
- nsec3 bad owner name
- short record
- short question
- mismatching question class
- bad record owner name
- mismatched class in record
- mismatched KEY class
- OPT wrong owner name
- invalid RRSIG "covers" type
- UPDATE malformed delete type
- TSIG wrong class
- TSIG not the last record
there were TSAN error reports because of conflicting uses of
node->dirty and node->nsec, which were in the same qword.
this could be resolved by separating them, but we could also
make them into atomic values and remove some node locking.
The fix_iterator() function had a lot of bugs in it and while fixing
them, the number of corner cases and the complexity of the function
got out of hand. Rewrite the function with the following modifications:
The function now requires that the iterator is pointing to a leaf node.
This removes the cases we have to deal when the iterator was left on a
dead branch.
From the leaf node, pop up the iterator stack until we encounter the
branch where the offset point is before the point where the search key
differs. This will bring us to the right branch, or at the first
unmatched node, in which case we pop up to the parent branch. From
there it is easier to retrieve the predecessor.
Once we are at the right branch, all we have to do is find the right
twig (which is either the twig for the character at the position where
the search key differs, or the previous twig) and walk down from there
to the greatest leaf or, in case there is no good twig, get the
previous twig from the successor and get the greatest leaf from there.
If there is no previous twig to select in this branch, because every
leaf from this branch node is greater than the one we wanted, we need
to pop up the stack again and resume at the parent branch. This is
achieved by calling prevleaf().
Move the fix_iterator out of the loop and only call it when we found
a leaf node. This leaf node may be the wrong leaf node, but fix_iterator
should correct that.
Also, when we don't need to set the iterator, just get any leaf. We
only need to have a leaf for the qpkey_compare and the end result does
not matter if compare was against an ancestor leaf or any leaf below
that point.
An assertion failure would be triggered when sending the TCP data ends
after the TCP reading gets closed. Implement proper reference counting
for the isc_httpd object.
When searching for a requested name in dns_qp_lookup(), we may add
a leaf node to the QP chain, then subsequently determine that the
branch we were on was a dead end. When that happens, the chain can be
left holding a pointer to a node that is *not* an ancestor of the
requested name.
We correct for this by unwinding any chain links with an offset
value greater or equal to that of the node we found.
The code below the if/else construction could only be run if the 'if'
code path was taken. Move the code into the 'if' code block so that
it is more easier to read.
The high-water allows administrators to better tune the recursive
clients limit without having to to poll the statistics channel in high
rates to get this number.
Returning the value allows for better high-water tracking without
running into edge cases like the following:
0. The counter is at value X
1. Increment the value (X+1)
2. The value is decreased multiple times in another threads (X+1-Y)
3. Get the value (X+1-Y)
4. Update-if-greater misses the X+1 value which should have been the
high-water
When parsing a zonefile named-checkzone (and others) could loop
infinitely if a directory was $INCLUDED. Record the error and treat
as EOF when looking for multiple errors.
This was found by Eric Sesterhenn from X41.
In nibble mode if the value to be converted was negative the parser
would loop forever. Process the value as an unsigned int instead
of as an int to prevent sign extension when shifting.
This was found by Eric Sesterhenn from X41.
An RPZ response's SOA record TTL is set to 1 instead of the SOA TTL,
a boolean value is passed on to query_addsoa, which is supposed to be
a TTL value. I don't see what value is appropriate to be used for
overriding, so we will pass UINT32_MAX.
The expireheader() call in the expire_ttl_headers() function
is erroneous as it passes the 'nlocktypep' and 'tlocktypep'
arguments in a wrong order, which then causes an assertion
failure.
Fix the order of the arguments so it corresponds to the function's
prototype.
in QP keys, characters that are not common in DNS names are
encoded as two-octet sequences. this caused a glitch in iterator
positioning when some lookups failed.
consider the case where we're searching for "\009" (represented
in a QP key as {0x03, 0x0c}) and a branch exists for "\000"
(represented as {0x03, 0x03}). we match on the 0x03, and continue
to search down. at the point where we find we have no match,
we need to pop back up to the branch before the 0x03 - which may
be multiple levels up the stack - before we position the iterator.
there were some structure names used in qpcache.c and qpzone.c that
were too similar to each other and could be confusing when debugging.
they have been changed as follows:
in qcache.c:
- changed_t was unused, and has been removed
- search_t -> qpc_search_t
- qpdb_rdatasetiter_t -> qpc_rditer_t
- qpdb_dbiterator_t -> qpc_dbiter_t
in qpzone.c:
- qpdb_changed_t -> qpz_changed_t
- qpdb_changedlist_t -> qpz_changedlist_t
- qpdb_version_t -> qpz_version_t
- qpdb_versionlist_t -> qpz_versionlist_t
- qpdb_search_t -> qpz_search_t
- qpdb_load_t -> qpz_search_t
when calling dns_qp_lookup() from qpcache, instead of passing
'foundname' so that a name would be constructed from the QP key,
we now just use the name field in the node data. this makes
dns_qp_lookup() run faster.
the same optimization has also been added to qpzone.
the documentation for dns_qp_lookup() has been updated to
discuss this performance consideration.
when the cache is over memory, we purge from the LRU list until
we've freed the approximate amount of memory to be added. this
approximation could fail because the memory allocated for nodenames
wasn't being counted.
add a dns_name_size() function so we can look up the size of nodenames,
then add that to the purgesize calculation.
in a cache database, unlike zones, NSEC3 records are stored in
the main tree. it is not necessary to maintain a separate 'nsec3'
tree, nor to have code in the dbiterator implementation to traverse
from one tree to another.
(if we ever implement synth-from-dnssec using NSEC3 records, we'll
need to revert this change. in the meantime, simpler code is better.)
the QP database doesn't support relative names as the RBTDB did, so
there's no need for a 'new_origin' flag or to handle `DNS_R_NEWORIGIN`
result codes.
- remove unneeded struct members and misleading comments.
- remove unused parameters for static functions.
- rename 'find_callback' to 'delegating' for consistency with qpzone;
the find callback mechanism is not used in QP databases.
- change dns_qpdata_t to qpcnode_t (QP cache node), and dns_qpdb_t to
qpcache_t, as these types are only accessed locally.
- also change qpdata_t in qpzone.c to qpznode_t (QP zone node), for
consistency.
- make the refcount declarations for qpcnode_t and qpznode_t static,
using the new ISC_REFCOUNT_STATIC macros.