The call to dns_view_flushcache() is done under exclusive mode, but we
still need to check if view->adb is still attached before calling
dns_adb_flush() because the shutdown might have been already
initialized. This most likely only a theoretical problem on shutdown
because there's either no way how to initiate cache flush when shutting
down or very slim window where the `rndc flush` would have to hit the
slim time during named shutdown.
When starting priming from dns_view_find(), the dns_view shutdown could
be initiated by different thread, detaching from the resolver. Use
dns_view_getresolver() to attach to the resolver under view->lock, so we
don't try to call dns_resolver_prime() with NULL pointer.
There are more accesses to view->resolver, (and also view->adb and
view->requestmgr that suffer from the same problem) in the dns_view
module, but they are all done in exclusive mode or under a view->lock.
Replace the use of isc_ht API with isc_hashmap API in the dns_resolver
implementation. This requires extending the fctxbucket_t structure to
include keysize and copy of the key because the isc_hashmap API needs
the raw key in case of resizing the hashmap table.
Replace the use of isc_ht API with isc_hashmap API in the dns_adb
database implementation. This requires extending the
dns_adbnamebucket_t and dns_adbentrybucket_t structures to include
keysize and copy of the key because the isc_hashmap API needs the raw
key in case of resizing the hashmap table.
Previously, the tree read lock could be upgraded to a write lock in
decrement_reference() and then downgraded back to read lock in
dereference_iter_node(). When the use of isc_rwlock_downgrade() was
removed, the downgrade was changed to a simple unlock+lock. This allows
some delete operations to sneak in and delete nodes that the iterator
expects to be in place.
Expand decrement_reference() so the caller can indicate whether the
tree read lock should be upgraded, and disallow the upgrade when
calling from dereference_iter_node(), so there will be no need to
release the lock afterward.
The zone_refreshkeys() could run before the zone_shutdown(), but after
the last .erefs has been "detached" causing assertion failure when doing
dns_zone_attach(). Remove the use of .erefs (dns_zone_attach/detach)
and replace it with using the .irefs and additional checks whether the
zone is exiting in the callbacks.
There was an exception for dnssec-policy that allowed DNSSEC in the
unsigned version of the zone. This however causes a crash if the
zone switches from dynamic to inline-signing in the case of NSEC3,
because we are now trying to add an NSEC3 record to a non-NSEC3 node.
This is because BIND expects none of the records in the unsigned
version of the zone to be NSEC3.
Remove the exception for dnssec-policy when copying non DNSSEC
records, but do allow for DNSKEY as this may be a published DNSKEY
from a different provider.
The dead nodes might get reactivated during the db iterator walks the
version of the tree, so we can't cleanup the dead nodes while the db
version is open. Restore the previous behaviour that cleaned up the
dead nodes when we are closing the version.
The dns_rbtdb unit already tracks the state of the node and tree rwlocks
during the top level function and passes the states of the locks to the
called functions.
Add the tree locking family of macros modeled after node locking macros,
and expand both to track the state of the lock in an external variable.
Additionally, in developer mode, add precondition to the macros, so the
lock is in required state - this should cause an assertion failure on
double locking instead of the thread getting stuck.
The only place where isc_rwlock_downgrade was being used was the
decrement_reference() where the code tries either relocks the node
rwlock to write and then tries to upgrade the tree lock. When returning
from the function it tries to restore the locks into a previous state
which is nice, but kind of moot, because at every use of
decrement_reference() the node locks is immediately or almost
immeditately unlocked, and same holds for the tree lock.
Instead of trying to restore the node and tree lock into the initial
state, the decrement_reference now returns the state of the locks, so
the caller can then use the right unlock operation (read or write).
Only when the tree lock was originally unlocked, the decrement_reference
unlocks the tree lock before returning to the caller.
When named starts it creates an empty KEYDATA record in the managed-keys
zone as a placeholder, then schedules a key refresh. If key refresh
fails for some reason (e.g. connectivity problems), named will load the
placeholder key into secroots as a trusted key during the next startup,
which will break the chain of trust, and named will never recover from
that state until managed-keys.bind and managed-keys.bind.jnl files are
manually deleted before (re)starting named again.
Before calling load_secroots(), check that we are not dealing with a
placeholder.
Because dns_resolver_createfetch() locks the view, it was necessary
to unlock the zone in zone_refreshkeys() before calling it in order
to maintain the lock order, and relock afterward. this permitted a race
with dns_zone_synckeyzone().
This commit moves the call to dns_resolver_createfetch() into a separate
function which is called asynchronously after the zone has been
unlocked.
The keyfetch object now attaches to the zone to ensure that
it won't be shut down before the asynchronous call completes.
This necessitated refactoring dns_zone_detach() so it always runs
unlocked. For managed zones it now schedules zone_shutdown() to
run asynchronously, and for unmanaged zones, it requires the last
dns_zone_detach() to be run without loopmgr running.
Instead of doing incremental zone loading with fixed quantum - 100
loaded lines per event, move the zone loading process to the offloaded
libuv threads using isc_work_enqueue() API.
This has the advantage that the thread scheduling is given back to the
operating system that understands blocking operations, and the zone
loading operation doesn't block the networking threads directly.
Incremental file loads now use loopmgr events instead of task events.
The dns_master_loadstreaminc(), _loadbufferinc(), _loadlexer() and
_loadlexerinc() functions were not used in BIND, and have been removed.
dns_rdata_checksvcb performs data entry checks on SVCB records.
In particular that _dns SVBC record have an 'alpn' and if that 'alpn'
parameter indicates HTTP is in use that 'dophath' is present.
The wrong tls configuration was picked here. It should be of the
primary that is selected by forward->which, not zone->curprimary.
This bug may cause BIND to select the wrong primary when retrieving
the TLS settings, or cause a crash in case the wrongly selected primary
has no TLS settings.
ARM states that the "eligibility" TTL is the smallest original TTL
value that is accepted for a record to be eligible for prefetching,
but the code, which implements the condition doesn't behave in that
manner for the edge case when the TTL is equal to the configured
eligibility value.
Fix the code to check that the TTL is greater than, or equal to the
configured eligibility value, instead of just greater than it.
For UDP queries, after calling dns_adb_beginudpfetch() in fctx_query(),
make sure that dns_adb_endudpfetch() is also called on error path, in
order to adjust the quota back.
It is currently possible that dns_adb_endudpfetch() is not
called in fctx_cancelquery() for a UDP query, which results
in quotas not being adjusted back.
Always call dns_adb_endudpfetch() for UDP queries.
In the cleanup code of fctx_query() function there is a code path
where 'query' is linked to 'fctx' and it is being destroyed.
Make sure that 'query' is unlinked before destroying it.
This log happens when BIND checks the parental-agents if the DS has
been published. But if you don't have parental-agents set up, the list
of keys to check will be empty and the result will be ISC_R_NOTFOUND.
This is not an error, so change the log level to debug in this case.
Mostly generated automatically with the following semantic patch,
except where coccinelle was confused by #ifdef in lib/isc/net.c
@@ expression list args; @@
- UNEXPECTED_ERROR(__FILE__, __LINE__, args)
+ UNEXPECTED_ERROR(args)
@@ expression list args; @@
- FATAL_ERROR(__FILE__, __LINE__, args)
+ FATAL_ERROR(args)
All we need for compression is a very small hash set of compression
offsets, because most of the information we need (the previously added
names) can be found in the message using the compression offsets.
This change combines dns_compress_find() and dns_compress_add() into
one function dns_compress_name() that both finds any existing suffix,
and adds any new prefix to the table. The old split led to performance
problems caused by duplicate names in the compression context.
Compression contexts are now either small or large, which the caller
chooses depending on the expected size of the message. There is no
dynamic resizing.
There is a behaviour change: compression now acts on all the labels in
each name, instead of just the last few.
A small benchmark suggests this is about 2x faster.
Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/once.h with
PTHEADS_RUNTIME_CHECK(), in order to improve error reporting for any
once-related run-time failures (by augmenting error messages with
file/line/caller information and the error string corresponding to
errno).
sizeof(dns_name_t) did not change but the boolean attributes are now
separated as one-bit structure members. This allows debuggers to
pretty-print dns_name_t attributes without any special hacks, plus we
got rid of manual bit manipulation code.
Originally RBT node stored three lowest bits from dns_name_t attributes.
This had a curious side-effect noticed by Tony Finch:
If you create an rbt node from a DYNAMIC name then the flag will be
propagated through dns_rbt_namefromnode() ... if you subsequently call
dns_name_free() it will try to isc_mem_put() a piece of an rbt node ...
but dns_name_free() REQUIRE()s that the name is dynamic so in the usual
case where rbt nodes are created from non-dynamic names, this kind of
code will fail an assertion.
This is a bug it dates back to june 1999 when NAMEATTR_DYNAMIC was
invented.
Apparently it does not happen often :-)
I'm planning to get rid of DNS_NAMEATTR_ definitions and bit operations,
so removal of this "three-bit-subset" assignment is a first step.
We can keep only the ABSOLUTE flag in RBT node and nothing else because
names attached to rbt nodes are always readonly: The internal node_name()
function always sets the NAMEATTR_READONLY when making a dns_name that
refers to the node's name, so the READONLY flag will be set in the name
returned by dns_rbt_namefromnode().
Co-authored-by: Tony Finch <fanf@isc.org>
Getting the recorded value of 'edns-udp-size' from the resolver requires
strong attach to the dns_view because we are accessing `view->resolver`.
This is not the case in places (f.e. dns_zone unit) where `.udpsize` is
accessed. By moving the .udpsize field from `struct dns_resolver` to
`struct dns_view`, we can access the value directly even with weakly
attached dns_view without the need to lock the view because `.udpsize`
can be accessed after the dns_view object has been shut down.
The dns_view implements weak and strong reference counting. When strong
reference counting reaches zero, the adb, ntatable and resolver objects
are shut down and detached.
In dns_zone and dns_nta the dns_view was weakly attached, but the
view->resolver reference was accessed directly leading to dereferencing
the NULL pointer.
Add dns_view_getresolver() method which attaches to view->resolver
object under the lock (if it still exists) ensuring the dns_resolver
will be kept referenced until not needed.
While refactoring the isc_mem_getx(...) usage, couple places were
identified where the memory was resized manually. Use the
isc_mem_reget(...) that was introduced in [GL !5440] to resize the
arrays via function rather than a custom code.
In several places, the structures were cleaned with memset(...)) and
thus the semantic patch converted the isc_mem_get(...) to
isc_mem_getx(..., ISC_MEM_ZERO). Use the designated initializer to
initialized the structures instead of zeroing the memory with
ISC_MEM_ZERO flag as this better matches the intended purpose.
Add new semantic patch to replace the straightfoward uses of:
ptr = isc_mem_{get,allocate}(..., size);
memset(ptr, 0, size);
with the new API call:
ptr = isc_mem_{get,allocate}x(..., size, ISC_MEM_ZERO);
Formerly, the isc_hash32() would have to change the key in a local copy
to make it case insensitive. Change the isc_siphash24() and
isc_halfsiphash24() functions to lowercase the input directly when
reading it from the memory and converting the uint8_t * array to
64-bit (respectively 32-bit numbers).
dohpath is specfied in draft-ietf-add-svcb-dns and has a value
of 7. It must be a relative path (start with a /), be encoded
as UTF8 and contain the variable dns ({?dns}).
The dns__nta_shutdown() could be run from different threads and it was
accessing nta->timer unlocked. Don't check and stop the timer from
dns__nta_shutdown() directly, but leave it for the async callback.
Because the dns_zonemgr_create() was run before the loopmgr was started,
the isc_ratelimiter API was more complicated that it had to be. Move
the dns_zonemgr_create() to run_server() task which is run on the main
loop, and simplify the isc_ratelimiter API implementation.
The isc_timer is now created in the isc_ratelimiter_create() and
starting the timer is now separate async task as is destroying the timer
in case it's not launched from the loop it was created on. The
ratelimiter tick now doesn't have to create and destroy timer logic and
just stops the timer when there's no more work to do.
This should also solve all the races that were causing the
isc_ratelimiter to be left dangling because the timer was stopped before
the last reference would be detached.
Now that the 'dns_request' supports using TLS transport, implement
dynamic update forwarding using DoT when the primary server is
configured to use a TLS transport.
Previously, when using such configuration, the dynamic update forwarding
feature was broken.