Commit Graph

13896 Commits

Author SHA1 Message Date
Mark Andrews
648ee54752 Remove dead code, result cannot be ISC_R_SUSPEND
*** CID 351290:  Control flow issues  (DEADCODE)
    /lib/dns/client.c: 1027 in dns_client_resolve()
    1021     	if (!client->readydone) {
    1022     		WAIT(&client->ready, &client->readylock);
    1023     	}
    1024     	UNLOCK(&client->readylock);
    1025
    1026     	LOCK(&resarg->lock);
    >>>     CID 351290:  Control flow issues  (DEADCODE)
    >>>     Execution cannot reach the expression "result == ISC_R_SUSPEND" inside this statement: "if (result == ISC_R_SUCCESS...".
    1027     	if (result == ISC_R_SUCCESS || result == ISC_R_SUSPEND) {
    1028     		result = resarg->result;
    1029     	}
    1030     	if (result != ISC_R_SUCCESS && resarg->vresult != ISC_R_SUCCESS) {
    1031     		/*
    1032     		 * If this lookup failed due to some error in DNSSEC
2022-04-29 12:25:25 +10:00
Petr Menšík
656a0f076f Additional safety check for negative array index
inet_ntop result should always protect against empty string accepted
without an error. Make additional check to satisfy coverity scans.
2022-04-29 11:22:40 +10:00
Petr Menšík
67e773c93c Ensure diff variable is not read uninitialized
Coverity detected issues:
- var_decl: Declaring variable "diff" without initializer.
- uninit_use_in_call: Using uninitialized value "diff.tuples.head" when
  calling "dns_diff_clear".
2022-04-29 11:22:40 +10:00
Mark Andrews
3e857065de Check that SIG and RRSIG records for private algorithms are valid
SIG and RRSIG records for private algorithms are supposed to contain
the name / OID of the algorithm used to generate them at the start
of the signature field.
2022-04-28 15:54:27 -07:00
Aram Sargsyan
2f2e02ff0c Document catalog zones member zone reset by change of unique label
The DNS catalog zones draft version 5 document requires that catalog
zones consumers must reset the member zone's internal zone state when
its unique label changes (either within the same catalog zone or
during change of ownership performed using the "coo" property).

BIND already behaves like that, and, in fact, doesn't support keeping
the zone state during change of ownership even if the unique label
has been kept the same, because BIND always removes the member zone
and adds it back during unique label renaming or change of ownership.

Document the described behavior and add a log message to inform when
unique label renaming occurs.

Add a system test case with unique label renaming.
2022-04-28 14:04:28 +00:00
Aram Sargsyan
84d3aba4f3 Remove reduntant checks of 'rdclass' in catz.c
We check the `rdclass` to be of type IN in `dns_catz_update_process()`
function, and all the other static functions where similar checks exist
are called after (and in the result of) that function being called,
so they are effectively redundant.
2022-04-28 12:40:03 +00:00
Aram Sargsyan
a8228d5f19 Introduce the concept of broken catalog zones
The DNS catalog zones draft version 5 document describes various
situations when a catalog zones must be considered as "broken" and
not be processed.

Implement those checks in catz.c and add corresponding system tests.
2022-04-28 12:36:58 +00:00
Matthijs Mekking
c66b9abc0b Add stale answer extended errors
Add DNS extended errors 3 (Stale Answer) and 19 (Stale NXDOMAIN Answer)
to responses. Add extra text with the reason why the stale answer was
returned.

To test, we need to change the configuration such that for the first
set of tests the stale-refresh-time window does not interfer with the
expected extended errors.
2022-04-28 09:58:25 +02:00
Ondřej Surý
196ec365c7 In zone.c, use __func__ instead of hand-crafted me strings
In zone.c, the "me" strings were defined for functions that could be
traced with "ENTER" macro.

Use the __func__ that's defined by the compiler and is less prone to
copy&paste errors.
2022-04-28 09:18:05 +02:00
Evan Hunt
5c4cf3fcc4 prevent a deadlock in the shutdown system test
The shutdown test sends 'rdnc status' commands in parallel with
'rndc stop' A new rndc connection arriving will reference the ACL
environment to see whether the client is allowed to connect.
Commit c0995bc380 added a mutex lock to ns_interfacemgr_getaclenv(),
but if the new connection arrives while the interfaces are being
purged during shutdown, that lock is already being held. If the
the connection event slips in ahead of one of the netmgr's "stop
listening" events on a worker thread, a deadlock can occur.

The fix is not to hold the interfacemgr lock while shutting down
interfaces; only while actually traversing the interface list to
identify interfaces needing shutdown.
2022-04-27 23:25:57 -07:00
Evan Hunt
7b2ea97e46 refactor resume_dsfetch()
clean up resume_dsfetch() so that the fctx reference counting is
saner and easier to follow.
2022-04-27 10:54:28 -07:00
Evan Hunt
d2f407cca3 refactor validated()
minor changes to ensure that fctx reference counting is clear and correct.
2022-04-27 10:54:28 -07:00
Evan Hunt
7c5afebcdc rename maybe_destroy() to maybe_cancel_validators()
the maybe_destroy() function no longer destroys the fctx,
so rename it and update the comments.
2022-04-27 10:54:28 -07:00
Evan Hunt
b4592d02a1 refactor fctx_done() to set fctx to NULL
previously fctx_done() detached the fctx but did not clear the pointer
passed into it from the caller.  in some conditions, when rctx_done()
was reached while waiting for a validator to complete, fctx_done()
could be called twice on the same fetch, causing a double detach.

fctx_done() now clears the fctx pointer, to reduce the chances of
such mistakes.
2022-04-27 10:54:28 -07:00
Aram Sargsyan
e3a88862c0 Handle ISC_R_SUCCESS on a deactivated response in udp_recv()
There is a possibility for `udp_recv()` to be called with `eresult`
being `ISC_R_SUCCESS`, but nevertheless with already deactivated `resp`,
which can happen when the request has been canceled in the meantime.
2022-04-27 15:53:14 +00:00
Artem Boldariev
978f97dcdd TLSDNS: call send callbacks after only the data was sent
This commit ensures that write callbacks are getting called only after
the data has been sent via the network.

Without this fix, a situation could appear when a write callback could
get called before the actual encrypted data would have been sent to
the network. Instead, it would get called right after it would have
been passed to the OpenSSL (i.e. encrypted).

Most likely, the issue does not reveal itself often because the
callback call was asynchronous, so in most cases it should have been
called after the data has been sent, but that was not guaranteed by
the code logic.

Also, this commit removes one memory allocation (netievent) from a hot
path, as there is no need to call this callback asynchronously
anymore.
2022-04-27 17:44:23 +03:00
Tony Finch
72b23aafd2 Apply clang-format to rbt.c
Giving the code a proper spring cleaning
2022-04-27 11:05:05 +01:00
Tony Finch
b0bf49726e Clean up a few rbt comments
Avoid HTML entities, and describe what a function does
instead of explaining why it used to be a macro.
2022-04-27 11:05:05 +01:00
Tony Finch
084f146946 Fix style of a function name in rbt.c
Mechanically generated with:

:; spatch --no-show-diff --in-place --sp-file <<END lib/dns/rbt.c
@@ expression node, name; @@
- NODENAME(node, name)
+ node_name(node, name)
@@ parameter list params; @@
  static void
- NODENAME(params)
+ node_name(params)
  { ... }
END
2022-04-27 11:05:05 +01:00
Tony Finch
8adae2d813 Remove redundant rbt macro definitions
After the previous commit, these macros are no longer used.
2022-04-27 11:05:05 +01:00
Tony Finch
bee1c91b0a Remove do-nothing rbt macro calls
Pointer chasing reads better like left->right instead of RIGHT(left)

Mechanically generated with:

:; spatch --no-show-diff --in-place --sp-file <<END lib/dns/rbt.c
@@ expression node; @@
- PARENT(node)
+ node->parent
@@ expression node; @@
- LEFT(node)
+ node->left
@@ expression node; @@
- RIGHT(node)
+ node->right
@@ expression node; @@
- DOWN(node)
+ node->down
@@ expression node; @@
- UPPERNODE(node)
+ node->uppernode
@@ expression node; @@
- DATA(node)
+ node->data
@@ expression node; @@
- IS_EMPTY(node)
+ node->data == NULL
@@ expression node; @@
- HASHNEXT(node)
+ node->hashnext
@@ expression node; @@
- HASHVAL(node)
+ node->hashval
@@ expression node; @@
- COLOR(node)
+ node->color
@@ expression node; @@
- NAMELEN(node)
+ node->namelen
@@ expression node; @@
- OLDNAMELEN(node)
+ node->oldnamelen
@@ expression node; @@
- OFFSETLEN(node)
+ node->offsetlen
@@ expression node; @@
- ATTRS(node)
+ node->attributes
@@ expression node; @@
- IS_ROOT(node)
+ node->is_root
@@ expression node; @@
- FINDCALLBACK(node)
+ node->find_callback
@@ expression node; @@
- DIRTY(node)
+ node->dirty
@@ expression node; @@
- WILD(node)
+ node->wild
@@ expression node; @@
- LOCKNUM(node)
+ node->locknum
@@ expression node; @@
- MAKE_RED(node)
+ node->color = RED
@@ expression node; @@
- MAKE_BLACK(node)
+ node->color = BLACK
END
2022-04-27 11:05:05 +01:00
Evan Hunt
a1e9a59e2b lock find when unlinking adbname->finds in dns_adb_cancelfind()
In dns_adb_cancelfind(), we need to release the find lock and
then acquire the bucket and find locks in that order, for
consistency with locking hierarchy elsehwere. Previously we
were only acquiring the bucket lock.

Also rewrote the function for better readability.
2022-04-26 12:59:59 +02:00
Ondřej Surý
407b37c3f2 Set IP(V6)_RECVERR on connect UDP sockets (via libuv)
The connect()ed UDP socket provides feedback on a variety of ICMP
errors (eg port unreachable) which bind can then use to decide what to
do with errors (report them to the client, try again with a different
nameserver etc).  However, Linux's implementation does not report what
it considers "transient" conditions, which is defined as Destination
host Unreachable, Destination network unreachable, Source Route Failed
and Message Too Big.

Explicitly enable IP_RECVERR / IPV6_RECVERR (via libuv uv_udp_bind()
flag) to learn about ICMP destination network/host unreachable.
2022-04-26 12:22:18 +02:00
Ondřej Surý
eb8f2974b1 Abort when libuv at runtime mismatches libuv at compile time
When we compile with libuv that has some capabilities via flags passed
to f.e. uv_udp_listen() or uv_udp_bind(), the call with such flags would
fail with invalid arguments when older libuv version is linked at the
runtime that doesn't understand the flag that was available at the
compile time.

Enforce minimal libuv version when flags have been available at the
compile time, but are not available at the runtime.  This check is less
strict than enforcing the runtime libuv version to be same or higher
than compile time libuv version.
2022-04-26 11:40:40 +02:00
Ondřej Surý
9ae34a04e8 The route socket and its storage was detached while still reading
The interfacemgr and the .route was being detached while the network
manager had pending read from the socket.  Instead of detaching from the
socket, we need to cancel the read which in turn will detach the route
socket and the associated interfacemgr.
2022-04-25 17:19:33 +02:00
Tony Finch
b2950c96de Revert "Move random number re-seeding out of the hot path"
This reverts commit b1bb41603e.
2022-04-25 15:18:58 +01:00
Tony Finch
b1bb41603e Move random number re-seeding out of the hot path
Instead of checking if we need to re-seed for every isc_random call,
seed the random number generator in the libisc global initializer
and the per-thread initializer.
2022-04-22 16:40:37 +01:00
Tony Finch
254d2abafb Clean up isc_random
Remove redundant comments and avoid implicit casts.
2022-04-22 16:40:37 +01:00
Tony Finch
d20ea4a703 Make isc_random_uniform() nearly divisionless
It used to require two 32-bit integer divisions to get a random number
less than some limit. Now we use Daniel Lemire's "nearly-divisionless"
algorithm for unbiased bounded random numbers, which requires one
64-bit integer multiply in the usual case, and one 32-bit integer
division in rare slow cases. Even the slow cases are faster than
before; there are also fewer branches.

I think this algorithm is exceptionally beautiful. It also has more
clever tricks than lines of code, so I have done my best to explain
how it works.
2022-04-22 16:40:37 +01:00
Ondřej Surý
b55e8a959f Allow attaching to dns_adb which is shutting down
The dns__adb_attach() had an assertion failure that prevented to attach
to dns_adb if the dns_adb was shutting down.  There was a race between
checking for .exiting in dns_adb_createfind and creating new_adbfind() -
other thread could have set the .exiting to true between the check.

Remove the assertion failure and allow attaching to dns_adb even while
shutting down.  The process of dns_adb shutting down would be noticed
only a moments later when any other callback is called.
2022-04-22 16:48:37 +02:00
Ondřej Surý
741a7096fc Run resume_dslookup() from the correct task
The rctx_chaseds() function calls dns_resolver_createfetch(), passing
fctx->task as the target task to run resume_dslookup() from.  This
breaks task-based serialization of events as fctx->task is the task that
the dns_resolver_createfetch() caller wants to receive its fetch
completion event in; meanwhile, intermediate fetches started by the
resolver itself (e.g. related to QNAME minimization) must use
res->buckets[bucketnum].task instead.  This discrepancy may cause
trouble if the resume_dslookup() callback happens to be run concurrently
with e.g. fctx_doshutdown().

Fix by passing the correct task to dns_resolver_createfetch() in
rctx_chaseds().
2022-04-22 14:25:32 +02:00
Michał Kępień
5065c4686e Fix loading plugins using just their filenames
BIND 9 plugins are installed using Automake's pkglib_LTLIBRARIES stanza,
which causes the relevant shared objects to be placed in the
$(libdir)/@PACKAGE@/ directory, where @PACKAGE@ is expanded to the
lowercase form of the first argument passed to AC_INIT(), i.e. "bind".
Meanwhile, NAMED_PLUGINDIR - the preprocessor macro that the
ns_plugin_expandpath() function uses for determining the absolute path
to a plugin for which only a filename has been provided (rather than a
path) - is set to $(libdir)/named.  This discrepancy breaks loading
plugins using just their filenames.  Fix the issue (and also prevent it
from reoccurring) by setting NAMED_PLUGINDIR to $(pkglibdir).
2022-04-22 13:27:12 +02:00
Michał Kępień
7aa7b6474b Prevent memory bloat caused by a jemalloc quirk
Since version 5.0.0, decay-based purging is the only available dirty
page cleanup mechanism in jemalloc.  It relies on so-called tickers,
which are simple data structures used for ensuring that certain actions
are taken "once every N times".  Ticker data (state) is stored in a
thread-specific data structure called tsd in jemalloc parlance.  Ticks
are triggered when extents are allocated and deallocated.  Once every
1000 ticks, jemalloc attempts to release some of the dirty pages hanging
around (if any).  This allows memory use to be kept in check over time.

This dirty page cleanup mechanism has a quirk.  If the first
allocator-related action for a given thread is a free(), a
minimally-initialized tsd is set up which does not include ticker data.
When that thread subsequently calls *alloc(), the tsd transitions to its
nominal state, but due to a certain flag being set during minimal tsd
initialization, ticker data remains unallocated.  This prevents
decay-based dirty page purging from working, effectively enabling memory
exhaustion over time. [1]

The quirk described above has been addressed (by moving ticker state to
a different structure) in jemalloc's development branch [2], but not in
any numbered jemalloc version released to date (the latest one being
5.2.1 as of this writing).

Work around the problem by ensuring that every thread spawned by
isc_thread_create() starts with a malloc() call.  Avoid immediately
calling free() for the dummy allocation to prevent an optimizing
compiler from stripping away the malloc() + free() pair altogether.

An alternative implementation of this workaround was considered that
used a pair of isc_mem_create() + isc_mem_destroy() calls instead of
malloc() + free(), enabling the change to be fully contained within
isc__trampoline_run() (i.e. to not touch struct isc__trampoline), as the
compiler is not allowed to strip away arbitrary function calls.
However, that solution was eventually dismissed as it triggered
ThreadSanitizer reports when tools like dig, nsupdate, or rndc exited
abruptly without waiting for all worker threads to finish their work.

[1] https://github.com/jemalloc/jemalloc/issues/2251
[2] c259323ab3
2022-04-21 14:19:39 +02:00
Mark Andrews
d4892f7cdc Tighten DBC restrictions on message sections
dns_message_findname and dns_message_sectiontotext incorrectly accepted
DNS_SECTION_ANY.  If DNS_SECTION_ANY was passed the section array could
be incorrectly accessed at (-1).

dns_message_pseudosectiontotext and dns_message_pseudosectiontoyaml
incorrectly accepted DNS_PSEUDOSECTION_ANY.  These functions are
designed to process a single section.
2022-04-19 22:12:38 +00:00
Ondřej Surý
d1d88a2895 Add detailed tracing when TASKMGR_TRACE is defined
When TASKMGR_TRACE=1 is defined, the task and event objects have
detailed tracing information about function, file, line, and
backtrace (to the extent tracked by gcc) where it was created.

At exit, when there are unfinished tasks, they will be printed along
with the detailed information.
2022-04-19 14:25:23 +02:00
Ondřej Surý
f0feaa3305 Remove isc_task_sendto(anddetach) functions
The only place where isc_task_sendto() was used was in dns_resolver
unit, where the "sendto" part was actually no-op, because dns_resolver
uses bound tasks.  Remove the isc_task_sendto() and
isc_task_sendtoanddetach() functions in favor of using bound tasks
create with isc_task_create_bound().

Additionally, cache the number of running netmgr threads (nworkers)
locally to reduce the number of function calls.
2022-04-19 14:24:36 +02:00
Ondřej Surý
1eeb4c1121 Remove isc_event_constallocate()
The isc_event_constallocate() function was not used anywhere, thus
remove the isc_event_constallocate() macro, declaration and definition.
2022-04-19 13:46:26 +02:00
Ondřej Surý
f55a4d3e55 Allow listening on less than nworkers threads
For some applications, it's useful to not listen on full battery of
threads.  Add workers argument to all isc_nm_listen*() functions and
convenience ISC_NM_LISTEN_ONE and ISC_NM_LISTEN_ALL macros.
2022-04-19 11:08:13 +02:00
Mark Andrews
69d30f8974 Check PRIVATEDNS and PRIVATEOID key identifiers
dns_rdata_fromtext and dns_rdata_fromwire now checks that there is
a valid name or oid at the start of the keydata when the key algorithm
is PRIVATEDNS and PRIVATEOID respectively.

dns_rdata_totext now prints out the oid if the algorithm is PRIVATEOID.
2022-04-19 14:32:56 +10:00
Mark Andrews
d043a41499 Update the rdataset->trust field in ncache.c:rdataset_settrust
Both the trust recorded in the slab stucture and the trust on
rdataset need to be updated.
2022-04-19 08:38:26 +10:00
Aram Sargsyan
99d1ec6c4b Do not use REQUIRE in dns_catz_entry_detach() after other code
The REQUIRE checks should be at the top of the function before
any assignments or code.

Move the REQUIRE check to the top.
2022-04-14 20:41:52 +00:00
Aram Sargsyan
59c486391d Replace CATZ_OPT_MASTERS with CATZ_OPT_PRIMARIES
Update the enum entry in the continued effort of replacing some
DNS terminology.
2022-04-14 20:41:52 +00:00
Aram Sargsyan
bb837db4ee Implement catalog zones change of ownership (coo) support
Catalog zones change of ownership is special mechanism to facilitate
controlled migration of a member zone from one catalog to another.

It is implemented using catalog zones property named "coo" and is
documented in DNS catalog zones draft version 5 document.

Implement the feature using a new hash table in the catalog zone
structure, which holds the added "coo" properties for the catalog zone
(containing the target catalog zone's name), and the key for the hash
table being the member zone's name for which the "coo" property is being
created.

Change some log messages to have consistent zone name quoting types.

Update the ARM with change of ownership documentation and usage
examples.

Add tests which check newly the added features.
2022-04-14 20:41:52 +00:00
Aram Sargsyan
0b2d5490cd Do not cancel processing record datasets in catalog zone after an error
When there are multiple record datasets in a database node of a catalog
zone, and BIND encounters a soft error during processing of a dataset,
it breaks from the loop and doesn't process the other datasets in the
node.

There are cases when this is not desired. For example, the catalog zones
draft version 5 states that there must be a TXT RRset named
`version.$CATZ` with exactly one RR, but it doesn't set a limitation
on possible non-TXT RRsets named `version.$CATZ` existing alongside
with the TXT one. In case when one exists, we will get a processing
error and will not continue the loop to process the TXT RRset coming
next.

Remove the "break" statement to continue processing all record datasets.
2022-04-14 10:56:24 +00:00
Aram Sargsyan
6035980bb1 Process the 'version' record of the catalog zone first
When processing a new or updated catalog zone, the record datasets
from the database are being processed in order. This creates a
problem because we need to know the version of the catalog zone
schema to process some of the records differently, but we do not
know the version until the 'version' record gets processed.

Find the 'version' record and process it first, only then iterate over
the database to process the rest, making sure not to process the
'version' record twice.
2022-04-14 10:56:24 +00:00
Aram Sargsyan
cedfebc64a Implement catalog zones options new syntax based on custom properties
According to DNS catalog zones draft version 5 document, catalog
zone custom properties must be placed under the "ext" label.

Make necessary changes to support the new custom properties syntax in
catalog zones with version "2" of the schema.

Change the default catalog zones schema version from "1" to "2" in
ARM to prepare for the new features and changes which come starting
from this commit in order to support the latest DNS catalog zones draft
document.

Make some restructuring in ARM and rename the term catalog zone "option"
to "custom property" to better reflect the terms used in the draft.

Change the version of 'catalog1.zone.' catalog zone in the "catz" system
test to "2", and leave the version of 'catalog2.zone.' catalog zone at
version "1" to test both versions.

Add tests to check that the new syntax works only with the new schema
version, and that the old syntax works only with the legacy schema
version catalog zones.
2022-04-14 10:53:52 +00:00
Matthijs Mekking
3d05c99abb Update dns_dnssec_syncdelete() function
Update the function that synchronizes the CDS and CDNSKEY DELETE
records. It now allows for the possibility that the CDS DELETE record
is published and the CDNSKEY DELETE record is not, and vice versa.

Also update the code in zone.c how 'dns_dnssec_syncdelete()' is called.

With KASP, we still maintain the DELETE records our self. Otherwise,
we publish the CDS and CDNSKEY DELETE record only if they are added
to the zone. We do still check if these records can be signed by a KSK.

This change will allow users to add a CDS and/or CDNSKEY DELETE record
manually, without BIND removing them on the next zone sign.

Note that this commit removes the check whether the key is a KSK, this
check is redundant because this check is also made in
'dst_key_is_signing()' when the role is set to DST_BOOL_KSK.
2022-04-13 13:26:59 +02:00
Evan Hunt
73ff8850bf ADB entries could be unlinked too soon
due to a typo in the code, ADB entries were unlinked from their entry
buckets during shutdown if they had a nonzero reference count. they
were only supposed to be unlinked if the reference count was exactly
one (that being the reference held by the bucket itself).
2022-04-11 17:29:03 -07:00
Ondřej Surý
f981b52793 Don't destroy mctx and task pools until we are destroying zonemgr
The mctx, zonetask and loadtask pools were being destroyed in the
shutdown function where in theory a dangling zone could be still
attached to it.

Move the isc_mem_put() on the pools to the destroy() function.
2022-04-07 18:12:03 +02:00
Tony Finch
71ce8b0a51 Ensure that dns_request_createvia() has a retry limit
There are a couple of problems with dns_request_createvia(): a UDP
retry count of zero means unlimited retries (it should mean no
retries), and the overall request timeout is not enforced. The
combination of these bugs means that requests can be retried forever.

This change alters calls to dns_request_createvia() to avoid the
infinite retry bug by providing an explicit retry count. Previously,
the calls specified infinite retries and relied on the limit implied
by the overall request timeout and the UDP timeout (which did not work
because the overall timeout is not enforced). The `udpretries`
argument is also changed to be the number of retries; previously, zero
was interpreted as infinity because of an underflow to UINT_MAX, which
appeared to be a mistake. And `mdig` is updated to match the change in
retry accounting.

The bug could be triggered by zone maintenance queries, including
NOTIFY messages, DS parental checks, refresh SOA queries and stub zone
nameserver lookups. It could also occur with `nsupdate -r 0`.
(But `mdig` had its own code to avoid the bug.)
2022-04-06 17:12:48 +01:00