Commit Graph

14037 Commits

Author SHA1 Message Date
Michał Kępień
1a79aeab44 Stop resolving invalid names in resume_dslookup()
Commit 7b2ea97e46 introduced a logic bug
in resume_dslookup(): that function now only conditionally checks
whether DS chasing can still make progress.  Specifically, that check is
only performed when the previous resume_dslookup() call invokes
dns_resolver_createfetch() with the 'nameservers' argument set to
something else than NULL, which may not always be the case.  Failing to
perform that check may trigger assertion failures as a result of
dns_resolver_createfetch() attempting to resolve an invalid name.

Example scenario that leads to such outcome:

 1. A validating resolver is configured to forward all queries to
    another resolver.  The latter returns broken DS responses that
    trigger DS chasing.

 2. rctx_chaseds() calls dns_resolver_createfetch() with the
    'nameservers' argument set to NULL.

 3. The fetch fails, so resume_dslookup() is called.  Due to
    fevent->result being set to e.g. DNS_R_SERVFAIL, the default branch
    is taken in the switch statement.

 4. Since 'nameservers' was set to NULL for the fetch which caused the
    resume_dslookup() callback to be invoked
    (fctx->nsfetch->private->nameservers), resume_dslookup() chops off
    one label off fctx->nsname and calls dns_resolver_createfetch()
    again, for a name containing one label less than before.

 5. Steps 3-4 are repeated (i.e. all attempts to find the name servers
    authoritative for the DS RRset being chased fail) until fctx->nsname
    becomes stripped down the the root name.

 6. Since resume_dslookup() does not check whether DS chasing can still
    make progress, it strips off a label off the root name and continues
    its attempts at finding the name servers authoritative for the DS
    RRset being chased, passing an invalid name to
    dns_resolver_createfetch().

Fix by ensuring resume_dslookup() always checks whether DS chasing can
still make progress when a name server fetch fails.  Update code
comments to ensure the purpose of the relevant dns_name_equal() check is
clear.
2022-07-13 10:31:16 +02:00
Mark Andrews
5b51610174 Update libdns_la_LIBADD rather than libdns_la_LDFLAGS
the wrong macro was being update with MAXMINDDB_LIBS making
it difficult to adjust link order.
2022-07-13 00:14:13 +00:00
Evan Hunt
5ec077e6aa clear fctx->magic and fetch->magic when destroying
fctx_destroy() and dns_resolver_destroyfetch() did not clear the
'magic' field during destruction.
2022-07-12 23:40:47 +00:00
Evan Hunt
df1d81cf96 log the reason for falling back to AXFR from IXFR at level info
messages indicating the reason for a fallback to AXFR (i.e, because
the requested serial number is not present in the journal, or because
the size of the IXFR response would exceeed "max-ixfr-ratio") are now
logged at level info instead of debug(4).
2022-07-12 16:02:54 -07:00
Artem Boldariev
ffcb54211e TLS: do not ignore accept callback result
Before this change the TLS code would ignore the accept callback result,
and would not try to gracefully close the connection. This had not been
noticed, as it is not really required for DoH. Now the code tries to
shut down the TLS connection gracefully when accepting it is not
successful.
2022-07-12 14:40:22 +03:00
Artem Boldariev
8585b92f98 TLSDNS: try pass incoming data to OpenSSL if there are any
Otherwise the code path will lead to a call to SSL_get_error()
returning SSL_ERROR_SSL, which in turn might lead to closing
connection to early in an unexpected way, as it is clearly not what is
intended.

The issue was found when working on loppmgr branch and appears to
be timing related as well. Might be responsible for some unexpected
transmission failures e.g. on zone transfers.
2022-07-12 14:40:22 +03:00
Artem Boldariev
fc74b15e67 TLS: bail out earlier when NM is stopping
In some operations - most prominently when establishing connection -
it might be beneficial to bail out earlier when the network manager
is stopping.

The issue is backported from loopmgr branch, where such a change is
not only beneficial, but required.
2022-07-12 14:40:22 +03:00
Artem Boldariev
ac4fb34f18 TLS: sometimes TCP conn. handle might be NULL on when connecting
In some cases - in particular, in case of errors, NULL might be passed
to a connection callback instead of a handle that could have led to
an abort. This commit ensures that such a situation will not occur.

The issue was found when working on the loopmgr branch.
2022-07-12 14:40:22 +03:00
Artem Boldariev
88524e26ec TLS: try to close sockets whenever there are no pending operations
This commit ensures that the underlying TCP socket of a TLS connection
gets closed earlier whenever there are no pending operations on it.

In the loop-manager branch, in some circumstances the connection
could have remained opened for far too long for no reason. This
commit ensures that will not happen.
2022-07-12 14:40:22 +03:00
Artem Boldariev
237ce05b89 TLS: Implement isc_nmhandle_setwritetimeout()
This commit adds a proper implementation of
isc_nmhandle_setwritetimeout() for TLS connections. Now it passes the
value to the underlying TCP handle.
2022-07-12 14:40:22 +03:00
Mark Andrews
7a42417d61 Clone the message buffer before forwarding UPDATE messages
this prevents named forwarding a buffer that may have been over
written.
2022-07-12 17:13:24 +10:00
Ondřej Surý
ddad205092 Don't compress in the rrset if compression was disabled
Currently, when rrset is being compressed, the optimization has been put
in place to reuse offset to the previous name in the same rrset.  This
skips the check for non-improving compression and thus compresses the
root zone making the wireformat worse by one byte.

Additionally, when the compression has been disabled for the name, it
would be repeatedly added to the compression table because we act as if
the name was not found and the dns_compress_add() doesn't check for the
existing entry.

Change the dns_name_towire2() to always lookup the name in the
compression table to prevent adding duplicates, but don't use it neither
in the wireformat nor in the rrset cache.
2022-07-11 12:26:15 +02:00
Evan Hunt
549cf0f3e6 "rndc fetchlimit" now also lists rate-limited domains
"rndc fetchlimit" now also prints a list of domain names that are
currently rate-limited by "fetches-per-zone".

The "fetchlimit" system test has been updated to use this feature
to check that domain limits are applied correctly.
2022-07-06 19:46:23 -07:00
Evan Hunt
6175897478 add "rndc fetchlimit" to show fetchlimited servers
this command runs dns_adb_dumpquota() to display all servers
in the ADB that are being actively fetchlimited by the
fetches-per-server controls (i.e, servers with a nonzero average
timeout ratio or with the quota having been reduced from the
default value).

the "fetchlimit" system test has been updated to use the
new command to check quota values instead of "rndc dumpdb".
2022-07-06 19:46:20 -07:00
Evan Hunt
7cac4ca03c clean up unused API
the dns_adb_dumpfind() function was only used inside adb.c and
can be static. dns_view_dumpdbtostream() was not used anywhere.
2022-07-06 19:36:54 -07:00
Evan Hunt
f6abb80746 try other servers when receiving FORMERR
previously, when an iterative query returned FORMERR, resolution
would be stopped under the assumption that other servers for
the same domain would likely have the same capabilities. this
assumption is not correct; some domains have been reported for
which some but not all servers will return FORMERR to a given
query; retrying allows recursion to succeed.
2022-07-06 14:15:32 -07:00
Evan Hunt
a499794984 REQUIRE should not have side effects
it's a style violation to have REQUIRE or INSIST contain code that
must run for the server to work. this was being done with some
atomic_compare_exchange calls. these have been cleaned up.  uses
of atomic_compare_exchange in assertions have been replaced with
a new macro atomic_compare_exchange_enforced, which uses RUNTIME_CHECK
to ensure that the exchange was successful.
2022-07-05 12:22:55 -07:00
Mark Andrews
7be64c0e94 Tighten $GENERATE directive parsing
The original sscanf processing allowed for a number of syntax errors
to be accepted.  This included missing the closing brace in
${modifiers}

Look for both comma and right brace as intermediate seperators as
well as consuming the final right brace in the sscanf processing
for ${modifiers}.  Check when we got right brace to determine if
the sscanf consumed more input than expected and if so behave as
if it had stopped at the first right brace.
2022-07-05 09:41:33 -07:00
Mark Andrews
5327b9708f Check for overflow in $GENERATE computations
$GENERATE uses 'int' for its computations and some constructions
can overflow values that can be represented by an 'int' resulting
in undefined behaviour.  Detect these conditions and return a
range error.
2022-07-05 09:41:29 -07:00
Mark Andrews
a5b57ed293 Add synth-from-dnssec namespaces for keytable entries
We do this by adding callbacks for when a node is added or deleted
from the keytable.  dns_keytable_add and dns_keytable_delete where
extended to take a callback.  dns_keytable_deletekey does not remove
the node so it was not extended.
2022-07-05 12:29:01 +10:00
Mark Andrews
f716bd68d4 Add entries to the synth-from-dnssec namespace tree for zones
When a zone is attached or detached from the view (zone->view is
updated) update the synth-from-dnssec namespace tree.
2022-07-05 12:29:01 +10:00
Mark Andrews
228dadb026 Check the synth-form-dnssec namespace when synthesising responses
Call dns_view_sfd_find to find the namespace to be used to verify
the covering NSEC records returned for the given QNAME.  Check that
the NSEC owner names are within that namespace.
2022-07-05 12:29:01 +10:00
Mark Andrews
3619cad141 Add a mechanism to record namespaces for synth-from-dnssec
When namespace is grafted on, the DNSSEC proofs for non existance
need to come from that namespace and not a higher namespace.  We
add 3 function dns_view_sfd_add, dns_view_sfd_del and dns_view_sfd_find
to add, remove and find the namespace that should be used when
checking NSEC records.

dns_view_sfd_add adds a name to a tree, creating the tree if needed.
If the name already existed in the tree the reference count is
increased otherwise it is initalised to 1.

dns_view_sfd_del removes a reference to a name in the tree, if the
count goes to 0 the node is removed.

dns_view_sfd_find returns the namespace to be used to entered name.
If there isn't an enclosing name in the tree, or the tree does not
yet exist, the root name is returned.

Access to the tree is controlled by a read/write lock.
2022-07-05 12:29:01 +10:00
Evan Hunt
975a5a98cf Add missing isc_refcount_*() calls
Commits 76bcb4d16b and
d48d8e1cf0 did not include
isc_refcount_destroy() calls that would be logical counterparts of the
isc_refcount_init() calls these commits added.  Add the missing
isc_refcount_destroy() calls to destroy().

Adding these calls (which ensure a given structure's reference count
equals 0 when it is destroyed, therefore detecting reference counting
issues) uncovered another flaw in the commits mentioned above: missing
isc_refcount_decrement() calls that would be logical counterparts of the
isc_refcount_increment*() calls these commits added.  Add the missing
isc_refcount_decrement() calls to unlink_name() and unlink_entry().
2022-07-04 16:02:12 +02:00
Michał Kępień
ef86653d80 Add missing invocations of pthreads destructors
Add isc_mutex_destroy() and isc_rwlock_destroy() calls missing from the
commits that introduced the relevant isc_mutex_init() and
isc_rwlock_init() calls:

  - 76bcb4d16b
  - 1595304312
  - 857f3bede3

None of these omissions affect any hot paths, so they are not expected
to cause operational issues; correctness is the only concern here.
2022-07-04 16:02:12 +02:00
Michał Kępień
887c666caf Obsolete the "glue-cache" option
The "glue-cache" option was marked as deprecated by commit
5ae33351f2 (first released in BIND 9.17.6,
back in October 2020), so now obsolete that option, removing all code
and documentation related to it.

Note: this causes the glue cache feature to be permanently enabled, not
disabled.
2022-06-30 15:24:08 +02:00
Artem Boldariev
d2e13ddf22 Update the set of HTTP endpoints on reconfiguration
This commit ensures that on reconfiguration the set of HTTP
endpoints (=paths) is being updated within HTTP listeners.
2022-06-28 15:42:38 +03:00
Artem Boldariev
e72962d5f1 Update max concurrent streams limit in HTTP listeners on reconfig
This commit ensures that HTTP listeners concurrent streams limit gets
updated properly on reconfiguration.
2022-06-28 15:42:38 +03:00
Artem Boldariev
a2379135fa Update HTTP listeners quotas on reconfiguration
This commit ensures that on reconfiguration a proper value for HTTP
connections limit is picked up.

The commit also refactors how listeners settings are updated so that
there is less code duplication.
2022-06-28 15:42:38 +03:00
Artem Boldariev
3f0b310772 Store HTTP quota size inside a listenlist instead of the quota
This way only quota size is passed to the interface/listener
management code instead of a quota object. Thus, we can implement
updating the quota object size instead of recreating the object.
2022-06-28 15:42:38 +03:00
Matthijs Mekking
d8dae61832 Add isccfg duration utility functions
Add function isccfg_duration_toseconds and isccfg_parse_duration to get
rid of code duplication.
2022-06-28 11:56:31 +02:00
Matthijs Mekking
8e18fa5874 Fix a bug in the duration_fromtext function
The function actually did not enforce that the duration string starts
with a P (or p), just that there is a P (or p) in the string.
2022-06-28 11:56:31 +02:00
Matthijs Mekking
c2a7950417 Also inherit from "default" for "insecure" policy
Remove the duplication from the defaultconf and inherit the values
not set in the "insecure" policy from the "default" policy. Therefore,
we must insist that the first read built-in policy is the default one.
2022-06-28 11:56:31 +02:00
Matthijs Mekking
5d6f0de84b Nit changes in keymgr and kasp
Use the ISC_MAX define instead of "x = a > b ? a : b" paradigm.

Remove an unneeded include.
2022-06-28 11:56:31 +02:00
Matthijs Mekking
20acb8d3a3 When loading dnssec-policies, inherit from default
Most of the settings (durations) are already inheriting from the default
because they use the constants from lib/dns/kasp.h. We need them as
constants so we can use them in named-checkconf to verify the policy
parameters.

The NSEC(3) parameters and keys should come from the actual default
policy. Change the call to cfg_kasp_fromconfig() to include the default
kasp. We also no longer need to corner case where config is NULL we load
the built-in policy: the built-in policies are now loaded when config is
set to named_g_config.

Finally, add a debug log (it is useful to see which policies are being
loaded).
2022-06-28 11:56:31 +02:00
Matthijs Mekking
5ff414e986 Store built-in dnssec-policies in defaultconf
Update the defaultconf with the built-in policies. These will now be
printed with "named -C".

Change the defines in kasp.h to be strings, so they can be concatenated
in the defaultconf. This means when creating a kasp structure, we no
longer initialize the defaults (this is fine because only kaspconf.c
uses dns_kasp_create() and it inherits from the default policy).

In kaspconf.c, the default values now need to be parsed from string.

Introduce some variables so we don't need to do get_duration multiple
times on the same configuration option.

Finally, clang-format-14 decided to do some random formatting changes.
2022-06-28 11:56:31 +02:00
Matthijs Mekking
a28d919503 Move duration structure to libisccfg/duration
Having the duration structure and parsing code here, it becomes
more accessible to be used in other places.
2022-06-28 11:56:31 +02:00
Michał Kępień
2f945703f2 Fix destination port extraction for client queries
The current logic for determining the address of the socket to which a
client sent its query is:

 1. Get the address:port tuple from the netmgr handle using
    isc_nmhandle_localaddr().

 2. Convert the address:port tuple from step 1 into an isc_netaddr_t
    using isc_netaddr_fromsockaddr().

 3. Convert the address from step 2 back into a socket address with the
    port set to 0 using isc_sockaddr_fromnetaddr().

Note that the port number (readily available in the netmgr handle) is
needlessly lost in the process, preventing it from being recorded in
dnstap captures of client traffic produced by named.

Fix by first storing the address:port tuple returned by
isc_nmhandle_localaddr() in client->destsockaddr and then creating an
isc_netaddr_t from that structure.  This allows the port number to be
retained in client->destsockaddr, which is what subsequently gets passed
to dns_dt_send().
2022-06-22 13:45:46 +02:00
Michal Nowak
1c45a9885a Update clang to version 14 2022-06-16 17:21:11 +02:00
Artem Boldariev
e616d7f240 TLS DNS: do not call accept callback twice
Before the changes from this commit were introduced, the accept
callback function will get called twice when accepting connection
during two of these stages:

* when accepting the TCP connection;
* when handshake has completed.

That is clearly an error, as it should have been called only once. As
far as I understand it the mistake is a result of TLS DNS transport
being essentially a fork of TCP transport, where calling the accept
callback immediately after accepting TCP connection makes sense.

This commit fixes this mistake. It did not have any very serious
consequences because in BIND the accept callback only checks an ACL
and updates stats.
2022-06-15 14:21:11 +03:00
Aram Sargsyan
f6e729635f Fix a race condition between shutdown and route_connected()
When shutting down, the interface manager can be destroyed
before the `route_connected()` callback is called, which is
unexpected for the latter and can cause a crash.

Move the interface manager attachment code from the callback
to the place before the callback is registered using
`isc_nm_routeconnect()` function, which will make sure that
the interface manager will live at least until the callback
is called.

Make sure to detach the interface manager if the
`isc_nm_routeconnect()` function is not implemented, or when
the callback is called with a result value which differs from
`ISC_R_SUCCESS`.
2022-06-14 14:31:24 +00:00
Aram Sargsyan
1d93fe973b Do not use the interface manager until it is ready
The `ns_interfacemgr_create()` function, when calling
`isc_nm_routeconnect()`, uses the newly created `ns_interfacemgr_t`
instance before initializing its reference count and the magic value.

Defer the `isc_nm_routeconnect()` call until the initializations
are complete.
2022-06-14 14:31:24 +00:00
Michał Kępień
0d7e5d513c Assert on unknown isc_quota_attach() return values
The only values that the isc_quota_attach() function (called from
check_recursionquota() via recursionquotatype_attach_soft()) can
currently return are: ISC_R_SUCCESS, ISC_R_SOFTQUOTA, and ISC_R_QUOTA.
Instead of just propagating any other (unexpected) error up the call
stack, assert immediately, so that if the isc_quota_* API gets updated
in the future to return values currently matching the "default"
statement, check_recursionquota() can be promptly updated to handle such
new return values as desired.
2022-06-14 13:13:32 +02:00
Michał Kępień
8d64beb06f Use a switch statement in check_recursionquota()
Improve readability of the check_recursionquota() function by replacing
a sequence of conditional statements with a switch statement.
2022-06-14 13:13:32 +02:00
Michał Kępień
a06cdfc7b7 Add helper function for recursive-clients logging
Reduce code duplication in check_recursionquota() by extracting its
parts responsible for logging to a separate helper function.

Remove result text from the "no more recursive clients" message because
it always says "quota reached" (as the relevant branch is only evaluated
when 'result' is set to ISC_R_QUOTA) and therefore brings no additional
value.
2022-06-14 13:13:32 +02:00
Michał Kępień
7bc4425e2a Remove redundant recursion quota pointer checks
When the client->recursionquota pointer was overloaded by different
features, each of those features had to be aware of that fact and handle
any updates of that pointer gracefully.  Example: prefetch code
initiates recursion, attaching to client->recursionquota, then query
processing restarts due to a CNAME being encountered, then that CNAME is
not found in the cache, so another recursion is triggered, but
client->recursionquota is already attached to; even though it is not
CNAME chaining code that attached to that pointer, that code still has
to handle such a situation gracefully.

However, all features that can initiate recursion have now been updated
to use separate slots in the 'recursions' array, so keeping the old
checks in place means masking future programming errors that could
otherwise be caught - and should be caught because each feature needs to
properly maintain its own quota reference.

Remove outdated recursion quota pointer checks to enable the assertions
in isc_quota_*() functions to detect programming errors in code paths
that can start recursion.  Remove an outdated comment to prevent
confusion.
2022-06-14 13:13:32 +02:00
Michał Kępień
ccee3507c2 Check recursions at the end of request processing
ns_client_endrequest() currently contains code that looks for
outstanding quota references and cleans them up if necessary.  This
approach masks programming errors because ns_client_endrequest() is only
called from ns__client_reset_cb(), which in turn is only called when all
references to the client's netmgr handle are released, which in turn
only happens after all recursion completion callbacks are invoked
(because isc_nmhandle_attach() is called before every call to
dns_resolver_createfetch() in lib/ns/query.c and the completion callback
is expected to detach from the handle), which in turn is expected to
happen for all recursions attempts, even those that get canceled.

Furthermore, declaring the prototype of ns_client_endrequest() at the
top of lib/ns/client.c is redundant because the definition of that
function is placed before its first use in that file.  Remove the
redundant function prototype.

Finally, remove INSIST assertions ensuring quota pointers are NULL in
ns__client_reset_cb() because the latter calls ns_client_endrequest() a
few lines earlier.
2022-06-14 13:13:32 +02:00
Michał Kępień
e09b36f2cc Adjust recursion quota when starting a fetch fails
Some functions fail to detach from the recursion quota if an error
occurs while initiating recursion.  This causes the recursive client
counter to be off.  Add missing recursionquota_detach() calls, reworking
cleanup code where appropriate.
2022-06-14 13:13:32 +02:00
Michał Kępień
172e15f7ad Attach to separate recursion quota pointers
Similarly to how different code paths reused common client handle
pointers and fetch references despite being logically unrelated, they
also reuse client->recursionquota, the field in which a reference to the
recursion quota is stored.  This unnecessarily forces all code using
that field to be aware of the fact that it is overloaded by different
features.

Overloading client->recursionquota also causes inconsistent behavior.
For example, if prefetch code triggers recursion and then delegation
handling code also triggers recursion, only one of these code paths will
be able to attach to the recursion quota, but both recursions will be
started anyway.  In other words, each code path only checks whether the
recursion quota has not been exceeded if the quota has not yet been
attached to by another code path.  This behavior theoretically allows
the configured recursion quota to be slightly exceeded; while it is not
expected to be a real-world operational issue, it is still confusing and
should therefore be fixed.

Extend the structures comprising the 'recursions' array with a new field
holding a pointer to the recursion quota that a given recursion process
attached to.  Update all code paths using client->recursionquota so that
they use the appropriate slot in the 'recursions' array.  Drop the
'recursionquota' field from ns_client_t.
2022-06-14 13:13:32 +02:00
Michał Kępień
95e703121d Ensure ns_query_cancel() handles all recursions
Previously, multiple code paths reused client->query.fetch, so it was
enough for ns_query_cancel() to issue a single call to
dns_resolver_cancelfetch() with that fetch as an argument.  Now, since
each slot in the 'recursions' array can hold a reference to a separate
resolver fetch, ns_query_cancel() needs to handle all of them, so that
all recursion callbacks get a chance to clean up the associated
resources when a query is canceled.
2022-06-14 13:13:32 +02:00