When pk11_numbits() is passed a user provided input that contains all
zeroes (via crafted DNS message), it would crash with assertion
failure. Fix that by properly handling such input.
QNAME minimization is normally disabled when forwarding. if, in the
course of processing a fetch, we switch back to normal recursion at
some point, we can't safely start minimizing because we may have
been left in an inconsistent state.
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.
This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.
The only arm64 runner we have at our disposal is suffering from
intermittent connectivity issues which make it unusable for extended
periods of time. Remove arm64 jobs from GitLab CI until we manage to
set up an arm64 runner with more reliable connectivity.
(cherry picked from commit 49f245f7c0)
The named configuration files used in the "geoip2" system test cause a
rather large number of views (6-8) to be set up in each tested named
instance. Each view has its own cache.
Commit aa72c31422 caused the RBT hash
table to be pre-allocated to a size derived from "max-cache-size", so
that it never needs to be rehashed. The size of that hash table is not
expected to be significant enough to cause memory use issues in typical
conditions even for large "max-cache-size" settings.
However, these two factors combined can cause memory exhaustion issues
in GitLab CI, where we run multiple "instances" of the test suite in
parallel on the same runner, each test suite executes multiple system
tests concurrently, and each system test may potentially start multiple
named instances at the same time. In practice, this problem currently
only seems to be affecting the "geoip2" system test, which is failing
intermittently due to named instances used by that test getting killed
by oom-killer.
Prevent the "geoip2" system test from failing intermittently by setting
"max-cache-size" in named configuration files used in that test to a low
value in order to keep memory usage at bay even with a large number of
views configured.
(cherry picked from commit 4292d5bdfe)
In 9.17 we introduced 'primaries' as a synonym for 'masters' in the
configuration file. This synonym has not been backported so change
the serve-stale test to make use of the 'masters' keyword.
Add a fifth named (ns5) that runs with `stale-cache-enable no;` and
check that there are no stale records in the cache.
(cherry picked from commit abc2ab9223)
When a received RRSet has TTL 0, they would be preserved for
serve-stale (default `max-stale-cache` is 12 hours) rather than expiring
them quickly from the cache database.
This commit makes sure the RRSet didn't have TTL 0 before marking the
entry in the database as "stale".
(cherry picked from commit 6ffa2ddae0)
The current serve-stale implementation in BIND 9 stores all received
records in the cache for a max-stale-ttl interval (default 12 hours).
This allows DNS operators to turn the serve-stale answers in an event of
large authoritative DNS outage. The caching of the stale answers needs
to be enabled before the outage happens or the feature would be
otherwise useless.
The negative consequence of the default setting is the inevitable
cache-bloat that happens for every and each DNS operator running named.
In this MR, a new configuration option `stale-cache-enable` is
introduced that allows the operators to selectively enable or disable
the serve-stale feature of BIND 9 based on their decision.
The newly introduced option has been disabled by default,
e.g. serve-stale is disabled in the default configuration and has to be
enabled if required.
(cherry picked from commit ce53db34d6)
Now that the log message has been printed set the result code to
DNS_R_FORMERR. We don't do this via dns_result_torcode() as we
don't want upstream errors to produce FORMERR if that processing
end with DNS_R_BADTSIG.
(cherry picked from commit 20488d6ad3)
The basic scenario for the problem was that in the process of
resolving a query, if any rrset was eligible for prefetching, then it
would trigger a call to query_prefetch(), this call would run in
parallel to the normal query processing.
The problem arises due to the fact that both query_prefetch(), and,
in the original thread, a call to ns_query_recurse(), try to attach
to the recursionquota, but recursing client stats counter is only
incremented if ns_query_recurse() attachs to it first.
Conversely, if fetch_callback() is called before prefetch_done(),
it would not only detach from recursionquota, but also decrement
the stats counter, if query_prefetch() attached to te quota first
that would result in a decrement not matched by an increment, as
expected.
To solve this issue an atomic bool was added, it is set once in
ns_query_recurse(), allowing fetch_callback() to check for it
and decrement stats accordingly.
For a more compreensive explanation check the thread comment below:
https://gitlab.isc.org/isc-projects/bind9/-/issues/1719#note_145857
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.
(cherry picked from commit a0f7d28967)
Running system tests with root privileges is potentially dangerous.
Only allow it when explicitly requested (by building with
--enable-developer).
(cherry picked from commit 3ef106f69d)
Created isc_refcount_decrement_expect macro to test conditionally
the return value to ensure it is in expected range. Converted
unchecked isc_refcount_decrement to use isc_refcount_decrement_expect.
Converted INSIST(isc_refcount_decrement()...) to isc_refcount_decrement_expect.
(cherry picked from commit bde5c7632a)
When silencing the Coverity warning in remove_old_tsversions(), the code
was refactored to reduce the indentation levels and break down the long
code into individual functions. This improve fix for [GL #1989].
(cherry picked from commit aca18b8b5b)