Commit Graph

32995 Commits

Author SHA1 Message Date
Matthijs Mekking
bb90cb2619 Merge branch '2289-cache-dump-stale-ttl-weird-values-9_16' into 'v9_16'
Fix nonsensical stale TTL values in cache dump (9.16)

See merge request isc-projects/bind9!4889
2021-04-13 09:43:07 +00:00
Matthijs Mekking
0d47f9f20f Use stale TTL as RRset TTL in dumpdb
It is more intuitive to have the countdown 'max-stale-ttl' as the
RRset TTL, instead of 0 TTL. This information was already available
in a comment "; stale (will be retained for x more seconds", but
Support suggested to put it in the TTL field instead.

(cherry picked from commit a83c8cb0af)
2021-04-13 10:59:17 +02:00
Matthijs Mekking
7b17cc080e Check staleness in bind_rdataset
Before binding an RRset, check the time and see if this record is
stale (or perhaps even ancient). Marking a header stale or ancient
happens only when looking up an RRset in cache, but binding an RRset
can also happen on other occasions (for example when dumping the
database).

Check the time and compare it to the header. If according to the
time the entry is stale, but not ancient, set the STALE attribute.
If according to the time is ancient, set the ANCIENT attribute.

We could mark the header stale or ancient here, but that requires
locking, so that's why we only compare the current time against
the rdh_ttl.

Adjust the test to check the dump-db before querying for data. In the
dumped file the entry should be marked as stale, despite no cache
lookup happened since the initial query.

(cherry picked from commit debee6157b)
2021-04-13 10:59:10 +02:00
Matthijs Mekking
dcf6e3e58a Fix nonsensical stale TTL values in cache dump
When introducing change 5149, "rndc dumpdb" started to print a line
above a stale RRset, indicating how long the data will be retained.

At that time, I thought it should also be possible to load
a cache from file. But if a TTL has a value of 0 (because it is stale),
stale entries wouldn't be loaded from file. So, I added the
'max-stale-ttl' to TTL values, and adjusted the $DATE accordingly.

Since we actually don't have a "load cache from file" feature, this
is premature and is causing confusion at operators. This commit
changes the 'max-stale-ttl' adjustments.

A check in the serve-stale system test is added for a non-stale
RRset (longttl.example) to make sure the TTL in cache is sensible.

Also, the comment above stale RRsets could have nonsensical
values. A possible reason why this may happen is when the RRset was
marked a stale but the 'max-stale-ttl' has passed (and is actually an
RRset awaiting cleanup). This would lead to the "will be retained"
value to be negative (but since it is stored in an uint32_t, you would
get a nonsensical value (e.g. 4294362497).

To mitigate against this, we now also check if the header is not
ancient. In addition we check if the stale_ttl would be negative, and
if so we set it to 0. Most likely this will not happen because the
header would already have been marked ancient, but there is a possible
race condition where the 'rdh_ttl + serve_stale_ttl' has passed,
but the header has not been checked for staleness.

(cherry picked from commit 2a5e0232ed)
2021-04-13 10:59:00 +02:00
Mark Andrews
7c2b5495e0 Merge branch '2597-make-calling-generic-rdata-methods-consistent-v9_16' into 'v9_16'
Make calling generic rdata methods consistent

See merge request isc-projects/bind9!4843
2021-04-13 03:04:20 +00:00
Mark Andrews
f4331a48fa Make calling generic rdata methods consistent
add matching macros to pass arguments from called methods
to generic methods.  This will reduce the amount of work
required when extending methods.

Also cleanup unnecessary UNUSED declarations.

(cherry picked from commit a88d3963e2)
2021-04-13 01:54:29 +00:00
Mark Andrews
2adee41fce Merge branch '2622-command-line-option-l-not-shown-with-usage-message-v9_16' into 'v9_16'
Update named's usage description

See merge request isc-projects/bind9!4887
2021-04-13 01:53:26 +00:00
Mark Andrews
4864b69e95 Update named's usage description
(cherry picked from commit 38449de93b)
2021-04-13 11:35:13 +10:00
Michal Nowak
b3bebad281 Merge branch 'mnowak/gdb-for-killed-named-v9_16' into 'v9_16'
Run GDB for crashed named servers

See merge request isc-projects/bind9!4848
2021-04-08 10:28:19 +00:00
Michal Nowak
45bb2ae5f6 Run GDB for crashed named servers
When a core file was generated after named crashed during a system test
on 9.16, it wasn't processed by GDB, and no backtrace report was
created. This is now fixed. There are also a few white-space changes.
2021-04-08 11:53:32 +02:00
Michal Nowak
e00b69d2ee Merge branch 'mnowak/fix-missing-fromhex.pl-in-out-of-tree-v9_16' into 'v9_16'
[v9_16] Move fromhex.pl script to bin/tests/system/

See merge request isc-projects/bind9!4877
2021-04-08 09:51:14 +00:00
Michal Nowak
98d91e3024 Move fromhex.pl script to bin/tests/system/
The fromhex.pl script needs to be copied from the source directory to
the build directory before any test is run, otherwise the out-of-tree
fails to find it. Given that the script is used only in system test,
move it to bin/tests/system/.

(cherry picked from commit cd0a34df1b)
2021-04-08 11:11:23 +02:00
Michał Kępień
e02df06d8e Merge branch '2620-free-resources-when-gss_accept_sec_context-fails-v9_16' into 'v9_16'
[v9_16] Free resources when gss_accept_sec_context() fails

See merge request isc-projects/bind9!4874
2021-04-08 09:04:09 +00:00
Michał Kępień
ef4460949f Add CHANGES entry
(cherry picked from commit 7eb87270a4)
2021-04-08 10:41:09 +02:00
Michał Kępień
363902ce2c Free resources when gss_accept_sec_context() fails
Even if a call to gss_accept_sec_context() fails, it might still cause a
GSS-API response token to be allocated and left for the caller to
release.  Make sure the token is released before an early return from
dst_gssapi_acceptctx().

(cherry picked from commit d954e152d9)
2021-04-08 10:41:08 +02:00
Michał Kępień
21b0eac026 Merge branch 'michal/fix-triggering-rules-for-the-tarball-create-job' into 'v9_16'
Fix triggering rules for the "tarball-create" job

See merge request isc-projects/bind9!4871
2021-04-07 20:34:01 +00:00
Michał Kępień
233294d750 Fix triggering rules for the "tarball-create" job
Commit fd8ce68189 (a backport of commit
4d5d3b75da) did not account for the fact
that the "tarball-create" GitLab CI job is not created for manually
triggered pipelines.  This prevents manual pipeline creation from
succeeding as it causes the "gcc:tarball" job to have unsatisfied
dependencies.  Make sure the "tarball-create" job is created for
manually triggered pipelines to allow such pipelines to be started
again.
2021-04-07 22:31:09 +02:00
Ondřej Surý
66e243e64d Merge branch '2600-general-error-managed-keys-zone-dns_journal_compact-failed-no-more-v9_16' into 'v9_16'
Resolve "general: error: managed-keys-zone: dns_journal_compact failed: no more" (v9.16)

See merge request isc-projects/bind9!4870
2021-04-07 20:00:32 +00:00
Mark Andrews
2840fca4c5 Add CHANGES and release note for [GL #2600]
(cherry picked from commit 0174098aca)
2021-04-07 21:30:01 +02:00
Mark Andrews
dd2c7a3c8e Check that upgrade of managed-keys.bind.jnl succeeded
Update the system to include a recoverable managed.keys journal created
with <size,serial0,serial1,0> transactions and test that it has been
updated as part of the start up process.

(cherry picked from commit bb6f0faeed)
2021-04-07 21:29:07 +02:00
Mark Andrews
7b93ff93d6 Rewrite managed-key journal immediately
Both managed keys and regular zone journals need to be updated
immediately when a recoverable error is discovered.

(cherry picked from commit 0fbdf189c7)
2021-04-07 21:29:07 +02:00
Mark Andrews
511ea2d3f3 Update dns_journal_compact() to handle bad transaction headers
Previously, dns_journal_begin_transaction() could reserve the wrong
amount of space.  We now check that the transaction is internally
consistent when upgrading / downgrading a journal and we also handle the
bad transaction headers.

(cherry picked from commit 83310ffd92)
2021-04-07 21:29:06 +02:00
Mark Andrews
6da2e05df9 Compute transaction size based on journal/transaction type
previously the code assumed that it was a new transaction.

(cherry picked from commit 520509ac7e)
2021-04-07 21:29:06 +02:00
Mark Andrews
d9ad7ccf2d Use journal_write_xhdr() to write the dummy transaction header
Instead of journal_write(), use correct format call journal_write_xhdr()
to write the dummy transaction header which looks at j->header_ver1 to
determine which transaction header to write instead of always writing a
zero filled journal_rawxhdr_t header.

(cherry picked from commit 5a6112ec8f)
2021-04-07 21:29:06 +02:00
Diego dos Santos Fronza
25750e6436 Merge branch '2582-threadsanitizer-data-race-lib-dns-zone-c-10272-7-in-zone_maintenance-v9_16' into 'v9_16'
Resolve TSAN data race in zone_maintenance

See merge request isc-projects/bind9!4866
2021-04-07 13:25:17 +00:00
Diego Fronza
5d391f07c0 Resolve TSAN data race in zone_maintenance
Fix race between zone_maintenance and dns_zone_notifyreceive functions,
zone_maintenance was attempting to read a zone flag calling
DNS_ZONE_FLAG(zone, flag) while dns_zone_notifyreceive was updating
a flag in the same zone calling DNS_ZONE_SETFLAG(zone, ...).

The code reading the flag in zone_maintenance was not protected by the
zone's lock, to avoid a race the zone's lock is now being acquired
before an attempt to read the zone flag is made.
2021-04-07 13:22:36 +00:00
Matthijs Mekking
834379b807 Merge branch '2608-stale-answer-client-timeout-default-off-v9_16' into 'v9_16'
Change default stale-answer-client-timeout to off (9.16)

See merge request isc-projects/bind9!4867
2021-04-07 13:16:03 +00:00
Matthijs Mekking
c63b533690 Change default stale-answer-client-timeout to off
Using "stale-answer-client-timeout" turns out to have unforeseen
negative consequences, and thus it is better to disable the feature
by default for the time being.

(cherry picked from commit e443279bbf)
2021-04-07 14:46:55 +02:00
Matthijs Mekking
cae34f759a Merge branch '2594-servestale-staleonly-recursion-race-v9_16' into 'v9_16'
Serve-stale "staleonly" recursion race condition

See merge request isc-projects/bind9!4860
2021-04-02 12:05:27 +00:00
Matthijs Mekking
194a72b3f1 If RPZ config'd, bail stale-answer-client-timeout
When we are recursing, RPZ processing is not allowed. But when we are
performing a lookup due to "stale-answer-client-timeout", we are still
recursing. This effectively means that RPZ processing is disabled on
such a lookup.

In this case, bail the "stale-answer-client-timeout" lookup and wait
for recursion to complete, as we we can't perform the RPZ rewrite
rules reliably.

(cherry picked from commit 3d3a6415f7)
2021-04-02 13:29:27 +02:00
Matthijs Mekking
29bcd113ea Rename "staleonly"
The dboption DNS_DBFIND_STALEONLY caused confusion because it implies
we are looking for stale data **only** and ignore any active RRsets in
the cache. Rename it to DNS_DBFIND_STALETIMEOUT as it is more clear
the option is related to a lookup due to "stale-answer-client-timeout".

Rename other usages of "staleonly", instead use "lookup due to...".
Also rename related function and variable names.

(cherry picked from commit 839df94190)
2021-04-02 13:29:17 +02:00
Matthijs Mekking
34dd6521b1 Restore the RECURSIONOK attribute after staleonly
When doing a staleonly lookup we don't want to fallback to recursion.
After all, there are obviously problems with recursion, otherwise we
wouldn't do a staleonly lookup.

When resuming from recursion however, we should restore the
RECURSIONOK flag, allowing future required lookups for this client
to recurse.

(cherry picked from commit 3f81d79ffb)
2021-04-02 13:29:09 +02:00
Matthijs Mekking
114dc7888a Remove result exception on staleonly lookup
When implementing "stale-answer-client-timeout", we decided that
we should only return positive answers prematurely to clients. A
negative response is not useful, and in that case it is better to
wait for the recursion to complete.

To do so, we check the result and if it is not ISC_R_SUCCESS, we
decide that it is not good enough. However, there are more return
codes that could lead to a positive answer (e.g. CNAME chains).

This commit removes the exception and now uses the same logic that
other stale lookups use to determine if we found a useful stale
answer (stale_found == true).

This means we can simplify two test cases in the serve-stale system
test: nodata.example is no longer treated differently than data.example.

(cherry picked from commit aaed7f9d8c)
2021-04-02 13:28:59 +02:00
Matthijs Mekking
4b25333037 Add notes and changes for [#2594]
Pretty newsworthy.

(cherry picked from commit e44bcc6f53)
2021-04-02 13:28:48 +02:00
Matthijs Mekking
06823aa255 Remove INSIST on NS_QUERYATTR_ANSWERED
The NS_QUERYATTR_ANSWERED attribute is to prevent sending a response
twice. Without the attribute, this may happen if a staleonly lookup
found a useful answer and sends a response to the client, and later
recursion ends and also tries to send a response.

The attribute was also used to mask adding a duplicate RRset. This is
considered harmful. When we created a response to the client with a
stale only lookup (regardless if we actually have send the response),
we should clear the rdatasets that were added during that lookup.

Mark such rdatasets with the a new attribute,
DNS_RDATASETATTR_STALE_ADDED. Set a query attribute
NS_QUERYATTR_STALEOK if we may have added rdatasets during a stale
only lookup. Before creating a response on a normal lookup, check if
we can expect rdatasets to have been added during a staleonly lookup.
If so, clear the rdatasets from the message with the attribute
DNS_RDATASETATTR_STALE_ADDED set.

(cherry picked from commit 3d5429f61f)
2021-04-02 13:28:08 +02:00
Matthijs Mekking
33d61b9651 Simplify when to detach the client
With stale-answer-client-timeout, we may send a response to the client,
but we may want to hold on to the network manager handle, because
recursion is going on in the background, or we need to refresh a
stale RRset.

Simplify the setting of 'nodetach':
* During a staleonly lookup we should not detach the nmhandle, so just
  set it prior to 'query_lookup()'.
* During a staleonly "stalefirst" lookup set the 'nodetach' to true
  if we are going to refresh the RRset.

Now there is no longer the need to clear the 'nodetach' if we go
through the "dbfind_stale", "stale_refresh_window", or "stale_only"
paths.

(cherry picked from commit 48b0dc159b)
2021-04-02 13:28:01 +02:00
Matthijs Mekking
b1496d19d5 Refactor stale lookups, ignore active RRsets
When doing a staleonly lookup, ignore active RRsets from cache. If we
don't, we may add a duplicate RRset to the message, and hit an
assertion failure in query.c because adding the duplicate RRset to the
ANSWER section failed.

This can happen on a race condition. When a client query is received,
the recursion is started. When 'stale-answer-client-timeout' triggers
around the same time the recursion completes, the following sequence
of events may happen:
1. Queue the "try stale" fetch_callback() event to the client task.
2. Add the RRsets from the authoritative response to the cache.
3. Queue the "fetch complete" fetch_callback() event to the client task.
4. Execute the "try stale" fetch_callback(), which retrieves the
   just-inserted RRset from the database.
5. In "ns_query_done()" we are still recursing, but the "staleonly"
   query attribute has already been cleared. In other words, the
   query will resume when recursion ends (it already has ended but is
   still on the task queue).
6. Execute the "fetch complete" fetch_callback(). It finds the answer
   from recursion in the cache again and tries to add the duplicate to
   the answer section.

This commit changes the logic for finding stale answers in the cache,
such that on "stale_only" lookups actually only stale RRsets are
considered. It refactors the code so that code paths for "dbfind_stale",
"stale_refresh_window", and "stale_only" are more clear.

First we call some generic code that applies in all three cases,
formatting the domain name for logging purposes, increment the
trystale stats, and check if we actually found stale data that we can
use.

The "dbfind_stale" lookup will return SERVFAIL if we didn't found a
usable answer, otherwise we will continue with the lookup
(query_gotanswer()). This is no different as before the introduction of
"stale-answer-client-timeout" and "stale-refresh-time".

The "stale_refresh_window" lookup is similar to the "dbfind_stale"
lookup: return SERVFAIL if we didn't found a usable answer, otherwise
continue with the lookup (query_gotanswer()).

Finally the "stale_only" lookup.

If the "stale_only" lookup was triggered because of an actual client
timeout (stale-answer-client-timeout > 0), and if database lookup
returned a stale usable RRset, trigger a response to the client.
Otherwise return and wait until the recursion completes (or the
resolver query times out).

If the "stale_only" lookup is a "stale-anwer-client-timeout 0" lookup,
preferring stale data over a lookup. In this case if there was no stale
data, or the data was not a positive answer, retry the lookup with the
stale options cleared, a.k.a. a normal lookup. Otherwise, continue
with the lookup (query_gotanswer()) and refresh the stale RRset. This
will trigger a response to the client, but will not detach the handle
because a fetch will be created to refresh the RRset.

(cherry picked from commit 92f7a67892)
2021-04-02 13:27:52 +02:00
Matthijs Mekking
fcf8fb4f39 Keep track of allow client detach
The stale-answer-client-timeout feature introduced a dependancy on
when a client may be detached from the handle. The dboption
DNS_DBFIND_STALEONLY was reused to track this attribute. This overloads
the meaning of this database option, and actually introduced a bug
because the option was checked in other places. In particular, in
'ns_query_done()' there is a check for 'RECURSING(qctx->client) &&
(!QUERY_STALEONLY(&qctx->client->query) || ...' and the condition is
satisfied because recursion has not completed yet and
DNS_DBFIND_STALEONLY is already cleared by that time (in
query_lookup()), because we found a useful answer and we should detach
the client from the handle after sending the response.

Add a new boolean to the client structure to keep track of client
detach from handle is allowed or not. It is only disallowed if we are
in a staleonly lookup and we didn't found a useful answer.

(cherry picked from commit fee164243f)
2021-04-02 13:27:43 +02:00
Ondřej Surý
bcae8ec0ef Merge branch '2607-remove-custom-spnego-v9_16' into 'v9_16'
Remove custom ISC SPNEGO implementation (v9.16)

See merge request isc-projects/bind9!4855
2021-04-01 14:14:13 +00:00
Mark Andrews
99132eda0e Add CHANGES and release note for GL #2607 2021-04-01 16:11:25 +02:00
Ondřej Surý
565a6a5679 Move the dummy shims to single ifndef GSSAPI block
Previously, every function had it's own #ifdef GSSAPI #else #endif block
that defined shim function in case GSSAPI was not being used.  Now the
dummy shim functions have be split out into a single #else #endif block
at the end of the file.

This makes the gssapictx.c similar to 9.17.x code, making the backports
and reviews easier.
2021-04-01 10:42:32 +02:00
Mark Andrews
3fd30e1634 Add Heimdal compatibility support
The Heimdal Kerberos library handles the OID sets in a different manner.
Unify the handling of the OID sets between MIT and Heimdal
implementations by dynamically creating the OID sets instead of using
static predefined set.  This is how upstream recommends to handle the
OID sets.
2021-04-01 10:42:32 +02:00
Mark Andrews
6b0b0c6aba Request krb5 CFLAGS and LIBS from $KRB5_CONFIG
The GSSAPI now needs both gssapi and krb5 libraries, so we need to
request both CFLAGS and LIBS from the configure script.
2021-04-01 10:42:32 +02:00
Mark Andrews
a875dcc669 Remove custom ISC SPNEGO implementation
The custom ISC SPNEGO mechanism implementation is no longer needed on
the basis that all major Kerberos 5/GSSAPI (mit-krb5, heimdal and
Windows) implementations support SPNEGO mechanism since 2006.

This commit removes the custom ISC SPNEGO implementation, and removes
the option from both autoconf and win32 Configure script.  Unknown
options are being ignored, so this doesn't require any special handling.
2021-04-01 10:42:32 +02:00
Mark Andrews
216a97188d Handle expected signals in tsiggss authsock.pl script
When the authsock.pl script would be terminated with a signal,
it would leave the pidfile around.  This commit adds a signal
handler that cleanups the pidfile on signals that are expected.
2021-04-01 09:58:19 +02:00
Michal Nowak
f8c6872beb Merge branch 'mnowak/web-run-gcc-tarball-ci-job-v9_16' into 'v9_16'
[v9_16] Run gcc:tarball CI job in web-triggered pipelines

See merge request isc-projects/bind9!4852
2021-03-31 15:03:57 +00:00
Michal Nowak
fd8ce68189 Run gcc:tarball CI job in web-triggered pipelines
The gcc:tarball CI job may identify problems with tarballs created by
"make dist" of the tarball-create CI job. Enabling the gcc:tarball CI
job in web-triggered pipelines provides developers with a test vector.

(cherry picked from commit 4d5d3b75da)
2021-03-31 16:53:51 +02:00
Ondřej Surý
6e4eaa780d Merge branch 'cherry-pick-19b69e9a' into 'v9_16'
Do not require config.h to use isc/util.h (v9.16)

See merge request isc-projects/bind9!4842
2021-03-26 19:00:03 +00:00
Ondřej Surý
ee7283b3ee Merge branch 'bind-dyndb-ldap-v9.16.13' into 'main'
Do not require config.h to use isc/util.h

See merge request isc-projects/bind9!4840

(cherry picked from commit 19b69e9a3b)

81eb3396 Do not require config.h to use isc/util.h
2021-03-26 18:48:06 +00:00
Diego dos Santos Fronza
1c7b15151b Merge branch '2490-dig-tcp-does-not-honor-tries-1-nor-retry-0-v9_16' into 'v9_16'
Resolve "dig +tcp does not honor +tries=1 nor +retry=0"

See merge request isc-projects/bind9!4839
2021-03-25 17:59:37 +00:00