Commit Graph

9983 Commits

Author SHA1 Message Date
Mark Andrews
4864b69e95 Update named's usage description
(cherry picked from commit 38449de93b)
2021-04-13 11:35:13 +10:00
Michal Nowak
45bb2ae5f6 Run GDB for crashed named servers
When a core file was generated after named crashed during a system test
on 9.16, it wasn't processed by GDB, and no backtrace report was
created. This is now fixed. There are also a few white-space changes.
2021-04-08 11:53:32 +02:00
Michal Nowak
98d91e3024 Move fromhex.pl script to bin/tests/system/
The fromhex.pl script needs to be copied from the source directory to
the build directory before any test is run, otherwise the out-of-tree
fails to find it. Given that the script is used only in system test,
move it to bin/tests/system/.

(cherry picked from commit cd0a34df1b)
2021-04-08 11:11:23 +02:00
Mark Andrews
dd2c7a3c8e Check that upgrade of managed-keys.bind.jnl succeeded
Update the system to include a recoverable managed.keys journal created
with <size,serial0,serial1,0> transactions and test that it has been
updated as part of the start up process.

(cherry picked from commit bb6f0faeed)
2021-04-07 21:29:07 +02:00
Matthijs Mekking
c63b533690 Change default stale-answer-client-timeout to off
Using "stale-answer-client-timeout" turns out to have unforeseen
negative consequences, and thus it is better to disable the feature
by default for the time being.

(cherry picked from commit e443279bbf)
2021-04-07 14:46:55 +02:00
Matthijs Mekking
114dc7888a Remove result exception on staleonly lookup
When implementing "stale-answer-client-timeout", we decided that
we should only return positive answers prematurely to clients. A
negative response is not useful, and in that case it is better to
wait for the recursion to complete.

To do so, we check the result and if it is not ISC_R_SUCCESS, we
decide that it is not good enough. However, there are more return
codes that could lead to a positive answer (e.g. CNAME chains).

This commit removes the exception and now uses the same logic that
other stale lookups use to determine if we found a useful stale
answer (stale_found == true).

This means we can simplify two test cases in the serve-stale system
test: nodata.example is no longer treated differently than data.example.

(cherry picked from commit aaed7f9d8c)
2021-04-02 13:28:59 +02:00
Mark Andrews
216a97188d Handle expected signals in tsiggss authsock.pl script
When the authsock.pl script would be terminated with a signal,
it would leave the pidfile around.  This commit adds a signal
handler that cleanups the pidfile on signals that are expected.
2021-04-01 09:58:19 +02:00
Diego Fronza
db2c180feb Update dig's man page
Adjusted man page entries for +tries and +retry options to reflect the
fact that now those options apply to TCP as well.
2021-03-25 14:33:50 -03:00
Diego Fronza
9c2d52bcdb Added tests for tries=1 and retry=0 on TCP EOF
Added tests to ensure that dig won't retry sending a query over tcp
(+tcp) when a TCP connection is closed prematurely (EOF is read) if
either +tries=1 or retry=0 is specified on the command line.
2021-03-25 14:33:08 -03:00
Diego Fronza
d299f0e9ab Adjusted dig system tests
Now that premature EOF on tcp connections take +tries and +retry into
account, the dig system tests handling TCP EOF with +tries=1 were
expecting dig to do a second attempt in handling the tcp query, which
doesn't happen anymore.

To make the test work as expected +tries value was adjusted to 2, to
make it behave as before after the new update on dig.
2021-03-25 14:32:46 -03:00
Diego Fronza
1e5f3e6fa3 Don't retry +tcp queries on failure if tries=1 or retries=0
Before this commit, a premature EOF (connection closed) on tcp queries
was causing dig to automatically attempt to send the query again, even
if +tries=1 or +retries=0 was provided on command line.

This commit fix the problem by taking into account the no. of retries
specified by the user when processing a premature EOF on tcp
connections.
2021-03-25 14:32:37 -03:00
Matthijs Mekking
a36257ea60 Fix some intermittent kasp failures
When calling "rndc dnssec -checkds", it may take some milliseconds
before the appropriate changes have been written to the state file.
Add retry_quiet mechanisms to allow the write operation to finish.

Also retry_quiet the check for the next key event. A "rndc dnssec"
command may trigger a zone_rekey event and this will write out
a new "next key event" log line, but it may take a bit longer than
than expected in the tests.

(cherry picked from commit 82d667e1d5)
2021-03-22 15:35:22 +01:00
Matthijs Mekking
d12b40f6fb Rekey immediately after rndc checkds/rollover
Call 'dns_zone_rekey' after a 'rndc dnssec -checkds' or 'rndc dnssec
-rollover' command is received, because such a command may influence
the next key event. Updating the keys immediately avoids unnecessary
rollover delays.

The kasp system test no longer needs to call 'rndc loadkeys' after
a 'rndc dnssec -checkds' or 'rndc dnssec -rollover' command.

(cherry picked from commit 82f72ae249)
2021-03-22 15:35:22 +01:00
Matthijs Mekking
1f8c5786f8 Delete CDS/CDNSKEY records when zone is unsigned
CDS/CDNSKEY DELETE records are only useful if they are signed,
otherwise the parent cannot verify these RRsets anyway. So once the DS
has been removed (and signaled to BIND), we can remove the DNSKEY and
RRSIG records, and at this point we can also remove the CDS/CDNSKEY
records.

(cherry picked from commit 6f31f62d69)
2021-03-22 13:57:10 +01:00
Matthijs Mekking
7882c7fbea Allow CDS/CDNSKEY DELETE records in unsigned zone
While not useful, having a CDS/CDNSKEY DELETE record in an unsigned
zone is not an error and "named-checkzone" should not complain.

(cherry picked from commit f211c7c2a1)
2021-03-22 13:31:02 +01:00
Matthijs Mekking
fe09becc7e Retry quiet check keys
Change the 'check_keys' function to try three times. Some intermittent
kasp test failures are because we are inspecting the key files
before the actual change has happen. The 'retry_quiet' approach allows
for a bit more time to let the write operation finish.

(cherry picked from commit d5531df79a)
2021-03-22 11:24:55 +01:00
Matthijs Mekking
68e9603ed8 Test keymgr2kasp state from timing metadata
Add two test zones that migrate to dnssec-policy. Test if the key
states are set accordingly given the timing metadata.

The rumoured.kasp zone has its Publish/Active/SyncPublish times set
not too long ago so the key states should be set to RUMOURED. The
omnipresent.kasp zone has its Publish/Active/SyncPublish times set
long enough to set the key states to OMNIPRESENT.

Slightly change the init_migration_keys function to set the
key lifetime to "none" (legacy keys don't have lifetime). Then in the
test case set the expected key lifetime explicitly.

(cherry picked from commit c40c1ebcb1)
2021-03-22 11:24:55 +01:00
Matthijs Mekking
177ceb6cda Editorial commit keymgr2kasp test
This commit is somewhat editorial as it does not introduce something
new nor fixes anything.

The layout in keymgr2kasp/tests.sh has been changed, with the
intention to make more clear where a test scenario ends and begins.

The publication time of some ZSKs has been changed. It makes a more
clear distinction between publication time and activation time.

(cherry picked from commit f6fa254256)
2021-03-22 11:24:55 +01:00
Matthijs Mekking
e91f53cc6e Introduce kasp.sh
Add a script similar to conf.sh to include common functions and
variables for testing KASP. Currently used in kasp, keymgr2kasp, and
nsec3.

(cherry picked from commit ecb073bdd6)
2021-03-22 11:24:55 +01:00
Matthijs Mekking
2fa68d985f Move kasp migration tests to different directory
The kasp system test was getting pretty large, and more tests are on
the way. Time to split up. Move tests that are related to migrating
to dnssec-policy to a separate directory 'keymgr2kasp'.

(cherry picked from commit 5389172111)
2021-03-22 11:24:55 +01:00
Patrick McLean
702edde73a dig: Use high resolution clocks when microsecond accuracy is requested
The TIME_NOW macro calls isc_time_now which uses CLOCK_REALTIME_COARSE
for getting the current time. This is perfectly fine for millisecond,
however when the user request microsecond resolutiuon, they are going
to get very inaccurate results. This is especially true on a server
class machine where the clock ticks may be set to 100HZ.

This changes dig to use the new TIME_NOW_HIRES macro that uses the
CLOCK_MONOTONIC_RAW that is more expensive, but gets the *actual*
current time rather than the at the last kernel time tick.

(cherry picked from commit 56cef1495f)
2021-03-20 12:00:59 -07:00
Ondřej Surý
db49ffca20 Change the isc_nm_(get|set)timeouts() to work with milliseconds
The RFC7828 specifies the keepalive interval to be 16-bit, specified in
units of 100 milliseconds and the configuration options tcp-*-timeouts
are following the suit.  The units of 100 milliseconds are very
unintuitive and while we can't change the configuration and presentation
format, we should not follow this weird unit in the API.

This commit changes the isc_nm_(get|set)timeouts() functions to work
with milliseconds and convert the values to milliseconds before passing
them to the function, not just internally.
2021-03-18 15:16:13 +01:00
Ondřej Surý
94a32eaf29 Add TCP timeouts system test
The system tests were missing a test that would test tcp-initial-timeout
and tcp-idle-timeout.

This commit adds new "timeouts" system test that adds:

  * Test that waits longer than tcp-initial-timeout and then checks
    whether the socket was closed

  * Test that sends and receives DNS message then waits longer than
    tcp-initial-timeout but shorter time than tcp-idle-timeout than
    sends DNS message again than waits longer than tcp-idle-timeout
    and checks whether the socket was closed

  * Similar test, but bursting 25 DNS messages than waiting longer than
    tcp-initial-timeout and shorter than tcp-idle-timeout than do second
    25 DNS message burst

  * Check whether transfer longer than tcp-initial-timeout succeeds
2021-03-18 15:16:13 +01:00
Matthijs Mekking
937e10a5f4 Add test for thaw dynamic kasp zone
Add a test for freezing, manually updating, and then thawing a dynamic
zone with "dnssec-policy". In the kasp system test we add parameters
to the "update_is_signed" check to signal the indicated IP addresses
for the labels "a" and "d". If set to '-', the test is skipped.

After nsupdating the dynamic.kasp zone, we revert the update (with
nsupdate) and update the zone again, but now with the freeze/thaw
approach.

(cherry picked from commit 0cae3249e3)
2021-03-17 11:12:48 +01:00
Mark Andrews
8dc5d63e1d Ignore the actual error code returned by getaddrinfo
when testing if interactive mode continues or not on
invalid hostname.  We only need to detect that getaddrinfo
failed and that we continued or not.

(cherry picked from commit 25d1276170)
2021-03-16 11:12:47 +11:00
Matthijs Mekking
96953fc293 Fix servestale fetchlimits crash
When we query the resolver for a domain name that is in the same zone
for which is already one or more fetches outstanding, we could
potentially hit the fetch limits. If so, recursion fails immediately
for the incoming query and if serve-stale is enabled, we may try to
return a stale answer.

If the resolver is also is authoritative for the parent zone (for
example the root zone), first a delegation is found, but we first
check the cache for a better response.

Nothing is found in the cache, so we try to recurse to find the
answer to the query.

Because of fetch-limits 'dns_resolver_createfetch()' returns an error,
which 'ns_query_recurse()' propagates to the caller,
'query_delegation_recurse()'.

Because serve-stale is enabled, 'query_usestale()' is called,
setting 'qctx->db' to the cache db, but leaving 'qctx->version'
untouched. Now 'query_lookup()' is called to search for stale data
in the cache database with a non-NULL 'qctx->version'
(which is set to a zone db version), and thus we hit an assertion
in rbtdb.

This crash was introduced in 'v9_16' by commit
2afaff75ed.

(cherry picked from commit 87591de6f7)
2021-03-11 13:47:20 +01:00
Mark Andrews
ce5dc75e48 Move cleanup of queries to later in the shutdown sequence
to avoid TSAN report

    WARNING: ThreadSanitizer: data race
      Write of size 8 at 0x000000000001 by main thread:
        #0 free <null>
        #1 default_memfree lib/isc/mem.c:440
        #2 mem_put lib/isc/mem.c:363
        #3 isc__mem_free lib/isc/mem.c:1012
        #4 main bin/tools/mdig.c:2231

      Previous read of size 1 at 0x000000000005 by thread T1:
        #0 dns_name_fromtext lib/dns/name.c:1121
        #1 sendquery bin/tools/mdig.c:596
        #2 sendqueries bin/tools/mdig.c:779
        #3 dispatch lib/isc/task.c:1153
        #4 run lib/isc/task.c:1345
        #5 isc__trampoline_run lib/isc/trampoline.c:184
        #6 <null> <null>

      Thread T1 (running) created by main thread at:
        #0 pthread_create <null>
        #1 isc_thread_create pthreads/thread.c:79
        #2 isc_taskmgr_create lib/isc/task.c:1435
        #3 main bin/tools/mdig.c:2148

    SUMMARY: ThreadSanitizer: data race in __interceptor_free

(cherry picked from commit 4015af02d8)
2021-03-04 15:02:07 +01:00
Ondřej Surý
7a8193efba Use int type to store result from isc_commandline_parse()
The C standard actually doesn't define char as signed or unsigned, and
it could be either according to underlying architecture.  It turns out
that while it's usually signed type, it isn't on arm64 where it's
unsigned.

isc_commandline_parse() return int, just use that instead of the char.

(cherry picked from commit 8153729d3a)
2021-03-04 11:21:26 +01:00
Evan Hunt
bda028e0ee create 'journal' system test
tests that version 1 journal files containing version 1 transaction
headers are rolled forward correctly on server startup, then updated
into version 2 journals. also checks journal file consistency and
'max-journal-size' behavior.

(cherry picked from commit a0aefa1de6)
2021-03-03 19:21:16 -08:00
Mark Andrews
5aea511e1b extend named-journalprint to be able to force the journal version
named-journalprint can now upgrade or downgrade a journal file
in place; the '-u' option upgrades and the '-d' option downgrades.

(cherry picked from commit fb2d0e2897)
2021-03-03 19:19:50 -08:00
Evan Hunt
47a274e9f1 allow dns_journal_rollforward() to read old journal files
when the 'max-ixfr-ratio' option was added, journal transaction
headers were revised to include a count of RR's in each transaction.
this made it impossible to read old journal files after an upgrade.

this branch restores the ability to read version 1 transaction
headers. when rolling forward, printing journal contents, if
the wrong transaction header format is found, we can switch.

when dns_journal_rollforward() detects a version 1 transaction
header, it returns DNS_R_RECOVERABLE.  this triggers zone_postload()
to force a rewrite of the journal file in the new format, and
also to schedule a dump of the zone database with minimal delay.
journal repair is done by dns_journal_compact(), which rewrites
the entire journal, ignoring 'max-journal-size'. journal size is
corrected later.

newly created journal files now have "BIND LOG V9.2" in their headers
instead of "BIND LOG V9". files with the new version string cannot be
read using the old transaction header format. note that this means
newly created journal files will be rejected by older versions of named.

named-journalprint now takes a "-x" option, causing it to print
transaction header information before each delta, including its
format version.

(cherry picked from commit ee19966326)
2021-03-03 19:19:50 -08:00
Matthijs Mekking
acc95d4e1d Don't servfail on staleonly lookups
When a staleonly lookup doesn't find a satisfying answer, it should
not try to respond to the client.

This is not true when the initial lookup is staleonly (that is when
'stale-answer-client-timeout' is set to 0), because no resolver fetch
has been created at this point. In this case continue with the lookup
normally.

(cherry picked from commit f8b7b597e9)
2021-02-25 12:07:34 +01:00
Matthijs Mekking
84deb57bc3 Don't allow recursion on staleonly lookups
Fix a crash that can happen in the following scenario:

A client request is received. There is no data for it in the cache,
(not even stale data). A resolver fetch is created as part of
recursion.

Some time later, the fetch still hasn't completed, and
stale-answer-client-timeout is triggered. A staleonly lookup is
started. It will also find no data in the cache.

So 'query_lookup()' will call 'query_gotanswer()' with ISC_R_NOTFOUND,
so this will call 'query_notfound()' and this will start recursion.

We will eventually end up in 'ns_query_recurse()' and that requires
the client query fetch to be NULL:

    REQUIRE(client->query.fetch == NULL);

If the previously started fetch is still running this assertion
fails.

The crash is easily prevented by not requiring recursion for
staleonly lookups.

Also remove a redundant setting of the staleonly flag at the end of
'query_lookup_staleonly()' before destroying the query context.

Add a system test to catch this case.

(cherry picked from commit 9e061faaae)
2021-02-25 12:07:27 +01:00
Matthijs Mekking
ddfb9ea8c1 Add tests for NSEC3 on dynamic zones
GitLab issue #2498 is a bug report on NSEC3 with dynamic zones. Tests
for it in the nsec3 system test directory were missing.

(cherry picked from commit 0c0f10b53f)
2021-02-25 10:55:51 +01:00
Mark Andrews
c5ad174129 Silence CID 320481: Null pointer dereferences
*** CID 320481:  Null pointer dereferences  (REVERSE_INULL)
    /bin/tests/wire_test.c: 261 in main()
    255     			process_message(input);
    256     		}
    257     	} else {
    258     		process_message(input);
    259     	}
    260
       CID 320481:  Null pointer dereferences  (REVERSE_INULL)
       Null-checking "input" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
    261     	if (input != NULL) {
    262     		isc_buffer_free(&input);
    263     	}
    264
    265     	if (printmemstats) {
    266     		isc_mem_stats(mctx, stdout);

(cherry picked from commit 658c950d7b)
2021-02-24 00:08:57 +11:00
Matthijs Mekking
9b4e067206 Minor kasp test fixes
Two minor fixes in the kasp system test:

1. A wrong comment in ns3/setup.sh (we are subtracting 2 hours, not
   adding them).
2. 'get_keyids' used bad parameters "$1" "$2" when 'check_numkeys'
   failed. Also, 'check_numkeys' can use $DIR, $ZONE, and $NUMKEYS
   directly, no need to pass them.

(cherry picked from commit 5be26898c0)
2021-02-23 09:19:23 +01:00
Matthijs Mekking
fc9dcbf419 Test purge-keys option
Add some more zones to the kasp system test to test the 'purge-keys'
option. Three zones test that the predecessor key files are removed
after the purge keys interval, one test checks that the key files
are retained if 'purge-keys' is disabled. For that, we change the
times to 90 days in the past (the default value for 'purge-keys').

(cherry picked from commit 6333ff15f0)
2021-02-23 09:19:11 +01:00
Matthijs Mekking
45dcabf411 Add purge-keys config option
Add a new option 'purge-keys' to 'dnssec-policy' that will purge key
files for deleted keys. The option determines how long key files
should be retained prior to removing the corresponding files from
disk.

If set to 0, the option is disabled and 'named' will not remove key
files from disk.

(cherry picked from commit 313de3a7e2)
2021-02-23 09:18:55 +01:00
Michal Nowak
5ded078daa Use FEATURETEST variable instead of a path
feature-test tool location needs to be determined by its associated
variable; otherwise, the tool is not found on Windows:

    setup.sh: line 22: ../feature-test: No such file or directory

(cherry picked from commit 102f012631)
2021-02-18 15:47:57 +01:00
Matthijs Mekking
1ffe0accf5 Fix eddsa system test
Use the correct conf.sh setup commands in ns3/sign.sh
2021-02-18 08:37:48 +00:00
Michal Nowak
c65c3d5153 Prevent Git to expand $systest
CentOS 8 "git status" unexpectedly expands search directory "tsig" to
also search in the "tsiggss" directory, thus incorrectly identifying
files as "not removed" in the "tsig" directory:

$ git status -su --ignored tsig
$ touch tsiggss/ns1/{named.run,named.memstats}
$ git status -su --ignored tsig
!! tsiggss/ns1/named.memstats
!! tsiggss/ns1/named.run

(cherry picked from commit f310b75250)
2021-02-18 08:20:54 +01:00
Michal Nowak
78c5a80817 Clean omitted files from system tests
Any CI job:
- I:dnssec:file dnssec/ns1/trusted.keys not removed
- I:rpzrecurse:file rpzrecurse/ns3/named.run.prev not removed

system:gcc:sid:amd64:
- I:mirror:file mirror/ns3/_default.nzf not removed

system:gcc:xenial:amd64:
- I:shutdown:file shutdown/.cache/v/cache/lastfailed not removed

(cherry picked from commit 14a104d121)
2021-02-18 08:20:54 +01:00
Michal Nowak
382ace6db6 Add system test name to "file not removed" info
(cherry picked from commit 10bf725ee2)
2021-02-18 08:20:54 +01:00
Michal Nowak
59499ecc3a Merge UNTESTED and SKIPPED system test results
Descriptions of UNTESTED and SKIPPED system test results are very
similar to one another and it may be confusing when to pick one and
when the other. Merging these two system test results removes the
confusion.

(cherry picked from commit 29d7c6e449)
2021-02-17 12:06:33 +01:00
Mark Andrews
d51b78c85b Stop including <gssapi.h> from <dst/gssapi.h> header
The only reason for including the gssapi.h from the dst/gssapi.h header
was to get the typedefs of gss_cred_id_t and gss_ctx_id_t.  Instead of
using those types directly this commit introduces dns_gss_cred_id_t and
dns_gss_ctx_id_t types that are being used in the public API and
privately retyped to their counterparts when we actually call the gss
api.

This also conceals the gssapi headers, so users of the libdns library
doesn't have to add GSSAPI_CFLAGS to the Makefile when including libdns
dst API.
2021-02-16 12:08:21 +11:00
Ondřej Surý
4bbe3e75de Stop including dnstap headers from <dns/dnstap.h>
The <fstrm.h> and <protobuf-c/protobuf-c.h> headers are only directly
included where used and we stopped exposing those headers from libdns
headers.
2021-02-16 12:08:21 +11:00
Mark Andrews
bf5aac225b Stop including <lmdb.h> from <dns/lmdb.h>
The lmdb.h header doesn't have to be included from the dns/lmdb.h
header as it can be separately included where used.  This stops
exposing the inclusion of lmdb.h from the libdns headers.
2021-02-16 12:08:21 +11:00
Mark Andrews
b8fc8742e5 Re-order include directories
${FSTRM_CFLAGS} ${PROTOBUF_C_CFLAGS} ${OPENSSL_CFLAGS} ${LMDB_CFLAGS}
need to appear after all directories in the build tree.
2021-02-16 12:08:21 +11:00
Diego Fronza
80c1a44643 Test reconfig after adding inline signed zones won't crash named
This test ensures that named won't crash after many inline-signed zones
are added to configurarion, followed by a rndc reconfig.
2021-02-15 11:53:24 -03:00
Diego Fronza
d89a8bf696 Fix dangling references to outdated views after reconfig
This commit fix a leak which was happening every time an inline-signed
zone was added to the configuration, followed by a rndc reconfig.

During the reconfig process, the secure version of every inline-signed
zone was "moved" to a new view upon a reconfig and it "took the raw
version along", but only once the secure version was freed (at shutdown)
was prev_view for the raw version detached from, causing the old view to
be released as well.

This caused dangling references to be kept for the previous view, thus
keeping all resources used by that view in memory.
2021-02-15 11:52:50 -03:00