Commit Graph

10592 Commits

Author SHA1 Message Date
Mark Andrews
f23e86b96b Add test configurations with invalid dnssec-policy clauses
bad-ksk-without-zsk.conf only has a ksk defined without a
matching zsk for the same algorithm.

bad-zsk-without-ksk.conf only has a zsk defined without a
matching ksk for the same algorithm.

bad-unpaired-keys.conf has two keys of different algorithms
one ksk only and the other zsk only
2022-03-08 13:23:14 +11:00
Ondřej Surý
9d8e8a4fcc Fix null pointer dereferences in udp_ready()
The query pointer was detached too early leading to null pointer
reference.  Move the query_detach() after the query->canceled check.
2022-03-06 10:18:20 +01:00
Ondřej Surý
6bd025942c Replace netievent lock-free queue with simple locked queue
The current implementation of isc_queue uses Michael-Scott lock-free
queue that in turn uses hazard pointers.  It was discovered that the way
we use the isc_queue, such complicated mechanism isn't really needed,
because most of the time, we either execute the work directly when on
nmthread (in case of UDP) or schedule the work from the matching
nmthreads.

Replace the current implementation of the isc_queue with a simple locked
ISC_LIST.  There's a slight improvement - since copying the whole list
is very lightweight - we move the queue into a new list before we start
the processing and locking just for moving the queue and not for every
single item on the list.

NOTE: There's a room for future improvements - since we don't guarantee
the order in which the netievents are processed, we could have two lists
- one unlocked that would be used when scheduling the work from the
matching thread and one locked that would be used from non-matching
thread.
2022-03-04 13:49:51 +01:00
Ondřej Surý
be34b1c535 Reorder the nsupdate shutdown code to shutdown managers early
If the dns_request send callback is delayed, the dst API would get
deinitialized and then the detach from the tsig key would cause an
assertion failure.

Shutdown the isc_managers early, and only then dereference the dst
objects when cleaning up the resources used by nsupdate.
2022-03-04 13:47:59 +01:00
Aram Sargsyan
4043fe9090 Fix query context management issues in dighost.c
For the reference, the _cancel_lookup() function iterates through
the lookup's queries list and detaches them. In the ideal scenario,
that should be the last reference and the query will be destroyed
after that, but it is also possible that we are still expecting a
callback, which also holds a reference (for example, _cancel_lookup()
could have been called from recv_done(), when send_done() was still
not executed).

The start_udp() and start_tcp() functions are currently designed in
slightly different ways: start_udp() creates a new query attachment
`connectquery`, to be called in the callback function, while
start_tcp() does not, which is a bug, but is hidden by the fact
that when the query is being erroneously destroyed prematurely (before
_cancel_lookup() is called) in the result of that, it also gets
de-listed from the lookup's queries' list, so _cancel_lookup() doesn't
even try to detach it.

For better understanding, here's an illustration of the query's
references count changes, and from where it was changed:

UDP
---
 1. _new_query()        -> refcount = 1 (initial)
 2. start_udp()         -> refcount = 2 (lookup->current_query)
 3. start_udp()         -> refcount = 3 (connectquery)
 4. udp_ready()         -> refcount = 4 (readquery)
 5. udp_ready()         -> refcount = 5 (sendquery)
 6. udp_ready()         -> refcount = 4 (lookup->current_query)
 7. udp_ready()         -> refcount = 3 (connectquery)
 8. send_done()         -> refcount = 2 (sendquery)
 9. recv_done()         -> refcount = 1 (readquery)
10. _cancel_lookup()    -> refcount = 0 (initial)
11. the query gets destroyed and removed from `lookup->q`

TCP, fortunate scenario
-----------------------

 1. _new_query()        -> refcount = 1 (initial)
 2. start_tcp()         -> refcount = 2 (lookup->current_query)
 3. launch_next_query() -> refcount = 3 (readquery)
 4. launch_next_query() -> refcount = 4 (sendquery)
 5. tcp_connected()     -> refcount = 3 (lookup->current_query)
 6. tcp_connected()     -> refcount = 2 (bug, there was no connectquery)
 7. send_done()         -> refcount = 1 (sendquery)
 8. recv_done()         -> refcount = 0 (readquery)
 9. the query gets prematurely destroyed and removed from `lookup->q`
10. _cancel_lookup()    -> the query is not in `lookup->q`

TCP, unfortunate scenario, revealing the bug
--------------------------------------------

 1. _new_query()        -> refcount = 1 (initial)
 2. start_tcp()         -> refcount = 2 (lookup->current_query)
 3. launch_next_query() -> refcount = 3 (readquery)
 4. launch_next_query() -> refcount = 4 (sendquery)
 5. tcp_connected()     -> refcount = 3 (lookup->current_query)
 6. tcp_connected()     -> refcount = 2 (bug, there was no connectquery)
 7. recv_done()         -> refcount = 1 (readquery)
 8. _cancel_lookup()    -> refcount = 0 (the query was in `lookup->q`)
 9. we hit an assertion here when trying to destroy the query, because
    sendhandle is not detached (which is done by send_done()).
10. send_done()         -> this never happens

This commit does the following:

1. Add a `connectquery` attachment in start_tcp(), like done in
   start_udp().
2. Add missing _cancel_lookup() calls for error scenarios, which
   were possibly missing because before fixing the bug, calling
   _cancel_lookup() and then calling query_detach() would cause
   an assertion.
3. Log a debug message and call isc_nm_cancelread(query->readhandle)
   for every query in the lookup from inside the _cancel_lookup()
   function, like it is done in _cancel_all().
4. Add a `canceled` property for the query which becomes `true` when
   the lookup (and subsequently, its queries) are canceled.
5. Use the `canceled` property in the network manager callbacks to
   know that the query was canceled, and act like `eresult` was equal
   to `ISC_R_CANCELED`.
2022-03-03 11:10:52 -08:00
Aram Sargsyan
98820aef7e Add a missing UNLOCK_LOOKUP
There was a missing UNLOCK_LOOKUP in the recv_done() callback when
the operation had been canceled. That omission could result in a
deadlock situation.
2022-03-03 11:10:52 -08:00
Evan Hunt
4ca74eee49 document zone grammar more correctly
the "zone" clause can be documented using, for instance,
`cfg_test --zonegrammar primary", which prints only
options that are valid in primary zones. this was not
the method being used when generating the named.conf
man page; instead, "zone" was documented with all possible
options, and no zone types at all.

this commit removes "zone" from the generic documentation
and adds include statements in named.conf.rst so that
correct zone grammars will be included in the man page.
2022-03-02 01:53:24 -08:00
Evan Hunt
0e57fc160e add a CFG_CLAUSEFLAG_NODOC flag for use with outdated terms
"masters" and "default-masters" are now flagged so they will
not be included in the named.conf man page, despite being
accepted as valid options by the parser for backward
compatibiility.
2022-02-25 16:33:30 -08:00
Petr Špaček
653db956f0 Remove last .bat file from the source tree
This fixes an omission in !5739, "Remove leftover test code for Windows".
2022-02-22 15:53:25 +01:00
Ondřej Surý
30fda4cb52 Remove the keep-response-order system test
Remove the keep-response-order from the system test and cleanup the
pipelined system test to be shell check clean and use the helper
functions.
2022-02-18 09:16:03 +01:00
Ondřej Surý
d01562f22b Remove the keep-response-order ACL map
The keep-response-order option has been obsoleted, and in this commit,
remove the keep-response-order ACL map rendering the option no-op, the
call the isc_nm_sequential() and the now unused isc_nm_sequential()
function itself.
2022-02-18 09:16:03 +01:00
Ondřej Surý
30f4bdb17e Declare the keep-response-order obsolete
The keep-response-order option has been introduced when TCP pipelining
has been introduced to BIND 9 as a failsafe for possibly non-compliant
clients.

Declare the keep-response-order obsolete as all DNS clients should
either support out-of-order processing or don't send more DNS queries
until the DNS response for the previous one has been received.
2022-02-17 16:49:56 -08:00
Ondřej Surý
8fed1b6461 Add XFR max-transfer-time-out and max-tranfer-idle-out system tests
Extend the timeouts system test to ensure that the maximum outgoing
transfer time (max-transfer-time-out) and maximum outgoing transfer idle
time (max-transfer-idle-out) works as expected.  This is done by
lowering the limits to 5/1 minutes and testing that the connection has
been dropped while sleeping between the individual XFR messages.
2022-02-17 21:38:17 +01:00
Evan Hunt
08c2728ed1 add a test for dnssec-signzone -J
generate a journal file, and load it in dnssec-signzone.
2022-02-17 12:03:05 -08:00
Evan Hunt
4d2f5754af add a test for dnssec-verify -J
generate a journal file and confirm that dnssec-verify is able
to load it.
2022-02-17 12:03:05 -08:00
Evan Hunt
d2597e3496 support $INCLUDE in makejournal
bin/tests/system/makejournal needs to ignore DNS_R_SEENINCLUDE
when calling dns_db_load(), otherwise it cannot generate a journal
for a zone file with a $INCLUDE statement.
2022-02-17 12:03:05 -08:00
Evan Hunt
c3fd94cd4d make dnssec-verify and dnssec-signzone read journal files
add a -J option to dnssec-verify and dnssec-signzone to read
a specified journal file when loading a zone.
2022-02-17 12:03:01 -08:00
Ondřej Surý
ebfdb50ac7 Add TCP garbage system test
Test if the TCP connection gets reset when garbage instead of DNS
message is sent.

I'm only happy when it rains
Pour some misery down on me
- Garbage
2022-02-17 20:39:55 +01:00
Ondřej Surý
b735182ae0 Add TCP write timeout system test
Extend the timeouts system test that bursts the queries for large TXT
record and never read any responses back filling up the server TCP write
buffer.  The test should work with the default wmem_max value on
Linux (208k).
2022-02-17 09:06:58 +01:00
Evan Hunt
4444b168db negative 'blackhole' ACL match could be treated as positive
There was a bug in the checking of the "blackhole" ACL in
dns_request_create*(), causing an address to be treated as included
in the ACL if it was explicitly *excluded*. Thus, leaving "blackhole"
unset had no effect, but setting it to "none" would cause any
destination addresses to be rejected for dns_request purposes. This
would cause zone transfer requests and SOA queries to fail, among
other things.

The bug has been fixed, and "blackhole { none; };" was added to the
xfer system test as a regression test.
2022-02-16 19:05:06 -08:00
Ondřej Surý
b42681c4e9 Use compile-time paths in the manual pages
Replace the hard-coded paths for various BIND 9 files (configuration,
pid, etc.) in the man pages and ARM with compile-time values using the
sphinx-build replace system.

This is more complicated, because the restructured text specification
doesn't allow |substitions| inside ``code-blocks``, so for each specific
file we had to create own substition which is sub-optimal, but it is
only way how to do this without adding Sphinx extension.
2022-02-10 16:50:22 +01:00
Ondřej Surý
0500345513 Remove unused functions from isc_thread API
The isc_thread_setaffinity call was removed in !5265 and we are not
going to restore it because it was proven that the performance is better
without it.  Additionally, remove the already disabled cpu system test.

The isc_thread_setconcurrency function is unused and also calling
pthread_setconcurrency() on Linux has no meaning, formerly it was
added because of Solaris in 2001 and it was removed when taskmgr was
refactored to run on top of netmgr in !4918.
2022-02-09 17:22:06 +01:00
Matthijs Mekking
7845f51178 Fix keyfromlabel test, missing status update
Fix a missing status=$((status+ret)) in the keyfromlabel system test,
which would ignore the error if ZSK key creation failed.
2022-02-04 13:40:18 +01:00
Aram Sargsyan
a449709441 Use unique SoftHSMv2 token label for the "keyfromlabel" test
When there are more than one tokens initialized in SoftHSMv2,
care must be taken to correctly identify them.

Use a SoftHSMv2 token label which will uniquely identify the
token used for this test.

Use the "--token-label" parameter for the `pkcs11-tool` program
to make sure that it finds and uses the correct token.
2022-02-04 13:40:18 +01:00
Matthijs Mekking
468cf3cdc2 Fix keyfromlabel echo output
The 'id' variable is either keyfromlabel-ksk or keyfromlabel-zsk and is
set in the 'keygen' and 'keyfromlabel' functions. It should not be used
outside these functions.
2022-02-04 13:40:18 +01:00
Matthijs Mekking
bfe287f4a4 Add test for assertion failure in pk11_numbits
This test was originally in the pkcs11 system test. While this crash
happened in the native pkcs11 of BIND 9, and that code has been
removed in 9.17, there is no need for this test. Nevertheless, it
doesn't hurt having the test case persist.
2022-02-04 13:40:18 +01:00
Matthijs Mekking
11a0b41370 Add system test for engine_pkcs11
Add a system test for engine_pkcs11 interactions that replaces the
tests that are done in the native PKCS#11 system test.

The native PKCS#11 code was removed in 9.17 but without copying the
pkcs11 system test.
2022-02-04 13:40:18 +01:00
Mark Andrews
123b57db36 Check that no debugging / errors are reported normally 2022-01-31 14:18:55 -08:00
Evan Hunt
6de4dfcc8c make nslookup test shellcheck safe 2022-01-31 14:17:23 -08:00
Mark Andrews
c068c3c771 Remove spurious 'debugging = true;'
This appears to be left over from the developement phase while
adding reference counting to the lookup structure.
2022-01-31 13:55:00 -08:00
Evan Hunt
e8ac7cf6ec remove error handling code around dns_dnsseckey_create()
this function can no longer fail, so error checking is not necessary.
2022-01-31 10:39:04 -08:00
Evan Hunt
79ddedabf8 test ECS information is passed in dlzexternal
the dlzexternal test driver now includes ECS, if present in the
query, in the TXT record returned for QNAME "source-addr".
2022-01-27 13:53:59 -08:00
Petr Špaček
f81debe1c8 extend DLZ interface and example with ECS support
Apparently we forgot about DLZ when updating DNS_CLIENTINFO_VERSION
constant for ECS, which is at value "3" since ECS was introduced.

The code in example drivers and tests now hardcodes version numbers
2 (without ECS) and 3 (with ECS) depending on what a given code path
requires.
2022-01-27 13:53:59 -08:00
Michal Nowak
e97ed8d9b6 Clean up test.output.* references
test.output.* files are no longer created by the system test framework.
Remove all references to these files from the source tree.
2022-01-27 15:32:28 +01:00
Michal Nowak
f6b996f6fc Drop systests.output references from system test
Since "runall.sh" script removal systests.output file is not being
created and its references are useless.
2022-01-27 15:32:28 +01:00
Michal Nowak
8109e924b5 Drop support for sequential system tests
System test used to have sequential system tests, which can't run in
parallel with the rest of system tests. As there are no such tests
anymore the underlying infrastructure can be dropped.
2022-01-27 15:32:28 +01:00
Michal Nowak
9d398572f0 Drop bin/tests/system/parallel.sh
"parallel.sh" script was used on Windows to run system tests in
parallel. Since Windows support was removed from BIND 9, the script is
not needed anymore.
2022-01-27 15:32:28 +01:00
Michal Nowak
986b364fe6 Drop testsummary.sh
testsummary.sh was not updated after build system rewrite to Autotools,
and needs to be fixed to produce test summary and core dump, assertion
failures, and ThreadSanitizer reports.

Given that all of this is provided by Autotools and run.sh already,
there's little use to testsummary.sh script and should be dropped.
2022-01-27 15:32:27 +01:00
Michal Nowak
5d2dd94cf8 Drop runall.sh
runall.sh was mainly used on Windows and as it's support was removed
from the "main" branch the script is not needed anymore.

Also, remove bin/tests/system/README text on running multiple system
test suites simultaneously with runall.sh as that support was not
present in the script anyway.
2022-01-27 11:58:17 +01:00
Michal Nowak
b983df403a Drop unused @DNSTAP@ label in conf.sh.in
@DNSTAP@ label does not have adjacent AC_SUBST() call and is therefore
unused.
2022-01-27 11:57:27 +01:00
Michal Nowak
67092442d6 rrsetorder should use stop_server() in tests.sh 2022-01-27 11:57:27 +01:00
Michal Nowak
7ba786dedb Drop bin/tests/system/setup.sh
bin/tests/system/setup.sh just executes setup.sh script of a particular
system test in the directory of the system test. This does not seems to
be useful enough to maintain it.
2022-01-27 11:57:27 +01:00
Michal Nowak
4c03d814ed Drop stopall.sh
stopall.sh script takes almost 2 minutes to go thru all test
subdirectories (due to a sleep in stop.pl) and does not seems to be
efficient way to stop manually started tests.
2022-01-27 11:57:26 +01:00
Matthijs Mekking
0af8bbd49b Create keys with pkcs11-tool --id
The keyfromlabel system ECDSA tests sometimes fail. When this happens
the ZSK and KSK key id values differ by 1, which is an indication that
the same key is used for both DNSKEY records.

When the private key is retrieved with 'ENGINE_load_private_key()', the
public key is already set. But sometimes that key differs from the key
which was retrieved with 'ENGINE_load_public_key()'.

The libp11 source code uses id to find the key and without IDs all the
keys are "equal", so it is returning the first key in the array of the
enumerated keys instead of the matching key. In our test we didn't use
'--id', just '--label'. With this change, the system test should no
longer fail intermittently.

Note this is only an issue for ECDSA keys, not RSA keys.
2022-01-27 10:49:47 +01:00
Matthijs Mekking
eba66665a5 Add system test for dnssec-keyfromlabel
Add missing system test for dnssec-keyfromlabel. Test for various
algorithms that we can generate key files from a key that is stored in a
HSM, and that those keys can be used for signing with dnssec-signzone.
2022-01-27 10:49:46 +01:00
Matthijs Mekking
221e1bc2a3 Update .gitlab-ci.yml with openssl setup
GitLab CI needs to know about some environment variables that will
tell where OpenSSL and SoftHSM2 is installed. This is done in the
image, making the prepare-softhsm2.sh script obsolete.

The SoftHSM2 module location is system specific.
2022-01-27 10:46:58 +01:00
Matthijs Mekking
0725fcad38 Remove prepare-softhsm2.sh from runtime test
This script is obsoleted because SoftHSM2 is now installed in the
image.
2022-01-27 10:46:58 +01:00
Evan Hunt
1d706f328c Remove leftover test code for Windows
- Removed all code that only runs under CYGWIN, and made all
  code that doesn't run under CYGWIN non-optional.
- Removed the $TP variable which was used to add optional
  trailing dots to filenames; they're no longer optional.
- Removed references to pssuspend and dos2unix.
- No need to use environment variables for diff and kill.
- Removed uses of "tr -d '\r'"; this was a workaround for
  a cygwin regex bug that is no longer needed.
2022-01-27 09:08:29 +01:00
Michał Kępień
a938db2170 Fix waiting for lock file removal upon exit
Commit c787a539d2 fixed a certain class of
intermittent system test failures caused by named instances unable to
restart.  The root cause was bin/tests/system/stop.pl returning without
waiting for a named instance to remove its lock file.

Later on, it turned out that the above change causes other issues on
Windows due to the way named handles signals on that platform.  Commit
761ba4514f intended to address those
issues by making the server_lock_file() subroutine in
bin/tests/system/stop.pl return an empty value on Windows, in order to
prevent the script for waiting for lock file cleanup on that platform.
Note, however, that Windows detection in that subroutine is limited to
checking whether the CYGWIN environment variable is set.

While that environment variable was not set on Unix-like systems before
commit 761ba4514f, another commit
(a33237f070, merged a few weeks later)
changed that by setting the CYGWIN environment variable to an empty
value on Unix-like systems.  This made the defined($ENV{'CYGWIN'}) check
in server_lock_file() return true, inadvertently preventing
bin/tests/system/stop.pl from waiting for lock file removal before
exiting on Unix-like systems and therefore reintroducing the original
issue.

Fix by making server_lock_file() only return an empty value when the
CYGWIN environment variable is set to a non-empty value (which is what
bin/tests/system/conf.sh.win32 does).  Adjust a similar check in the
pid_file_exists() subroutine in the same way for consistency.
2022-01-26 15:18:43 +01:00
Michał Kępień
fb87022115 Do not strip leading whitespace from test output
The echo_*() and cat_*() functions in bin/tests/system/conf.sh.common
call the "read" builtin command without specifying the field separator
to use.  This results in leading whitespace getting stripped from each
line of the texts passed to those functions, which mangles e.g. pytest
output, hindering test failure troubleshooting.

Address by setting IFS to an empty value for the "read" calls used in
the aforementioned helper functions.
2022-01-26 15:18:43 +01:00