Commit Graph

9890 Commits

Author SHA1 Message Date
Matthijs Mekking
cf0439cd5f Treat dnssec-policy "none" as a builtin zone
Configure "none" as a builtin policy. Change the 'cfg_kasp_fromconfig'
api so that the 'name' will determine what policy needs to be
configured.

When transitioning a zone from secure to insecure, there will be
cases when a zone with no DNSSEC policy (dnssec-policy none) should
be using KASP. When there are key state files available, this is an
indication that the zone once was DNSSEC signed but is reconfigured
to become insecure.

If we would not run the keymgr, named would abruptly remove the
DNSSEC records from the zone, making the zone bogus. Therefore,
change the code such that a zone will use kasp if there is a valid
dnssec-policy configured, or if there are state files available.

(cherry picked from commit cf420b2af0)
2020-12-23 11:56:33 +01:00
Matthijs Mekking
63b72ad5e9 Small adjustments to kasp rndc_checkds function
Slightly better test output, and only call 'load keys' if the
'rndc checkds' call succeeded.

(cherry picked from commit 756674f6d1)
2020-12-23 11:56:16 +01:00
Matthijs Mekking
c3d2843915 Add tests for going from secure to insecure
Add two test zones that will be reconfigured to go insecure, by
setting the 'dnssec-policy' option to 'none'.

One zone was using inline-signing (implicitly through dnssec-policy),
the other is a dynamic zone.

Two tweaks to the kasp system test are required: we need to set
when to except the CDS/CDS Delete Records, and we need to know
when we are dealing with a dynamic zone (because the logs to look for
are slightly different, inline-signing prints "(signed)" after the
zone name, dynamic zones do not).

(cherry picked from commit fa2e4e66b0)
2020-12-23 11:56:07 +01:00
Matthijs Mekking
ba75744331 Add test for cpu affinity
Add a test to check BIND 9 honors CPU affinity mask. This requires
some changes to the start script, to construct the named command.

(cherry picked from commit f1a097964c)
2020-12-23 09:25:48 +11:00
Mark Andrews
b278d91680 Handle shared library platforms that don't support inter library dependancies 2020-12-20 21:36:09 +00:00
Mark Andrews
afbc6f41d9 Fixup library link lists 2020-12-20 21:36:09 +00:00
Michal Nowak
5f0e9c8645 Fix program name reference in dnssec-keymgr(8) 2020-12-14 13:17:27 +01:00
Michal Nowak
c77c96133d Fix a reference to rndc(8) in named(8) manual page
(cherry picked from commit befcbcac28)
2020-12-14 13:17:27 +01:00
Mark Andrews
151500b522 Update dnssec-signzone -N soa-serial-format description
document the autoincrement when the serial would go backwards.

(cherry picked from commit eb1b29b19e)
2020-12-12 08:07:51 +01:00
Ondřej Surý
04f9f45c54 Print warning when falling back to increment soa serial method
When using the `unixtime` or `date` method to update the SOA serial,
`named` and `dnssec-signzone` would silently fallback to `increment`
method to prevent the new serial number to be smaller than the old
serial number (using the serial number arithmetics).  Add a warning
message when such fallback happens.

(cherry picked from commit ef685bab5c)
2020-12-12 07:55:29 +01:00
Ondřej Surý
7fc62f829d Add libssl libraries to Windows build
This commit extends the perl Configure script to also check for libssl
in addition to libcrypto and change the vcxproj source files to link
with both libcrypto and libssl.
2020-12-09 10:46:16 +01:00
Ondřej Surý
7b9c8b9781 Refactor netmgr and add more unit tests
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested.  It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.

NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.

The changes that are included in this commit are listed here
(extensively, but not exclusively):

* The netmgr_test unit test was split into individual tests (udp_test,
  tcp_test, tcpdns_test and newly added tcp_quota_test)

* The udp_test and tcp_test has been extended to allow programatic
  failures from the libuv API.  Unfortunately, we can't use cmocka
  mock() and will_return(), so we emulate the behaviour with #define and
  including the netmgr/{udp,tcp}.c source file directly.

* The netievents that we put on the nm queue have variable number of
  members, out of these the isc_nmsocket_t and isc_nmhandle_t always
  needs to be attached before enqueueing the netievent_<foo> and
  detached after we have called the isc_nm_async_<foo> to ensure that
  the socket (handle) doesn't disappear between scheduling the event and
  actually executing the event.

* Cancelling the in-flight TCP connection using libuv requires to call
  uv_close() on the original uv_tcp_t handle which just breaks too many
  assumptions we have in the netmgr code.  Instead of using uv_timer for
  TCP connection timeouts, we use platform specific socket option.

* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}

  When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
  waiting for socket to either end up with error (that path was fine) or
  to be listening or connected using condition variable and mutex.

  Several things could happen:

    0. everything is ok

    1. the waiting thread would miss the SIGNAL() - because the enqueued
       event would be processed faster than we could start WAIT()ing.
       In case the operation would end up with error, it would be ok, as
       the error variable would be unchanged.

    2. the waiting thread miss the sock->{connected,listening} = `true`
       would be set to `false` in the tcp_{listen,connect}close_cb() as
       the connection would be so short lived that the socket would be
       closed before we could even start WAIT()ing

* The tcpdns has been converted to using libuv directly.  Previously,
  the tcpdns protocol used tcp protocol from netmgr, this proved to be
  very complicated to understand, fix and make changes to.  The new
  tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
  Closes: #2194, #2283, #2318, #2266, #2034, #1920

* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
  pass accepted TCP sockets between netthreads, but instead (similar to
  UDP) uses per netthread uv_loop listener.  This greatly reduces the
  complexity as the socket is always run in the associated nm and uv
  loops, and we are also not touching the libuv internals.

  There's an unfortunate side effect though, the new code requires
  support for load-balanced sockets from the operating system for both
  UDP and TCP (see #2137).  If the operating system doesn't support the
  load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
  on FreeBSD 12+), the number of netthreads is limited to 1.

* The netmgr has now two debugging #ifdefs:

  1. Already existing NETMGR_TRACE prints any dangling nmsockets and
     nmhandles before triggering assertion failure.  This options would
     reduce performance when enabled, but in theory, it could be enabled
     on low-performance systems.

  2. New NETMGR_TRACE_VERBOSE option has been added that enables
     extensive netmgr logging that allows the software engineer to
     precisely track any attach/detach operations on the nmsockets and
     nmhandles.  This is not suitable for any kind of production
     machine, only for debugging.

* The tlsdns netmgr protocol has been split from the tcpdns and it still
  uses the old method of stacking the netmgr boxes on top of each other.
  We will have to refactor the tlsdns netmgr protocol to use the same
  approach - build the stack using only libuv and openssl.

* Limit but not assert the tcp buffer size in tcp_alloc_cb
  Closes: #2061

(cherry picked from commit 634bdfb16d)
2020-12-09 10:46:16 +01:00
Ondřej Surý
a35a666a7c Reformat sources using clang-format-11
(cherry picked from commit 7ba18870dc)
2020-12-08 19:34:05 +01:00
Ondřej Surý
5d34daaf78 Change the default value for nocookie-udp-size back to 4096
The DNS Flag Day 2020 reduced all the EDNS buffer sizes to 1232.  In
this commit, we revert the default value for nocookie-udp-size back to
4096 because the option is too obscure and most people don't realize
that they also need to change this configuration option in addition to
max-udp-size.

(cherry picked from commit 79c196fc77)
2020-12-02 12:01:50 +01:00
Mark Andrews
5c10b5a4e8 Adjust default value of "max-recursion-queries"
Since the queries sent towards root and TLD servers are now included in
the count (as a result of the fix for CVE-2020-8616),
"max-recursion-queries" has a higher chance of being exceeded by
non-attack queries.  Increase its default value from 75 to 100.

(cherry picked from commit ab0bf49203)
2020-12-02 00:53:49 +11:00
Mark Andrews
2898f530cd Check that missing cookies are handled
(cherry picked from commit bd9155590e)
2020-11-27 08:44:00 +11:00
Michal Nowak
f48b221ffa Write traceback file to the same directory as core file
The traceback files could overwrite each other on systems which do not
use different core dump file names for different processes.  Prevent
that by writing the traceback file to the same directory as the core
dump file.

These changes still do not prevent the operating system from overwriting
a core dump file if the same binary crashes multiple times in the same
directory and core dump files are named identically for different
processes.

(cherry picked from commit 6428fc26af)
2020-11-26 18:29:41 +01:00
Mark Andrews
a50ce1bd43 Unify whitespace in bin/tests/system/run.sh
Replace tabs with spaces to make whitespace consistent across the entire
bin/tests/system/run.sh script.

(cherry picked from commit 0f0a006c7e)
2020-11-26 18:29:33 +01:00
Matthijs Mekking
6db879160f Detect NSEC3 salt collisions
When generating a new salt, compare it with the previous NSEC3
paremeters to ensure the new parameters are different from the
previous ones.

This moves the salt generation call from 'bin/named/*.s' to
'lib/dns/zone.c'. When setting new NSEC3 parameters, you can set a new
function parameter 'resalt' to enforce a new salt to be generated. A
new salt will also be generated if 'salt' is set to NULL.

Logging salt with zone context can now be done with 'dnssec_log',
removing the need for 'dns_nsec3_log_salt'.

(cherry picked from commit 6b5d7357df)
2020-11-26 14:15:05 +00:00
Matthijs Mekking
734865e110 Add zone context to "generated salt" logs
(cherry picked from commit 3b4c764b43)
2020-11-26 14:15:05 +00:00
Matthijs Mekking
93f9d3b812 Move logging of salt in separate function
There may be a desire to log the salt without losing the context
of log module, level, and category.

(cherry picked from commit 7878f300ff)
2020-11-26 14:15:04 +00:00
Matthijs Mekking
52d3bf5f31 Change nsec3param salt config to saltlen
Upon request from Mark, change the configuration of salt to salt
length.

Introduce a new function 'dns_zone_checknsec3aram' that can be used
upon reconfiguration to check if the existing NSEC3 parameters are
in sync with the configuration. If a salt is used that matches the
configured salt length, don't change the NSEC3 parameters.

(cherry picked from commit 6f97bb6b1f)
2020-11-26 14:15:04 +00:00
Matthijs Mekking
d35dab3db8 Add check for NSEC3 and key algorithms
NSEC3 is not backwards compatible with key algorithms that existed
before the RFC 5155 specification was published.

(cherry picked from commit 00c5dabea3)
2020-11-26 14:15:03 +00:00
Matthijs Mekking
00d7cc5144 Disable one nsec3 test due to GL #2216
This known bug makes the test fail. There is no trivial fix so disable
test case for now.

(cherry picked from commit f10790b02d)
2020-11-26 14:15:03 +00:00
Matthijs Mekking
cf79e6ccc1 Add some NSEC3 optout tests
Make sure that just changing the optout value recreates the chain.

(cherry picked from commit a5b45bdd03)
2020-11-26 14:15:03 +00:00
Matthijs Mekking
2a1793a2be Check nsec3param configuration values
Check 'nsec3param' configuration for the number of iterations.  The
maximum number of iterations that are allowed are based on the key
size (see https://tools.ietf.org/html/rfc5155#section-10.3).

Check 'nsec3param' configuration for correct salt. If the string is
not "-" or hex-based, this is a bad salt.

(cherry picked from commit 7039c5f805)
2020-11-26 14:15:03 +00:00
Matthijs Mekking
b6cf88333a Don't use 'rndc signing' with kasp
The 'rndc signing' command allows you to manipulate the private
records that are used to store signing state. Don't use these with
'dnssec-policy' as such manipulations may violate the policy (if you
want to change the NSEC3 parameters, change the policy and reconfig).

(cherry picked from commit eae9a6d297)
2020-11-26 14:15:02 +00:00
Matthijs Mekking
d13786d583 Fix a reconfig bug wrt inline-signing
When doing 'rndc reconfig', named may complain about a zone not being
reusable because it has a raw version of the zone, and the new
configuration has not set 'inline-signing'. However, 'inline-signing'
may be implicitly true if a 'dnssec-policy' is used for the zone, and
the zone is not dynamic.

Improve the check in 'named_zone_reusable'.  Create a new function for
checking 'inline-signing' configuration that matches existing code in
'bin/named/server.c'.

(cherry picked from commit ba8128ea00)
2020-11-26 14:15:02 +00:00
Matthijs Mekking
008e84e965 Support for NSEC3 in dnssec-policy
Implement support for NSEC3 in dnssec-policy.  Store the configuration
in kasp objects. When configuring a zone, call 'dns_zone_setnsec3param'
to queue an nsec3param event. This will ensure that any previous
chains will be removed and a chain according to the dnssec-policy is
created.

Add tests for dnssec-policy zones that uses the new 'nsec3param'
option, as well as changing to new values, changing to NSEC, and
changing from NSEC.

(cherry picked from commit 114af58ee2)
2020-11-26 14:15:02 +00:00
Matthijs Mekking
5dfd3b2d7b Add kasp nsec3param configuration
Add configuration and documentation on how to enable NSEC3 when
using dnssec-policy for signing your zones.

(cherry picked from commit f7ca96c805)
2020-11-26 14:15:02 +00:00
Matthijs Mekking
9b9ac92fd0 Move generate_salt function to lib/dns/nsec3
We will be using this function also on reconfig, so it should have
a wider availability than just bin/named/server.

(cherry picked from commit 84a4273074)
2020-11-26 14:14:56 +00:00
Michał Kępień
e395ff54e5 Teach cppcheck that fatal() does not return
cppcheck is not aware that the bin/dnssec/dnssectool.c:fatal() function
does not return.  This triggers certain cppcheck 2.2 false positives,
for example:

    bin/dnssec/dnssec-signzone.c:3470:13: warning: Either the condition 'ndskeys==8' is redundant or the array 'dskeyfile[8]' is accessed at index 8, which is out of bounds. [arrayIndexOutOfBoundsCond]
       dskeyfile[ndskeys++] = isc_commandline_argument;
                ^
    bin/dnssec/dnssec-signzone.c:3467:16: note: Assuming that condition 'ndskeys==8' is not redundant
       if (ndskeys == MAXDSKEYS) {
                   ^
    bin/dnssec/dnssec-signzone.c:3470:13: note: Array index out of bounds
       dskeyfile[ndskeys++] = isc_commandline_argument;
                ^

    bin/dnssec/dnssec-signzone.c:771:20: warning: Either the condition 'l->hashbuf==NULL' is redundant or there is pointer arithmetic with NULL pointer. [nullPointerArithmeticRedundantCheck]
     memset(l->hashbuf + l->entries * l->length, 0, l->length);
                       ^
    bin/dnssec/dnssec-signzone.c:767:18: note: Assuming that condition 'l->hashbuf==NULL' is not redundant
      if (l->hashbuf == NULL) {
                     ^
    bin/dnssec/dnssec-signzone.c:771:20: note: Null pointer addition
     memset(l->hashbuf + l->entries * l->length, 0, l->length);
                       ^

Instead of suppressing all such warnings individually, conditionally
define a preprocessor macro which prevents them from being triggered.

(cherry picked from commit d9701e22b5)
2020-11-25 13:21:58 +01:00
Matthijs Mekking
259db79579 Fix syntax in echo_i messages
It's either "record returns" or "records return".

(cherry picked from commit 53188daf5b)
2020-11-20 10:43:21 +11:00
Mark Andrews
b3d259107f Fix DNAME when QTYPE is CNAME or ANY
The synthesised CNAME is not supposed to be followed when the
QTYPE is CNAME or ANY as the lookup is satisfied by the CNAME
record.

(cherry picked from commit e980affba0)
2020-11-19 10:52:29 +11:00
Diego Fronza
10860b09be Update ARM and other documents 2020-11-12 10:13:04 +01:00
Diego Fronza
f321d95464 Adjusted test to match new rndc serve-stale status output 2020-11-11 16:06:36 -03:00
Diego Fronza
4905c2e24a Output 'stale-refresh-time' value on rndc serve-stale status 2020-11-11 16:06:30 -03:00
Diego Fronza
73c199dec7 Check 'stale-refresh-time' when sharing cache between views
This commit ensures that, along with previous restrictions, a cache is
shareable between views only if their 'stale-refresh-time' value are
equal.
2020-11-11 16:06:23 -03:00
Matthijs Mekking
4d52ddbd15 Add two more system tests for stale-refresh-time
Add one test that checks the behavior when serve-stale is enabled
via configuration (as opposed to enabled via rndc).

Add one test that checks the behavior when stale-refresh-time is
disabled (set to 0).
2020-11-11 16:06:16 -03:00
Matthijs Mekking
276c912953 Change serve-stale test stale-answer-ttl
Using a 'stale-answer-ttl' the same value as the authoritative ttl
value makes it hard to differentiate between a response from the
stale cache and a response from the authoritative server.

Change the stale-answer-ttl from 2 to 4, so that it differs from the
authoritative ttl.
2020-11-11 16:06:07 -03:00
Diego Fronza
8383621ba8 Wait for multiple parallel dig commands to fully finish
The strategy of running many dig commands in parallel and
waiting for the respective output files to be non empty was
resulting in random test failures, hard to reproduce, where
it was possible that the subsequent reading of the files could
have been failing due to the file's content not being fully flushed.

Instead of checking if output files are non empty, we now wait
for the dig processes to finish.
2020-11-11 16:05:30 -03:00
Diego Fronza
0123a0f44f Added system test for stale-refresh-time
This test works as follow:
- Query for data.example rrset.
- Sleep until its TTL expires (2 secs).
- Disable authoritative server.
- Query for data.example again.
- Since server is down, answer come from stale cache, which has
  a configured stale-answer-ttl of 3 seconds.
- Enable authoritative server.
- Query for data.example again
- Since last query before activating authoritative server failed, and
  since 'stale-refresh-time' seconds hasn't elapsed yet, answer should
  come from stale cache and not from the authoritative server.
2020-11-11 16:01:59 -03:00
Diego Fronza
e8c3d538d5 Adjusted ancient rrset system test
Before the stale-refresh-time feature, the system test for ancient rrset
was somewhat based on the average time the previous tests and queries
were taking, thus not very precise.

After the addition of stale-refresh-time the system test for ancient
rrset started to fail since the queries for stale records (low
max-stale-ttl) were not taking the time to do a full resolution
anymore, since the answers now were coming from the cache (because the
rrset were stale and within stale-refresh-time window after the
previous resolution failure).

To handle this, the correct time to wait before rrset become ancient is
calculated from max-stale-ttl configuration plus the TTL set in the
rrset used in the tests (ans2/ans.pl).

Then before sending queries for ancient rrset, we check if we need to
sleep enough to ensure those rrset will be marked as ancient.
2020-11-11 16:01:51 -03:00
Diego Fronza
24ec021e50 Warn if 'stale-refresh-time' < 30 (default)
RFC 8767 recommends that attempts to refresh to be done no more
frequently than every 30 seconds.

Added check into named-checkconf, which will warn if values below the
default are found in configuration.

BIND will also log the warning during loading of configuration in the
same fashion.
2020-11-11 16:00:22 -03:00
Diego Fronza
8cc5abff23 Add stale-refresh-time option
Before this update, BIND would attempt to do a full recursive resolution
process for each query received if the requested rrset had its ttl
expired. If the resolution fails for any reason, only then BIND would
check for stale rrset in cache (if 'stale-cache-enable' and
'stale-answer-enable' is on).

The problem with this approach is that if an authoritative server is
unreachable or is failing to respond, it is very unlikely that the
problem will be fixed in the next seconds.

A better approach to improve performance in those cases, is to mark the
moment in which a resolution failed, and if new queries arrive for that
same rrset, try to respond directly from the stale cache, and do that
for a window of time configured via 'stale-refresh-time'.

Only when this interval expires we then try to do a normal refresh of
the rrset.

The logic behind this commit is as following:

- In query.c / query_gotanswer(), if the test of 'result' variable falls
  to the default case, an error is assumed to have happened, and a call
  to 'query_usestale()' is made to check if serving of stale rrset is
  enabled in configuration.

- If serving of stale answers is enabled, a flag will be turned on in
  the query context to look for stale records:
  query.c:6839
  qctx->client->query.dboptions |= DNS_DBFIND_STALEOK;

- A call to query_lookup() will be made again, inside it a call to
  'dns_db_findext()' is made, which in turn will invoke rbdb.c /
  cache_find().

- In rbtdb.c / cache_find() the important bits of this change is the
  call to 'check_stale_header()', which is a function that yields true
  if we should skip the stale entry, or false if we should consider it.

- In check_stale_header() we now check if the DNS_DBFIND_STALEOK option
  is set, if that is the case we know that this new search for stale
  records was made due to a failure in a normal resolution, so we keep
  track of the time in which the failured occured in rbtdb.c:4559:
  header->last_refresh_fail_ts = search->now;

- In check_stale_header(), if DNS_DBFIND_STALEOK is not set, then we
  know this is a normal lookup, if the record is stale and the query
  time is between last failure time + stale-refresh-time window, then
  we return false so cache_find() knows it can consider this stale
  rrset entry to return as a response.

The last additions are two new methods to the database interface:
- setservestale_refresh
- getservestale_refresh

Those were added so rbtdb can be aware of the value set in configuration
option, since in that level we have no access to the view object.
2020-11-11 15:59:56 -03:00
Mark Andrews
23d2d95d28 Check that DNSTAP captures forwarded UPDATE responses
(cherry picked from commit 2b7128fede)
2020-11-10 17:59:04 +11:00
Michał Kępień
64ca2e0061 Wait for the "fast-expire" zone to be transferred
In order for a "fast-expire/IN: response-policy zone expired" message to
be logged in ns3/named.run, the "fast-expire" zone must first be
transferred in by that server.  However, with unfavorable timing, ns3
may be stopped before it manages to fetch the "fast-expire" zone from
ns5 and after the latter has been reconfigured to no longer serve that
zone.  In such a case, the "rpz" system test will report a false
positive for the relevant check.  Prevent that from happening by
ensuring ns3 manages to transfer the "fast-expire" zone before getting
shut down.

(cherry picked from commit 39191052ad)
2020-11-05 07:55:36 +01:00
Matthijs Mekking
67b9e80b1e kasp test: Use DEFAULT_ALGORITHM in tests.sh
Some setup scripts uses DEFAULT_ALGORITHM in their dnssec-policy
and/or initial signing. The tests still used the literal values
13, ECDSAP256SHA256, and 256. Replace those occurrences where
appropriate.

(cherry picked from commit 518dd0bb17)
2020-11-04 14:28:19 +01:00
Matthijs Mekking
a0a4f7e318 Add a test for RFC 8901 signer model 2
The new 'dnssec-policy' was already compatible with multi-signer
model 2, now we also have a test for it.

(cherry picked from commit 7e0ec9f624)
2020-11-04 14:28:10 +01:00
Mark Andrews
939e735e2c Check that a zone in the process of being signed resolves
ans10 simulates a local anycast server which has both signed and
unsigned instances of a zone.  'A' queries get answered from the
signed instance.  Everything else gets answered from the unsigned
instance.  The resulting answer should be insecure.

(cherry picked from commit d7840f4b93)
2020-10-30 09:19:12 +11:00