Commit Graph

13093 Commits

Author SHA1 Message Date
Matthijs Mekking
d565dd6190 Add checkds code
Similar to notify, add code to send and keep track of checkds requests.

On every zone_rekey event, we will check the DS at parental agents
(but we will only actually query parental agents if theree is a DS
scheduled to be published/withdrawn).

On a zone_rekey event, we will first clear the ongoing checkds requests.
Reset the counter, to avoid continuing KSK rollover premature.

This has the risk that if zone_rekey events happen too soon after each
other, there are redundant DS queries to the parental agents. But
if TTLs and the configured durations in the dnssec-policy are sane (as
in not ridiculous short) the chance of this happening is low.

Update: Remove the TLS bits as this is not supported in 9.16

(cherry picked from commit f7872dbd20)
2021-07-01 14:48:23 +02:00
Matthijs Mekking
70cee781a1 Add checkds log notice
When the checkds published/withdrawn is activated, log a notice. Can
be used for testing, but also operationally useful.

(cherry picked from commit 1a50554963)
2021-07-01 14:48:23 +02:00
Matthijs Mekking
96d4f99a8f Add key metadata for DS published/withdrawn
In order to keep track of how many parents have the DS for a given key
published or withdrawn, keep a counter.

(cherry picked from commit 6e2c24be7c)
2021-07-01 14:48:23 +02:00
Matthijs Mekking
850aed0219 Add helpful function 'dns_zone_getdnsseckeys'
This code gathers DNSSEC keys from key files and from the DNSKEY RRset.
It is used for the 'rndc dnssec -status' command, but will also be
needed for "checkds". Turn it into a function.

(cherry picked from commit 40331a20c4)
2021-07-01 14:48:23 +02:00
Matthijs Mekking
9c0e252e2b Add "parental-source[-v6]" config option
Similar to "notify-source" and "transfer-source", add options to
set the source address when querying parental agents for DS records.

(manually picked from commit 2872d6a12e)
2021-07-01 14:48:23 +02:00
Matthijs Mekking
884750b66d Add dst_key_role function
Change the static function 'get_ksk_zsk' to a library function that
can be used to determine the role of a dst_key. Add checks if the
boolean parameters to store the role are not NULL. Rename to
'dst_key_role'.

(cherry picked from commit c9b7f62767)
2021-07-01 14:48:23 +02:00
Matthijs Mekking
63582dc778 Parse "parental-agents" configuration
Parse the new "parental-agents" configuration and store it in the zone
structure.

(cherry picked from commit 6f92d4b9a5)
2021-07-01 14:48:23 +02:00
Matthijs Mekking
72d97df483 Make "primaries" config parsing generic
Make the code to parse "primaries" configuration more generic so
it can be reused for "parental-agents".

(cherry picked from commit 6040c71478)
2021-07-01 14:48:23 +02:00
Matthijs Mekking
ab26fc2d66 Check parental-agents config
Add checks for "parental-agents" configuration, checking for the option
being at wrong type of zone (only allowed for primaries and
secondaries), duplicate definitions, duplicate references, and
undefined parental clauses (the name referenced in the zone clause
does not have a matching "parental-agent" clause).

(cherry picked from commit 1e763e582b)
2021-07-01 14:48:23 +02:00
Matthijs Mekking
90ef2b9c81 Add parental-agents configuration
Introduce a way to configure parental agents that can be used to
query DS records to be used in automatic key rollovers.

(manually picked from commit 0311705d4b)
2021-07-01 14:48:23 +02:00
Matthijs Mekking
42da0e7790 Change primaries objects to remote-servers
Change the primaries configuration objects to the more generic
remote-servers, that we can reuse for other purposes (such as
parental-agents).

(manually picked from commit 39a961112f)
2021-07-01 14:48:21 +02:00
Mark Andrews
8be9a67aec Handle placeholder KEYDATA record
A placeholder keydata record can appear in a zone file.  Allow them
to be read back in.

(cherry picked from commit c6fa8a1d45)
2021-07-01 15:01:05 +10:00
Matthijs Mekking
37db953d9d Fix setnsec3param hang on shutdown
When performing the 'setnsec3param' task, zones that are not loaded
will have their task rescheduled. We should do this only if the zone
load is still pending, this prevents zones that failed to load get
stuck in a busy wait and causing a hang on shutdown.

(cherry picked from commit 10055d44e3)
2021-06-28 11:07:31 +02:00
Matthijs Mekking
068a978ae9 Fix checkconf dnssec-policy inheritance bug
Similar to #2778, the check for 'dnssec-policy' failed to account for
it being inheritable.

(cherry picked from commit 75ec7d1d9f)
2021-06-24 10:41:28 +02:00
Ondřej Surý
d115a9ae2a Disable the PMTUD also on the old socket UDP code
Instead of just disabling the PMTUD mechanism on the UDP sockets, we
now set IP_DONTFRAG (IPV6_DONTFRAG) flag.  That means that the UDP
packets won't get ever fragmented.  If the ICMP packets are lost the
UDP will just timeout and eventually be retried over TCP.
2021-06-23 21:06:05 +02:00
Ondřej Surý
66a058838c Disable IP fragmentation on the UDP sockets
In DNS Flag Day 2020, we started setting the DF (Don't Fragment socket
option on the UDP sockets.  It turned out, that this code was incomplete
leading to dropping the outgoing UDP packets.

This has been now remedied, so it is possible to disable the
fragmentation on the UDP sockets again as the sending error is now
handled by sending back an empty response with TC (truncated) bit set.

This reverts commit 66eefac78c.

(cherry picked from commit b941411072)
2021-06-23 17:58:27 +02:00
Evan Hunt
82a81287f9 Handle UDP send errors when sending DNS message larger than MTU
When the fragmentation is disabled on UDP sockets, the uv_udp_send()
call can fail with UV_EMSGSIZE for messages larger than path MTU.
Previously, this error would end with just discarding the response.  In
this commit, a proper handling of such case is added and on such error,
a new DNS response with truncated bit set is generated and sent to the
client.

This change allows us to disable the fragmentation on the UDP
sockets again.

(cherry picked from commit a3ba95116e)
2021-06-23 17:58:27 +02:00
Ondřej Surý
a12938e183 Add rbtdb setownercase/getownercase unit test
This commit adds a unittest that tests private rdataset_getownercase()
and rdataset_setownercase() methods from rbtdb.c.  The test setups
minimal mock dns_rbtdb_t and dns_rbtdbnode_t data structures.

As the rbtdb methods are generally hidden behind layers and layers, we
include the "rbtdb.c" directly from rbtdb_test.c, and thus we can use
the private methods and data structures directly.  This also opens up
opportunity to add more unittest for the rbtdb private functions without
going through all the layers.

(cherry picked from commit c7a11bd5b4)
2021-06-23 17:31:13 +02:00
Ondřej Surý
0167c4a898 Use POSIX tolower(), toupper() and isupper() functions
In the code that rdataset_setownercase() and rdataset_getownercase() we
now use tolower()/toupper()/isupper() functions appropriately instead of
rolling our own code.

(cherry picked from commit 7ccbe52060)
2021-06-23 11:50:11 +02:00
Matthijs Mekking
bb1f0404ab Fix deadlock issue with key-directory and in-view
When locking key files for a zone, we iterate over all the views and
lock a mutex inside the zone structure. However, if we envounter an
in-view zone, we will try to lock the key files twice, one time for
the home view and one time for the in-view view. This will lead to
a deadlock because one thread is trying to get the same lock twice.

(cherry picked from commit 42c601ae14)
2021-06-22 09:25:46 +02:00
Mark Andrews
efbf4ed5e1 Checking of key-directory and dnssec-policy was broken
the checks failed to account for key-directory being inheritable.

(cherry picked from commit d1e283ede1)
2021-06-18 17:29:41 +10:00
Mark Andrews
52cc9ff372 Add w and W to maptoupper and maptolower tables
(cherry picked from commit 08eeebb6a7)
2021-06-18 16:35:19 +10:00
Michał Kępień
c745b14203 Allow resetting hash table size limits for DNS DBs
When "max-cache-size" is changed to "unlimited" (or "0") for a running
named instance (using "rndc reconfig"), the hash table size limit for
each affected cache DB is not reset to the maximum possible value,
preventing those hash tables from being allowed to grow as a result of
new nodes being added.

Extend dns_rbt_adjusthashsize() to interpret "size" set to 0 as a signal
to remove any previously imposed limits on the hash table size.  Adjust
API documentation for dns_db_adjusthashsize() accordingly.  Move the
call to dns_db_adjusthashsize() from dns_cache_setcachesize() so that it
also happens when "size" is set to 0.

(cherry picked from commit 6b77583f54)
2021-06-17 17:17:37 +02:00
Michał Kępień
c2d9c14354 Allow hash tables for cache RBTs to be grown
Upon creation, each dns_rbt_t structure has its "maxhashbits" field
initialized to the value of the RBT_HASH_MAX_BITS preprocessor macro,
i.e. 32.  When the dns_rbt_adjusthashsize() function is called for the
first time for a given RBT (for cache RBTs, this happens when they are
first created, i.e. upon named startup), it lowers the value of the
"maxhashbits" field to the number of bits required to index the
requested number of hash table slots.  When a larger hash table size is
subsequently requested, the value of the "maxhashbits" field should be
increased accordingly, up to RBT_HASH_MAX_BITS.  However, the loop in
the rehash_bits() function currently ensures that the number of bits
necessary to index the resized hash table will not be larger than
rbt->maxhashbits instead of RBT_HASH_MAX_BITS, preventing the hash table
from being grown once the "maxhashbits" field of a given dns_rbt_t
structure is set to any value lower than RBT_HASH_MAX_BITS.

Fix by tweaking the loop guard condition in the rehash_bits() function
so that it compares the new number of bits used for indexing the hash
table against RBT_HASH_MAX_BITS rather than rbt->maxhashbits.

(cherry picked from commit c096f91451)
2021-06-17 17:17:37 +02:00
Mark Andrews
2c38ba4670 Lock access to task->threadid
(cherry picked from commit 234ad2d075)
2021-06-15 12:53:13 +10:00
Ondřej Surý
b0e7511001 Update the source code formatting using clang-format-12
clang-format now tries to keep the type-cast on the same line as the
variable.  Update the formatting.
2021-06-13 08:19:44 +02:00
Michał Kępień
f88c90f47f Fix "no DS" proofs for wildcard+CNAME delegations
When answering a query requires wildcard expansion, the AUTHORITY
section of the response needs to include NSEC(3) record(s) proving that
the QNAME does not exist.

When a response to a query is an insecure delegation, the AUTHORITY
section needs to include an NSEC(3) proof that no DS record exists at
the parent side of the zone cut.

These two conditions combined trip up the NSEC part of the logic
contained in query_addds(), which expects the NS RRset to be owned by
the first name found in the AUTHORITY section of a delegation response.
This may not always be true, for example if wildcard expansion causes an
NSEC record proving QNAME nonexistence to be added to the AUTHORITY
section before the delegation is added to the response.  In such a case,
named incorrectly omits the NSEC record proving nonexistence of QNAME
from the AUTHORITY section.

The same block of code is affected by another flaw: if the same NSEC
record proves nonexistence of both the QNAME and the DS record at the
parent side of the zone cut, this NSEC record will be added to the
AUTHORITY section twice.

Fix by looking for the NS RRset in the entire AUTHORITY section and
adding the NSEC record to the delegation using query_addrrset() (which
handles duplicate RRset detection).

(cherry picked from commit 7a87bf468b)
2021-06-10 10:26:51 +02:00
Mark Andrews
3593651559 Fix the variable checked by a post-load assertion
Instead of checking the value of the variable modified two lines earlier
(the number of SOA records present at the apex of the old version of the
zone), one of the RUNTIME_CHECK() assertions in zone_postload() checks
the number of SOA records present at the apex of the new version of the
zone, which is already checked before.  Fix the assertion by making it
check the correct variable.

(cherry picked from commit 098639dc59)
2021-06-10 10:04:21 +02:00
Mark Andrews
c7216ae382 Adjust acceptable count values
usleep(100000) can be slightly less than 10ms so allow the count
to reach 11.

(cherry picked from commit 2bc454dc2d)
2021-06-10 08:33:46 +10:00
Mark Andrews
edd0fe1dca Address race between zone_settimer and set_key_expiry_warning by
adding missing lock.

    WARNING: ThreadSanitizer: data race
    Read of size 4 at 0x000000000001 by thread T1 (mutexes: read M1, write M2):
    #0 isc_time_isepoch lib/isc/unix/time.c:110
    #1 zone_settimer lib/dns/zone.c:14649
    #2 dns_zone_maintenance lib/dns/zone.c:6281
    #3 dns_zonemgr_forcemaint lib/dns/zone.c:18190
    #4 view_loaded server.c:9654
    #5 call_loaddone lib/dns/zt.c:301
    #6 doneloading lib/dns/zt.c:575
    #7 zone_asyncload lib/dns/zone.c:2259
    #8 task_run lib/isc/task.c:845
    #9 isc_task_run lib/isc/task.c:938
    #10 isc__nm_async_task lib/isc/netmgr/netmgr.c:855
    #11 process_netievent lib/isc/netmgr/netmgr.c:934
    #12 process_queue lib/isc/netmgr/netmgr.c:1003
    #13 process_all_queues lib/isc/netmgr/netmgr.c:775
    #14 async_cb lib/isc/netmgr/netmgr.c:804
    #15 <null> <null>
    #16 isc__trampoline_run lib/isc/trampoline.c:191
    #17 <null> <null>

    Previous write of size 4 at 0x000000000001 by thread T2:
    #0 isc_time_set lib/isc/unix/time.c:93
    #1 set_key_expiry_warning lib/dns/zone.c:6430
    #2 del_sigs lib/dns/zone.c:6711
    #3 zone_resigninc lib/dns/zone.c:7113
    #4 zone_maintenance lib/dns/zone.c:11111
    #5 zone_timer lib/dns/zone.c:14588
    #6 task_run lib/isc/task.c:845
    #7 isc_task_run lib/isc/task.c:938
    #8 isc__nm_async_task lib/isc/netmgr/netmgr.c:855
    #9 process_netievent lib/isc/netmgr/netmgr.c:934
    #10 process_queue lib/isc/netmgr/netmgr.c:1003
    #11 process_all_queues lib/isc/netmgr/netmgr.c:775
    #12 async_cb lib/isc/netmgr/netmgr.c:804
    #13 <null> <null>
    #14 isc__trampoline_run lib/isc/trampoline.c:191
    #15 <null> <null>

    SUMMARY: ThreadSanitizer: data race lib/isc/unix/time.c:110 in isc_time_isepoch

(cherry picked from commit 3d66e97a28)
2021-06-09 23:56:47 +10:00
Matthijs Mekking
7893064f2e Fix NSEC3 resalting upon restart
When named restarts, it will examine signed zones and checks if the
current denial of existence strategy matches the dnssec-policy. If not,
it will schedule to create a new NSEC(3) chain.

However, on startup the zone database may not be read yet, fooling
BIND that the denial of existence chain needs to be created. This
results in a replacement of the previous NSEC(3) chain.

Change the code such that if the NSEC3PARAM lookup failed (the result
did not return in ISC_R_SUCCESS or ISC_R_NOTFOUND), we will try
again later. The nsec3param structure has additional variables to
signal if the lookup is postponed. We also need to save the signal
if an explicit resalt was requested.

In addition to the two added boolean variables, we add a variable to
store the NSEC3PARAM rdata. This may have a yet to be determined salt
value. We can't create the private data yet because there may be a
mismatch in salt length and the NULL salt value.

(cherry picked from commit 0ae3ffdc1c)
2021-06-09 09:18:44 +02:00
Ondřej Surý
6a43b1e711 Pause the dbiterator when dumping the zone to the disk
When we rewrote the zone dumping to use the separate threadpool, the
dumping would acquire the read lock for the whole time the zone dumping
process is dumping the zone.

When combined with incoming IXFR that tries to acquire the write lock on
the same rwlock, we would end up blocking all the other readers.

In this commit, we pause the dbiterator every time we get next record
and before start dumping it to the disk.

(cherry picked from commit 7e59b8a4a1)
2021-06-04 11:32:31 +02:00
Mark Andrews
a74b2a4448 Report which assertion failed when calling set_global_error
(cherry picked from commit 66d1df57cb)
2021-06-03 17:36:51 +10:00
Ondřej Surý
e528b3241d Fix copy&paste error in setsockopt_off
Because of copy&paste error the setsockopt_off macro would enable
the socket option instead of disabling it.

(cherry picked from commit f14d870d15)
2021-06-02 18:10:44 +02:00
Ondřej Surý
ce0083474e Cleanup the remaining of HAVE_UV_<func> macros
While cleaning up the usage of HAVE_UV_<func> macros, we forgot to
cleanup the HAVE_UV_UDP_CONNECT in the actual code and
HAVE_UV_TRANSLATE_SYS_ERROR and this was causing Windows build to fail
on uv_udp_send() because the socket was already connected and we were
falsely assuming that it was not.

The platforms with autoconf support were not affected, because we were
still checking for the functions from the configure.

(cherry picked from commit 67afea6cfc)
2021-06-02 12:01:29 +02:00
Ondřej Surý
e95dadb40d Indicate to the kernel that we won't be needing the zone dumps
Add a call to posix_fadvise() to indicate to the kernel, that `named`
won't be needing the dumped zone files any time soon with:

 * POSIX_FADV_DONTNEED - The specified data will not be accessed in the
   near future.

Notes:

 POSIX_FADV_DONTNEED attempts to free cached pages associated with the
 specified region. This is useful, for example, while streaming large
 files. A program may periodically request the kernel to free cached
 data that has already been used, so that more useful cached pages are
 not discarded instead.

(cherry picked from commit e83b6569da)
2021-05-31 16:57:20 +02:00
Ondřej Surý
c8eddf4f33 Refactor zone dumping code to use netmgr async threadpools
Previously, dumping the zones to the files were quantized, so it doesn't
slow down network IO processing.  With the introduction of network
manager asynchronous threadpools, we can move the IO intensive work to
use that API and we don't have to quantize the work anymore as it the
file IO won't block anything except other zone dumping processes.

(cherry picked from commit 8a5c62de83)
2021-05-31 16:57:19 +02:00
Ondřej Surý
2e849353b3 Add isc_task_getnetmgr() function
Add a function to pull the attached netmgr from inside the executed
task.  This is needed for any task that needs to call the netmgr API.

(cherry picked from commit 7670f98377)
2021-05-31 16:57:19 +02:00
Ondřej Surý
1417e39055 Add asynchronous work API to the network manager
The libuv has a support for running long running tasks in the dedicated
threadpools, so it doesn't affect networking IO.

This commit adds isc_nm_work_enqueue() wrapper that would wraps around
the libuv API and runs it on top of associated worker loop.

The only limitation is that the function must be called from inside
network manager thread, so the call to the function should be wrapped
inside a (bound) task.

(cherry picked from commit 87fe97ed91)
2021-05-31 16:57:19 +02:00
Ondřej Surý
c1703f5ce6 Use UV_VERSION_HEX to decide whether we need libuv shim functions
Instead of having a configure check for every missing function that has
been added in later version of libuv, we now use UV_VERSION_HEX to
decide whether we need the shim or not.

(cherry picked from commit 211bfefbaa)
2021-05-31 16:57:19 +02:00
Ondřej Surý
4db28d79b1 Add uv_os_getenv() and uv_os_setenv() compatibility shims
The uv_os_getenv() and uv_os_setenv() functions were introduced in the
libuv >= 1.12.0.  Add simple compatibility shims for older versions.

(cherry picked from commit 7477d1b2ed)
2021-05-31 16:57:19 +02:00
Ondřej Surý
0ce462ab8e Add uv_req_get_data() and uv_req_set_data() compatibility shims
The uv_req_get_data() and uv_req_set_data() functions were introduced in
libuv >= 1.19.0, so we need to add compatibility shims with older libuv
versions.

(cherry picked from commit f752840db3)
2021-05-31 16:57:19 +02:00
Matthijs Mekking
89b0a0aa52 Reuse rdatset->ttl when dumping ancient RRsets
Rather than having an expensive 'expired' (fka 'stale_ttl') in the
rdataset structure, that is only used to be printed in a comment on
ancient RRsets, reuse the TTL field of the RRset.

(cherry picked from commit f7f543d99b)
2021-05-30 12:30:36 -07:00
Kevin Chen
3924f78748 Several serve-stale improvements
Commit a83c8cb0af updated masterdump so
that stale records in "rndc dumpdb" output no longer shows 0 TTLs.  In
this commit we change the name of the `rdataset->stale_ttl` field to
`rdataset->expired` to make its purpose clearer, and set it to zero in
cases where it's unused.

Add 'rbtdb->serve_stale_ttl' to various checks so that stale records
are not purged from the cache when they've been stale for RBTDB_VIRTUAL
(300) seconds.

Increment 'ns_statscounter_usedstale' when a stale answer is used.

Note: There was a question of whether 'overmem_purge' should be
purging ancient records, instead of stale ones.  It is left as purging
stale records, since stale records could take up the majority of the
cache.

This submission is copyrighted Akamai Technologies, Inc. and provided
under an MPL 2.0 license.

This commit was originally authored by Kevin Chen, and was updated by
Matthijs Mekking to match recent serve-stale developments.

(cherry picked from commit 0cdf85d204)
2021-05-30 12:30:36 -07:00
Matthijs Mekking
a2ec3a1e4c Reset DNS_FETCHOPT_TRYSTALE_ONTIMEOUT on resume
Once we resume a query, we should clear DNS_FETCHOPT_TRYSTALE_ONTIMEOUT
from the options to prevent triggering the stale-answer-client-timeout
on subsequent fetches.

If we don't this may cause a crash when for example when prefetch is
triggered after a query restart.

(cherry picked from commit c0dc5937c7)
2021-05-30 00:33:42 -07:00
Evan Hunt
85ca29e5e2 clean up query correctly if already answered by serve-stale
when a serve-stale answer has been sent, the client continues waiting
for a proper answer. if a final completion event for the client does
arrive, it can just be cleaned up without sending a response, similar
to a canceled fetch.

(cherry picked from commit 8bd8e995f1)
2021-05-27 12:09:43 -07:00
Diego Fronza
4ed22937d8 Add malloc attribute to memory allocation functions
The malloc attribute allows compiler to do some optmizations on
functions that behave like malloc/calloc, like assuming that the
returned pointer do not alias other pointers.
2021-05-27 15:42:36 +02:00
Mark Andrews
0b8cd8f19d inline-signing should have been in zone_only_clauses
(cherry picked from commit b3301da262)
2021-05-27 15:27:03 +02:00
Mark Andrews
0340df46ec Remove priority from attribute constructor/destructor
On some platforms, the __attribute__ constructor and destructor won't
take priorities and the compilation failed.  On such platform would be
macOS.  For this reason, the constructor/destructor in the libisc was
reworked to not use priorities, but have a single constructor and
destructor that calls the appropriate routines in correct order.

This commit removes the extra priority because it's now not needed and
it also breaks a compilation on macOS with GCC 10.

(cherry picked from commit d68b009cfe)
2021-05-27 08:26:20 +02:00
Mark Andrews
9694554b88 Add missing initialisations
configuring with --enable-mutex-atomics flagged these incorrectly
initialised variables on systems where pthread_mutex_init doesn't
just zero out the structure.

(cherry picked from commit 715a2c7fc1)
2021-05-26 17:19:06 +02:00