Commit Graph

35896 Commits

Author SHA1 Message Date
Tony Finch
71ce8b0a51 Ensure that dns_request_createvia() has a retry limit
There are a couple of problems with dns_request_createvia(): a UDP
retry count of zero means unlimited retries (it should mean no
retries), and the overall request timeout is not enforced. The
combination of these bugs means that requests can be retried forever.

This change alters calls to dns_request_createvia() to avoid the
infinite retry bug by providing an explicit retry count. Previously,
the calls specified infinite retries and relied on the limit implied
by the overall request timeout and the UDP timeout (which did not work
because the overall timeout is not enforced). The `udpretries`
argument is also changed to be the number of retries; previously, zero
was interpreted as infinity because of an underflow to UINT_MAX, which
appeared to be a mistake. And `mdig` is updated to match the change in
retry accounting.

The bug could be triggered by zone maintenance queries, including
NOTIFY messages, DS parental checks, refresh SOA queries and stub zone
nameserver lookups. It could also occur with `nsupdate -r 0`.
(But `mdig` had its own code to avoid the bug.)
2022-04-06 17:12:48 +01:00
Tony Finch
5867c1b727 Make notify test shellcheck clean
Use POSIX shell syntax, and use functions to reduce repetition.
2022-04-06 17:12:08 +01:00
Artem Boldariev
a671fb34f6 Merge branch 'artem-tls-ctx-refcount' into 'main'
Implement reference counting for TLS contexts, Resolve #3122 DoT stops working after "rndc reconfigure" when running named as non-root

Closes #3122

See merge request isc-projects/bind9!6087
2022-04-06 16:09:04 +00:00
Artem Boldariev
8bec4a6bf6 Extend the doth system test
This commit adds simple checks that the TLS contexts in question are
indeed being updated on DoT and DoH listeners.
2022-04-06 18:45:57 +03:00
Artem Boldariev
a100c1ff7c Update CHANGES [GL #3122]
Add an entry that reloading TLS certificates without destroying
underlying TCP listening sockets.
2022-04-06 18:45:57 +03:00
Artem Boldariev
77b2db8246 Replace listener TLS contexts on reconfiguration
This commit makes use of isc_nmsocket_set_tlsctx(). Now, instead of
recreating TLS-enabled listeners (including the underlying TCP
listener sockets), only the TLS context in use is replaced.
2022-04-06 18:45:57 +03:00
Artem Boldariev
df317184eb Add isc_nmsocket_set_tlsctx()
This commit adds isc_nmsocket_set_tlsctx() - an asynchronous function
that replaces the TLS context within a given TLS-enabled listener
socket object. It is based on the newly added reference counting
functionality.

The intention of adding this function is to add functionality to
replace a TLS context without recreating the whole socket object,
including the underlying TCP listener socket, as a BIND process might
not have enough permissions to re-create it fully on reconfiguration.
2022-04-06 18:45:57 +03:00
Artem Boldariev
25609156a5 Maintain a per-thread TLS ctx reference in TLS stream code
This commit changes the generic TLS stream code to maintain a
per-worker thread TLS context reference.
2022-04-06 18:45:57 +03:00
Artem Boldariev
9256026d18 Use isc_tlsctx_attach() in TLS DNS code
This commit adds proper reference counting for TLS contexts into
generic TLS DNS (DoT) code.
2022-04-06 18:45:57 +03:00
Artem Boldariev
b52d46612f Use isc_tlsctx_attach() in TLS stream code
This commit adds proper reference counting for TLS contexts into
generic TLS stream code.
2022-04-06 18:45:57 +03:00
Artem Boldariev
a7a482c1b1 Add isc_tlsctx_attach()
The implementation is done on top of the reference counting
functionality found in OpenSSL/LibreSSL, which allows for avoiding
wrapping the object.

Adding this function allows using reference counting for TLS contexts
in BIND 9's codebase.
2022-04-06 18:45:57 +03:00
Ondřej Surý
09dccf29b4 Merge branch '3249-rename-configuration-option-to-reuseport' into 'main'
Rename the configuration option to load balance sockets to reuseport

Closes #3249

See merge request isc-projects/bind9!6093
2022-04-06 15:23:16 +00:00
Ondřej Surý
7e71c4d0cc Rename the configuration option to load balance sockets to reuseport
After some back and forth, it was decidede to match the configuration
option with unbound ("so-reuseport"), PowerDNS ("reuseport") and/or
nginx ("reuseport").
2022-04-06 17:03:57 +02:00
Mark Andrews
4216c72d13 Merge branch '3259-cid-351372-concurrent-data-access-violations-atomicity' into 'main'
Resolve "CID 351372:  Concurrent data access violations  (ATOMICITY)"

Closes #3259

See merge request isc-projects/bind9!6090
2022-04-06 07:53:59 +00:00
Mark Andrews
98718b3b4b Unlink the timer event before trying to purge it
as far as I can determine the order of operations is not important.

    *** CID 351372:  Concurrent data access violations  (ATOMICITY)
    /lib/isc/timer.c: 227 in timer_purge()
    221     		LOCK(&timer->lock);
    222     		if (!purged) {
    223     			/*
    224     			 * The event has already been executed, but not
    225     			 * yet destroyed.
    226     			 */
    >>>     CID 351372:  Concurrent data access violations  (ATOMICITY)
    >>>     Using an unreliable value of "event" inside the second locked section. If the data that "event" depends on was changed by another thread, this use might be incorrect.
    227     			timerevent_unlink(timer, event);
    228     		}
    229     	}
    230     }
    231
    232     void
2022-04-06 07:33:41 +00:00
Mark Andrews
6d94ac9f96 Merge branch '3258-cid-351370-cid-351371-after-adb-refactoring' into 'main'
Resolve "CID 351370 & CID 351371 after ADB refactoring"

Closes #3258

See merge request isc-projects/bind9!6089
2022-04-06 07:33:19 +00:00
Mark Andrews
ed1e480c53 Move lock to before label to prevent duplicate lock
*** CID 351370:  Program hangs  (LOCK)
    /lib/dns/adb.c: 2699 in dns_adb_cancelfind()
    2693
    2694     	LOCK(&nbucket->lock);
    2695     	ISC_LIST_UNLINK(adbname->finds, find, plink);
    2696     	UNLOCK(&nbucket->lock);
    2697
    2698     cleanup:
    >>>     CID 351370:  Program hangs  (LOCK)
    >>>     "pthread_mutex_lock" locks "find->lock" while it is locked.
    2699     	LOCK(&find->lock);
    2700     	if (!FIND_EVENTSENT(find)) {
    2701     		ev = &find->event;
    2702     		task = ev->ev_sender;
    2703     		ev->ev_sender = find;
    2704     		ev->ev_type = DNS_EVENT_ADBCANCELED;
2022-04-06 12:56:17 +10:00
Mark Andrews
05e08a21d1 Remove unnecessary NULL test leading to REVERSE_INULL false positive
*** CID 351371:  Null pointer dereferences  (REVERSE_INULL)
    /lib/dns/adb.c: 2615 in dns_adb_createfind()
    2609     	/*
    2610     	 * Copy out error flags from the name structure into the find.
    2611     	 */
    2612     	find->result_v4 = find_err_map[adbname->fetch_err];
    2613     	find->result_v6 = find_err_map[adbname->fetch6_err];
    2614
    >>>     CID 351371:  Null pointer dereferences  (REVERSE_INULL)
    >>>     Null-checking "find" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
    2615     	if (find != NULL) {
    2616     		if (want_event) {
    2617     			INSIST((find->flags & DNS_ADBFIND_ADDRESSMASK) != 0);
    2618     			isc_task_attach(task, &(isc_task_t *){ NULL });
    2619     			find->event.ev_sender = task;
    2620     			find->event.ev_action = action;
2022-04-06 12:54:08 +10:00
Artem Boldariev
4d57ef0c49 Merge branch 'artem-fix-return-value-x509_store_up_ref' into 'main'
Change X509_STORE_up_ref() shim return value

See merge request isc-projects/bind9!6084
2022-04-05 12:51:00 +00:00
Artem Boldariev
f0ac4c47b0 Change X509_STORE_up_ref() shim return value
X509_STORE_up_ref() must return 1 on success, while the previous
implementation would return the references count. This commit fixes
that.
2022-04-05 15:03:27 +03:00
Arаm Sаrgsyаn
0130ff96d5 Merge branch '3244-dig-use-after-free' into 'main'
Resolve "use-after-free in dighost.c/dig.c"

Closes #3244

See merge request isc-projects/bind9!6052
2022-04-05 11:52:00 +00:00
Aram Sargsyan
ef9bd8533a Add CHANGES note for [GL #3244] 2022-04-05 11:21:11 +00:00
Aram Sargsyan
5b2b3e589c Fix using unset pointer when printing a debug message in dighost.c
The used `query->handle` is always `NULL` at this point.

Change the code to use `handle` instead.
2022-04-05 11:20:42 +00:00
Aram Sargsyan
2771a5b64d Add a missing clear_current_lookup() call in recv_done()
The error code path handling the `ISC_R_CANCELED` code lacks a
`clear_current_lookup()` call, without which dig hangs indefinitely
when handling the error.

Add the missing call to account for all references of the lookup so
it can be destroyed.
2022-04-05 11:20:42 +00:00
Aram Sargsyan
f831e758d1 When using +qr in dig print the data of the current query
In `send_udp()` and `launch_next_query()` functions, when calling
`dighost_printmessage()` to print detailed information about the
sent query, dig always prints the data of the first query in the
lookup's queries list.

The first query in the list can be already finished, having its handles
freed, and accessing this information results in assertion failure.

Print the current query's information instead.
2022-04-05 11:20:41 +00:00
Michal Nowak
04e9b6060c Merge branch '3158-only-set-foundname-on-success-test' into 'main'
[CVE-2022-0635] Add regression test

Closes #3158

See merge request isc-projects/bind9!6060
2022-04-05 09:02:09 +00:00
Mark Andrews
56fbed2f0f Add regression test for CVE-2022-0635 2022-04-05 09:54:45 +02:00
Mark Andrews
ed9a4d9d71 Merge branch '3220-digdelv-test-uses-address-outside-of-our-control' into 'main'
Handle "network unreachable" error messages in digdelv system test

See merge request isc-projects/bind9!6010
2022-04-05 04:22:11 +00:00
Mark Andrews
9ef4d2b583 Use multiple fixed expressions for portable grep usage
Additionally add "network unreachable" as an expected error message.
2022-04-05 03:55:13 +00:00
Ondřej Surý
4bbc245e7e Merge branch 'ondrej-dont-use-shutdown-function-name' into 'main'
Rename shutdown() to test_shutdown() in timer_test.c

See merge request isc-projects/bind9!6078
2022-04-04 23:54:13 +00:00
Ondřej Surý
7868d8145b Rename shutdown() to test_shutdown() in timer_test.c
The shutdown() is part of standard library (POSIX-1), don't use such
name in the timer_test.c, but rather rename it to test_shutdown().
2022-04-05 01:49:04 +02:00
Ondřej Surý
141da70898 Merge branch '3249-add-configuration-option-to-disable-SO_REUSEPORT_LB-fix' into 'main'
Enable the load-balance-sockets configuration

Closes #3249

See merge request isc-projects/bind9!6076
2022-04-04 23:37:32 +00:00
Ondřej Surý
142c63dda8 Enable the load-balance-sockets configuration
Previously, HAVE_SO_REUSEPORT_LB has been defined only in the private
netmgr-int.h header file, making the configuration of load balanced
sockets inoperable.

Move the missing HAVE_SO_REUSEPORT_LB define the isc/netmgr.h and add
missing isc_nm_getloadbalancesockets() implementation.
2022-04-05 01:30:58 +02:00
Ondřej Surý
9217f1e200 Merge branch '3249-add-configuration-option-to-disable-SO_REUSEPORT_LB' into 'main'
Add option to configure load balance sockets

Closes #3249

See merge request isc-projects/bind9!6059
2022-04-04 21:37:54 +00:00
Ondřej Surý
855f49cfba Add CHANGES and release note for [GL #3249] 2022-04-04 23:10:04 +02:00
Ondřej Surý
85c6e797aa Add option to configure load balance sockets
Previously, the option to enable kernel load balancing of the sockets
was always enabled when supported by the operating system (SO_REUSEPORT
on Linux and SO_REUSEPORT_LB on FreeBSD).

It was reported that in scenarios where the networking threads are also
responsible for processing long-running tasks (like RPZ processing, CATZ
processing or large zone transfers), this could lead to intermitten
brownouts for some clients, because the thread assigned by the operating
system might be busy.  In such scenarious, the overall performance would
be better served by threads competing over the sockets because the idle
threads can pick up the incoming traffic.

Add new configuration option (`load-balance-sockets`) to allow enabling
or disabling the load balancing of the sockets.
2022-04-04 23:10:04 +02:00
Ondřej Surý
38f8716b1c Merge branch '3182-placeholder' into 'main'
Add placeholder for [GL #3182]

Closes #3182

See merge request isc-projects/bind9!6071
2022-04-04 19:47:27 +00:00
Ondřej Surý
910c6b9cef Add placeholder for [GL #3182] 2022-04-04 21:45:09 +02:00
Ondřej Surý
59f04a5d09 Merge branch '3190-offload-rpz-updates' into 'main'
Run the RPZ update as offloaded work

Closes #3190

See merge request isc-projects/bind9!5938
2022-04-04 19:44:15 +00:00
Ondřej Surý
23a4559b34 Add CHANGES and release note for [GL #3190] 2022-04-04 21:20:05 +02:00
Ondřej Surý
f106d0ed2b Run the RPZ update as offloaded work
Previously, the RPZ updates ran quantized on the main nm_worker loops.
As the quantum was set to 1024, this might lead to service
interruptions when large RPZ update was processed.

Change the RPZ update process to run as the offloaded work.  The update
and cleanup loops were refactored to do as little locking of the
maintenance lock as possible for the shortest periods of time and the db
iterator is being paused for every iteration, so we don't hold the rbtdb
tree lock for prolonged periods of time.
2022-04-04 21:20:05 +02:00
Ondřej Surý
b6e885c97f Refactor the dns_rpz_add/delete to use local rpz copy
Previously dns_rpz_add() were passed dns_rpz_zones_t and index to .zones
array.  Because we actually attach to dns_rpz_zone_t, we should be using
the local pointer instead of passing the index and "finding" the
dns_rpz_zone_t again.

Additionally, dns_rpz_add() and dns_rpz_delete() were used only inside
rpz.c, so make them static.
2022-04-04 21:20:05 +02:00
Ondřej Surý
840179a247 General cleanup of dns_rpz implementation
Do a general cleanup of lib/dns/rpz.c style:

 * Removed deprecated and unused functions
 * Unified dns_rpz_zone_t naming to rpz
 * Unified dns_rpz_zones_t naming to rpzs
 * Add and use rpz_attach() and rpz_attach_rpzs() functions
 * Shuffled variables to be more local (cppcheck cleanup)
2022-04-04 21:19:48 +02:00
Ondřej Surý
cadd1a0ab3 Merge branch '3229-remove-exclusive-mode-from-ns_interfacemgr' into 'main'
Remove exclusive mode from ns_interfacemgr

Closes #3229

See merge request isc-projects/bind9!6023
2022-04-04 19:16:50 +00:00
Ondřej Surý
70e58897c7 Add CHANGES note for [GL #3229] 2022-04-04 19:27:18 +02:00
Ondřej Surý
c0995bc380 Remove exclusive mode from ns_interfacemgr
Now that the dns_aclenv_t has now properly rwlocked .localhost and
.localnets member, we can remove the task exclusive mode use from the
ns_interfacemgr.  Some light related cleanup has been also done.
2022-04-04 19:27:00 +02:00
Ondřej Surý
8138a595d9 Add isc_rwlock around dns_aclenv .localhost and .localnets member
In order to modify the .localhost and .localnets members of the
dns_aclenv, all other processing on the netmgr loops needed to be
stopped using the task exclusive mode.  Add the isc_rwlock to the
dns_aclenv, so any modifications to the .localhost and .localnets can be
done under the write lock.
2022-04-04 19:27:00 +02:00
Arаm Sаrgsyаn
bd9707464c Merge branch '3248-dig-stuck-using-a-server-with-a-mapped-ip-address' into 'main'
Fix dig hanging issue in cases when the lookup's next query can't start

Closes #3248

See merge request isc-projects/bind9!6061
2022-04-04 09:37:40 +00:00
Aram Sargsyan
438e9b5587 Add CHANGES note for [GL #3248] 2022-04-04 09:16:15 +00:00
Aram Sargsyan
7e2f50c369 Fix dig hanging issue in cases when the lookup's next query can't start
In recv_done(), when dig decides to start the lookup's next query in
the line using `start_udp()` or `start_tcp()`, and for some reason,
no queries get started, dig doesn't cancel the lookup.

This can occur, for example, when there are two queries in the lookup,
one with a regular IP address, and another with a IPv4 mapped IPv6
address. When the regular IP address fails to serve the query, its
`recv_done()` callback starts the next query in the line (in this
case the one with a mapped IP address), but because `dig` doesn't
connect to such IP addresses, and there are no other queries in the
list, no new queries are being started, and the lookup keeps hanging.

After calling `start_udp()` or `start_tcp()` in `recv_done()`, check
if there are no pending/working queries then cancel the lookup instead
of only detaching from the current query.
2022-04-04 09:15:56 +00:00