Commit Graph

35878 Commits

Author SHA1 Message Date
Artem Boldariev
3a75b33287 Add isc_nmsocket_set_tlsctx()
This commit adds isc_nmsocket_set_tlsctx() - an asynchronous function
that replaces the TLS context within a given TLS-enabled listener
socket object. It is based on the newly added reference counting
functionality.

The intention of adding this function is to add functionality to
replace a TLS context without recreating the whole socket object,
including the underlying TCP listener socket, as a BIND process might
not have enough permissions to re-create it fully on reconfiguration.
2022-04-27 23:58:38 +03:00
Artem Boldariev
f52c06054b Maintain a per-thread TLS ctx reference in TLS stream code
This commit changes the generic TLS stream code to maintain a
per-worker thread TLS context reference.
2022-04-27 23:58:38 +03:00
Artem Boldariev
28460151ca Use isc_tlsctx_attach() in TLS DNS code
This commit adds proper reference counting for TLS contexts into
generic TLS DNS (DoT) code.
2022-04-27 23:58:38 +03:00
Artem Boldariev
ff987957e7 Use isc_tlsctx_attach() in TLS stream code
This commit adds proper reference counting for TLS contexts into
generic TLS stream code.
2022-04-27 23:58:38 +03:00
Artem Boldariev
677819d22d Add isc_tlsctx_attach()
The implementation is done on top of the reference counting
functionality found in OpenSSL/LibreSSL, which allows for avoiding
wrapping the object.

Adding this function allows using reference counting for TLS contexts
in BIND 9's codebase.
2022-04-27 23:58:38 +03:00
Arаm Sаrgsyаn
95d1e9ee62 Merge branch '3300-dispatch-udp_recv-handle-deactivated-resp-returning-success-v9_18' into 'v9_18'
[v9_18] Handle ISC_R_SUCCESS on a deactivated response in udp_recv()

See merge request isc-projects/bind9!6200
2022-04-27 19:07:45 +00:00
Aram Sargsyan
e1aca8d575 Add CHANGES note for [GL #3300]
(cherry picked from commit bbdd139e20)
2022-04-27 18:08:42 +00:00
Aram Sargsyan
4de1f65e4d Handle ISC_R_SUCCESS on a deactivated response in udp_recv()
There is a possibility for `udp_recv()` to be called with `eresult`
being `ISC_R_SUCCESS`, but nevertheless with already deactivated `resp`,
which can happen when the request has been canceled in the meantime.

(cherry picked from commit e3a88862c0)
2022-04-27 18:08:42 +00:00
Artem Boldariev
f46e46e730 Merge branch '3274-fix-test-server-for-solaris-backport-v9_18' into 'v9_18'
Rename yield() to the test_server_yield() (backport to 9.18)

See merge request isc-projects/bind9!6201
2022-04-27 17:23:22 +00:00
Artem Boldariev
f83d128ece Rename yield() to the test_server_yield()
This commit ensures that the test_server binary will build on Solaris,
which has yield() definition within 'unistd.h'.
2022-04-27 20:13:24 +03:00
Artem Boldariev
65d929e0d3 Merge branch '3271-tlsdns-call-write-callbacks-after-send-backport-v9_18' into 'v9_18'
TLSDNS: call send callbacks only after the data was sent  (backport to 9.18)

See merge request isc-projects/bind9!6198
2022-04-27 17:06:52 +00:00
Artem Boldariev
8b19f62ac5 TLSDNS: call send callbacks after only the data was sent
This commit ensures that write callbacks are getting called only after
the data has been sent via the network.

Without this fix, a situation could appear when a write callback could
get called before the actual encrypted data would have been sent to
the network. Instead, it would get called right after it would have
been passed to the OpenSSL (i.e. encrypted).

Most likely, the issue does not reveal itself often because the
callback call was asynchronous, so in most cases it should have been
called after the data has been sent, but that was not guaranteed by
the code logic.

Also, this commit removes one memory allocation (netievent) from a hot
path, as there is no need to call this callback asynchronously
anymore.
2022-04-27 17:57:11 +03:00
Petr Špaček
66080a6d91 Merge branch 'pspacek/pin-sphinx-packages-for-rtd-v9_18' into 'v9_18'
Pin Sphinx related package versions to match ReadTheDocs and our CI [v9_18]

See merge request isc-projects/bind9!6192
2022-04-27 12:35:36 +00:00
Petr Špaček
77873b1a5a Pin Sphinx related package versions to match ReadTheDocs and our CI
This seems to be most appropriate way to ensure consistency between
release tarballs and public presentation on ReadTheDocs.

Previous attempt with removing docutils constraint, which relied on pip
depedency solver to pick the same packages as in CI was flawed. RTD
installs a bit different set of packages so it was inherently
unreliable.

As a result RTD pulled in sphinx-rtd-theme==0.4.3 while CI
had 1.0.0, and this inconsistency caused Table of Contents in Release
Notes to render incorrectly. Previous solution was to downgrade
docutils to < 0.17, but I think we should rather pin exact versions.

For the long history of messing with versions read also
isc-projects/bind9@2a8eda0084
isc-projects/images@d4435b97be
isc-projects/bind9@6a2daddf5b

(cherry picked from commit 6088ba3837)
2022-04-27 14:34:56 +02:00
Ondřej Surý
9fd2c7e66a Merge branch 'ondrej-fix-route_recv-use-after-free-v9_18' into 'v9_18'
The route socket and its storage was detached while still reading [v9_18]

See merge request isc-projects/bind9!6181
2022-04-26 14:41:39 +00:00
Ondřej Surý
192df8d2f1 The route socket and its storage was detached while still reading
The interfacemgr and the .route was being detached while the network
manager had pending read from the socket.  Instead of detaching from the
socket, we need to cancel the read which in turn will detach the route
socket and the associated interfacemgr.

(cherry picked from commit 9ae34a04e8)
2022-04-26 16:41:24 +02:00
Ondřej Surý
0cdb2f497a Merge branch '3230-remove-task-exclusive-mode-from-ns_clientmgr-v9_18' into 'v9_18'
Remove task exclusive mode from ns_clientmgr [v9.18]

See merge request isc-projects/bind9!6187
2022-04-26 14:40:42 +00:00
Ondřej Surý
4520ecc471 Add CHANGES mode for [GL #3230]
(cherry picked from commit a243860562)
2022-04-26 15:57:03 +02:00
Ondřej Surý
8beaee0b08 Remove task exclusive mode from ns_clientmgr
The .lock, .exiting and .excl members were not using for anything else
than starting task exclusive mode, setting .exiting to true and ending
exclusive mode.

Remove all the stray members and dead code eliminating the task
exclusive mode use from ns_clientmgr.

(cherry picked from commit 4f74e1010e)
2022-04-26 15:56:30 +02:00
Ondřej Surý
bc36f3e723 Merge branch '3299-fix-AX_PROG_CC_FOR_BUILD-macro-v9_18' into 'v9_18'
Fix the cached value of ac_cv_c_compiler_gnu [v9.18]

See merge request isc-projects/bind9!6185
2022-04-26 13:49:31 +00:00
Ondřej Surý
1bcd20d4bb Fix the cached value of ac_cv_c_compiler_gnu
There was an error in AX_PROG_CC_FOR_BUILD macro that cached literal
name of the cache variable `saved_ac_cv_c_compiler_gnu` instead of the
value of said variable breaking the consecutive runs of ./configure
script with caching enabled.

(cherry picked from commit 4a9f899b5c)
2022-04-26 15:49:16 +02:00
Petr Špaček
d5fd2a53ef Merge branch 'pspacek/rtd-requirements-update-v9_18' into 'v9_18'
Fix mismatch between docutils version in CI and ReadTheDocs [v9_18]

See merge request isc-projects/bind9!6184
2022-04-26 13:48:24 +00:00
Petr Špaček
243cd069fc Fix mismatch between docutils version in CI and ReadTheDocs
Currently our CI images we use to build docs (which subsequently get
into release tarballs) are using docutils 0.17.1, which is latest version
which fulfills Sphinx 4.5.0 requirement for docutils < 0.18.

The old requirement for docutils < 0.17 was causing discrepancy between
the way we build release artifacts and the docs on ReadTheDocs.org which
uses doc/arm/requirements.txt from our repo.

Remove the limit for RDT with hope that it will pull latest permissible
version of docutils.

For the long history of messing with docutils version read also
isc-projects/images@d4435b97be
isc-projects/bind9@6a2daddf5b

(cherry picked from commit 2a8eda0084)
2022-04-26 15:46:55 +02:00
Ondřej Surý
77fde4a112 Merge branch '3229-remove-exclusive-mode-from-ns_interfacemgr-v9_18' into 'v9_18'
Remove exclusive mode from ns_interfacemgr [v9.18]

See merge request isc-projects/bind9!6179
2022-04-26 12:22:19 +00:00
Ondřej Surý
95a55d0968 Add CHANGES note for [GL #3229]
(cherry picked from commit 70e58897c7)
2022-04-26 14:21:57 +02:00
Ondřej Surý
ce8ffdda69 Remove exclusive mode from ns_interfacemgr
Now that the dns_aclenv_t has now properly rwlocked .localhost and
.localnets member, we can remove the task exclusive mode use from the
ns_interfacemgr.  Some light related cleanup has been also done.

(cherry picked from commit c0995bc380)
2022-04-26 14:21:57 +02:00
Ondřej Surý
ab528a0fcb Add isc_rwlock around dns_aclenv .localhost and .localnets member
In order to modify the .localhost and .localnets members of the
dns_aclenv, all other processing on the netmgr loops needed to be
stopped using the task exclusive mode.  Add the isc_rwlock to the
dns_aclenv, so any modifications to the .localhost and .localnets can be
done under the write lock.

(cherry picked from commit 8138a595d9)
2022-04-26 14:21:57 +02:00
Petr Špaček
a180f66b06 Merge branch '3301-support-sphinx-149-v9_18' into 'v9_18'
Split negative and positive dig/mdig/delv options to support Sphinx 1.4.9 [v9_18]

See merge request isc-projects/bind9!6180
2022-04-26 12:18:38 +00:00
Petr Špaček
a84871ccca Add hyperlinks to dig/mdig/delv +options
(cherry picked from commit ac0c2378ca)
2022-04-26 14:06:33 +02:00
Petr Špaček
4c21534009 Split negative and positive dig/mdig/delv options to support Sphinx 1.4.9
Man pages for dig/mdig/delv used `.. option:: +[no]bla` to describe two
options at once, and very old Sphinx does not support that [] in option
names.

Solution is to split negative and positive options into `+bla, +nobla`
form. In the end it improves readability because it transforms hard to
read strings with double brackets from
`+[no]subnet=addr[/prefix-length]` to
`+subnet=addr[/prefix-length], +nosubnet`.

As a side-effect it also allows easier linking to dig/mdig/delv options
using their name directly instead of always overriding the link target
to `+[no]bla` form.

Transformation was done using regex:
    s/:: +\[no\]\(.*\)/:: +\1, +no\1
... and manual review around occurences matching regex
    +no.*=

Fixes: #3301
(cherry picked from commit 0342dddce7)
2022-04-26 14:00:38 +02:00
Ondřej Surý
d751514215 Merge branch 'ondrej-enforce-minimal-libuv-version-v9_18' into 'v9_18'
Abort when libuv at runtime mismatches libuv at compile time [v9.18]

See merge request isc-projects/bind9!6177
2022-04-26 10:12:08 +00:00
Ondřej Surý
2a648b9078 Abort when libuv at runtime mismatches libuv at compile time
When we compile with libuv that has some capabilities via flags passed
to f.e. uv_udp_listen() or uv_udp_bind(), the call with such flags would
fail with invalid arguments when older libuv version is linked at the
runtime that doesn't understand the flag that was available at the
compile time.

Enforce minimal libuv version when flags have been available at the
compile time, but are not available at the runtime.  This check is less
strict than enforcing the runtime libuv version to be same or higher
than compile time libuv version.
2022-04-26 12:11:51 +02:00
Petr Špaček
b3e1c9060b Merge branch '3295-support-sphinx-185-v9_18' into 'v9_18'
Use unique program + option names for link anchors to support Sphinx 1.8.5 [v9_18]

See merge request isc-projects/bind9!6170
2022-04-26 10:07:32 +00:00
Petr Špaček
355aebc6df Use unique program + option names for link anchors to support Sphinx 1.8.5
Sphinx "standard domain" provides directive types ".. program::" and
".. option::" to create link anchor for a program name + option combination.
These can be referenced using :ref:`program option` syntax.

The problem is that Sphinx 1.8.5 (e.g. in Ubuntu 18.04) generates
conflicting link targets if a page contains two option directives
starting with the same word, e.g.:

.. program:: dnssec-settime
.. option:: -P date
.. option:: -P ds date

The reason is that option directive consumes only first word as "option
name" (-P) and all the rest is considered "option argument" (date, ds
date). Newer versions of Sphinx (e.g. 4.5.0) handle this by creating
numbered link anchors, but older versions warn and BIND build system
turns the warning into a hard error.

To handle that we use method recommended by Sphinx maintainer:
https://github.com/sphinx-doc/sphinx/issues/10218#issuecomment-1059925508
As a bonus it provides more accurate link anchors for sub-options.

Alternatives considered:
- Replacing standard domain definition of .. option - causes more
  problems, see BIND issue #3294.
- Removing hyperlinks for options - that would be a step back.

Fixes: #3295
(cherry picked from commit bbb24264bb)
2022-04-25 14:46:37 +02:00
Ondřej Surý
08feb4c23e Merge branch 'ondrej-use-correct-task-for-resume_dslookup-v9_18' into 'v9_18'
Run resume_dslookup() from the correct task [v9.18]

See merge request isc-projects/bind9!6164
2022-04-22 14:57:40 +00:00
Ondřej Surý
7e72c55ff9 Run resume_dslookup() from the correct task
The rctx_chaseds() function calls dns_resolver_createfetch(), passing
fctx->task as the target task to run resume_dslookup() from.  This
breaks task-based serialization of events as fctx->task is the task that
the dns_resolver_createfetch() caller wants to receive its fetch
completion event in; meanwhile, intermediate fetches started by the
resolver itself (e.g. related to QNAME minimization) must use
res->buckets[bucketnum].task instead.  This discrepancy may cause
trouble if the resume_dslookup() callback happens to be run concurrently
with e.g. fctx_doshutdown().

Fix by passing the correct task to dns_resolver_createfetch() in
rctx_chaseds().

(cherry picked from commit 741a7096fc)
2022-04-22 15:57:22 +02:00
Michał Kępień
aa3a3e7cda Merge branch 'michal/fix-loading-plugins-using-just-their-filenames-v9_18' into 'v9_18'
[v9_18] Fix loading plugins using just their filenames

See merge request isc-projects/bind9!6162
2022-04-22 11:37:07 +00:00
Michał Kępień
4ac4640c40 Fix loading plugins using just their filenames
BIND 9 plugins are installed using Automake's pkglib_LTLIBRARIES stanza,
which causes the relevant shared objects to be placed in the
$(libdir)/@PACKAGE@/ directory, where @PACKAGE@ is expanded to the
lowercase form of the first argument passed to AC_INIT(), i.e. "bind".
Meanwhile, NAMED_PLUGINDIR - the preprocessor macro that the
ns_plugin_expandpath() function uses for determining the absolute path
to a plugin for which only a filename has been provided (rather than a
path) - is set to $(libdir)/named.  This discrepancy breaks loading
plugins using just their filenames.  Fix the issue (and also prevent it
from reoccurring) by setting NAMED_PLUGINDIR to $(pkglibdir).

(cherry picked from commit 5065c4686e)
2022-04-22 13:29:10 +02:00
Michał Kępień
83eaff2851 Merge branch 'michal/regenerate-man-pages-with-sphinx-4.5.0-v9_18' into 'v9_18'
[v9_18] Regenerate man pages with Sphinx 4.5.0

See merge request isc-projects/bind9!6160
2022-04-22 11:21:11 +00:00
Michał Kępień
c1ba7c685d Regenerate man pages with Sphinx 4.5.0
The Debian 11 (bullseye) Docker image, which GitLab CI uses for building
documentation, currently contains the following package versions:

  - Sphinx 4.5.0
  - sphinx-rtd-theme 1.0.0
  - docutils 0.17.1

Regenerate the man pages to match contents produced in a Sphinx
environment using the above package versions.  This is necessary to
prevent the "docs" GitLab CI job from failing.

(cherry picked from commit e80ce6cfe2)
2022-04-22 13:11:35 +02:00
Michał Kępień
ccea861632 Merge branch '3297-fix-a-pylint-2.13.7-error-v9_18' into 'v9_18'
[v9_18] Fix a PyLint 2.13.7 error

See merge request isc-projects/bind9!6151
2022-04-22 10:34:57 +00:00
Michał Kępień
fd1f39fe59 Fix a PyLint 2.13.7 error
PyLint 2.13.7 reports the following error:

    bin/tests/system/doth/conftest.py:34:28: E0601: Using variable 'stderr' before assignment (used-before-assignment)

The reason the current code has not caused problems before is that
invoking gnutls-cli with just the --logfile=/dev/null argument causes it
to always return with a non-zero exit code, either due to the option not
being supported or due to the hostname argument not being provided.  In
other words, the 'except' branch has always been taken.  PyLint is
obviously right on a syntactical level, though.

Instead of relying on a less than obvious code flow (where the 'except'
branch is always taken), rework the flagged code by employing
subprocess.run(..., check=False) instead of subprocess.check_output(),
making exception handling redundant.

While this issue was investigated, it was also noticed that
subprocess.check_output() was incorrectly used as a context manager:
Popen objects are context managers, but subprocess.check_output() and
subprocess.run() are not.  Fix by dropping the relevant 'with'
statement.

(cherry picked from commit 3f5318f094)
2022-04-22 12:14:50 +02:00
Michał Kępień
f8d17c6263 Fix "digdelv" system test requirements
Commit f64cd23e7b added a Python-based
name server (bin/tests/system/digdelv/ans8/ans.py) to the "digdelv"
system test, but did not update bin/tests/system/Makefile.am to ensure
Python is present in the test environment before the "digdelv" system
test is run.  Update bin/tests/system/Makefile.am to enforce that
requirement.

(cherry picked from commit aaa0223752)
2022-04-22 12:14:50 +02:00
Michał Kępień
1735e589d1 Require Python 3.6+ for running Python-based tests
configure.ac currently requires Python 3.4 for running Python-based
system tests.  Meanwhile, there are some features in Python 3.6+ that we
would like to use for making our Python code cleaner (e.g. f-strings).
Update the minimum Python version required for running Python-based
system tests to 3.6, noting that:

  - Python 3.4 has reached end-of-life on March 18th, 2019.
  - Python 3.5 has reached end-of-life on September 5th, 2020.

(cherry picked from commit beaaa7f4e2)
2022-04-22 12:14:50 +02:00
Michał Kępień
0f59e1e270 Merge branch '3287-prevent-memory-bloat-caused-by-a-jemalloc-quirk-v9_18' into 'v9_18'
[v9_18] Prevent memory bloat caused by a jemalloc quirk

See merge request isc-projects/bind9!6153
2022-04-21 12:42:06 +00:00
Michał Kępień
5e4855a25d Add CHANGES entry for GL #3287
(cherry picked from commit e33aef4e39)
2022-04-21 14:22:13 +02:00
Michał Kępień
2da371d005 Prevent memory bloat caused by a jemalloc quirk
Since version 5.0.0, decay-based purging is the only available dirty
page cleanup mechanism in jemalloc.  It relies on so-called tickers,
which are simple data structures used for ensuring that certain actions
are taken "once every N times".  Ticker data (state) is stored in a
thread-specific data structure called tsd in jemalloc parlance.  Ticks
are triggered when extents are allocated and deallocated.  Once every
1000 ticks, jemalloc attempts to release some of the dirty pages hanging
around (if any).  This allows memory use to be kept in check over time.

This dirty page cleanup mechanism has a quirk.  If the first
allocator-related action for a given thread is a free(), a
minimally-initialized tsd is set up which does not include ticker data.
When that thread subsequently calls *alloc(), the tsd transitions to its
nominal state, but due to a certain flag being set during minimal tsd
initialization, ticker data remains unallocated.  This prevents
decay-based dirty page purging from working, effectively enabling memory
exhaustion over time. [1]

The quirk described above has been addressed (by moving ticker state to
a different structure) in jemalloc's development branch [2], but not in
any numbered jemalloc version released to date (the latest one being
5.2.1 as of this writing).

Work around the problem by ensuring that every thread spawned by
isc_thread_create() starts with a malloc() call.  Avoid immediately
calling free() for the dummy allocation to prevent an optimizing
compiler from stripping away the malloc() + free() pair altogether.

An alternative implementation of this workaround was considered that
used a pair of isc_mem_create() + isc_mem_destroy() calls instead of
malloc() + free(), enabling the change to be fully contained within
isc__trampoline_run() (i.e. to not touch struct isc__trampoline), as the
compiler is not allowed to strip away arbitrary function calls.
However, that solution was eventually dismissed as it triggered
ThreadSanitizer reports when tools like dig, nsupdate, or rndc exited
abruptly without waiting for all worker threads to finish their work.

[1] https://github.com/jemalloc/jemalloc/issues/2251
[2] c259323ab3

(cherry picked from commit 7aa7b6474b)
2022-04-21 14:22:13 +02:00
Michał Kępień
0deec48487 Merge tag 'v9_18_2' into v9_18
BIND 9.18.2
2022-04-21 09:44:56 +02:00
Tony Finch
05c88b18da Merge branch '3275-notify-test-fix-v9_18' into 'v9_18'
Avoid timeouts in the notify system test (backport to 9.18)

See merge request isc-projects/bind9!6143
2022-04-20 17:12:36 +00:00
Tony Finch
037223211c Use wait_for_log_re in the autosign system test
Fix another occurrence of the mistake of passing a regex to
wait_for_log by using the new wait_for_log_re instead.

(cherry picked from commit f4c2909353)
2022-04-20 17:51:40 +01:00