Commit Graph

36885 Commits

Author SHA1 Message Date
Matthijs Mekking
2655ee4902 Add release note and change entry for [GL #3627]
(cherry picked from commit 5585256bf6)
2022-10-27 12:18:27 +02:00
Matthijs Mekking
9a05448f13 Fix config bug related to port setting
There are three levels there for the port value, with increasing
priority:

1. The default ports, defined by 'port' and 'tls-port' config options.
2. The primaries-level default port: primaries port <number>  { ... };
3. The primaries element-level port: primaries { <address> port
   <number>; ... };"

In 'named_config_getipandkeylist()', the 'def_port' and 'def_tlsport'
variables are extracted from level 1. The 'port' variable is extracted
from the level 2. Currently if that is unset, it defaults to the
default port ('def_port' or 'def_tlsport' depending on the transport
used), but overrides the level 2 port setting for the next primaries in
the list.

Update the code such that we inherit the port only if the level 3 port
is not set, and inherit from the default ports if the level 2 port is
also not set.

(cherry picked from commit 72d3bf8e4e)
2022-10-27 12:18:18 +02:00
Matthijs Mekking
bf6961c551 Add xfer system test case
Add a test case that if the first primary fails, the fallback of a
second primary on plain DNS works. This is mainly to test that the port
configuration inheritance works correctly.

(cherry picked from commit 622a499027)
2022-10-27 12:18:12 +02:00
Tom Krizek
222a1fc4eb Merge branch '3517-serve-stale-cache-timeout-0-test-v9_18' into 'v9_18'
[v9_18] [CVE-2022-3080] Test serve stale cache with timeout 0 and CNAME

See merge request isc-projects/bind9!6976
2022-10-24 13:00:50 +00:00
Tom Krizek
9a159fc4c4 Remove misleading comment from serve-stale test
The stale-answer-client-timeout option is not set to 0 in the config
neither is it the default value. This was probably caused by a
copy-paste error.
2022-10-24 14:30:43 +02:00
Tom Krizek
3a9ae0249d Test serve stale cache with timeout 0 and CNAME
Add a couple of tests that verify the serve-stale behavior when
stale-answer-client-timeout is set to 0 and a (stale) CNAME record is
queried.

Related #3517
2022-10-24 14:30:41 +02:00
Michał Kępień
b272185c38 Merge branch 'michal/bump-sphinx-version-to-5.3.0-v9_18' into 'v9_18'
[v9_18] Bump Sphinx version to 5.3.0

See merge request isc-projects/bind9!6973
2022-10-24 09:57:41 +00:00
Michał Kępień
da0cd8c6db Bump Sphinx version to 5.3.0
Make the Sphinx version listed in doc/arm/requirements.txt match the
version currently used in GitLab CI, so that Read the Docs builds the
documentation using the same Python software versions as those used in
GitLab CI.

(cherry picked from commit a8f0ab7df6)
2022-10-24 11:45:11 +02:00
Arаm Sаrgsyаn
65467dfcff Merge branch '3603-resolver-prefetch-eligibility-edge-case-bug-v9_18' into 'v9_18'
[v9_18] Synchronize prefetch "trigger" and "eligibility" code and documentation

See merge request isc-projects/bind9!6969
2022-10-21 11:29:08 +00:00
Aram Sargsyan
840cad93c7 Getting the "prefetch" setting from the configuration cannot fail
The "prefetch" setting is in "defaultconf" so it cannot fail, use
INSIST to confirm that.

The 'trigger' and 'eligible' variables are now prefixed with
'prefetch_' and their declaration moved to an upper level, because
there is no more additional code block after this change.

(cherry picked from commit 0227565cf1)
2022-10-21 10:22:51 +00:00
Aram Sargsyan
6d64f9e4ec Fix prefetch "trigger" value's documentation in ARM
For the prefetch "trigger" parameter ARM states that when a cache
record with a lower TTL value is encountered during query processing,
it is refreshed. But in reality, the record is refreshed when the TTL
value is lower or equal to the configured "trigger" value.

Fix the documentation to make it match with with the code.

(cherry picked from commit ef344b1f52)
2022-10-21 10:22:44 +00:00
Aram Sargsyan
b7149536ee Add a CHANGES note for [GL #3603]
(cherry picked from commit 041ffac0d7)
2022-10-21 10:22:37 +00:00
Aram Sargsyan
bb9cc81dd4 Match prefetch eligibility behavior with ARM
ARM states that the "eligibility" TTL is the smallest original TTL
value that is accepted for a record to be eligible for prefetching,
but the code, which implements the condition doesn't behave in that
manner for the edge case when the TTL is equal to the configured
eligibility value.

Fix the code to check that the TTL is greater than, or equal to the
configured eligibility value, instead of just greater than it.

(cherry picked from commit 863f51466e)
2022-10-21 10:22:29 +00:00
Aram Sargsyan
9000b43d46 Add another prefetch check in the resolver system test
The test triggers a prefetch, but fails to check if it acutally
happened, which prevented it from catching a bug when the record's
TTL value matches the configured prefetch eligibility value.

Check that prefetch happened by comparing the TTL values.

(cherry picked from commit 89fa9a6592)
2022-10-21 10:22:23 +00:00
Arаm Sаrgsyаn
ccbd389ed2 Merge branch '3598-adb-quota-might-not-be-decremented-v9_18' into 'v9_18'
[v9_18] Resolve "ADB quota might not be decremented"

See merge request isc-projects/bind9!6967
2022-10-21 10:09:09 +00:00
Aram Sargsyan
192373a26e Add CHANGES and release notes for [GL #3598]
(cherry picked from commit 6f50972e5f)
2022-10-21 09:04:51 +00:00
Aram Sargsyan
64feeba60f Call dns_adb_endudpfetch() on error path, if required
For UDP queries, after calling dns_adb_beginudpfetch() in fctx_query(),
make sure that dns_adb_endudpfetch() is also called on error path, in
order to adjust the quota back.

(cherry picked from commit 5da79e2be0)
2022-10-21 08:36:34 +00:00
Aram Sargsyan
a83a58467d Always call dns_adb_endudpfetch() in fctx_cancelquery() for UDP queries
It is currently possible that dns_adb_endudpfetch() is not
called in fctx_cancelquery() for a UDP query, which results
in quotas not being adjusted back.

Always call dns_adb_endudpfetch() for UDP queries.

(cherry picked from commit e4569373ca)
2022-10-21 08:36:34 +00:00
Aram Sargsyan
4a311b9bb4 Unlink the query under cleanup_query
In the cleanup code of fctx_query() function there is a code path
where 'query' is linked to 'fctx' and it is being destroyed.

Make sure that 'query' is unlinked before destroying it.

(cherry picked from commit ac889684c7)
2022-10-21 08:36:34 +00:00
Ondřej Surý
b55f4068ff Merge branch '3270-use-curl-in-statschannel-system-test-v9_18' into 'v9_18'
Replace some raw nc usage in statschannel system test with curl [v9.18]

See merge request isc-projects/bind9!6966
2022-10-20 16:28:29 +00:00
Ondřej Surý
a06bd51bd7 Replace some raw nc usage in statschannel system test with curl
For tests where the TCP connection might get interrupted abruptly,
replace the nc with curl as the data sent from server to client might
get lost because of abrupt TCP connection.  This happens when the TCP
connection gets closed during sending the large request to the server.

As we already require curl for other system tests, replace the nc usage
in the statschannel test with curl that actually understands the
HTTP/1.1 protocol, so the same connection is reused for sending the
consequtive requests, but without client-side "pipelining".

For the record, the server doesn't support parallel processing of the
pipelined request, so it's a bit misnomer here, because what we are
actually testing is that we process all requests received in a single
TCP read callback.

(cherry picked from commit cd0e5c5784)
2022-10-20 18:06:48 +02:00
Ondřej Surý
ce4528940b Merge branch '3270-serialize-statschannel-http-requests-v9_18' into 'v9_18'
Serialize the HTTP/1.1 statschannel requests [v9.18]

See merge request isc-projects/bind9!6965
2022-10-20 15:57:53 +00:00
Ondřej Surý
9274876dec Serialize the HTTP/1.1 statschannel requests
The statschannel truncated test still terminates abruptly sometimes and
it doesn't return the answer for the first query.  This might happen
when the second process_request() discovers there's not enough space
before the sending is complete and the connection is terminated before
the client gets the data.

Change the isc_http, so it pauses the reading when it receives the data
and resumes it only after the sending has completed or there's
incomplete request waiting for more data.

This makes the request processing slightly less efficient, but also less
taxing for the server, because previously all requests that has been
received via single TCP read would be processed in the loop and the
sends would be queued after the read callback has processed a full
buffer.

(cherry picked from commit 13959781cb)
2022-10-20 17:23:36 +02:00
Ondřej Surý
f3847437b2 Merge branch 'ondrej-refactor-isc_httpd-v9_18' into 'v9_18'
Rewrite isc_httpd using picohttpparser and isc_url_parse [v9.18]

See merge request isc-projects/bind9!6964
2022-10-20 15:14:45 +00:00
Ondřej Surý
da1e7a7ba2 Replace the statschannel truncated tests with two new tests
Now that the artificial limit on the recv buffer has been removed, the
current system test always fails because it tests if the truncation has
happened.

Add test that sending more than 10 headers makes the connection to
closed; and add test that sending huge HTTP request makes the connection
to be closed.

(cherry picked from commit cad2706cce)
2022-10-20 16:13:10 +02:00
Ondřej Surý
067502a16e Rewrite isc_httpd using picohttpparser and isc_url_parse
Rewrite the isc_httpd to be more robust.

1. Replace the hand-crafted HTTP request parser with picohttpparser for
   parsing the whole HTTP/1.0 and HTTP/1.1 requests.  Limit the number
   of allowed headers to 10 (arbitrary number).

2. Replace the hand-crafted URL parser with isc_url_parse for parsing
   the URL from the HTTP request.

3. Increase the receive buffer to match the isc_netmgr buffers, so we
   can at least receive two full isc_nm_read()s.  This makes the
   truncation processing much simpler.

4. Process the received buffer from single isc_nm_read() in a single
   loop and schedule the sends to be independent of each other.

The first two changes makes the code simpler and rely on already
existing libraries that we already had (isc_url based on nodejs) or are
used elsewhere (picohttpparser).

The second two changes remove the artificial "truncation" limit on
parsing multiple request.  Now only a request that has too many
headers (currently 10) or is too big (so, the receive buffer fills up
without reaching end of the request) will end the connection.

We can be benevolent here with the limites, because the statschannel
channel is by definition private and access must be allowed only to
administrators of the server.  There are no timers, no rate-limiting, no
upper limit on the number of requests that can be served, etc.

(cherry picked from commit beecde7120)
2022-10-20 16:10:21 +02:00
Ondřej Surý
944ddd0fb2 Add picohttpparser.{c.h} from https://github.com/h2o/picohttpparser
PicoHTTPParser is a tiny, primitive, fast HTTP request/response parser.

Unlike most parsers, it is stateless and does not allocate memory by
itself. All it does is accept pointer to buffer and the output
structure, and setups the pointers in the latter to point at the
necessary portions of the buffer.

(cherry picked from commit 3a8884f024)
2022-10-20 15:49:27 +02:00
Artem Boldariev
8e6721fe9e Merge branch '3563-fix-named-startup-on-manycore-solaris-systems-v9-18' into 'v9_18'
[Backport to v9.18] Fix named failing to start on Solaris systems with hundreds of CPUs

See merge request isc-projects/bind9!6962
2022-10-20 13:14:46 +00:00
Artem Boldariev
c3ce67f994 Modify release notes [GL #3563]
Mention that a startup problem on manycore Solaris systems is fixed.

(cherry picked from commit 2c9400f116)
2022-10-20 15:15:52 +03:00
Artem Boldariev
acb431b5c3 Modify CHANGES [GL #3563]
Mention that a startup problem on manycore Solaris systems is fixed.

(cherry picked from commit 03ee132e28)
2022-10-20 15:15:51 +03:00
Artem Boldariev
43c8e8b9d6 Fix named failing to start on Solaris systems with hundreds of CPUs
This commit fixes a startup issue on Solaris systems with
many (reportedly > 510) CPUs by bumping RLIMIT_NOFILE. This appears to
be a regression from 9.11.

(cherry picked from commit fff01fe7eb)
2022-10-20 15:15:10 +03:00
Michal Nowak
d11843bdfc Merge tag 'v9_18_8' into v9_18
BIND 9.18.8
2022-10-20 11:47:43 +02:00
Matthijs Mekking
be33035f26 Merge branch 'matthijs-fix-dnssec-signing-log-lovel-v9_18' into 'v9_18'
[v9_18] Change log level when doing rekey

See merge request isc-projects/bind9!6939
2022-10-20 08:20:49 +00:00
Matthijs Mekking
6af9d0088b Change log level when doing rekey
This log happens when BIND checks the parental-agents if the DS has
been published. But if you don't have parental-agents set up, the list
of keys to check will be empty and the result will be ISC_R_NOTFOUND.
This is not an error, so change the log level to debug in this case.

(cherry picked from commit a1d57fc8cb)
2022-10-20 10:20:14 +02:00
Evan Hunt
8138ab4611 Merge branch '3247-rpz-ip-cd-v9_18' into 'v9_18'
ensure RPZ lookups handle CD=1 correctly

See merge request isc-projects/bind9!6957
2022-10-19 20:38:34 +00:00
Evan Hunt
777aa045fc CHANGES for [GL #3247]
(cherry picked from commit 3676f6394b)
2022-10-19 13:12:52 -07:00
Evan Hunt
5c44d63979 add a test with CD=1 query for pending data
this is a regression test for [GL #3247].

(cherry picked from commit 575a924b1a)
2022-10-19 13:12:32 -07:00
Evan Hunt
2cc8874c90 ensure RPZ lookups handle CD=1 correctly
RPZ rewrites called dns_db_findext() without passing through the
client database options; as as result, if the client set CD=1,
DNS_DBFIND_PENDINGOK was not used as it should have been, and
cache lookups failed, resulting in failure of the rewrite.

(cherry picked from commit 305a50dbe1)
2022-10-19 13:12:31 -07:00
Tom Krizek
ff5823fa12 Merge branch 'tkrizek/system-tests-fixes-v9_18' into 'v9_18'
Update various system tests and add them to default test suite [v9_18]

See merge request isc-projects/bind9!6949
2022-10-19 14:58:13 +00:00
Tom Krizek
ba7ea2dfac Remove generated controls.conf file from system tests
The controls.conf file shouldn't be used directly without templating it
first. Remove this no longer used hard-coded file to avoid confusion.

(cherry picked from commit cbd0355328)
2022-10-19 15:32:46 +02:00
Tom Krizek
5db5f20985 Revive dupsigs system test
Correctly source conf.sh in dupsigs test scripts (fix issue introduced
by 093af1c00a).

Update dupsigs test for dnssec-dnskey-kskonly default. Since v9.17.20,
the dnssec-dnskey-kskonly is set to yes. Update the test to not expect
the additional RRSIG with ZSK for DNSKEY.

Speed up the test from 20 minutes to 2.5 minutes and make it part of the
default test suite executed in CI.
- decrease number of records to sign from 2000 to 500
- decrease the signing interval by a factor of 6
- shorten the final part of the test after last signing (since nothing
  new happens there)

Finally, clarify misleading comments about (in)sufficient time for zone
re-signing. The time used in the test is in fact sufficient for the
re-signing to happen. If it wasn't, the previous ZSK would end up being
deleted while its signatures would still be present, which is a
situation where duplicate signatures can still happen.

(cherry picked from commit cb0a2ae1dd)
2022-10-19 15:32:44 +02:00
Tom Krizek
ef0eadf864 Revive the stress system test
Ensure the port numbers are dynamically filled in with copy_setports.

Clarify test fail condition.

Make the stress test part of the default test suite since it doesn't
seem to run too long or interfere with other tests any more (the
original note claiming so is more than 20 years old).

Related !6883

(cherry picked from commit 7495deea3e)
2022-10-19 15:32:44 +02:00
Tom Krizek
bd8262dc35 Revive dialup system test
Properly template the port number in config files with copy_setports.

The test takes two minutes on my machine which doesn't seem like a
proper justification to exclude it from the test suite, especially
considering we run these tests in parallel nowadays. The resource usage
doesn't seems significantly increased so it shouldn't interfere with
other system tests.

There also exists a precedent for longer running system tests that are
already part of the default system test suite (e.g. serve-stale takes
almost three minutes on the same machine).

(cherry picked from commit 235ae5f344)
2022-10-19 15:32:44 +02:00
Tom Krizek
25d2d7e46e Make digdelv test work in different network envs
When a target server is unreachable, the varying network conditions may
cause different ICMP message (or no message). The host unreachable
message was discovered when attempting to run the test locally while
connected to a VPN network which handles all traffic.

Extend the dig output check with "host unreachable" message to avoid a
false negative test result in certain network environments.

(cherry picked from commit 1e7d832342)
2022-10-19 15:32:44 +02:00
Ondřej Surý
10a43eba02 Merge branch '3270-remove-time-requirement-for-statschannel-truncated-test-v9_18' into 'v9_18'
Remove the time requirement for the statschannel truncated test [v9.18]

See merge request isc-projects/bind9!6953
2022-10-19 13:31:23 +00:00
Ondřej Surý
6261ada8c2 Remove the time requirement for the statschannel truncated test
The 5 seconds requirement to finish the 'pipelined with truncated
stream' was causing spurious failures in the CI because the job runners
might be very busy and sending 128k of data might simply take some time.

Remove the time requirement altogether, there's actually no reason why
the test SHOULD or even MUST finish under 5 seconds.

(cherry picked from commit 0f56a53d66)
2022-10-19 15:30:44 +02:00
Michal Nowak
6a3d92a98c Merge branch '3394-cve-2022-2795-test-v9_18' into 'v9_18'
[v9_18] Add tests for CVE-2022-2795

See merge request isc-projects/bind9!6948
2022-10-19 13:05:22 +00:00
Michał Kępień
9c2714e27f Add tests for CVE-2022-2795
Add a test ensuring that the amount of work fctx_getaddresses() performs
for any encountered delegation is limited: delegate example.net to a set
of 1,000 name servers in the redirect.com zone, the names of which all
resolve to IP addresses that nothing listens on, and query for a name in
the example.net domain, checking the number of times the findname()
function gets executed in the process; fail if that count is excessively
large.

Since the size of the referral response sent by ans3 is about 20 kB, it
cannot be sent back over UDP (EMSGSIZE) on some operating systems in
their default configuration (e.g. FreeBSD - see the
net.inet.udp.maxdgram sysctl).  To enable reliable reproduction of
CVE-2022-2795 (retry patterns vary across BIND 9 versions) and avoid
false positives at the same time (thread scheduling - and therefore the
number of fetch context restarts - vary across operating systems and
across test runs), extend bin/tests/system/resolver/ans3/ans.pl so that
it also listens on TCP and make "ns1" in the "resolver" system test
always use TCP when communicating with "ans3".

Also add a test (foo.bar.sub.tld1/TXT) that ensures the new limitations
imposed on the resolution process by the mitigation for CVE-2022-2795 do
not prevent valid, glueless delegation chains from working properly.

(cherry picked from commit 604d8f0b96)
2022-10-19 12:36:20 +02:00
Michal Nowak
09ea1f9b3b Merge branch '3493-compression-buffer-reuse-test-v9_18' into 'v9_18'
[CVE-2022-2881] test for growth of compressed pipelined responses

See merge request isc-projects/bind9!6941
2022-10-19 08:19:06 +00:00
Evan Hunt
b42dfd01f1 test for growth of compressed pipelined responses
add a test to compare the Content-Length of successive compressed
messages on a single HTTP connection that should contain the same
data; fail if the size grows by more than 100 bytes from one query
to the next.

(cherry picked from commit 3c11fafadf)
2022-10-18 17:28:45 +02:00