This converts a special characters test case, a max-zone-ttl error
check, and two cases of insecure zones.
We no longer assert for having more than one DNSKEY and/or RRSIG
records. If the zone is insecure, this is no longer always true. And
we already check for the expected number of records in the
check_dnskeys/check_signatures functions.
This commit deals with converting the dynamic zone test cases to
pytest. The tests for 'inline-signing.kasp' are similar to the default
case, so these are added to 'test_kasp_default'.
Unfortunately I need to add sleep calls in between freezing, updating,
and thawing a zone. Without it the intermittent failures are too
frequent.
This commit deals with converting the test cases related to the default
dnssec-policy.
This requires a new method 'check_update_is_signed'. This method will
be used in future tests as well, and checks if an expected record is
in the zone and is properly signed.
Remove the counterparts for the newly added test from the kasp shell
tests script.
Convert the first couple of tests from 'kasp/tests.sh' to
'kasp/tests_kasp.py', those are test cases related to 'dnssec-keygen'
and 'dnssec-settime'.
For this, we also add a new KeyProperties method,
'policy_to_properties', that takes a list of strings which represent
the keys according to the dnssec-policy and the expected key states.
For the kasp tests we need a new utility that can retrieve a list of
Keys from a given directory, belonging to a specific zone. This is
'keydir_to_keylist' and is the replacement of 'kasp.sh:get_keyids()'.
'check_next_key_event' is a method to check when the next key event is
scheduled, needed for the rollover tests.
This commit introduces replacements for the 'check_keys' and
'check_keytimes' from the shell test library. For that, we introduce
more functions for the class Key. The 'match_properties' function is
used in 'check_keys' to see if a set of KeyProperties match the Key.
This speficially ignores timing metadata. The function resembles what
is in 'kasp.sh:check_key()'.
The 'match_timingmetadata' function is used in 'check_keytimes' to see
if the timing metadata of a set of KeyProperties match the Key. The
values are checked in all three key files (except if the private key is
not available (set with properties["private"]), or if it is a legacy key
(set with properties["legacy"]).
An additional check function is added, to check if the key relationships
are set correctly. It follows a similar pattern as 'check_keytimes'. If
"Predecessor" and/or "Successor" are expected to be set in the state
file, this function checks so, and also verifies that they are not set
if they should not be.
Because we want to check the metadata in all three files, a new
value in the Key class is added: 'privatefile'. The 'get_metadata'
function is adapted so that we can also check metadata in other files.
Introduce methods to easily retrieve the TTL and public DNSKEY record
from the keyfile.
When checking if the CDS is equal to the expected value, use the DNSKEY
TTL instead of hardcoded 3600.
In isctest.kasp, introduce a new class 'KeyProperties' that can be used
to check if a Key matches expected properties. Properties are for the
time being divided in three parts: 'properties' that contain some
attributes of the expected properties (such as are we dealing with a
legacy key, is the private key available, and other things that do not
fit the metadata exactly), 'metadata' that contains expected metadata
(such as 'Algorithm', 'Lifetime', 'Length'), and 'timing', which is
metadata of the class KeyTimingMetadata.
The 'default()' method fills in the expected properties for the default
DNSSEC policy.
The 'set_expected_times()' sets the expected timing metadata, derived
from when the key was created. This method can take an offset to push
the expected timing metadata a duration in the future or back into the
past. If 'pregenerated=True', derive the expected timing metadata from
the 'Publish' metadata derived from the keyfile, rather than from the
'Created' metadata.
The calculations in the 'Ipub', 'IpubC' and 'Iret' methods are derived
from RFC 7583 DNSSEC Key Rollover Timing Considerations.
This is the first step of converting the kasp system test to pytest.
Well, perhaps not the first, because earlier the ksr system test was
already converted to pytest and then the `isctest/kasp.py` library
was already introduced. Lots of this code can be reused for the kasp
pytest code.
First of all, 'check_file_contents_equal' is moved out of the ksr test
and into the 'check' library. This feels the most appropriate place
for this function to be reused in other tests. Then, 'keystr_to_keylist'
is moved to the 'kasp' library.
Introduce two new methods that are unused in this point of time, but
we are going to need them for the kasp system test. 'zone_contains'
will be used to check if a signature exists in the zonefile. This way
we can tell whether the signature has been reused or refreshed.
'file_contents_contain' will be used to check if the comment and public
DNSKEY record in the keyfile is correct.
The dnssec-keygen command for the ZSK generation for the zone
multisigner-model2.kasp was wrong (no ZSK was generated in the setup
script, but when 'named' is started, the missing ZSK was created
anyway by 'dnssec-policy'.
Only set the next time the keymgr should run if the value is non zero.
Otherwise we default back to one hour. This may happen if there is one
or more key with an unlimited lifetime.
The keymgr never set the expected timing metadata when CDS/CDNSKEY
records for the corresponding key will be removed from the zone. This
is not troublesome, as key states dictate when this happens, but with
the new pytest we use the timing metadata to determine if the CDS and/or
CDNSKEY for the given key needs to be published.
There are a couple of cases where the safety intervals are added
inappropriately:
1. When setting the PublishCDS/SyncPublish timing metadata, we don't
need to add the publish-safety value if we are calculating the time
when the zone is completely signed for the first time. This value
is for when the DNSKEY has been published and we add a safety
interval before considering the DNSKEY omnipresent.
2. The retire-safety value should only be added to ZSK rollovers if
there is an actual rollover happening, similar to adding the sign
delay.
3. The retire-safety value should only be added to KSK rollovers if
there is an actual rollover happening. We consider the new DS
omnipresent a bit later, so that we are forced to keep the old DS
a bit longer.
While converting the kasp system test to pytest, I encountered a small
bug in the keymgr code. We retire keys when there is more than one
key matching a 'keys' line from the dnssec-policy. But if there are
multiple identical 'keys' lines, as is the case for the test zone
'checkds-doubleksk.kasp', we retire one of the two keys that have the
same properties.
Fix this by checking if there are double matches. This is not fool proof
because there may be many keys for a few identical 'keys' lines, but it
is good enough for now. In practice it makes no sense to have a policy
that dictates multiple keys with identical properties.
When `dns_remote_done()` is true, calling `dns_remote_curraddr()` asserts.
Add a `dns_remote_curraddr()` check before calling `dns_remote_curraddr()`.
Closes#5215
Merge branch '5215-assert-in-dns_remote_curraddr-fix' into 'main'
See merge request isc-projects/bind9!10222
This MR changes custom attach/detach implementation with refcount macros, replaces isc_mem_destroy() with isc_mem_detach(), and does various small cleanups.
Merge branch 'ondrej/cleanup-isc_mem-api' into 'main'
See merge request isc-projects/bind9!9456
The isc_mem API is one of the most commonly used APIs that didn't
used ISC_REFCOUNT_DECL and ISC_REFCOUNT_IMPL macros. Replace the
implementation of isc_mem_attach(), isc_mem_detach() and
isc_mem_destroy() with the respective macros.
This also removes the legacy isc_mem_destroy() functionality that would
check whether all references had been detached from the memory context
as it doesn't work reliably when using the call_rcu() API. Instead of
doing this individually, call isc_mem_checkdestroyed(stderr) from the
isc_mem_destroy() macro to keep the extra check that all contexts were
freed when the program is exiting.
ZONEMD digests RRSIG records and potentially digests SIG record. Add digests
methods for both record types.
Closes#5219
Merge branch '5219-add-digest-methods-for-sig-and-rrsig' into 'main'
See merge request isc-projects/bind9!10217
ZONEMD needs to be able to digest SIG and RRSIG records. The signer
field can be compressed in SIG so we need to call dns_name_digest().
While for RRSIG the records the signer field is not compressed the
canonical form has the signer field downcased (RFC 4034, 6.2). This
also implies that compare_rrsig needs to downcase the signer field
during comparison.
The qpcache_findzonecut() accepts two "foundnames": 'foundname' and
'dcname' could be NULL. Originally, when 'dcname' would be NULL, the
'dcname' would be set to 'foundname' which basically means that we were
copying the .ndata over itself for no apparent reason.
Merge branch 'ondrej/refactor-qpcache_findzonecut' into 'main'
See merge request isc-projects/bind9!10049
The qpcache_findzonecut() accepts two "foundnames": 'foundname' and
'dcname' could be NULL. Originally, when 'dcname' would be NULL, the
'dcname' would be set to 'foundname'. Then code like this was present:
result = find_deepest_zonecut(&search, node, nodep, foundname,
rdataset,
sigrdataset DNS__DB_FLARG_PASS);
dns_name_copy(foundname, dcname);
Which basically means that we are copying the .ndata over itself for no
apparent reason. Cleanup the dcname vs foundname usage.
Co-authored-by: Evan Hunt <each@isc.org>
Co-authored-by: Ondřej Surý <ondrej@isc.org>
isc_iterated_hash didn't work in offloaded threads as the per thread
initialisation has not been done. This has been fixed.
Closes#5214
Merge branch '5214-call-isc__iterated_hash_initialize-in-isc__work_cb' into 'main'
See merge request isc-projects/bind9!10206
When a system test is run with the `USE_RR` environment variable set to 1, an `rr` trace is now correctly generated for each instance of `named`.
Closes#5079
Merge branch '5079-fix-rr' into 'main'
See merge request isc-projects/bind9!10197
when running a system test with the USE_RR environment
variable set to 1, an rr trace is generated for named.
because rr wasn't run using libtool --mode=execute, the
trace would actually be generated for the wrapper script
generated by libtool, not for the actual named binary.
IPv6 Advanced Socket API (:rfc:`3542`) is a hard requirement, remove the
autoconf check to speed up the ./configure run a little bit.
Merge branch 'ondrej/remove-check-for-mandatory-IPv6' into 'main'
See merge request isc-projects/bind9!10201
This merge request addresses several key performance bottlenecks in the DoH (DNS over HTTPS) implementation by introducing significant optimizations and improvements.
### Key Improvements
1. **Simplification and Optimisation of `http_do_bio()` Function**:
- The code flow in the `http_do_bio()` function has been significantly simplified.
2. **Flushing HTTP Write Buffer on Outgoing DNS Messages**:
- The buffer is flushed and a send operation is performed when there is an outgoing DNS message.
3. **Bumping Active Streams Processing Limit**:
- The total number of active streams has been increased to 60% of the total streams limit.
These changes collectively enhance the performance and reliability of the DoH implementation, making it more efficient and robust for handling high-load scenarios, particularly noticeable in long runs (>= 1h) of `stress:long:rpz:doh+udp:linux:*` tests. It improves perf. for tests for BIND 9.18, but it likely will have a positive but less pronounced effect on newer versions as well.
In essence, the merge request fixes three bottlenecks stacked upon each other.
*It is a logical continuation of the merge requests !10109.* !10109, unfortunately, did not completely [address the performance drop in 9.18](https://gitlab.isc.org/isc-projects/bind9/-/pipelines/221545) for longer runs of the stress test. This merge request [addresses that](https://gitlab.isc.org/isc-projects/bind9/-/pipelines/223661).
**P.S.**
The origin of the fixes is, in fact, the branch in !10193. So this MR is a ... *forward port* of them.
Merge branch 'artem-doh-performance-drop-post-fix' into 'main'
See merge request isc-projects/bind9!10192
This commit bumps the total number of active streams (= the opened
streams for which a request is received, but response is not ready) to
60% of the total streams limit.
The previous limit turned out to be too tight as revealed by
longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests.
Previously, the code would try to avoid sending any data regardless of
what it is unless:
a) The flush limit is reached;
b) There are no sends in flight.
This strategy is used to avoid too numerous send requests with little
amount of data. However, it has been proven to be too aggressive and,
in fact, harms performance in some cases (e.g., on longer (≥1h) runs
of "stress:long:rpz:doh+udp:linux:*").
Now, additionally to the listed cases, we also:
c) Flush the buffer and perform a send operation when there is an
outgoing DNS message passed to the code (which is indicated by the
presence of a send callback).
That helps improve performance for "stress:long:rpz:doh+udp:linux:*"
tests.
Previously, a function for continuing IO processing on the next UV
tick was introduced (http_do_bio_async()). The intention behind this
function was to ensure that http_do_bio() is eventually called at
least once in the future. However, the current implementation allows
queueing multiple such delayed requests needlessly. There is currently
no need for these excessive requests as http_do_bio() can requeue them
if needed. At the same time, each such request can lead to a memory
allocation, particularly in BIND 9.18.
This commit ensures that the number of enqueued delayed IO processing
requests never exceeds one in order to avoid potentially bombarding IO
threads with the delayed requests needlessly.
This commit significantly simplifies the code flow in the
http_do_bio() function, which is responsible for processing incoming
and outgoing HTTP/2 data. It seems that the way it was structured
before was indirectly caused by the presence of the missing callback
calls bug, fixed in 8b8f4d500d.
The change introduced by this commit is known to remove a bottleneck
and allows reproducible and measurable performance improvement for
long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests.
Additionally, it fixes a similar issue with potentially missing send
callback calls processing and hardens the code against use-after-free
errors related to the session object (they can potentially occur).
Cleanup short list macros from <isc/util.h>, remove two unused headers, move locking macros to respective headers and use only the C11 static assertion.
Merge branch 'ondrej/cleanup-short-macros' into 'main'
See merge request isc-projects/bind9!10196
Previously, a gcc < 4.6 shim for _Static_assert() was included. Such an
old compiler is not supported now anyway, so the macro variant has been
removed in favor of a single definition using _Static_assert().