Commit Graph

31140 Commits

Author SHA1 Message Date
Evan Hunt
c5405c2700 improve calculation of database size
"max-journal-size" is set by default to twice the size of the zone
database. however, the calculation of zone database size was flawed.

- change the size calculations in dns_db_getsize() to more accurately
  represent the space needed for a journal file or *XFR message to
  contain the data in the database. previously we returned the sizes
  of all rdataslabs, including header overhead and offset tables,
  which resulted in the database size being reported as much larger
  than the equivalent journal transactions would have been.
- map files caused a particular problem here: the full name can't be
  determined from the node while a file is being deserialized, because
  the uppernode pointers aren't set yet. so we store "full name length"
  in the dns_rbtnode structure while serializing, and clear it after
  deserialization is complete.
2020-03-12 00:38:37 -07:00
Ondřej Surý
c99f7cf9bd Merge branch 'ondrej/fix-clang-format-headers-symlinks-v9_16' into 'v9_16'
Fix .clang-format.headers symlinks (v9.16)

See merge request isc-projects/bind9!3213
2020-03-11 09:24:21 +00:00
Ondřej Surý
67464af0bb Fixup the headers formatting 2020-03-11 10:23:35 +01:00
Ondřej Surý
60c6ff4ece Fix the deeper symlinks to .clang-format.headers 2020-03-11 10:21:54 +01:00
Ondřej Surý
ff60a59b7f Merge branch 'ondrej/clang-format-improve-includes-v9_16' into 'v9_16'
Improve #include block sorting and grouping in clang-format (v9_16)

See merge request isc-projects/bind9!3194
2020-03-11 08:55:38 +00:00
Ondřej Surý
f3c2274479 Use the new sorting rules to regroup #include headers 2020-03-11 08:55:12 +00:00
Ondřej Surý
ba0aff0d59 Improve the #include block sorting
The IncludeCategories was incomplete, it missed pk11/ and dst/ headers
and the rule that put "" header after all <> headers was broken.
2020-03-11 08:55:12 +00:00
Michał Kępień
164109087e Merge branch 'michal/minor-release-note-tweaks-v9_16' into 'v9_16'
[v9_16] Minor release note tweaks

See merge request isc-projects/bind9!3211
2020-03-11 08:54:48 +00:00
Michał Kępień
f483c4a6bb Add GitLab identifier to rwlock release note
(cherry picked from commit 3e6ef80706)
2020-03-11 09:52:51 +01:00
Michał Kępień
45170c828e Merge branch '1636-add-release-note-about-controlling-source-ports-v9_16' into 'v9_16'
[v9_16] Add release note about controlling source ports

See merge request isc-projects/bind9!3209
2020-03-11 08:32:58 +00:00
Michał Kępień
e6d4da4080 Add release note about controlling source ports
(cherry picked from commit 384b413dc5)
2020-03-11 09:30:42 +01:00
Michał Kępień
43bea15a7a Release note wording tweaks
(cherry picked from commit 2283d38ac2)
2020-03-11 09:29:44 +01:00
Michał Kępień
c7dee5dd84 Move pthread rwlocks release note to a section
(cherry picked from commit f8a8eaba8b)
2020-03-11 09:28:01 +01:00
Michał Kępień
1d1605fb57 Merge branch 'matthijs-disable-mscv-kasp-system-test-v9_16' into 'v9_16'
[v9_16] Disable kasp test on Windows

See merge request isc-projects/bind9!3208
2020-03-11 07:20:54 +00:00
Matthijs Mekking
e58c1cfe1a Remove leftover set_keydir
(cherry picked from commit 2094e5ed4d)
2020-03-10 16:04:13 +01:00
Matthijs Mekking
a22e881a97 Disable kasp test on Windows
The kasp system test is timing critical.  The test passes on all
Linux based machines, but fails frequently on Windows.  The test
takes a lot more time on Windows and at the final checks fail
because the expected next key event is too far off.  For example:

I:kasp:check next key event for zone step2.algorithm-roll.kasp (570)
I:kasp:error: bad next key event time 20909 for zone \
  step2.algorithm-roll.kasp (expect 21600)
I:kasp:failed

This is because the kasp system test calculates the time when the
next key event should occur based on the policy.  This assumes that
named is able to do key management within a minute.  But starting,
named, doing key management for other zones, and reconfiguring takes
much more time on Windows and thus the next key event on Windows is
much shorter than anticipated.

That this happens is a good thing because this means that the
correct next key event is used, but is not so nice for testing, as
it is hard to determine how much time named needed before finishing
the current key event.

Disable the kasp test on Windows now because it is blocking the
release.  We know the cause of these test failures, and it is clear
that this is a fault in the test, not the code.  Therefore we feel
comfortable disabling the test right now and work on a fix while
unblocking the release.

(cherry picked from commit 4e610b7f6b)
2020-03-10 16:04:13 +01:00
Michal Nowak
6d8f8abe49 Merge branch 'mnowak/abi-tracker-helper-v9_16' into 'v9_16'
[v9_16] Add API Checker

See merge request isc-projects/bind9!3203
2020-03-10 08:56:20 +00:00
Michal Nowak
0cff4c4e4f Add API Checker
ABI checker tools generate HTML and TXT API compatibility reports of
BIND libraries. Comparison is being done between two bind source trees
which hold built BIND.

In the CI one version is the reference version defined by
BIND_BASELINE_VERSION variable, the latter one is the HEAD of branch
under test.

(cherry picked from commit 49bc08e612)
2020-03-10 09:53:44 +01:00
Evan Hunt
15090b4dea Merge branch '1664-double-unlock-v9_16' into 'v9_16'
remove redundant ZONEDB_UNLOCK

See merge request isc-projects/bind9!3198
2020-03-10 00:08:51 +00:00
Evan Hunt
2db2a22f28 remove redundant ZONEDB_UNLOCK
(cherry picked from commit b54454b7c6)
2020-03-09 16:47:44 -07:00
Matthijs Mekking
44680ad1cf Merge branch '1653-dnssec-policy-view-race-v9_16' into 'v9_16'
Resolve "Race condition with dnssec-policy, same zone in different views"

See merge request isc-projects/bind9!3195
2020-03-09 15:59:51 +00:00
Matthijs Mekking
33ceecdde7 Update changes, documentation
(cherry picked from commit 47e42d5750)
2020-03-09 16:25:46 +01:00
Matthijs Mekking
29cde9e990 Fix race condition dnssec-policy with views
When configuring the same dnssec-policy for two zones with the same
name but in different views, there is a race condition for who will
run the keymgr first. If running sequential only one set of keys will
be created, if running parallel two set of keys will be created.

Lock the kasp when running looking for keys and running the key
manager. This way, for the same zone in different views only one
keyset will be created.

The dnssec-policy does not implement sharing keys between different
zones.

(cherry picked from commit e0bdff7ecd)
2020-03-09 16:25:35 +01:00
Matthijs Mekking
7508598b8d Merge branch 'matthijs-refactor-kasp-test-v9_16' into 'v9_16'
Refactor kasp test (backport v9_16)

See merge request isc-projects/bind9!3191
2020-03-09 15:21:57 +00:00
Matthijs Mekking
da9a1bc5f3 Add check calls to kasp zsk-retired test
The test case for zsk-retired was missing the actual checks.  Add
them and fix the set_policy call to expect three keys.

(cherry picked from commit 2e4b55de85)
2020-03-09 15:43:38 +01:00
Matthijs Mekking
44bacf33fc More consistent spacing and comments
Some comments started with a lowercased letter. Capitalized them to
be more consistent with the rest of the comments.

Add some newlines between `set_*` calls and check calls, also to be
more consistent with the other test cases.

(cherry picked from commit 7e54dd74f9)
2020-03-09 15:43:29 +01:00
Matthijs Mekking
c73cca2622 Replace key_states
(cherry picked from commit f500b16f83)
2020-03-09 15:43:17 +01:00
Matthijs Mekking
406f27ebae Replace key_timings
(cherry picked from commit 32e4916c59)
2020-03-09 15:43:10 +01:00
Matthijs Mekking
581e184a21 Replace key_properties
(cherry picked from commit 628e09a423)
2020-03-09 15:43:02 +01:00
Matthijs Mekking
0d9fef7768 Replace zone_properties
(cherry picked from commit 8a4787d585)
2020-03-09 15:42:54 +01:00
Matthijs Mekking
be84cc82af Merge branch 'matthijs-kasp-test-algoroll-v9_16' into 'v9_16'
Backport kasp algorithm rollover test plus bugfixes to v9_16

See merge request isc-projects/bind9!3187
2020-03-09 14:24:57 +00:00
Matthijs Mekking
bc02baa045 Add additional wait period for algorithm rollover
We may be checking the algorithm steps too fast: the reconfig
command may still be in progress. Make sure the zones are signed
and loaded by digging the NSEC records for these zones.

(cherry picked from commit d16520532f)
2020-03-09 14:42:53 +01:00
Matthijs Mekking
01098fb81e Make clang-format happy
(cherry picked from commit 53bd81ad19)
2020-03-09 14:42:53 +01:00
Matthijs Mekking
4e8ffc4ed8 update CHANGES
(cherry picked from commit 6ddfed3de0)
2020-03-09 14:42:53 +01:00
Matthijs Mekking
b59dc6f89e Add CSK algorithm rollover test
(cherry picked from commit 917cf5f86f)
2020-03-09 14:42:53 +01:00
Matthijs Mekking
c20ac664dd [#1624] dnssec-policy change retire unwanted keys
When changing a dnssec-policy, existing keys with properties that no
longer match were not being retired.

(cherry picked from commit 3905a03205)
2020-03-09 14:42:53 +01:00
Matthijs Mekking
4bbefa8514 [#1625] Algorithm rollover waited too long
Algorithm rollover waited too long before introducing zone
signatures.  It waited to make sure all signatures were resigned,
but when introducing a new algorithm, all signatures are resigned
immediately.  Only add the sign delay if there is a predecessor key.

(cherry picked from commit 28506159f0)
2020-03-09 14:42:53 +01:00
Matthijs Mekking
150464e719 [#1626] Fix stuck algorithm rollover
Algorithm rollover was stuck on submitting DS because keymgr thought
it would move to an invalid state.  It did not match the current
key because it checked it against the current key in the next state.
Fixed by when checking the current key, check it against the desired
state, not the existing state.

(cherry picked from commit a8542b8cab)
2020-03-09 14:42:53 +01:00
Matthijs Mekking
f8b555a3a2 Add algorithm rollover test case
Add a test case for algorithm rollover.  This is triggered by
changing the dnssec-policy.  A new nameserver ns6 is introduced
for tests related to dnssec-policy changes.

This requires a slight change in check_next_key_event to only
check the last occurrence.  Also, change the debug log message in
lib/dns/zone.c to deal with checks when no next scheduled key event
exists (and default to loadkeys interval 3600).

(cherry picked from commit 88ebe9581b)
2020-03-09 14:42:53 +01:00
Matthijs Mekking
08ed7461af Remove unneeded step6 zone
The zone 'step6.ksk-doubleksk.autosign' is configured but is not
set up nor tested.  Remove the unneeded configured zone.

(cherry picked from commit cc2afe853b)
2020-03-09 14:42:53 +01:00
Matthijs Mekking
9dc207a363 Introduce enable dnssec test case
(cherry picked from commit fdb3f6f400)
2020-03-09 14:42:53 +01:00
Matthijs Mekking
5e3bad7c95 Prepare kasp for algorithm rollover test
Algorithm rollover will require four keys so introduce KEY4.
Also it requires to look at key files for multiple algorithms so
change getting key ids to be algorithm rollover agnostic (adjusting
count checks).  The algorithm will be verified in check_key so
relaxing 'get_keyids' is fine.

Replace '${_alg_num}' with '$(key_get KEY[1-4] ALG_NUM)' in checks
to deal with multiple algorithms.

(cherry picked from commit 00ced2d2e7)
2020-03-09 14:42:53 +01:00
Michał Kępień
a8563d7fdd Merge branch 'michal/do-not-run-openbsd-system-test-jobs-for-tags-v9_16' into 'v9_16'
[v9_16] Do not run OpenBSD system test jobs for tags

See merge request isc-projects/bind9!3188
2020-03-09 13:35:33 +00:00
Michał Kępień
2c645e10f0 Do not run OpenBSD system test jobs for tags
OpenBSD virtual machines seem to affected particularly badly by other
activity happening on the host.  This causes trouble around release
time: when multiple tags are pushed to the repository, a large number of
jobs is started concurrently on all CI runners.  In extreme cases, this
causes the system test suite to run for about an hour (!) on OpenBSD
VMs, with multiple tests failing.  We investigated the test artifacts
for all such cases in the past and the outcome was always the same: test
failures were caused by extremely slow I/O on the guest.  We tried
various tricks to work around this problem, but nothing helped.

Given the above, stop running OpenBSD system test jobs for pending BIND
releases to prevent the results of these jobs from affecting the
assessment of a given release's readiness for publication.  This change
does not affect OpenBSD build jobs.  OpenBSD system test jobs will still
be run for scheduled and web-requested pipelines, to make sure we catch
any severe issues with test code on that platform sooner or later.

(cherry picked from commit 7b002cea83)
2020-03-09 14:34:18 +01:00
Matthijs Mekking
c33930f99a Merge branch '1413-fix-dnssec-test-v9_16' into 'v9_16'
Fix dnssec test

See merge request isc-projects/bind9!3186
2020-03-09 12:09:47 +00:00
Matthijs Mekking
3b7bfa807f Fix dnssec test
There is a failure mode which gets triggered on heavily loaded
systems. A key change is scheduled in 5 seconds to make ZSK2 inactive
and ZSK3 active, but `named` takes more than 5 seconds to progress
from `rndc loadkeys` to the query check. At this time the SOA RRset
is already signed by the new ZSK which is not expected to be active
at that point yet.

Split up the checks to test the case where RRsets are signed
correctly with the offline KSK (maintained the signature) and
the active ZSK.  First run, RRsets should be signed with the still
active ZSK2, second run RRsets should be signed with the new active
ZSK3.

(cherry picked from commit aebb2aaa0f)
2020-03-09 12:04:12 +01:00
Diego dos Santos Fronza
213defd9e8 Merge branch '1472-threadsanitizer-lock-order-inversion-potential-deadlock-dns_resolver_createfetch-vs_v9_16' into 'v9_16'
Resolve "ThreadSanitizer: lock-order-inversion (potential deadlock) - dns_resolver_createfetch vs. dns_resolver_prime"

See merge request isc-projects/bind9!3157
2020-03-06 17:31:47 +00:00
Diego Fronza
277581c5a1 Fixed disposing of resolver->references in destroy() function 2020-03-06 13:37:07 -03:00
Diego Fronza
341b69aa7e Fixed potential-lock-inversion
This commit simplifies a bit the lock management within dns_resolver_prime()
and prime_done() functions by means of turning resolver's attribute
"priming" into an atomic_bool and by creating only one dependent object on the
lock "primelock", namely the "primefetch" attribute.

By having the attribute "priming" as an atomic type, it save us from having to
use a lock just to test if priming is on or off for the given resolver context
object, within "dns_resolver_prime" function.

The "primelock" lock is still necessary, since dns_resolver_prime() function
internally calls dns_resolver_createfetch(), and whenever this function
succeeds it registers an event in the task manager which could be called by
another thread, namely the "prime_done" function, and this function is
responsible for disposing the "primefetch" attribute in the resolver object,
also for resetting "priming" attribute to false.

It is important that the invariant "priming == false AND primefetch == NULL"
remains constant, so that any thread calling "dns_resolver_prime" knows for sure
that if the "priming" attribute is false, "primefetch" attribute should also be
NULL, so a new fetch context could be created to fulfill this purpose, and
assigned to "primefetch" attribute under the lock protection.

To honor the explanation above, dns_resolver_prime is implemented as follow:
	1. Atomically checks the attribute "priming" for the given resolver context.
	2. If "priming" is false, assumes that "primefetch" is NULL (this is
           ensured by the "prime_done" implementation), acquire "primelock"
	   lock and create a new fetch context, update "primefetch" pointer to
	   point to the newly allocated fetch context.
	3. If "priming" is true, assumes that the job is already in progress,
	   no locks are acquired, nothing else to do.

To keep the previous invariant consistent, "prime_done" is implemented as follow:
	1. Acquire "primefetch" lock.
	2. Keep a reference to the current "primefetch" object;
	3. Reset "primefetch" attribute to NULL.
	4. Release "primefetch" lock.
	5. Atomically update "priming" attribute to false.
	6. Destroy the "primefetch" object by using the temporary reference.

This ensures that if "priming" is false, "primefetch" was already reset to NULL.

It doesn't make any difference in having the "priming" attribute not protected
by a lock, since the visible state of this variable would depend on the calling
order of the functions "dns_resolver_prime" and "prime_done".

As an example, suppose that instead of using an atomic for the "priming" attribute
we employed a lock to protect it.
Now suppose that "prime_done" function is called by Thread A, it is then preempted
before acquiring the lock, thus not reseting "priming" to false.
In parallel to that suppose that a Thread B is scheduled and that it calls
"dns_resolver_prime()", it then acquires the lock and check that "priming" is true,
thus it will consider that this resolver object is already priming and it won't do
any more job.
Conversely if the lock order was acquired in the other direction, Thread B would check
that "priming" is false (since prime_done acquired the lock first and set "priming" to false)
and it would initiate a priming fetch for this resolver.

An atomic variable wouldn't change this behavior, since it would behave exactly the
same, depending on the function call order, with the exception that it would avoid
having to use a lock.

There should be no side effects resulting from this change, since the previous
implementation employed use of the more general resolver's "lock" mutex, which
is used in far more contexts, but in the specifics of the "dns_resolver_prime"
and "prime_done" it was only used to protect "primefetch" and "priming" attributes,
which are not used in any of the other critical sections protected by the same lock,
thus having zero dependency on those variables.
2020-03-06 13:37:07 -03:00
Diego Fronza
84d6896661 Added atomic_compare_exchange_strong_acq_rel macro
It is much better to read than:
atomic_compare_exchange_strong_explicit() with 5 arguments.
2020-03-06 13:37:07 -03:00