Commit Graph

31252 Commits

Author SHA1 Message Date
Ondřej Surý
cacaa94350 Fix the statistic counter underflow in ns_client_t
In case of normal fetch, the .recursionquota is attached and
ns_statscounter_recursclients is incremented when the fetch is created.  Then
the .recursionquota is detached and the counter decremented in the
fetch_callback().

In case of prefetch or rpzfetch, the quota is attached, but the counter is not
incremented.  When we reach the soft-quota, the function returns early but don't
detach from the quota, and it gets destroyed during the ns_client_endrequest(),
so no memory was leaked.

But because the ns_statscounter_recursclients is only incremented during the
normal fetch the counter would be incorrectly decremented on two occassions:

1) When we reached the softquota, because the quota was not properly detached
2) When the prefetch or rpzfetch was cancelled mid-flight and the callback
   function was never called.

(cherry picked from commit 78886d4bed)
2020-04-03 20:22:56 +02:00
Ondřej Surý
0e9b0d79fb Remove the extra decstats on STATID_ACTIVE for children sockets
(cherry picked from commit 26842ac25c)
2020-04-03 20:22:56 +02:00
Witold Kręcicki
3559b32dcc Fix the memory ordering for the isc stats to be acquire-release
(cherry picked from commit 4ffd4cd4f6)
2020-04-03 20:22:55 +02:00
Witold Krecicki
27be7a8bd1 Merge branch 'wpk/tcpdns-refactoring-v9_16' into 'v9_16'
netmgr refactoring: use generic functions when operating on sockets.

See merge request isc-projects/bind9!3331
2020-04-03 12:21:20 +00:00
Witold Kręcicki
365636dbc9 netmgr refactoring: use generic functions when operating on sockets.
tcpdns used transport-specific functions to operate on the outer socket.
Use generic ones instead, and select the proper call in netmgr.c.
Make the missing functions (e.g. isc_nm_read) generic and add type-specific
calls (isc__nm_tcp_read). This is the preparation for netmgr TLS layer.

(cherry picked from commit 5fedd21e16)
2020-04-03 13:44:28 +02:00
Matthijs Mekking
949846399d Merge branch '1179-dnssec-stats-oom-kill-v9_16' into 'v9_16'
Resolve "OOM issue after upgrade from 9.14.3 to 9.14.4"

See merge request isc-projects/bind9!3329
2020-04-03 08:33:28 +00:00
Matthijs Mekking
ed2d3c55c2 Update release notes
(cherry picked from commit 386890a161)
2020-04-03 10:04:32 +02:00
Matthijs Mekking
df16e24d66 Replace hard coded value with constant
(cherry picked from commit c1723b2535)
2020-04-03 10:04:24 +02:00
Matthijs Mekking
f46187bcaa Merge if blocks in statschannel.c
(cherry picked from commit 1596d3b498)
2020-04-03 10:04:16 +02:00
Matthijs Mekking
ae19d0f60a Replace sign operation bool with enum
(cherry picked from commit 44b49955e1)
2020-04-03 10:04:07 +02:00
Matthijs Mekking
c3d738c883 Embed algorithm in key tag counter
Key tags are not unique across algorithms.

(cherry picked from commit b2028e26da)
2020-04-03 10:03:58 +02:00
Matthijs Mekking
facd99fd9c Group the keyid with the counters
Rather than group key ids together, group key id with its
corresponding counters. This should make growing / shrinking easier
than having keyids then counters.

(cherry picked from commit eb6a8b47d7)
2020-04-03 10:03:49 +02:00
Matthijs Mekking
e67490cadb Add test for many keys
Add a statschannel test case for DNSSEC sign metrics that has more
keys than there are allocated stats counters for.  This will produce
gibberish, but at least it should not crash.

(cherry picked from commit 31e8b2b13c)
2020-04-03 10:03:39 +02:00
Matthijs Mekking
f59f446122 Redesign dnssec sign statistics
The first attempt to add DNSSEC sign statistics was naive: for each
zone we allocated 64K counters, twice.  In reality each zone has at
most four keys, so the new approach only has room for four keys per
zone. If after a rollover more keys have signed the zone, existing
keys are rotated out.

The DNSSEC sign statistics has three counters per key, so twelve
counters per zone. First counter is actually a key id, so it is
clear what key contributed to the metrics.  The second counter
tracks the number of generated signatures, and the third tracks
how many of those are refreshes.

This means that in the zone structure we no longer need two separate
references to DNSSEC sign metrics: both the resign and refresh stats
are kept in a single dns_stats structure.

Incrementing dnssecsignstats:

Whenever a dnssecsignstat is incremented, we look up the key id
to see if we already are counting metrics for this key.  If so,
we update the corresponding operation counter (resign or
refresh).

If the key is new, store the value in a new counter and increment
corresponding counter.

If all slots are full, we rotate the keys and overwrite the last
slot with the new key.

Dumping dnssecsignstats:

Dumping dnssecsignstats is no longer a simple wrapper around
isc_stats_dump, but uses the same principle.  The difference is that
rather than dumping the index (key tag) and counter, we have to look
up the corresponding counter.

(cherry picked from commit 705810d577)
2020-04-03 10:03:30 +02:00
Ondřej Surý
86933f4a27 Merge branch '1717-rwlock-contention-in-isc_log_wouldlog-api-performance-impact-v9_16' into 'v9_16'
Reduce rwlock contention in isc_log_wouldlog()

See merge request isc-projects/bind9!3327
2020-04-03 08:00:39 +00:00
Ondřej Surý
aec1578620 Reduce rwlock contention in isc_log_wouldlog()
The rwlock introduced to protect the .logconfig member of isc_log_t
structure caused a significant performance drop because of the rwlock
contention.  It was also found, that the debug_level member of said
structure was not protected from concurrent read/writes.

The .dynamic and .highest_level members of isc_logconfig_t structure
were actually just cached values pulled from the assigned channels.

We introduced an even higher cache level for .dynamic and .highest_level
members directly into the isc_log_t structure, so we don't have to
access the .logconfig member in the isc_log_wouldlog() function.

(cherry picked from commit 3a24eacbb6)
2020-04-03 07:59:34 +00:00
Matthijs Mekking
96660671e2 Merge branch '1706-dnssec-policy-migration-v9_16' into 'v9_16'
Resolve "Changing from auto-dnssec maintain to dnssec-policy x immediately deletes existing keys"

See merge request isc-projects/bind9!3328
2020-04-03 07:59:04 +00:00
Matthijs Mekking
3726d7f857 Test migration to dnssec-policy, change algorithm
Add a test to ensure migration from 'auto-dnssec maintain;' to
dnssec-policy works even if the algorithm is changed.  The existing
keys should not be removed immediately, but their goal should be
changed to become hidden, and the new keys with the different
algorithm should be introduced immediately.

(cherry picked from commit 551acb44f4)
2020-04-03 09:17:06 +02:00
Matthijs Mekking
9387729711 Only initialize goal on active keys
If we initialize goals on all keys, superfluous keys that match
the policy all desire to be active.  For example, there are six
keys available for a policy that needs just two, we only want to
set the goal state to OMNIPRESENT on two keys, not six.

(cherry picked from commit 2389fcb4dc)
2020-04-03 09:16:51 +02:00
Matthijs Mekking
1553411d43 Update documentation with !1706 fix
(cherry picked from commit f47e697da3)
2020-04-03 09:16:25 +02:00
Matthijs Mekking
4741f2d07e Test migration to dnssec-policy, retire old keys
Migrating from 'auto-dnssec maintain;' to dnssec-policy did not
work properly, mainly because the legacy keys were initialized
badly.  Earlier commit deals with migration where existing keys
match the policy.  This commit deals with migration where existing
keys do not match the policy.  In that case, named must not
immediately delete the existing keys, but gracefully roll to the
dnssec-policy.

However, named did remove the existing keys immediately.  This is
because the legacy key states were initialized badly.  Because
those keys had their states initialized to HIDDEN or RUMOURED, the
keymgr decides that they can be removed (because only when the key
has its states in OMNIPRESENT it can be used safely).

The original thought to initialize key states to HIDDEN (and
RUMOURED to deal with existing keys) was to ensure that those keys
will go through the required propagation time before the keymgr
decides they can be used safely.  However, those keys are already
in the zone for a long time and making the key states represent
otherwise is dangerous: keys may be pulled out of the zone while
in fact they are required to establish the chain of trust.

Fix initializing key states for existing keys by looking more closely
at the time metadata.  Add TTL and propagation delays to the time
metadata and see if the DNSSEC records have been propagated.
Initialize the state to OMNIPRESENT if so, otherwise initialize to
RUMOURED.  If the time metadata is in the future, or does not exist,
keep initializing the state to HIDDEN.

The added test makes sure that new keys matching the policy are
introduced, but existing keys are kept in the zone until the new
keys have been propagated.

(cherry picked from commit 7f43520893)
2020-04-03 09:16:11 +02:00
Matthijs Mekking
83a00866b0 Tweak kasp system test
A few kasp system test tweaks to improve test failure debugging and
deal with tests related to migration to dnssec-policy.

1. When clearing a key, set lifetime to "none".  If "none", skip
   expect no lifetime set in the state file.  Legacy keys that
   are migrated but don't match the dnssec-policy will not have a
   lifetime.

2. The kasp system test prints which key id and file it is checking.
   Log explicitly if we are checking the id or a file.

3. Add quotes around "ID" when setting the key id, for consistency.

4. Fix a typo (non -> none).

5. Print which key ids are found, this way it is easier to see what
   KEY[1-4] failed to match one of the key files.

(cherry picked from commit a224754d59)
2020-04-03 09:15:51 +02:00
Matthijs Mekking
7aa5a11bdd Fix and test migration to dnssec-policy
Migrating from 'auto-dnssec maintain;' to dnssec-policy did not
work properly, mainly because the legacy keys were initialized
badly. Several adjustments in the keymgr are required to get it right:

- Set published time on keys when we calculate prepublication time.
  This is not strictly necessary, but it is weird to have an active
  key without the published time set.

- Initalize key states also before matching keys. Determine the
  target state by looking at existing time metadata: If the time
  data is set and is in the past, it is a hint that the key and
  its corresponding records have been published in the zone already,
  and the state is initialized to RUMOURED. Otherwise, initialize it
  as HIDDEN. This fixes migration to dnssec-policy from existing
  keys.

- Initialize key goal on keys that match key policy to OMNIPRESENT.
  These may be existing legacy keys that are being migrated.

- A key that has its goal to OMNIPRESENT *or* an active key can
  match a kasp key.  The code was changed with CHANGE 5354 that
  was a bugfix to prevent creating new KSK keys for zones in the
  initial stage of signing.  However, this caused problems for
  restarts when rollovers are in progress, because an outroducing
  key can still be an active key.

The test for this introduces a new KEY property 'legacy'.  This is
used to skip tests related to .state files.

(cherry picked from commit 6801899134)
2020-04-03 09:15:39 +02:00
Evan Hunt
6c379655d9 Merge branch '1447-incremental-rpz-update-v9_16' into 'v9_16'
incrementally clean up old RPZ records during updates

See merge request isc-projects/bind9!3319
2020-04-01 09:55:26 +00:00
Evan Hunt
5700485c21 CHANGES and release note
(cherry picked from commit 899f9440c0)
2020-04-01 01:32:55 -07:00
Evan Hunt
a288dee81e incrementally clean up old RPZ records during updates
After an RPZ zone is updated via zone transfer, the RPZ summary
database is updated, inserting the newly added names in the policy
zone and deleting the newly removed ones. The first part of this
was quantized so it would not run too long and starve other tasks
during large updates, but the second part was not quantized, so
that an update in which a large number of records were deleted
could cause named to become briefly unresponsive.

(cherry picked from commit 32da119ed8)
2020-04-01 01:32:55 -07:00
Mark Andrews
4e32fd130f Merge branch 'marka-empty-release-notes-v9_16' into 'v9_16'
add empty release notes for 9.16.2

See merge request isc-projects/bind9!3314
2020-03-31 07:07:20 +00:00
Mark Andrews
657ad6de31 add empty release notes for 9.16.2
(cherry picked from commit 503e2dff64)
2020-03-31 17:12:03 +11:00
Witold Krecicki
df93653818 Merge branch '1700-proper-tcp-resuming-v9_16' into 'v9_16'
Deactivate the handle before sending the async close callback.

See merge request isc-projects/bind9!3310
2020-03-30 12:57:50 +00:00
Witold Kręcicki
3274650123 Deactivate the handle before sending the async close callback.
We could have a race between handle closing and processing async
callback. Deactivate the handle before issuing the callback - we
have the socket referenced anyway so it's not a problem.
2020-03-30 10:54:12 +00:00
Witold Krecicki
52ae7bf603 Merge branch 'wpk/quota-callback-v9_16' into 'v9_16'
Add a quota attach function with a callback, some code cleanups.

See merge request isc-projects/bind9!3309
2020-03-30 10:30:23 +00:00
Witold Kręcicki
7ab77d009d Add a quota attach function with a callback, some code cleanups.
We introduce a isc_quota_attach_cb function - if ISC_R_QUOTA is returned
at the time the function is called, then a callback will be called when
there's quota available (with quota already attached). The callbacks are
organized as a LIFO queue in the quota structure.
It's needed for TCP client quota -  with old networking code we had one
single place where tcp clients quota was processed so we could resume
accepting when the we had spare slots, but it's gone with netmgr - now
we need to notify the listener/accepter that there's quota available so
that it can resume accepting.

Remove unused isc_quota_force() function.

The isc_quote_reserve and isc_quota_release were used only internally
from the quota.c and the tests.  We should not expose API we are not
using.

(cherry picked from commit d151a10f30)
2020-03-30 10:29:33 +02:00
Mark Andrews
a5ec7f9c83 Merge branch '1678-bind-fails-to-build-with-mysql-support-against-mysql8-mysql-connector-8-v9_16' into 'v9_16'
Resolve "BIND fails to build with MYSQL support against mysql8/mysql-connector-8"

See merge request isc-projects/bind9!3305
2020-03-26 23:21:58 +00:00
Ondřej Surý
2f3272ef86 Use compound literals in mysql_options() call
Makes use of compound literals instead of using extra my_bool
variable just to hold "true/1" value.

(cherry picked from commit 715b7a7cec)
2020-03-27 09:05:46 +11:00
Mark Andrews
3387fa03e4 Typedef my_bool if missing.
ORACLE MySQL 8.0 has dropped the my_bool type, so we need to reinstate
it back when compiling with that version or higher.  MariaDB is still
keeping the my_bool type.  The numbering between the two (MariaDB 5.x
jumped to MariaDB 10.x) doesn't make the life of the developer easy.

(cherry picked from commit c6d5d5c88f)
2020-03-27 09:05:46 +11:00
Mark Andrews
5f6b54927e remove unused variable
(cherry picked from commit 7af9883b48)
2020-03-27 09:05:46 +11:00
Michał Kępień
4bade7774a Merge branch 'michal/misc-gitlab-ci-yml-cleanups-v9_16' into 'v9_16'
[v9_16] Miscellaneous .gitlab-ci.yml cleanups

See merge request isc-projects/bind9!3300
2020-03-26 10:43:11 +00:00
Michał Kępień
7910702fec Remove unused YAML anchors
Some YAML anchors defined in .gitlab-ci.yml are not subsequently used.
Remove them to prevent confusion.

(cherry picked from commit 3d121ede6c)
2020-03-26 11:41:55 +01:00
Michał Kępień
688b759ed0 Do not install compiledb in cppcheck job
compiledb is already included in the Docker image used by the cppcheck
job.  Do not attempt installing it again.

(cherry picked from commit 3d264dbe81)
2020-03-26 11:41:55 +01:00
Michał Kępień
89ef138ba6 Include compiler name in all build/test job names
Most build/test job names already contain a "clang", "gcc", or "msvc"
prefix which indicates the compiler used for a given job.  Apply that
naming convention to all build/test job names.

(cherry picked from commit 0c898084cd)
2020-03-26 11:41:55 +01:00
Michał Kępień
6044f6d73d Refactor TSAN unit test job definitions
Multiple YAML keys have identical values for both TSAN unit test job
definitions.  Extract these common keys to a YAML anchor and use it in
TSAN unit test job definitions to reduce code duplication.

(cherry picked from commit 84463f33bf)
2020-03-26 11:41:55 +01:00
Michał Kępień
536704c749 Run "kyua report-html" for TSAN unit test jobs
Definitions of jobs running unit tests under TSAN contain an
"after_script" YAML key.  Since the "unit_test_job" anchor is included
in those job definitions before "after_script" is defined, the
job-specific value of that key overrides the one defined in the included
anchor.  This prevents "kyua report-html" from being run for TSAN unit
test jobs.  Moving the invocation of "kyua report-html" to the "script"
key in the "unit_test_job" anchor is not acceptable as it would cause
the exit code of that command to determine the result of all unit test
jobs and we need that to be the exit code of "make unit".  Instead, add
"kyua report-html" invocations to the "after_script" key of TSAN unit
test job definitions to address the problem without affecting other job
definitions.

(cherry picked from commit 6ebce9425e)
2020-03-26 11:41:55 +01:00
Michał Kępień
873cefc8c9 Refactor TSAN system test job definitions
Multiple YAML keys have identical values for both TSAN system test job
definitions.  Extract these common keys to a YAML anchor and use it in
TSAN system test job definitions to reduce code duplication.

(cherry picked from commit a9aa295f1f)
2020-03-26 11:41:54 +01:00
Michał Kępień
0c726127f7 Drop "before_script" key from TSAN job definitions
Both "system_test_job" and "unit_test_job" YAML anchors contain a
"before_script" key.  TSAN job definitions first specify their own value
of the "before_script" key and then include the aforementioned YAML
anchors, which results in the value of the "before_script" key being
overridden with the value specified by the included anchor.  Given this,
remove "before_script" definitions specific to TSAN jobs as they serve
no practical purpose.

(cherry picked from commit 8ef01c7b50)
2020-03-26 11:41:54 +01:00
Michał Kępień
b358cf30b2 Define TSAN options in a global variable
All assignments for the TSAN_OPTIONS variable are identical across the
entire .gitlab-ci.yml file.  Define a global TSAN_OPTIONS_COMMON
variable and use it in job definitions to reduce code duplication.

(cherry picked from commit 6325c0993a)
2020-03-26 11:41:54 +01:00
Ondřej Surý
f24de93e80 Merge branch '1679-fix-the-tv_nsec_check-v9_16' into 'v9_16'
Fix the tv_nsec check in isc_stdtime_get() (v9.16)

See merge request isc-projects/bind9!3293
2020-03-25 22:00:24 +00:00
Ondřej Surý
e017574b74 Correct the typecast of .tv_sec in isc_stdtime_get() 2020-03-25 22:10:10 +01:00
Ondřej Surý
2bb2a10ba4 Fix the tv_nsec check in isc_stdtime_get()
(cherry picked from commit 0d06a62dd1)
2020-03-25 21:19:55 +01:00
Ondřej Surý
7e79134ec0 Merge branch 'ondrej/no-clang-on-debian-sid-v9_16' into 'v9_16'
Rewrite .gitlab-ci.yml to have 'base_image' and other GitLab CI improvements (v9.16)

See merge request isc-projects/bind9!3288
2020-03-25 17:29:25 +00:00
Ondřej Surý
71c5f29573 Replace clang:stretch:amd64 build with clang:buster:amd64 build (+ add missing system test)
(cherry picked from commit 281531d82b)
2020-03-25 18:12:39 +01:00