bind9

Author	SHA1	Message	Date
Tony Finch	a8f1d0c19c	Compress zone transfers properly After change 5995, zone transfers were using a small compression context that only had space for the first few dozen names in each message. They now use a large compression context with enough space for every name.	2022-11-30 12:16:09 +00:00
Ondřej Surý	1816244725	Don't log "final reference detached" on INFO level The "final reference detached" message was meant to be DEBUG(1), but was instead kept at INFO level. Move it to the DEBUG(1) logging level, so it's not printed under normal operations.	2022-11-30 11:04:45 +01:00
Ondřej Surý	35d8d72dd8	Keep the unlink adb entries until expiration Currently, the ADB uses TTL of 0 for ADB names that the server is authoritative for and TTL of 10 seconds for HINT and GLUE ADB names. This requires the unlinked ADB entries to be kept around, because they would disappear too quickly. This especially affect the root zone as the trust level is "ultimate" for the root zone nameservers. This commit restores the ability to keep the unlinked ADB entries in the database for later reuse, restores printing the unlinked entries and adds some extra cleaning of the unlinked ADB entries on the tail of the LRU list (similar to what we are doing for the ADB names).	2022-11-30 10:03:24 +01:00
Ondřej Surý	50f357cb36	Refactor the dns_adb unit The dns_adb unit has been refactored to be much simpler. Following changes have been made: 1. Simplify the ADB to always allow GLUE and hints There were only two places where dns_adb_createfind() was used - in the dns_resolver unit where hints and GLUE addresses were ok, and in the dns_zone where dns_adb_createfind() would be called without DNS_ADBFIND_HINTOK and DNS_ADBFIND_GLUEOK set. Simplify the logic by allowing hint and GLUE addresses when looking up the nameserver addresses to notify. The difference is negligible and would cause a difference in the notified addresses only when there's mismatch between the parent and child addresses and we haven't cached the child addresses yet. 2. Drop the namebuckets and entrybuckets Formerly, the namebuckets and entrybuckets were used to reduced the lock contention when accessing the double-linked lists stored in each bucket. In the previous refactoring, the custom hashtable for the buckets has been replaced with isc_ht/isc_hashmap, so only a single item (mostly, see below) would end up in each bucket. Removing the entrybuckets has been straightforward, the only matching was done on the isc_sockaddr_t member of the dns_adbentry. Removing the zonebuckets required GLUEOK and HINTOK bits to be removed because the find could match entries with-or-without the bits set, and creating a custom key that stores the DNS_ADBFIND_STARTATZONE in the first byte of the key, so we can do a straightforward lookup into the hashtable without traversing a list that contains items with different flags. 3. Remove unassociated entries from ADB database Previously, the adbentries could live in the ADB database even after unlinking them from dns_adbnames. Such entries would show up as "Unassociated entries" in the ADB dump. The benefit of keeping such entries is little - the chance that we link such entry to a adbname is small, and it's simpler to evict unlinked entries from the ADB cache (and the hashtable) than create second LRU cleaning mechanism. Unlinked ADB entries are now directly deleted from the hash table (hashmap) upon destruction. 4. Cleanup expired entries from the hash table When buckets were still in place, the code would keep the buckets always allocated and never shrink the hash table (hashmap). With proper reference counting in place, we can delete the adbnames from the hash table and the LRU list. 5. Stop purging the names early when we hit the time limit Because the LRU list is now time ordered, we can stop purging the names when we find a first entry that doesn't fullfil our time-based eviction criteria because no further entry on the LRU list will meet the criteria. Future work: 1. Lock contention In this commit, the focus was on correctness of the data structure, but in the future, the lock contention in the ADB database needs to be addressed. Currently, we use simple mutex to lock the hash tables, because we almost always need to use a write lock for properly purging the hashtables. The ADB database needs to be sharded (similar to the effect that buckets had in the past). Each shard would contain own hashmap and own LRU list. 2. Time-based purging The ADB names and entries stay intact when there are no lookups. When we add separate shards, a timer needs to be added for time-based cleaning in case there's no traffic hashing to the inactive shard. 3. Revisit the 30 minutes limit The ADB cache is capped at 30 minutes. This needs to be revisited, and at least the limit should be configurable (in both directions).	2022-11-30 10:03:24 +01:00
Ondřej Surý	66d8bb03cb	Create per-thread task for dns_adb resolver fetches The dns_adb would serialize all fetches on a single task. Create a per-thread task, so the fetches will stay local to the thread that initiated the fetch.	2022-11-30 10:03:24 +01:00
Ondřej Surý	0d4ef6fcd7	Expire namehooks when purging stale ADB names Instead of trying to expire entries from adbentrybuckets, expire the namehooks while purging the stale ADB names.	2022-11-30 10:03:23 +01:00
Ondřej Surý	557a71a6f9	Purge stale ADB names globaly, not per bucket Before the refactoring, there was only few buckets with many names in them, so cleaning up stale ADB names per-bucket made sense. After the refactoring, each bucket directly maps to ADB name, so purging has been effectively disabled. Create a global LRU list for ADB names (and ADB entries) and purge the stale ADB names globally.	2022-11-30 10:03:23 +01:00
Ondřej Surý	327768e280	dns_adb: Remove deadnames and deadentries Previously, the name and entry buckets were much larger, so the dead names and entries were moved to a secondary list to be cleaned later (f.e. after the already running fetch has been canceled). After the last refactoring, the bucket now contains only the name (entry) itself and thus the extra list has a little use. Remove the .deadnames and .deadentries from dns_adbnamebucket_t and dns_adbentrybucket_t structures.	2022-11-30 10:03:23 +01:00
Ondřej Surý	77659e7392	Refactor dns_rpz unit to use single reference counting The dns_rpz_zones structure was using .refs and .irefs for strong and weak reference counting. Rewrite the unit to use just a single reference counting + shutdown sequence (dns_rpz_destroy_rpzs) that must be called by the creator of the dns_rpz_zones_t object. Remove the reference counting from the dns_rpz_zone structure as it is not needed because the zone objects are fully embedded into the dns_rpz_zones structure and dns_rpz_zones_t object must never be destroyed before all dns_rpz_zone_t objects. The dns_rps_zones_t reference counting uses the new ISC_REFCOUNT_TRACE capability - enable by defining DNS_RPZ_TRACE in the dns/rpz.h header. Additionally, add magic numbers to the dns_rpz_zone and dns_rpz_zones structures.	2022-11-30 09:59:35 +01:00
Ondřej Surý	118ae66976	Add extra set of ISC_REFCOUNT_TRACE_{IMPL,DECL} macros The new ISC_REFCOUNT_TRACE_{IMPL,DECL} macros can be used to add a reference tracing capability to any unit using the reference counting. It requires a little bit of extra work in each header as you can't have a define from inside a define (see rpz.h), but it's fairly easy to add tracing to any struct using reference counting with these macros.	2022-11-29 23:57:40 -08:00
Ondřej Surý	fa275a59da	Remove the unused cache cleaning mechanism from dns_cache API The dns_cache API contained a cache cleaning mechanism that would be disabled for 'rbt' based cache. As named doesn't have any other cache implementations, remove the cache cleaning mechanism from dns_cache API.	2022-11-29 13:48:33 -08:00
Ondřej Surý	5e4a26856c	Remove the dead external cache cleaning mechanism from RBTDB The RBTDB has own cache cleaning mechanism and therefor the iterator .cleaning member would never be set to true. Remove the code that checks for iterator->cleaning from the RBTDB.	2022-11-29 13:48:33 -08:00
Artem Boldariev	9b1c8c03fd	TCP: use uv_try_write() to optimise sends This commit make TCP code use uv_try_write() on best effort basis, just like TCP DNS and TLS DNS code does. This optimisation was added in 'caa5b6548a11da6ca772d6f7e10db3a164a18f8d' but, similar change was mistakenly omitted for generic TCP code. This commit fixes that.	2022-11-29 13:41:10 +02:00
Michal Nowak	afdb41a5aa	Update sources to Clang 15 formatting	2022-11-29 08:54:34 +01:00
Tony Finch	96b6d78f75	Speed up lib/dns/gen.c The `gen` program was causing a lengthy single-threaded pause in the BIND build. When generating RDATATYPE_FROMTEXT_SW(), `gen` hit the inner loop of `find_typename()` over 1.2 billion times. This change avoids long deeply-nested loops, so `gen` now runs in less than 10ms, about 300x faster. No changes to the output.	2022-11-28 09:44:26 +00:00
Ondřej Surý	d8df29e37d	Be more resilient when destroying the httpd requests Don't restart reading in the send callback after the httpdmgr has been shut down, and call httpd_request(..., ISC_R_SHUTDOWN, ...) when shutting down the httpdmgr to reduce code duplication.	2022-11-25 16:20:34 +01:00
Ondřej Surý	f3004da3a5	Make the netmgr send callback to be asynchronous only when needed Previously, the send callback would be synchronous only on success. Add an option (similar to what other callbacks have) to decide whether we need the asynchronous send callback on a higher level. On a general level, we need the asynchronous callbacks to happen only when we are invoking the callback from the public API. If the path to the callback went through the libuv callback or netmgr callback, we are already on asynchronous path, and there's no need to make the call to the callback asynchronous again. For the send callback, this means we need the asynchronous path for failure paths inside the isc_nm_send() (which calls isc__nm_udp_send(), isc__nm_tcp_send(), etc...) - all other invocations of the send callback could be synchronous, because those are called from the respective libuv send callbacks.	2022-11-25 15:46:25 +01:00
Ondřej Surý	5ca49942a3	Make the netmgr read callback to be asynchronous only when needed Previously, the read callback would be synchronous only on success or timeout. Add an option (similar to what other callbacks have) to decide whether we need the asynchronous read callback on a higher level. On a general level, we need the asynchronous callbacks to happen only when we are invoking the callback from the public API. If the path to the callback went through the libuv callback or netmgr callback, we are already on asynchronous path, and there's no need to make the call to the callback asynchronous again. For the read callback, this means we need the asynchronous path for failure paths inside the isc_nm_read() (which calls isc__nm_udp_read(), isc__nm_tcp_read(), etc...) - all other invocations of the read callback could be synchronous, because those are called from the respective libuv or netmgr read callbacks.	2022-11-25 15:46:15 +01:00
Tony Finch	00307fe318	Deduplicate time unit conversion factors The various factors like NS_PER_MS are now defined in a single place and the names are no longer inconsistent. I chose the _PER_SEC names rather than _PER_S because it is slightly more clear in isolation; but the smaller units are always NS, US, and MS.	2022-11-25 13:23:36 +00:00
Mark Andrews	b95d089751	Fix log messages incorrectly logged at error The log message "got TLS configuration for zone transfer" is not an error, setting to info.	2022-11-25 08:50:36 +11:00
Mark Andrews	65f2512315	TLS setting of primaries with catalog zones where being ignored Extract the tlss values if present from the ipkeylist entry and add the resulting tls setting to the constructed configuration for the primary. When comparing catalog zone entries for reuse also check the masters.tlss values for equality.	2022-11-25 08:50:36 +11:00
Evan Hunt	18606f5276	remove unused 'nupdates' field from client the 'nupdates' field was originally used to track whether a client was ready to shut down, along with other similar counters nreads, nrecvs, naccepts and nsends. this is now tracked differently, but nupdates was overlooked when the other counters were removed.	2022-11-23 23:44:10 +00:00
Matthijs Mekking	f9845dd128	Deprecate auto-dnssec Deprecate auto-dnssec, add specific log warning to migrate to dnssec-policy.	2022-11-23 09:46:16 +01:00
Matthijs Mekking	f71a6692db	Obsolete dnssec-secure-to-insecure option Now that the key management operations using dynamic updates feature has been removed, the 'dnssec-secure-to-insecure' option has become obsoleted.	2022-11-18 11:04:17 +01:00
Matthijs Mekking	b6c2776df5	Remove dynamic update key management code Remove code that triggers key and denial of existence management operations. Dynamic update should no longer be used to do DNSSEC maintenance (other than that of course signatures need to be created for the new zone contents).	2022-11-18 11:04:17 +01:00
Tony Finch	1c0f607811	Simplify and speed up DNS name decompression The aim is to do less work per byte: * Check the bounds for each label, instead of checking the bounds for each character. * Instead of copying one character at a time from the wire to the name, copy entire runs of sequential labels using memmove() to make the most of its fast loop. * To remember where the name ends, we only need to set the end marker when we see a compression pointer or when we reach the root label. There is no need to check if we jumped back and conditionally update the counter for every character. * To parse a compression pointer, we no longer take a diversion around the outer loop in between reading the upper byte of the pointer and the lower byte. * The parser state machine is now implicit in the instruction pointer, instead of being an explicit variable. Similarly, when we reach the root label we break directly out of the loop instead of setting a second state machine variable. * DNS_NAME_DOWNCASE is never used with dns_name_fromwire() so that option is no longer supported. I have removed this comment which dated from January 1999 when dns_name_fromwire() was first introduced: /* * Note: The following code is not optimized for speed, but * rather for correctness. Speed will be addressed in the future. */ No functional change, apart from removing support for the unused DNS_NAME_DOWNCASE option. The new code is about 2x faster than the old code: best case 11x faster, worst case 1.4x faster.	2022-11-17 08:45:15 +00:00
Tony Finch	e0c9692341	Clean up remnants of label types There were a few comments referring obliquely to different kinds of labels, which became obsolete a long time ago.	2022-11-17 08:44:27 +00:00
Mark Andrews	dfbffd77f9	Select the appropriate namespace when using a dual stack server When using dual-stack-servers the covering namespace to check whether answers are in scope or not should be fctx->domain. To do this we need to be able to distingish forwarding due to forwarders clauses and dual-stack-servers. A new flag FCTX_ADDRINFO_DUALSTACK has been added to signal this.	2022-11-17 12:23:45 +11:00
Ondřej Surý	379929e052	Deprecate setting operating system limits from named.conf It was possible to set operating system limits (RLIMIT_DATA, RLIMIT_STACK, RLIMIT_CORE and RLIMIT_NOFILE) from named.conf. It's better to leave these untouched as setting these is responsibility of the operating system and/or supervisor. Deprecate the configuration options and remove them in future BIND 9 release.	2022-11-14 16:48:52 +01:00
Ondřej Surý	0bf7014f85	Remove the last remnants of --with-tuning=large The small/large tuning has been completely removed from the code with last remnant of the dead code in ns_interfacemgr. Remove the dead code and the configure option.	2022-11-14 10:01:20 +01:00
Mark Andrews	f053d5b414	Have dns_zt_apply lock the zone table There were a number of places where the zone table should have been locked, but wasn't, when dns_zt_apply was called. Added a isc_rwlocktype_t type parameter to dns_zt_apply and adjusted all calls to using it. Removed locks in callers.	2022-11-11 15:26:11 +00:00
Matthijs Mekking	53eab06083	Change default TTL of NSEC3PARAM to SOA MINIMUM Despite the RFC says that the NSEC3PARAM is not something that is intended for the resolver to be cached, and thus the TTL of 0 is most logical, a zero TTL RRset can be abused by bad actors. Change the default to SOA MINIMUM.	2022-11-11 12:06:33 +01:00
Ondřej Surý	417097450a	Check view->adb in dns_view_flushcache() The call to dns_view_flushcache() is done under exclusive mode, but we still need to check if view->adb is still attached before calling dns_adb_flush() because the shutdown might have been already initialized. This most likely only a theoretical problem on shutdown because there's either no way how to initiate cache flush when shutting down or very slim window where the `rndc flush` would have to hit the slim time during named shutdown.	2022-11-11 11:47:44 +01:00
Ondřej Surý	a8ba240325	Don't use view->resolver directly when priming in dns_view_find() When starting priming from dns_view_find(), the dns_view shutdown could be initiated by different thread, detaching from the resolver. Use dns_view_getresolver() to attach to the resolver under view->lock, so we don't try to call dns_resolver_prime() with NULL pointer. There are more accesses to view->resolver, (and also view->adb and view->requestmgr that suffer from the same problem) in the dns_view module, but they are all done in exclusive mode or under a view->lock.	2022-11-11 11:47:44 +01:00
Ondřej Surý	e4654d1a6a	Bump the allowed HTTP headers in statschannel to 100 Firefox 90+ apparently sends more than 10 headers, so we need to bump the number to some higher number. Bump it to 100 just to be on a save side, this is for internal use only anyway.	2022-11-10 16:34:26 +01:00
Ondřej Surý	b7eabb6394	Use isc_hashmap instead of isc_ht in the dns_resolver API Replace the use of isc_ht API with isc_hashmap API in the dns_resolver implementation. This requires extending the fctxbucket_t structure to include keysize and copy of the key because the isc_hashmap API needs the raw key in case of resizing the hashmap table.	2022-11-10 15:07:19 +01:00
Ondřej Surý	e1220a2d4f	Use isc_hashmap instead of isc_ht in the dns_adb API Replace the use of isc_ht API with isc_hashmap API in the dns_adb database implementation. This requires extending the dns_adbnamebucket_t and dns_adbentrybucket_t structures to include keysize and copy of the key because the isc_hashmap API needs the raw key in case of resizing the hashmap table.	2022-11-10 15:07:19 +01:00
Ondřej Surý	f46ce447a6	Add isc_hashmap API that implements Robin Hood hashing Add new isc_hashmap API that differs from the current isc_ht API in several aspects: 1. It implements Robin Hood Hashing which is open-addressing hash table algorithm (e.g. no linked-lists) 2. No memory allocations - the array to store the nodes is made of isc_hashmap_node_t structures instead of just pointers, so there's only allocation on resize. 3. The key is not copied into the hashmap node and must be also stored externally, either as part of the stored value or in any other location that's valid as long the value is stored in the hashmap. This makes the isc_hashmap_t a little less universal because of the key storage requirements, but the inserts and deletes are faster because they don't require memory allocation on isc_hashmap_add() and memory deallocation on isc_hashmap_delete().	2022-11-10 15:07:19 +01:00
Ondřej Surý	9d2f22e666	Properly name the loop->mctx The per loop memory context were unnamed, properly name them as 'loop<tid>'.	2022-11-08 13:32:13 +01:00
Mark Andrews	044c3b2bb8	Add missing closing ')' to update-policy documentation The opening '(' before local was not being matched by a closing ')' after the closing '};'.	2022-11-04 10:37:47 +00:00
Ondřej Surý	96e7bf76e7	Don't release the tree read lock in dereference_iter_node() Previously, the tree read lock could be upgraded to a write lock in decrement_reference() and then downgraded back to read lock in dereference_iter_node(). When the use of isc_rwlock_downgrade() was removed, the downgrade was changed to a simple unlock+lock. This allows some delete operations to sneak in and delete nodes that the iterator expects to be in place. Expand decrement_reference() so the caller can indicate whether the tree read lock should be upgraded, and disallow the upgrade when calling from dereference_iter_node(), so there will be no need to release the lock afterward.	2022-11-03 14:07:44 +00:00
Ondřej Surý	80e66fbd2d	Don't use dns_zone_attach() in zone_refreshkeys() The zone_refreshkeys() could run before the zone_shutdown(), but after the last .erefs has been "detached" causing assertion failure when doing dns_zone_attach(). Remove the use of .erefs (dns_zone_attach/detach) and replace it with using the .irefs and additional checks whether the zone is exiting in the callbacks.	2022-11-03 14:29:32 +01:00
Matthijs Mekking	332b98ae49	Don't allow DNSSEC records in the raw zone There was an exception for dnssec-policy that allowed DNSSEC in the unsigned version of the zone. This however causes a crash if the zone switches from dynamic to inline-signing in the case of NSEC3, because we are now trying to add an NSEC3 record to a non-NSEC3 node. This is because BIND expects none of the records in the unsigned version of the zone to be NSEC3. Remove the exception for dnssec-policy when copying non DNSSEC records, but do allow for DNSKEY as this may be a published DNSKEY from a different provider.	2022-11-03 10:20:05 +01:00
Ondřej Surý	c429b52533	Don't cleanup the dead nodes when pruning the tree The dead nodes might get reactivated during the db iterator walks the version of the tree, so we can't cleanup the dead nodes while the db version is open. Restore the previous behaviour that cleaned up the dead nodes when we are closing the version.	2022-11-03 09:06:08 +01:00
Ondřej Surý	be204bf4c7	Cleanup the dead nodes when pruning the tree While sending the node to prune_tree(), we can also cleanup dead nodes because we already hold the tree and node bucket write locks.	2022-11-02 13:06:52 +01:00
Ondřej Surý	0492bbf590	Make the pthread_rwlock implementation header-only macros [2/2] While using mutrace, the phtread-rwlock based isc_rwlock implementation would be all tracked in the rwlock.c unit losing all useful information as all rwlocks would be traced in a single place. Rewrite the pthread_rwlock based implementation to be header-only macros, so we can use mutrace to properly track the rwlock contention without heavily patching mutrace to understand the libisc synchronization primitives.	2022-11-02 10:34:10 +01:00
Ondřej Surý	6bd201ccec	Remove one level of indirection from isc_rwlock [1/2] Instead of checking the PTHREAD_RUNTIME_CHECK from the header, move it to the pthread_rwlock implementation functions. The internal isc_rwlock actually cannot fail, so the checks in the header was useless anyway.	2022-11-02 10:27:09 +01:00
Ondřej Surý	98b7a93772	Remove isc_rwlock_downgrade() from isc_rwlock The isc_rwlock_downgrade() is not used anywhere, so we can remove it and make the pthread_rwlock implementation simpler.	2022-11-02 09:05:37 +01:00
Ondřej Surý	e5f7fe1f65	Add strong rwlock consistency checks to dns_rbtdb The dns_rbtdb unit already tracks the state of the node and tree rwlocks during the top level function and passes the states of the locks to the called functions. Add the tree locking family of macros modeled after node locking macros, and expand both to track the state of the lock in an external variable. Additionally, in developer mode, add precondition to the macros, so the lock is in required state - this should cause an assertion failure on double locking instead of the thread getting stuck.	2022-11-02 08:45:48 +01:00
Ondřej Surý	006a7f0cb6	Remove isc_rwlock_downgrade usage in rbtdb.c The only place where isc_rwlock_downgrade was being used was the decrement_reference() where the code tries either relocks the node rwlock to write and then tries to upgrade the tree lock. When returning from the function it tries to restore the locks into a previous state which is nice, but kind of moot, because at every use of decrement_reference() the node locks is immediately or almost immeditately unlocked, and same holds for the tree lock. Instead of trying to restore the node and tree lock into the initial state, the decrement_reference now returns the state of the locks, so the caller can then use the right unlock operation (read or write). Only when the tree lock was originally unlocked, the decrement_reference unlocks the tree lock before returning to the caller.	2022-11-02 08:45:48 +01:00

1 2 3 4 5 ...

14274 Commits