5031 Commits

Author SHA1 Message Date
alessio
e1e10adc3a Switch symtab to use fxhash hashing
This merge request resolves some performance regressions introduced
with the change from isc_symtab_t to isc_hashmap_t.

The key improvements are:

1. Using a faster hash function than both isc_hashmap_t and
   isc_symtab_t. The previous implementation used SipHash, but the
   hashflood resistance properties of SipHash are unneeded for config
   parsing.
2. Shrinking the initial size of the isc_hashmap_t used inside
   isc_symtab_t. Symtab is mainly used for config parsing, and the
   when used that way it will have between 1 and ~50 keys, but the
   previous implementation initialized a map with 128 slots.
   By initializing a smaller map, we speed up mallocs and optimize for
   the typical case of few config keys.
3. Slight optimization of the string matching in the hashmap, so that
   the tail is handled in a single load + comparison, instead of byte
   by byte.
   Of the three improvements, this is the least important.
2025-03-20 11:26:09 +01:00
Ondřej Surý
1fae6ccea1 Add the call function tracking to isc_mem API
As we already track __func__, __FILE__, __LINE__ triplet in most places,
add the function tracking to the isc_mem tracking API.
2025-03-05 11:17:17 +01:00
Ondřej Surý
eab9fc22e7 Replace attach/detach in isc_mem with refcount implementation
The isc_mem API is one of the most commonly used APIs that didn't
used ISC_REFCOUNT_DECL and ISC_REFCOUNT_IMPL macros.  Replace the
implementation of isc_mem_attach(), isc_mem_detach() and
isc_mem_destroy() with the respective macros.

This also removes the legacy isc_mem_destroy() functionality that would
check whether all references had been detached from the memory context
as it doesn't work reliably when using the call_rcu() API.  Instead of
doing this individually, call isc_mem_checkdestroyed(stderr) from the
isc_mem_destroy() macro to keep the extra check that all contexts were
freed when the program is exiting.
2025-03-05 11:17:17 +01:00
Ondřej Surý
552cf64a70 Replace isc_mem_destroy() with isc_mem_detach()
Remove legacy isc_mem_destroy() and just use isc_mem_detach() as
isc_mem_destroy() doesn't play well with call_rcu API.
2025-03-05 11:17:17 +01:00
Mark Andrews
988dc57c8c Call isc__iterated_hash_initialize
The iterated hash implementation needs to be initialised
on the worker thread.  Also clean it up after we are done.
2025-03-04 12:54:39 +00:00
Artem Boldariev
eaad0aefe6 DoH: Bump the active streams processing limit
This commit bumps the total number of active streams (= the opened
streams for which a request is received, but response is not ready) to
60% of the total streams limit.

The previous limit turned out to be too tight as revealed by
longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests.
2025-03-03 11:32:29 +02:00
Artem Boldariev
217a1ebd79 DoH: remove obsolete INSIST() check
The check, while not active by default, is not valid since the commit
8b8f4d500d.

See 'if (total == 0) { ...' below branch to understand why.
2025-03-03 11:32:11 +02:00
Artem Boldariev
c5f7968856 DoH: Flush HTTP write buffer on an outgoing DNS message
Previously, the code would try to avoid sending any data regardless of
what it is unless:

a) The flush limit is reached;
b) There are no sends in flight.

This strategy is used to avoid too numerous send requests with little
amount of data. However, it has been proven to be too aggressive and,
in fact, harms performance in some cases (e.g., on longer (≥1h) runs
of "stress:long:rpz:doh+udp:linux:*").

Now, additionally to the listed cases, we also:

c) Flush the buffer and perform a send operation when there is an
outgoing DNS message passed to the code (which is indicated by the
presence of a send callback).

That helps improve performance for "stress:long:rpz:doh+udp:linux:*"
tests.
2025-03-03 11:32:11 +02:00
Artem Boldariev
0e1b02868a DoH: Limit the number of delayed IO processing requests
Previously, a function for continuing IO processing on the next UV
tick was introduced (http_do_bio_async()). The intention behind this
function was to ensure that http_do_bio() is eventually called at
least once in the future. However, the current implementation allows
queueing multiple such delayed requests needlessly. There is currently
no need for these excessive requests as http_do_bio() can requeue them
if needed. At the same time, each such request can lead to a memory
allocation, particularly in BIND 9.18.

This commit ensures that the number of enqueued delayed IO processing
requests never exceeds one in order to avoid potentially bombarding IO
threads with the delayed requests needlessly.
2025-03-03 11:32:11 +02:00
Artem Boldariev
0956fb9b9e DoH: Simplify http_do_bio()
This commit significantly simplifies the code flow in the
http_do_bio() function, which is responsible for processing incoming
and outgoing HTTP/2 data. It seems that the way it was structured
before was indirectly caused by the presence of the missing callback
calls bug, fixed in 8b8f4d500d.

The change introduced by this commit is known to remove a bottleneck
and allows reproducible and measurable performance improvement for
long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests.

Additionally, it fixes a similar issue with potentially missing send
callback calls processing and hardens the code against use-after-free
errors related to the session object (they can potentially occur).
2025-03-03 11:32:11 +02:00
Ondřej Surý
ce7879c924 Remove STATIC_ASSERT variants in favor of the C11 variant
Previously, a gcc < 4.6 shim for _Static_assert() was included.  Such an
old compiler is not supported now anyway, so the macro variant has been
removed in favor of a single definition using _Static_assert().
2025-03-01 07:33:53 +01:00
Ondřej Surý
534069e048 Move locking macros into individual headers
Previously, the LOCK()/UNLOCK() and friends macros were defined in the
isc/util.h header.  Those macros were moved to their respective headers
as those would have to be included anyway if that particular lock was in
use.
2025-03-01 07:33:51 +01:00
Ondřej Surý
901637c25c Remove superflous header includes from isc/util.h header
Formerly, isc/util.h would pull a few extra headers (isc/list.h,
isc/attributes.h, isc/result.h and errno.h).  These includes were
removed in favor of including them directly when used.
2025-03-01 07:33:40 +01:00
Ondřej Surý
c5075a9a61 Remove convenience list macros from isc/util.h
The short convenience list macros were used very sparingly and
inconsistenly in the code base.  As the consistency is prefered over
the convenience, all shortened list macro were removed in favor of
their ISC_LIST API targets.
2025-03-01 07:33:40 +01:00
Ondřej Surý
2aa70fff76 Remove unused isc_mutexblock and isc_condition units
The isc_mutexblock and isc_condition units were no longer in use and
were removed.
2025-03-01 07:33:09 +01:00
Aydın Mercan
f4ab4f07e3 unify fips handling to isc_crypto and make the toggle one way
Since algorithm fetching is handled purely in libisc, FIPS mode toggling
can be purely done in within the library instead of provider fetching in
the binary for OpenSSL >=3.0.

Disabling FIPS mode isn't a realistic requirement and isn't done
anywhere in the codebase. Make the FIPS mode toggle enable-only to
reflect the situation.
2025-02-27 17:37:43 +03:00
Ondřej Surý
4917ffa61b Explicitly create and shutdown the call_rcu_thread
As the default_call_rcu_thread can't be forced to flush all the work
during the executable shutdown, create one call_rcu_thread explicitly
and assign it to the all created threads.

This allows this explicit call_rcu_thread to be unassociated from the
main thread and freed before the executable destructor exits.
2025-02-22 16:19:01 +01:00
Ondřej Surý
f5c204ac3e Move the library init and shutdown to executables
Instead of relying on unreliable order of execution of the library
constructors and destructors, move them to individual binaries.  The
advantage is that the execution time and order will remain constant and
will not depend on the dynamic load dependency solver.

This requires more work, but that was mitigated by a simple requirement,
any executable using libisc and libdns, must include <isc/lib.h> and
<dns/lib.h> respectively (in this particular order).  In turn, these two
headers must not be included from within any library as they contain
inlined functions marked with constructor/destructor attributes.
2025-02-22 16:19:00 +01:00
Ondřej Surý
b9e3cd5d2a Add isc_timer_running() function to check status of timer
In the next commit, we need to know whether the timer has been started
or stopped.  Add isc_timer_running() function that returns true if the
timer has been started.
2025-02-21 22:05:43 +01:00
Ondřej Surý
77ec2a6c22 Cleanup the isc_counter unit
The isc_counter_create() doesn't need the return value (it was always
ISC_R_SUCCESS), use the macros to implement the reference counting,
little style cleanup, and expand the unit test.
2025-02-21 09:51:42 +00:00
Aram Sargsyan
c6529891bb Fix isc_quota bug
Running jobs which were entered into the isc_quota queue is the
responsibility of the isc_quota_release() function, which, when
releasing a previously acquired quota, checks whether the queue
is empty, and if it's not, it runs a job from the queue without touching
the 'quota->used' counter. This mechanism is susceptible to a possible
hangup of a newly queued job in case when between the time a decision
has been made to queue it (because used >= max) and the time it was
actually queued, the last quota was released. Since there is no more
quotas to be released (unless arriving in the future), the newly
entered job will be stuck in the queue.

Fix the wrong memory ordering for 'quota->used', as the relaxed
ordering doesn't ensure that data modifications made by one thread
are visible in other threads.

Add checks in both isc_quota_release() and isc_quota_acquire_cb()
to make sure that the described hangup does not happen. Also see
code comments.
2025-02-20 10:56:00 +00:00
Artem Boldariev
2adabe835a DoH: http_send_outgoing() return value is not used
The value returned by http_send_outgoing() is not used anywhere, so we
make it not return anything (void). Probably it is an omission from
older times.
2025-02-19 17:52:36 +02:00
Artem Boldariev
8b8f4d500d DoH: Fix missing send callback calls
When handling outgoing data, there were a couple of rarely executed
code paths that would not take into account that the callback MUST be
called.

It could lead to potential memory leaks and consequent shutdown hangs.
2025-02-19 17:52:36 +02:00
Artem Boldariev
a22bc2d7d4 DoH: change how the active streams number is calculated
This commit changes the way how the number of active HTTP streams is
calculated and allows it to scale with the values of the maximum
amount of streams per connection, instead of effectively capping at
STREAM_CLIENTS_PER_CONN.

The original limit, which is intended to define the pipelining limit
for TCP/DoT. However, it appeared to be too restrictive for DoH, as it
works quite differently and implements pipelining at protocol level by
the means of multiplexing multiple streams. That renders each stream
to be effectively a separate connection from the point of view of the
rest of the codebase.
2025-02-19 17:52:36 +02:00
Artem Boldariev
05e8a50818 DoH: Track the amount of in flight outgoing data
Previously we would limit the amount of incoming data to process based
solely on the presence of not completed send requests. That worked,
however, it was found to severely degrade performance in certain
cases, as was revealed during extended testing.

Now we switch to keeping track of how much data is in flight (or ready
to be in flight) and limit the amount of processed incoming data when
the amount of in flight data surpasses the given threshold, similarly
to like we do in other transports.
2025-02-19 17:52:36 +02:00
alessio
53991ecc14 Refactor and simplify isc_symtab
This commit does several changes to isc_symtab:

1. Rewrite the isc_symtab to internally use isc_hashmap instead of
   hand-stiched hashtable.

2. Create a new isc_symtab_define_and_return() api, which returns
   the already defined symvalue on ISC_R_EXISTS; this allows users
   of the API to skip the isc_symtab_lookup()+isc_symtab_define()
   calls and directly call isc_symtab_define_and_return().

3. Merge isccc_symtab into isc_symtab - the only missing function
   was isccc_symtab_foreach() that was merged into isc_symtab API.

4. Add full set of unit tests for the isc_symtab API.
2025-02-17 11:43:19 +01:00
Ondřej Surý
c602d76c1f Reduce false sharing in dns_qpcache
Instead of having many node_lock_count * sizeof(<member>) arrays, pack
all the members into a qpcache_bucket_t struct that is cacheline aligned
and have a single array of those.

Additionaly, make both the head and the tail of isc_queue_t padded, not
just the head, to prevent false sharing of the lock-free structure with
the lock that follows it.
2025-02-04 21:37:46 +01:00
Andoni Duarte Pintado
3a64b288c1 Merge tag 'v9.21.4' 2025-01-29 17:17:18 +01:00
Evan Hunt
314741fcd0 deduplicate result codes
ISCCC_R_SYNTAX, ISCCC_R_EXPIRED, and ISCCC_R_CLOCKSKEW have the
same usage and text formats as DNS_R_SYNTAX, DNS_R_EXPIRED and
DNS_R_CLOCKSCREW respectively. this was originally done because
result codes were defined in separate libraries, and some tool
might be linked with libisccc but not libdns. as the result codes
are now defined in only one place, there's no need to retain the
duplicates.
2025-01-23 15:54:57 -08:00
Evan Hunt
a19f6c6654 clean up result codes that are never used
the following result codes are obsolete and have been removed
from result.h and result.c:

        - ISC_R_NOTHREADS
        - ISC_R_BOUND
        - ISC_R_NOTBOUND
        - ISC_R_NOTDIRECTORY
        - ISC_R_EMPTY
        - ISC_R_NOTBLOCKING
        - ISC_R_INPROGRESS
        - ISC_R_WOULDBLOCK

        - DNS_R_TOOMANYHOPS
        - DNS_R_NOREDATA
        - DNS_R_BADCKSUM
        - DNS_R_MOREDATA
        - DNS_R_NOVALIDDS
        - DNS_R_UNKNOWNOPT
        - DNS_R_NOVALIDKEY
        - DNS_R_NTACOVERED

        - DST_R_COMPUTESECRETFAILURE
        - DST_R_NORANDOMNESS
        - DST_R_NOCRYPTO
2025-01-23 15:54:57 -08:00
Evan Hunt
10accd6260 clean up uses of ISC_R_NOMEMORY
the isc_mem allocation functions can no longer fail; as a result,
ISC_R_NOMEMORY is now rarely used: only when an external library
such as libjson-c or libfstrm could return NULL. (even in
these cases, arguably we should assert rather than returning
ISC_R_NOMEMORY.)

code and comments that mentioned ISC_R_NOMEMORY have been
cleaned up, and the following functions have been changed to
type void, since (in most cases) the only value they could
return was ISC_R_SUCCESS:

- dns_dns64_create()
- dns_dyndb_create()
- dns_ipkeylist_resize()
- dns_kasp_create()
- dns_kasp_key_create()
- dns_keystore_create()
- dns_order_create()
- dns_order_add()
- dns_peerlist_new()
- dns_tkeyctx_create()
- dns_view_create()
- dns_zone_setorigin()
- dns_zone_setfile()
- dns_zone_setstream()
- dns_zone_getdbtype()
- dns_zone_setjournal()
- dns_zone_setkeydirectory()
- isc_lex_openstream()
- isc_portset_create()
- isc_symtab_create()

(the exception is dns_view_create(), which could have returned
other error codes in the event of a crypto library failure when
calling isc_file_sanitize(), but that should be a RUNTIME_CHECK
anyway.)
2025-01-23 15:54:57 -08:00
Artem Boldariev
937b5f8349 DoH: reduce excessive bad request logging
We started using isc_nm_bad_request() more actively throughout
codebase. In the case of HTTP/2 it can lead to a large count of
useless "Bad Request" messages in the BIND log, as often we attempt to
send such request over effectively finished HTTP/2 sessions.

This commit fixes that.
2025-01-15 14:09:17 +00:00
Artem Boldariev
4ae4e255cf Do not stop timer in isc_nm_read_stop() in manual timer mode
A call to isc_nm_read_stop() would always stop reading timer even in
manual timer control mode which was added with StreamDNS in mind. That
looks like an omission that happened due to how timers are controlled
in StreamDNS where we always stop the timer before pausing reading
anyway (see streamdns_on_complete_dnsmessage()). That would not work
well for HTTP, though, where we might want pause reading without
stopping the timer in the case we want to split incoming data into
multiple chunks to be processed independently.

I suppose that it happened due to NM refactoring in the middle of
StreamDNS development (at the time isc_nm_cancelread() and
isc_nm_pauseread() were removed), as the StreamDNS code seems to be
written as if timers are not stoping during a call to
isc_nm_read_stop().
2025-01-15 14:09:17 +00:00
Artem Boldariev
609a41517b DoH: introduce manual read timer control
This commit introduces manual read timer control as used by StreamDNS
and its underlying transports. Before that, DoH code would rely on the
timer control provided by TCP, which would reset the timer any time
some data arrived. Now, the timer is restarted only when a full DNS
message is processed in line with other DNS transports.

That change is required because we should not stop the timer when
reading from the network is paused due to throttling. We need a way to
drop timed-out clients, particularly those who refuse to read the data
we send.
2025-01-15 14:09:17 +00:00
Artem Boldariev
3425e4b1d0 DoH: floodding clients detection
This commit adds logic to make code better protected against clients
that send valid HTTP/2 data that is useless from a DNS server
perspective.

Firstly, it adds logic that protects against clients who send too
little useful (=DNS) data. We achieve that by adding a check that
eventually detects such clients with a nonfavorable useful to
processed data ratio after the initial grace period. The grace period
is limited to processing 128 KiB of data, which should be enough for
sending the largest possible DNS message in a GET request and then
some. This is the main safety belt that would detect even flooding
clients that initially behave well in order to fool the checks server.

Secondly, in addition to the above, we introduce additional checks to
detect outright misbehaving clients earlier:

The code will treat clients that open too many streams (50) without
sending any data for processing as flooding ones; The clients that
managed to send 1.5 KiB of data without opening a single stream or
submitting at least some DNS data will be treated as flooding ones.
Of course, the behaviour described above is nothing else but
heuristical checks, so they can never be perfect. At the same time,
they should be reasonable enough not to drop any valid clients,
realatively easy to implement, and have negligible computational
overhead.
2025-01-15 14:09:17 +00:00
Artem Boldariev
9846f395ad DoH: process data chunk by chunk instead of all at once
Initially, our DNS-over-HTTP(S) implementation would try to process as
much incoming data from the network as possible. However, that might
be undesirable as we might create too many streams (each effectively
backed by a ns_client_t object). That is too forgiving as it might
overwhelm the server and trash its memory allocator, causing high CPU
and memory usage.

Instead of doing that, we resort to processing incoming data using a
chunk-by-chunk processing strategy. That is, we split data into small
chunks (currently 256 bytes) and process each of them
asynchronously. However, we can process more than one chunk at
once (up to 4 currently), given that the number of HTTP/2 streams has
not increased while processing a chunk.

That alone is not enough, though. In addition to the above, we should
limit the number of active streams: these streams for which we have
received a request and started processing it (the ones for which a
read callback was called), as it is perfectly fine to have more opened
streams than active ones. In the case we have reached or surpassed the
limit of active streams, we stop reading AND processing the data from
the remote peer. The number of active streams is effectively decreased
only when responses associated with the active streams are sent to the
remote peer.

Overall, this strategy is very similar to the one used for other
stream-based DNS transports like TCP and TLS.
2025-01-15 14:09:17 +00:00
Michał Kępień
d6f9785ac6 Enable extraction of exact local socket addresses
Extracting the exact address that each wildcard/TCP socket is bound to
locally requires issuing the getsockname() system call, which libuv
exposes via its uv_*_getsockname() functions.  This is only required for
detailed logging and comes at a noticeable performance cost, so it
should not happen by default.  However, it is useful for debugging
certain problems (e.g. cryptic system test failures), so a convenient
way of enabling that behavior should exist.

Update isc_nmhandle_localaddr() so that it calls uv_*_getsockname() when
the ISC_SOCKET_DETAILS preprocessor macro is set at compile time.
Ensure proper handling of sockets that wrap other sockets.

Set the new ISC_SOCKET_DETAILS macro by default when --enable-developer
is passed to ./configure.  This enables detailed logging in the system
tests run in GitLab CI without affecting performance in non-development
BIND 9 builds.

Note that setting the ISC_SOCKET_DETAILS preprocessor macro at compile
time enables all callers of isc_nmhandle_localaddr() to extract the
exact address of a given local socket, which results e.g. in dnstap
captures containing more accurate information.

Mention the new preprocessor macro in the section of the ARM that
discusses why exact socket addresses may not be logged by default.
2024-12-29 12:32:05 +01:00
Artem Boldariev
6691a1530d TLS SNI - add low level support for SNI to the networking code
This commit adds support for setting SNI hostnames in outgoing
connections over TLS.

Most of the changes are related to either adapting the code to accept
and extra argument in *connect() functions and a couple of changes to
the TLS Stream to actually make use of the new SNI hostname
information.
2024-12-26 17:23:12 +02:00
Ondřej Surý
7b26becec0 Detect and possibly define constexpr using Autoconf
Previously, we had an ISC_CONSTEXPR macro that was expanded to either
`constexpr` or `static const`, depending on compiler support.  To make
the code cleaner, move `constexpr` support detection to Autoconf; if
`constexpr` support is missing from the compiler, define `constexpr` as
`static const` in config.h.
2024-12-25 15:21:26 +01:00
Ondřej Surý
06f9163d51 Remove C++ support from the public header
Since BIND 9 headers are not longer public, there's no reason to keep
the ISC_LANG_BEGINDECL and ISC_LANG_ENDDECL macros to support including
them from C++ projects.
2024-12-18 13:10:39 +01:00
Evan Hunt
95a0b6f479 clean up log module names
- remove obsolete DNS_LOGMODULE_RBT and DNS_LOGMODULE_RBTDB
- correct the misuse of the wrong log modules in dns/rpz.c and
  dns/catz.c, and add DNS_LOGMODULE_RPZ and DNS_LOGMODULE_CATZ
  to support them.
2024-12-11 17:11:32 +00:00
Pavel Březina
67e21d94d4 mark loop as shuttingdown earlier in shutdown_cb
`shutdown_trigger_close_cb` is not called in the main loop since
queued events in the `loop->async_trigger`, including loop teardown
(shutdown_server) are processed first, before the `uv_close` callback
is executed..

In order to pass the information to the queued events, it is necessary
to set the flag earlier in the process and not wait for the `uv_close`
callback to trigger.
2024-12-10 19:18:49 +00:00
Ondřej Surý
2089996f96 Replace remaining usage of DNS_R_MUSTBESECURE with DNS_R_NOVALIDSIG
The DNS_R_MUSTBESECURE lost its meaning with removal of
dnssec-must-be-secure option, so replace the few remaining (and a bit
confusing) use of this result code with DNS_R_NOVALIDSIG.
2024-12-09 13:10:21 +01:00
Ondřej Surý
d14a76e115 Update picohttpparser.{c,h} with upstream repository
Upstream code doesn't do regular releases, so we need to regularly
sync the code from the upstream repository.  This is synchronization up
to the commit f8d0513 from Jan 29, 2024.
2024-12-08 11:14:37 +00:00
Matthijs Mekking
aa24b77d8b Fix nsupdate hang when processing a large update
The root cause is the fix for CVE-2024-0760 (part 3), which resets
the TCP connection on a failed send. Specifically commit
4b7c61381f stops reading on the socket
because the TCP connection is throttling.

When the tcpdns_send_cb callback thinks about restarting reading
on the socket, this fails because the socket is a client socket.
And nsupdate is a client and is using the same netmgr code.

This commit removes the requirement that the socket must be a server
socket, allowing reading on the socket again after being throttled.
2024-12-05 15:40:48 +01:00
Matthijs Mekking
16b3bd1cc7 Implement global limit for outgoing queries
This global limit is not reset on query restarts and is a hard limit
for any client request.
2024-12-05 14:17:07 +01:00
Matthijs Mekking
ca7d487357 Implement getter function for counter limit 2024-12-05 14:17:07 +01:00
Artem Boldariev
300f05110d Extended TCP accept()/close() logging
This commit adds extra log messages issued when accepting or closing a
TCP connection (provided that debugging logging level >=99 is
enabled).
2024-11-27 21:14:08 +02:00
Ondřej Surý
c18bb5f1f2 Remove unused definition of ISC_CMSG_IP_TOS
The #define was used before, but we forgot to clean it up when we
removed support for dscp.
2024-11-27 15:03:27 +01:00
Ondřej Surý
95a7419c2a Remove the incomplete code for IPv6 pktinfo
The code that listens on individual interfaces is now stable and doesn't
require any changes.  The code that would bind to IPv6 wildcard address
and then use IPv6 pktinfo structure to get the source address is not
going to be completed, so it's better to just remove the dead cruft.
2024-11-27 15:03:27 +01:00