For consistency with similar functions, rename `pcache` to `cachep`,
call a separate destroy function when references reach 0, and add
a missing call to isc_refcount_destroy().
Using the TLS context cache for server-side contexts could reduce the
number of contexts to initialise in the configurations when e.g. the
same 'tls' entry is used in multiple 'listen-on' statements for the
same DNS transport, binding to multiple IP addresses.
In such a case, only one TLS context will be created, instead of a
context per IP address, which could reduce the initialisation time, as
initialising even a non-ephemeral TLS context introduces some delay,
which can be *visually* noticeable by log activity.
Also, this change lays down a foundation for Mutual TLS (when the
server validates a client certificate, additionally to a client
validating the server), as the TLS context cache can be extended to
store additional data required for validation (like intermediates CA
chain).
Additionally to the above, the change ensures that the contexts are
not being changed after initialisation, as such a practice is frowned
upon. Previously we would set the supported ALPN tags within
isc_nm_listenhttp() and isc_nm_listentlsdns(). We do not do that for
client-side contexts, so that appears to be an overlook. Now we set
the supported ALPN tags right after server-side contexts creation,
similarly how we do for client-side ones.
This commit adds a TLS context object cache implementation. The
intention of having this object is manyfold:
- In the case of client-side contexts: allow reusing the previously
created contexts to employ the context-specific TLS session resumption
cache. That will enable XoT connection to be reestablished faster and
with fewer resources by not going through the full TLS handshake
procedure.
- In the case of server-side contexts: reduce the number of contexts
created on startup. That could reduce startup time in a case when
there are many "listen-on" statements referring to a smaller amount of
`tls` statements, especially when "ephemeral" certificates are
involved.
- The long-term goal is to provide in-memory storage for additional
data associated with the certificates, like runtime
representation (X509_STORE) of intermediate CA-certificates bundle for
Strict TLS/Mutual TLS ("ca-file").
Commit 9ee60e7a17 erroneously introduced
duplicate conditions to several existing conditional statements
responsible for determining error codes passed to connection callbacks
upon failure. Fix the affected expressions to ensure connection
callbacks are invoked with:
- the ISC_R_SHUTTINGDOWN error code when a global netmgr shutdown is
in progress,
- the ISC_R_CANCELED error code when a specific operation has been
canceled.
This does not fix any known bugs, it only adjusts the changes introduced
by commit 9ee60e7a17 so that they match
its original intent.
The SSL_CTX_set_keylog_callback() function is a fairly recent OpenSSL
addition, having first appeared in version 1.1.1. Add a configure.ac
check for the availability of that function to prevent build errors on
older platforms. Sort similar checks alphabetically.
This makes the SSLKEYLOGFILE mechanism a silent no-op on unsupported
platforms, which is considered acceptable for a debugging feature.
Generate log messages containing TLS pre-master secrets when the
SSLKEYLOGFILE environment variable is set. This only ensures such
messages are prepared using the right logging category and passed to
libisc for further processing.
The TLS pre-master secret logging callback needs to be set on a
per-context basis, so ensure it happens for both client-side and
server-side TLS contexts.
TLS pre-master secrets will be dumped to disk using the logging
framework provided by libisc. Add a new logging category for this type
of debugging data in order to enable exporting it to a dedicated
channel. Derive the name of the new category from the name of the
relevant environment variable, SSLKEYLOGFILE.
ECDSA P-256 performs considerably better than the previously used
4096-bit RSA (can be observed using `openssl speed`), and, according
to RFC 6605, provides a security level comparable to 3072-bit RSA.
Previously, whole isc_mempool_get() and isc_mempool_set() would be
replaced by simpler version when run with address sanitizer.
Change the code to limit the fillcount to 1 and freemax to 0. This
change will make isc_mempool_get() to always allocate and use a single
new item and isc_mempool_put() will always return the item to the
allocator.
OpenSSL 3.0.1 does not accept 0 as a digest buffer length when
calling EVP_DigestSignFinal as it now checks that the digest buffer
length is large enough for the digest. Pass the digest buffer
length instead.
Surprising error IO error is returned when directory name
is given instead of named.conf file. It can be passed to named-checkconf
or include statement. Make a simple change to return Invalid file
instead. Still not precise, but much better error message is returned.
Fix of rhbz#490837.
Mutex debugging code (used when the ISC_MUTEX_DEBUG preprocessor macro
is set to 1 and PTHREAD_MUTEX_ERRORCHECK is defined) has been broken for
the past 3 years (since commit 2f3eee5a4f)
and nobody complained, which is a strong indication that this code is
not being used these days any more. External tools for detecting
locking issues are already wired into various GitLab CI checks. Drop
all code depending on the ISC_MUTEX_DEBUG preprocessor macro being set.
Mutex profiling code (used when the ISC_MUTEX_PROFILE preprocessor macro
is set to 1) has been broken for the past 3 years (since commit
0bed9bfc28) and nobody complained, which
is a strong indication that this code is not being used these days any
more. External tools for both measuring performance and detecting
locking issues are already wired into various GitLab CI checks. Drop
all code depending on the ISC_MUTEX_PROFILE preprocessor macro being
set.
On FreeBSD, the pthread primitives are not solely allocated on stack,
but part of the object lives on the heap. Missing pthread_*_destroy
causes the heap memory to grow and in case of fast lived object it's
possible to run out-of-memory.
Properly destroy the leaking mutex (worker->lock) and
the leaking condition (sock->cond).
Previously, we set the number of the hazard pointers to be 4 times the
number of workers because the dispatch ran on the old socket code.
Since the old socket code was removed there's a smaller number of
threads, namely:
- 1 main thread
- 1 timer thread
- <n> netmgr threads
- <n> threadpool threads
Set the number of hazard pointers to 2 + 2 * workers.
Previously, the isc_hp_init() could not lower the value of
isc__hp_max_threads, but because of a mistake the isc__hp_max_threads
would be set to HP_MAX_THREADS (e.g. 128 threads) thus it would be
always set to 128. This would result in increased memory usage even
when small number of workers were in use.
Change the default value of isc__hp_max_threads to be 1.
Additionally, enforce the max_hps value in isc_hp_new() to be smaller or
equal to HP_MAX_HPS. The only user is isc_queue which uses just 1
hazard pointer, so it's only theoretical issue.
Previously, when TCP accept failed, we have logged a message with
ISC_LOG_ERROR level. One common case, how this could happen is that the
client hits TCP client quota and is put on hold and when resumed, the
client has already given up and closed the TCP connection. In such
case, the named would log:
TCP connection failed: socket is not connected
This message was quite confusing because it actually doesn't say that
it's related to the accepting the TCP connection and also it logs
everything on the ISC_LOG_ERROR level.
Change the log message to "Accepting TCP connection failed" and for
specific error states lower the severity of the log message to
ISC_LOG_INFO.
There was a logical bug when setting a list of enabled TLS protocols,
which may lead to a crash (an abort()) on systems with ancient OpenSSL
versions.
The problem was due to the fact that we were INSIST()ing on supporting
all of the TLS versions, while checking only for mentioned in the
configuration was implied.
This commit adds an isc_nm_socket_type() function which can be used to
obtain a handle's socket type.
This change obsoletes isc_nm_is_tlsdns_handle() and
isc_nm_is_http_handle(). However, it was decided to keep the latter as
we eventually might end up supporting multiple HTTP versions.
This commit makes the TLS stream code to not issue mostly useless
debug log message on error during TLS I/O. This message was cluttering
logs a lot, as it can be generated on (almost) any non-clean TLS
connection termination, even in the cases when the actual query
completed successfully. Nor does it provide much value for end-users,
yet it can occasionally be seen when using dig and quite often when
running BIND over a publicly available network interface.
This commit removes unneeded isc__nmsocket_prep_destroy() call on ALPN
negotiation failure, which was eventually causing the TLS handle to
leak.
This call is not needed, as not attaching to the transport (TLS)
handle should be enough. At this point it seems like a kludge from
earlier days of the TLS code.
This prevents a direct leak in OPENSSL_init_crypto (called from
OPENSSL_init_ssl).
Add shim version of OPENSSL_cleanup because it is missing in LibreSSL on
OpenBSD.
This commit fixes a peculiar corner case in the client-side DoT code
because of which a crash could occur during a zone transfer. A junk
DNS message should be sent at the end of a zone transfer via TLS to
trigger the crash (abort).
This commit, hopefully, fixes that.
Also, this commit adds similar changes to the TCP DNS code, as it
shares the same origin and most of the logic.
Change 5756 (GL #2854) introduced build errors when using
'configure --disable-doh'. To fix this, isc_nm_is_http_handle() is
now defined in all builds, not just builds that have DoH enabled.
Missing code comments were added both for that function and for
isc_nm_is_tlsdns_handle().
This commit adds an isc_nm_set_min_answer_ttl() function which is
intended to to be used to give a hint to the underlying transport
regarding the answer TTL.
The interface is intentionally kept generic because over time more
transports might benefit from this functionality, but currently it is
intended for DoH to set "max-age" value within "Cache-Control" HTTP
header (as recommended in the RFC8484, section 5.1 "Cache
Interaction").
It is no-op for other DNS transports for the time being.
Check to see whether there are outstanding requests in the
httpd receive buffer after sending the response, and if so,
process them.
Test that pipelined requests are handled by sending multiple
minimal HTTP/1.1 using netcat (nc) and checking that we get
back the same number of responses.
Remember the amount of space consumed by the HTTP headers, then
move any trailing data to the start of the httpd->recvbuf once
we have finished processing the request.
if an incoming HTTP request is incomplete, but nothing else is clearly
wrong with it, the stats channel continues reading to see if there's
more coming. the buffer length was not being processed correctly in
this case. also, the server state was not reset correctly when the
request was complete, so that subsequent requests could be appended to
the first buffer instead of being treated as new.
in addition fixing the above problems, this commit also increases the
size of the httpd request buffer from 1024 to 4096, because some
browsers send a lot of headers.
Unless being configured with the `no-deprecated` option, OpenSSL 3.0.0
still has the deprecated APIs present and will throw warnings during
compilation, when using them.
Make sure that the old APIs are being used only with the older versions
of OpenSSL.
The EVP_MD_CTX_new() and EVP_MD_CTX_free() functions are renamed APIs
which were previously available as EVP_MD_CTX_create() and
EVP_MD_CTX_destroy() respectively, which means that we can use them
instead of providing our own shim functions.
OpenSSL 3.0.0 deprecates the EVP_MD_CTX_md() function.
Use EVP_MD_CTX_md() instead of EVP_MD_CTX_get0_md() and create a shim
to use the old variant for the older OpenSSL versions which don't have
the newer EVP_MD_CTX_get0_md().
The isc_time_add() and isc_time_subtract() didn't have a unit test, add
the unit test with couple of edge case vectors to check whether overflow
and underflow is correctly handled.
Use the __builtin_uadd_overflow() and __builtin_usub_overflow() for
overflow checks in isc_time_add() and isc_time_subtract(). This
generates more efficient and safe code.
The isc_time_add() could overflow when t.seconds + i.seconds == UINT_MAX
and t.nanoseconds + i.nanoseconds >= NS_PER_S.
Fix the overflow in isc_time_add(), and simplify the ISC_R_RANGE checks
both in isc_time_add() and isc_time_subtract() functions.
it is possible for udp_recv_cb() to fire after the socket
is already shutting down and statichandle is NULL; we need to
create a temporary handle in this case.
route/netlink sockets don't have stats counters associated with them,
so it's now necessary to check whether socket stats exist before
incrementing or decrementing them. rather than relying on the caller
for this, we now just pass the socket and an index, and the correct
stats counter will be updated if it exists.
isc_nm_routeconnect() opens a route/netlink socket, then calls a
connect callback, much like isc_nm_udpconnect(), with a handle that
can then be monitored for network changes.
Internally the socket is treated as a UDP socket, since route/netlink
sockets follow the datagram contract.
After support for route/netlink sockets is merged, not all sockets
will have stats counters associated with them, so it's now necessary
to check whether socket stats exist before incrementing or decrementing
them. rather than relying on the caller for this, we now just pass the
socket and an index, and the correct stats counter will be updated if
it exists.
The __builtin_expect() can be used to provide the compiler with branch
prediction information. The Gcc manual says[1] on the subject:
In general, you should prefer to use actual profile feedback for
this (-fprofile-arcs), as programmers are notoriously bad at
predicting how their programs actually perform.
Stop using __builtin_expect() and ISC_LIKELY() and ISC_UNLIKELY() macros
to provide the branch prediction information as the performance testing
shows that named performs better when the __builtin_expect() is not
being used.
1. https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fexpect
due to comparing logfile suffixes as 32 bit rather than 64 bit
integers, logfiles with timestamp suffixes that should have been
removed when rolling could be left in place. this has been fixed.
Unify the header guard style and replace the inconsistent include guards
with #pragma once.
The #pragma once is widely and very well supported in all compilers that
BIND 9 supports, and #pragma once was already in use in several new or
refactored headers.
Using simpler method will also allow us to automate header guard checks
as this is simpler to programatically check.
For reference, here are the reasons for the change taken from
Wikipedia[1]:
> In the C and C++ programming languages, #pragma once is a non-standard
> but widely supported preprocessor directive designed to cause the
> current source file to be included only once in a single compilation.
>
> Thus, #pragma once serves the same purpose as include guards, but with
> several advantages, including: less code, avoidance of name clashes,
> and sometimes improvement in compilation speed. On the other hand,
> #pragma once is not necessarily available in all compilers and its
> implementation is tricky and might not always be reliable.
1. https://en.wikipedia.org/wiki/Pragma_once