the test server running in shutdown/resolver was not logging
any debug info, which made it difficult to diagnose test failures.
(cherry picked from commit cc7ceace7d)
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.
A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:
- reqhandle: held while waiting for a request callback (query,
notify, update)
- sendhandle: held while waiting for a send callback
- fetchhandle: held while waiting for a recursive fetch to
complete
- updatehandle: held while waiting for an update-forwarding
task to complete
(cherry picked from commit 57b4dde974)
- rename isc_nmsocket_t->tcphandle to statichandle
- cancelread functions now take handles instead of sockets
- add a 'client' flag in socket objects, currently unused, to
indicate whether it is to be used as a client or server socket
(cherry picked from commit 7eb4564895)
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.
This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.
(cherry picked from commit 38264b6a4d)
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.
(cherry picked from commit a0f7d28967)
by having these functions act on netmgr handles instead of socket
objects, they can be used in callback functions outside the netgmr.
(cherry picked from commit 55896df79d)
"showzone" and "tsig-list" both used exclusive mode unnecessarily;
changing this will simplify future refactoring a bit.
(cherry picked from commit 002c328437)
We erroneously tried to destroy a socket after issuing
isc__nm_tcp{,dns}_close. Under some (race) circumstances we could get
nm_socket_cleanup to be called twice for the same socket, causing an
access to a dead memory.
(cherry picked from commit 233f134a4f)
There's a possibility of race in isc__nm_tcpconnect if the asynchronous
connect operation finishes with all the callbacks before we exit the
isc__nm_tcpconnect itself we might access an already freed memory.
Fix it by creating an additional reference to the socket freed at the
end of isc__nm_tcpconnect.
(cherry picked from commit 896db0f419)
the blackhole ACL was accidentally disabled with respect to client
queries during the netmgr conversion.
in order to make this work for TCP, it was necessary to add a return
code to the accept callback functions passed to isc_nm_listentcp() and
isc_nm_listentcpdns().
(cherry picked from commit 23c7373d68)
isc__nm_tcpdns_send() was not asynchronous and accessed socket
internal fields in an unsafe manner, which could lead to a race
condition and subsequent crash. Fix it by moving tcpdns processing
to a proper netmgr thread.
(cherry picked from commit 591b79b597)
We need to mark the socket as inactive early (and synchronously)
in the stoplistening process; otherwise we might destroy the
callback argument before we actually stop listening, and call
the callback on bad memory.
(cherry picked from commit 1cf65cd882)
this prevents a crash when some non-netmgr thread, such as a
recursive lookup, times out after the TCP socket is already
disconnected.
(cherry picked from commit 3704c4fff2)
this will allow recv event handlers to distinguish between cases
in which the region is NULL because of error, shutdown, or cancelation.
(cherry picked from commit 75c985c07f)
The isc_nm_cancelread() function cancels reading on a connected
socket and calls its read callback function with a 'result'
parameter of ISC_R_CANCELED.
(cherry picked from commit 5191ec8f86)
when isc_nm_destroy() is called, there's a loop that waits for
other references to be detached, pausing and unpausing the netmgr
to ensure that all the workers' events are run, followed by a
1-second sleep. this caused a delay on shutdown which will be
noticeable when netmgr is used in tools other than named itself,
so the delay has now been reduced to a hundredth of a second.
(cherry picked from commit 870204fe47)
the isc_nm_tcpconnect() function establishes a client connection via
TCP. once the connection is esablished, a callback function will be
called with a newly created network manager handle.
(cherry picked from commit abbb79f9d1)
A TCPDNS socket creates a handle for each complete DNS message.
Previously, when all the handles were disconnected, the socket
would be closed, but the wrapped TCP socket might still have
more to read.
Now, when a connection is established, the TCPDNS socket creates
a reference to itself by attaching itself to sock->self. This
reference isn't cleared until the connection is closed via
EOF, timeout, or server shutdown. This allows the socket to remain
open even when there are no active handles for it.
(cherry picked from commit cd79b49538)
- isc__nmhandle_get() now attaches to the sock in the nmhandle object.
the caller is responsible for dereferencing the original socket
pointer when necessary.
- tcpdns listener sockets attach sock->outer to the outer tcp listener
socket. tcpdns connected sockets attach sock->outerhandle to the handle
for the tcp connected socket.
- only listener sockets need to be attached/detached directly. connected
sockets should only be accessed and reference-counted via their
associated handles.
(cherry picked from commit 5ea26ee1f1)
there is no need for a caller to reference-count socket objects.
they need tto be able tto close listener sockets (i.e., those
returned by isc_nm_listen{udp,tcp,tcpdns}), and an isc_nmsocket_close()
function has been added for that. other sockets are only accessed via
handles.
(cherry picked from commit 9e740cad21)
The following reverted changes will be picked again as part of the
netmgr sync with main branch.
Revert "Merge branch '1996-confidential-issue-v9_16' into 'security-v9_16'"
This reverts commit e160b1509f, reversing
changes made to c01e643715.
Revert "Merge branch '2038-use-freebind-when-bind-fails-v9_16' into 'v9_16'"
This reverts commit 5f8ecfb918, reversing
changes made to 23021385d5.
Revert "Merge branch '1936-blackhole-fix-v9_16' into 'v9_16'"
This reverts commit f20bc90a72, reversing
changes made to 490016ebf1.
Revert "Merge branch '1938-fix-udp-race' into 'v9_16'"
This reverts commit 0a6c7ab2a9, reversing
changes made to 4ea84740e6.
Revert "Merge branch '1947-fix-tcpdns-race' into 'v9_16'"
This reverts commit 4ea84740e6, reversing
changes made to d761cd576b.
The handling of . (dot) characted at the beginning of the line has
changed between the sphinx-doc versions, and it was constantly giving us
trouble when generating man pages when using different sphinx-doc. This
commit just changes the source rst file, so there's no more . (dot) the
beginning of the line.
(cherry picked from commit a00ca65ae6)
The dns_message_create() function cannot soft fail (as all memory
allocations either succeed or cause abort), so we change the function to
return void and cleanup the calls.
(cherry picked from commit 33eefe9f85)
dns_message_t objects are now being handled using reference counting
semantics, so now dns_message_destroy() is not called directly anymore,
dns_message_detach must be called instead.
(cherry picked from commit 7deaf9a93c)
This commit fix the problems that arose when moving the dns_message_t
object from fetchctx_t to the query structure.
Since the lifetime of query objects are different than that of a
fetchctx and the dns_message_t object held by the query may be being
used by some external module, e.g. validator, even after the query
may have been destroyed, propery handling of the references to the
message were added in this commit to avoid accessing an already
destroyed object.
Specifically, in rctx_done(), a reference to the message is attached
at the beginning of the function and detached at the end, since a
possible call to fctx_cancelquery() would release the dns_message_t
object, and in the next lines of code a call to rctx_nextserver()
or rctx_chaseds() would require a valid pointer to the same object.
In valcreate() a new reference is attached to the message object,
this ensures that if the corresponding query object is destroyed
before the validator attempts to access it, no invalid pointer
access occurs.
In validated() we have to attach a new reference to the message,
since we destroy the validator object at the beginning of the
function, and we need access to the message in the next lines of
the same function.
rctx_nextserver() and rctx_chaseds() functions were adapted to
receive a new parameter of dns_message_t* type, this was so they
could receive a valid reference to a dns_message_t since using the
response context respctx_t to access the message through
rctx->query->rmessage could lead to an already released reference
due to the query being canceled.
(cherry picked from commit cde6227a68)
The assertion failure REQUIRE(msg->state == DNS_SECTION_ANY), caused
by calling dns_message_setclass within function resquery_response()
in resolver.c, was happening due to wrong management of dns message_t
objects used to process responses to the queries issued by the
resolver.
Before the fix, a resolver's fetch context (fetchctx_t) would hold
a pointer to the message, this same reference would then be used
over all the attempts to resolve the query, trying next server,
etc... for this to work the message object would have it's state
reset between each iteration, marking it as ready for a new processing.
The problem arose in a scenario with many different forwarders
configured, managing the state of the dns_message_t object was
lacking better synchronization, which have led it to a invalid
dns_message_t state in resquery_response().
Instead of adding unnecessarily complex code to synchronize the
object, the dns_message_t object was moved from fetchctx_t structure
to the query structure, where it better belongs to, since each query
will produce a response, this way whenever a new query is created
an associated dns_messate_t is also created.
This commit deals mainly with moving the dns_message_t object from
fetchctx_t to the query structure.
(cherry picked from commit 02f9e125c1)
This commit will be used as a base for the next code updates in
order to have a better control of dns_message_t objects' lifetime.
(cherry picked from commit 12d6d13100)