Commit Graph

411 Commits

Author SHA1 Message Date
Diego Fronza
4827ad0ec4 Add stale-refresh-time option
Before this update, BIND would attempt to do a full recursive resolution
process for each query received if the requested rrset had its ttl
expired. If the resolution fails for any reason, only then BIND would
check for stale rrset in cache (if 'stale-cache-enable' and
'stale-answer-enable' is on).

The problem with this approach is that if an authoritative server is
unreachable or is failing to respond, it is very unlikely that the
problem will be fixed in the next seconds.

A better approach to improve performance in those cases, is to mark the
moment in which a resolution failed, and if new queries arrive for that
same rrset, try to respond directly from the stale cache, and do that
for a window of time configured via 'stale-refresh-time'.

Only when this interval expires we then try to do a normal refresh of
the rrset.

The logic behind this commit is as following:

- In query.c / query_gotanswer(), if the test of 'result' variable falls
  to the default case, an error is assumed to have happened, and a call
  to 'query_usestale()' is made to check if serving of stale rrset is
  enabled in configuration.

- If serving of stale answers is enabled, a flag will be turned on in
  the query context to look for stale records:
  query.c:6839
  qctx->client->query.dboptions |= DNS_DBFIND_STALEOK;

- A call to query_lookup() will be made again, inside it a call to
  'dns_db_findext()' is made, which in turn will invoke rbdb.c /
  cache_find().

- In rbtdb.c / cache_find() the important bits of this change is the
  call to 'check_stale_header()', which is a function that yields true
  if we should skip the stale entry, or false if we should consider it.

- In check_stale_header() we now check if the DNS_DBFIND_STALEOK option
  is set, if that is the case we know that this new search for stale
  records was made due to a failure in a normal resolution, so we keep
  track of the time in which the failured occured in rbtdb.c:4559:
  header->last_refresh_fail_ts = search->now;

- In check_stale_header(), if DNS_DBFIND_STALEOK is not set, then we
  know this is a normal lookup, if the record is stale and the query
  time is between last failure time + stale-refresh-time window, then
  we return false so cache_find() knows it can consider this stale
  rrset entry to return as a response.

The last additions are two new methods to the database interface:
- setservestale_refresh
- getservestale_refresh

Those were added so rbtdb can be aware of the value set in configuration
option, since in that level we have no access to the view object.
2020-11-11 12:53:23 -03:00
Matthijs Mekking
b7856d2675 Cleanup duplicate definitions in query.h 2020-11-10 14:42:47 +00:00
Witold Kręcicki
2cfc8a45a4 Shutdown interface if we can't listen on it to avoid shutdown hang 2020-11-10 14:17:09 +01:00
Witold Kręcicki
f68fe9ff14 Fix a startup/shutdown crash in ns_clientmgr_create 2020-11-10 14:17:05 +01:00
Witold Kręcicki
38b78f59a0 Add DoT support to bind
Parse the configuration of tls objects into SSL_CTX* objects.  Listen on
DoT if 'tls' option is setup in listen-on directive.  Use DoT/DoH ports
for DoT/DoH.
2020-11-10 14:16:55 +01:00
Mark Andrews
b09727a765 Implement DNSTAP support in ns_client_sendraw()
ns_client_sendraw() is currently only used to relay UPDATE
responses back to the client.  dns_dt_send() is called with
this assumption.
2020-11-10 06:15:46 +00:00
Ondřej Surý
127ba7e930 Add libssl libraries to Windows build
This commit extends the perl Configure script to also check for libssl
in addition to libcrypto and change the vcxproj source files to link
with both libcrypto and libssl.
2020-11-09 16:00:28 +01:00
Evan Hunt
49d53a4aa9 use netmgr for xfrin
Use isc_nm_tcpdnsconnect() in xfrin.c for zone transfers.
2020-11-09 13:45:43 +01:00
Ondřej Surý
37b9511ce1 Use libuv's shared library handling capabilities
While libltdl is a feature-rich library, BIND 9 code only uses its basic
capabilities, which are also provided by libuv and which BIND 9 already
uses for other purposes.  As libuv's cross-platform shared library
handling interface is modeled after the POSIX dlopen() interface,
converting code using the latter to the former is simple.  Replace
libltdl function calls with their libuv counterparts, refactoring the
code as necessary.  Remove all use of libltdl from the BIND 9 source
tree.
2020-10-28 15:48:58 +01:00
Ondřej Surý
e2436159ab Refactor the cleanup code in lt_dl code
The cleanup code that would clean the object after plugin/dlz/dyndb
loading has failed was duplicating the destructor for the object, so
instead of the extra code, we just use the destructor instead.
2020-10-28 15:48:58 +01:00
Ondřej Surý
0f49b02fc5 Remove redundant lt_dlerror() calls
The redundant lt_dlerror() calls were taken from the examples to clean
any previous errors from lt_dl...() calls.  However upon code
inspection, it was discovered there are no such paths that could cause
the lt_dlerror() to return spurious error messages.
2020-10-28 15:48:58 +01:00
Ondřej Surý
f7c82e406e Fix the isc_nm_closedown() to actually close the pending connections
1. The isc__nm_tcp_send() and isc__nm_tcp_read() was not checking
   whether the socket was still alive and scheduling reads/sends on
   closed socket.

2. The isc_nm_read(), isc_nm_send() and isc_nm_resumeread() have been
   changed to always return the error conditions via the callbacks, so
   they always succeed.  This applies to all protocols (UDP, TCP and
   TCPDNS).
2020-10-22 11:37:16 -07:00
Michał Kępień
9014ff0cc6 Update library API versions 2020-10-22 08:54:32 +02:00
Diego Fronza
d727eaae6c Always return address records in additional section for NS queries 2020-10-21 12:03:42 -03:00
Ondřej Surý
bb990030d3 Simplify the EDNS buffer size logic for DNS Flag Day 2020
The DNS Flag Day 2020 aims to remove the IP fragmentation problem from
the UDP DNS communication.  In this commit, we implement the required
changes and simplify the logic for picking the EDNS Buffer Size.

1. The defaults for `edns-udp-size`, `max-udp-size` and
   `nocookie-udp-size` have been changed to `1232` (the value picked by
   DNS Flag Day 2020).

2. The probing heuristics that would try 512->4096->1432->1232 buffer
   sizes has been removed and the resolver will always use just the
   `edns-udp-size` value.

3. Instead of just disabling the PMTUD mechanism on the UDP sockets, we
   now set IP_DONTFRAG (IPV6_DONTFRAG) flag.  That means that the UDP
   packets won't get ever fragmented.  If the ICMP packets are lost the
   UDP will just timeout and eventually be retried over TCP.
2020-10-05 16:21:21 +02:00
Ondřej Surý
33eefe9f85 The dns_message_create() cannot fail, change the return to void
The dns_message_create() function cannot soft fail (as all memory
allocations either succeed or cause abort), so we change the function to
return void and cleanup the calls.
2020-09-29 08:22:08 +02:00
Diego Fronza
12d6d13100 Refactored dns_message_t for using attach/detach semantics
This commit will be used as a base for the next code updates in order
to have a better control of dns_message_t objects' lifetime.
2020-09-29 08:22:08 +02:00
Michał Kępień
b60d7345ed Fix function overrides in unit tests on macOS
Since Mac OS X 10.1, Mach-O object files are by default built with a
so-called two-level namespace which prevents symbol lookups in BIND unit
tests that attempt to override the implementations of certain library
functions from working as intended.  This feature can be disabled by
passing the "-flat_namespace" flag to the linker.  Fix unit tests
affected by this issue on macOS by adding "-flat_namespace" to LDFLAGS
used for building all object files on that operating system (it is not
enough to only set that flag for the unit test executables).
2020-09-28 09:09:21 +02:00
Michał Kępień
8bdba2edeb Drop function wrapping as it is redundant for now
As currently used in the BIND source tree, the --wrap linker option is
redundant because:

  - static builds are no longer supported,

  - there is no need to wrap around existing functions - what is
    actually required (at least for now) is to replace them altogether
    in unit tests,

  - only functions exposed by shared libraries linked into unit test
    binaries are currently being replaced.

Given the above, providing the alternative implementations of functions
to be overridden in lib/ns/tests/nstest.c is a much simpler alternative
to using the --wrap linker option.  Drop the code detecting support for
the latter from configure.ac, simplify the relevant Makefile.am, and
remove lib/ns/tests/wrap.c, updating lib/ns/tests/nstest.c accordingly
(it is harmless for unit tests which are not calling the overridden
functions).
2020-09-28 09:09:21 +02:00
Evan Hunt
86eddebc83 Purge memory pool upon plugin destruction
The typical sequence of events for AAAA queries which trigger recursion
for an A RRset at the same name is as follows:

 1. Original query context is created.
 2. An AAAA RRset is found in cache.
 3. Client-specific data is allocated from the filter-aaaa memory pool.
 4. Recursion is triggered for an A RRset.
 5. Original query context is torn down.

 6. Recursion for an A RRset completes.
 7. A second query context is created.
 8. Client-specific data is retrieved from the filter-aaaa memory pool.
 9. The response to be sent is processed according to configuration.
10. The response is sent.
11. Client-specific data is returned to the filter-aaaa memory pool.
12. The second query context is torn down.

However, steps 6-12 are not executed if recursion for an A RRset is
canceled.  Thus, if named is in the process of recursing for A RRsets
when a shutdown is requested, the filter-aaaa memory pool will have
outstanding allocations which will never get released.  This in turn
leads to a crash since every memory pool must not have any outstanding
allocations by the time isc_mempool_destroy() is called.

Fix by creating a stub query context whenever fetch_callback() is called,
including cancellation events. When the qctx is destroyed, it will ensure
the client is detached and the plugin memory is freed.
2020-09-25 13:32:34 -07:00
Mark Andrews
f0d9bf7c30 Clone the saved / query message buffers
The message buffer passed to ns__client_request is only valid for
the life of the the ns__client_request call.  Save a copy of it
when we recurse or process a update as ns__client_request will
return before those operations complete.
2020-09-23 10:37:42 +10:00
Ondřej Surý
d4976e0ebe Add separate prefetch nmhandle to ns_client_t
As the query_prefetch() or query_rpzfetch() could be called during
"regular" fetch, we need to introduce separate storage for attaching
the nmhandle during prefetching the records.  The query_prefetch()
and query_rpzfetch() are guarded for re-entrance by .query.prefetch
member of ns_client_t, so we can reuse the same .prefetchhandle for
both.
2020-09-22 09:56:26 +02:00
Evan Hunt
dcee985b7f update all copyright headers to eliminate the typo 2020-09-14 16:20:40 -07:00
Mark Andrews
baf165ffd0 clear pointer before subtracting reference 2020-09-14 11:02:33 +10:00
Evan Hunt
57b4dde974 change from isc_nmhandle_ref/unref to isc_nmhandle attach/detach
Attaching and detaching handle pointers will make it easier to
determine where and why reference counting errors have occurred.

A handle needs to be referenced more than once when multiple
asynchronous operations are in flight, so callers must now maintain
multiple handle pointers for each pending operation. For example,
ns_client objects now contain:

        - reqhandle:    held while waiting for a request callback (query,
                        notify, update)
        - sendhandle:   held while waiting for a send callback
        - fetchhandle:  held while waiting for a recursive fetch to
                        complete
        - updatehandle: held while waiting for an update-forwarding
                        task to complete

control channel connection objects now contain:

        - readhandle: held while waiting for a read callback
        - sendhandle: held while waiting for a send callback
        - cmdhandle:  held while an rndc command is running

httpd connections contain:

        - readhandle: held while waiting for a read callback
        - sendhandle: held while waiting for a send callback
2020-09-11 12:17:57 -07:00
Michał Kępień
b096a038e3 Update library API versions 2020-08-06 09:10:06 +02:00
Mark Andrews
20488d6ad3 Map DNS_R_BADTSIG to FORMERR
Now that the log message has been printed set the result code to
DNS_R_FORMERR.  We don't do this via dns_result_torcode() as we
don't want upstream errors to produce FORMERR if that processing
end with DNS_R_BADTSIG.
2020-08-04 12:20:37 +00:00
Michał Kępień
97a2733ef9 Update library API versions 2020-07-15 22:54:13 +02:00
Diego Fronza
aab691d512 Fix ns_statscounter_recursclients underflow
The basic scenario for the problem was that in the process of
resolving a query, if any rrset was eligible for prefetching, then it
would trigger a call to query_prefetch(), this call would run in
parallel to the normal query processing.

The problem arises due to the fact that both query_prefetch(), and,
in the original thread, a call to ns_query_recurse(), try to attach
to the recursionquota, but recursing client stats counter is only
incremented if ns_query_recurse() attachs to it first.

Conversely, if fetch_callback() is called before prefetch_done(),
it would not only detach from recursionquota, but also decrement
the stats counter, if query_prefetch() attached to te quota first
that would result in a decrement not matched by an increment, as
expected.

To solve this issue an atomic bool was added, it is set once in
ns_query_recurse(), allowing fetch_callback() to check for it
and decrement stats accordingly.

For a more compreensive explanation check the thread comment below:
https://gitlab.isc.org/isc-projects/bind9/-/issues/1719#note_145857
2020-07-13 11:46:18 -03:00
Evan Hunt
23c7373d68 restore "blackhole" functionality
the blackhole ACL was accidentally disabled with respect to client
queries during the netmgr conversion.

in order to make this work for TCP, it was necessary to add a return
code to the accept callback functions passed to isc_nm_listentcp() and
isc_nm_listentcpdns().
2020-06-30 17:29:09 -07:00
Mark Andrews
abe2c84b1d Suppress cppcheck warnings:
cppcheck-suppress objectIndex
cppcheck-suppress nullPointerRedundantCheck
2020-06-25 12:04:36 +10:00
Evan Hunt
75c985c07f change the signature of recv callbacks to include a result code
this will allow recv event handlers to distinguish between cases
in which the region is NULL because of error, shutdown, or cancelation.
2020-06-19 12:33:26 -07:00
Evan Hunt
9e740cad21 make isc_nmsocket_{attach,detach}{} functions private
there is no need for a caller to reference-count socket objects.
they need tto be able tto close listener sockets (i.e., those
returned by isc_nm_listen{udp,tcp,tcpdns}), and an isc_nmsocket_close()
function has been added for that. other sockets are only accessed via
handles.
2020-06-19 09:39:50 +02:00
Michał Kępień
a8bc003d1b Update library API versions 2020-06-18 10:03:05 +02:00
Mark Andrews
924e141a15 Adjust NS_CLIENT_TCP_BUFFER_SIZE and cleanup client_allocsendbuf
NS_CLIENT_TCP_BUFFER_SIZE was 2 byte too large following the
move to netmgr add associated changes to lib/ns/client.c and
as a result an INSIST could be trigger if the DNS message being
constructed had a checkpoint stage that fell in those two extra
bytes.  Adjusted NS_CLIENT_TCP_BUFFER_SIZE and cleaned up
client_allocsendbuf now that the previously reserved 2 bytes
are no longer used.
2020-06-18 09:59:19 +02:00
Mark Andrews
e5b2eca1d3 The dsset returned by dns_keynode_dsset needs to be thread safe.
- clone keynode->dsset rather than return a pointer so that thread
  use is independent of each other.
- hold a reference to the dsset (keynode) so it can't be deleted
  while in use.
- create a new keynode when removing DS records so that dangling
  pointers to the deleted records will not occur.
- use a rwlock when accessing the rdatalist to prevent instabilities
  when DS records are added.
2020-06-11 16:02:09 +10:00
Michal Nowak
5bbc6dd7f1 Fix "make dist"
Make various adjustments necessary to enable "make dist" to build a BIND
source tarball whose contents are complete enough to build binaries, run
unit & system tests, and generate documentation on Unix systems.

Known outstanding issues:

  - "make distcheck" does not work yet.
  - Tests do not work for out-of-tree source-tarball-based builds.
  - Source tarballs are not complete enough for building on Windows.

All of the above will be addressed in due course.
2020-06-05 13:19:49 +02:00
Mark Andrews
ae55fbbe9c Ignore attempts to add DS records at zone apex
DS records belong in the parent zone at a zone cut and
are not retrievable with modern recursive servers.
2020-06-04 16:00:33 +02:00
Michal Nowak
eddece7841 Associate unit test data dir with a more specific variable
Having 'TESTS', the Automake variable and 'TESTS' the unit test data dir
seems confusing, lets rename the latter to to 'TESTS_DIR'.
2020-06-04 12:56:57 +02:00
Ondřej Surý
4c23724c97 Move the dependencies from sln to vcxproj files 2020-05-28 08:08:30 +02:00
Ondřej Surý
bfd87e453d Restore the GSSAPI compilation on Windows (but we should really switch to SSPI/Kerberos) 2020-05-28 08:07:57 +02:00
Evan Hunt
249184e03e add a quick-and-dirty method of debugging a single query
when built with "configure --enable-singletrace", named will produce
detailed query logging at the highest debug level for any query with
query ID zero.

this enables monitoring of the progress of a single query by specifying
the QID using "dig +qid=0". the "client" logging category should be set
to a low severity level to suppress logging of other queries. (the
chance of another query using QID=0 at the same time is only 1 in 2^16.)

"--enable-singletrace" turns on "--enable-querytrace" as well, so if the
logging severity is not lowered, all other queries will be logged
verbosely as well. compiling with either of these options will impair
query performance; they should only be turned on when testing or
troubleshooting.
2020-05-26 00:47:18 -07:00
Evan Hunt
57e54c46e4 change "expr == false" to "!expr" in conditionals 2020-05-25 16:09:57 -07:00
Evan Hunt
68a1c9d679 change 'expr == true' to 'expr' in conditionals 2020-05-25 16:09:57 -07:00
Michal Nowak
bfa6ecb796 Provide unit test driver
This adds a unit test driver for BIND with Automake.  It runs the unit
test program provided as its sole command line argument and then looks
for a core dump generated by that test program.  If one is found, the
driver prints the backtrace into the test log.
2020-05-21 12:13:01 +02:00
Mark Andrews
c7cdc47cc5 move provide-ixfr testing after the serial has been checked 2020-05-14 16:37:34 +10:00
Mark Andrews
919a9ece25 enforce record count maximums 2020-05-13 15:35:28 +10:00
Mark Andrews
79de6edde8 allow grant rules to be retrieved 2020-05-13 15:35:28 +10:00
Diego Fronza
f2bf7beeb6 Added new logging category rpz-passthru
It is now possible to use the new logging category "rpz-passthru"
to redirect RPZ passthru activity to a dedicate logging channel.
2020-05-07 11:44:48 -03:00
Ondřej Surý
c86ebeebd2 As libltdl is convenience library, link it just into libisc 2020-04-30 15:33:44 +02:00