Commit Graph

691 Commits

Author SHA1 Message Date
Witold Kręcicki
7c960e89ea In resume_qmin check if the fetch context is already shutting down - if so, try to destroy it, don't continue 2019-03-29 14:30:40 +01:00
Mark Andrews
d8d04edfba assert hevent->rdataset is non NULL 2019-03-14 12:47:53 -07:00
Witold Kręcicki
56183a3917 Fix a race in fctx_cancelquery.
When sending an udp query (resquery_send) we first issue an asynchronous
isc_socket_connect and increment query->connects, then isc_socket_sendto2
and increment query->sends.
If we happen to cancel this query (fctx_cancelquery) we need to cancel
all operations we might have issued on this socket. If we are under very high
load the callback from isc_socket_connect (resquery_udpconnected) might have
not yet been fired. In this case we only cancel the CONNECT event on socket,
and ignore the SEND that's waiting there (as there is an `else if`).
Then we call dns_dispatch_removeresponse which kills the dispatcher socket
and calls isc_socket_close - but if system is under very high load, the send
we issued earlier might still not be complete - which triggers an assertion
because we're trying to close a socket that's still in use.

The fix is to always check if we have incomplete sends on the socket and cancel
them if we do.
2019-03-12 18:42:35 +01:00
Ondřej Surý
78d0cb0a7d Use coccinelle to remove explicit '#include <config.h>' from the source files 2019-03-08 15:15:05 +01:00
Witold Kręcicki
b49310ac06 If possible don't use forwarders when priming the resolver.
If we try to fetch a record from cache and need to look into
hints database we assume that the resolver is not primed and
start dns_resolver_prime(). Priming query is supposed to return
NSes for "." in ANSWER section and glue records for them in
ADDITIONAL section, so that we can fill that info in 'regular'
cache and not use hints db anymore.
However, if we're using a forwarder the priming query goes through
it, and if it's configured to return minimal answers we won't get
the addresses of root servers in ADDITIONAL section. Since the
only records for root servers we have are in hints database we'll
try to prime the resolver with every single query.

This patch adds a DNS_FETCHOPT_NOFORWARD flag which avoids using
forwarders if possible (that is if we have forward-first policy).
Using this flag on priming fetch fixes the problem as we get the
proper glue. With forward-only policy the problem is non-existent,
as we'll never ask for root server addresses because we'll never
have a need to query them.

Also added a test to confirm priming queries are not forwarded.
2019-01-16 17:41:13 -05:00
Witold Kręcicki
cfa2804e5a When a forwarder fails and we're not in a forward-only mode we
go back to regular resolution. When this happens the fetch timer is
already running, and we might end up in a situation where we we create
a fetch for qname-minimized query and after that the timer is triggered
and the query is retried (fctx_try) - which causes relaunching of
qname-minimization fetch - and since we already have a qmin fetch
for this fctx - assertion failure.

This fix stops the timer when doing qname minimization - qmin fetch
internal timer should take care of all the possible timeouts.
2019-01-16 11:09:30 -08:00
Mark Andrews
dadb924be7 adjust timeout to allow for ECN negotiation failures 2019-01-15 17:10:41 -08:00
Michał Kępień
33350626f9 Track forwarder timeouts in fetch contexts
Since following a delegation resets most fetch context state, address
marks (FCTX_ADDRINFO_MARK) set inside lib/dns/resolver.c are not
preserved when a delegation is followed.  This is fine for full
recursive resolution but when named is configured with "forward first;"
and one of the specified forwarders times out, triggering a fallback to
full recursive resolution, that forwarder should no longer be consulted
at each delegation point subsequently reached within a given fetch
context.

Add a new badnstype_t enum value, badns_forwarder, and use it to mark a
forwarder as bad when it times out in a "forward first;" configuration.
Since the bad server list is not cleaned when a fetch context follows a
delegation, this prevents a forwarder from being queried again after
falling back to full recursive resolution.  Yet, as each fetch context
maintains its own list of bad servers, this change does not cause a
forwarder timeout to prevent that forwarder from being used by other
fetch contexts.
2019-01-08 08:29:54 +01:00
Witold Kręcicki
d5793ecca2 - isc_task_create_bound - create a task bound to specific task queue
If we know that we'll have a task pool doing specific thing it's better
  to use this knowledge and bind tasks to task queues, this behaves better
  than randomly choosing the task queue.

- use bound resolver tasks - we have a pool of tasks doing resolutions,
  we can spread the load evenly using isc_task_create_bound

- quantum set universally to 25
2018-11-23 04:34:02 -05:00
Witold Kręcicki
929ea7c2c4 - Make isc_mutex_destroy return void
- Make isc_mutexblock_init/destroy return void
- Minor cleanups
2018-11-22 11:52:08 +00:00
Ondřej Surý
2f3eee5a4f isc_mutex_init returns 'void' 2018-11-22 11:51:49 +00:00
Ondřej Surý
6b65a4f86e Fix typo ISC_SHA256_DIGESTLENGHT -> ISC_SHA256_DIGESTLENGTH 2018-11-21 23:34:44 +01:00
Mark Andrews
93776c4c81 errors initalizing badcaches were not caught or cleaned up on error paths 2018-11-14 15:26:27 -05:00
Witold Kręcicki
9c8fead6d8 qname minimization: issue a warning only if the server is really broken 2018-11-14 19:55:10 +00:00
Ondřej Surý
e9a939841d Add min-cache-ttl and min-ncache-ttl keywords
Sometimes it is useful to set a 'floor' on the TTL for records
to be cached.  Some sites like to use ridiculously low TTLs for
some reason, and that often is not compatible with slow links.

Signed-off-by: Michael Milligan <milli@acmeps.com>
Signed-off-by: LaMont Jones <lamont@debian.org>
2018-11-14 18:24:53 +01:00
Ondřej Surý
23fff6c569 Hint the compiler with ISC_UNREACHABLE(); that code after INSIST(0); cannot be reached 2018-11-08 12:22:17 +07:00
Ondřej Surý
b2b43fd235 Turn (int & flag) into (int & flag) != 0 when implicitly typed to bool 2018-11-08 12:21:53 +07:00
Mark Andrews
99b25eb379 two dns_name_dup calls were not checked 2018-11-05 14:46:08 -08:00
Witold Kręcicki
2d0a33208c Cleanup fctx->finds before sending 'final' query after qname minimization.
At the beginning of qname minimization we get fctx->finds filled with what's
in the cache at this point, in worst case root servers. After doing full
run querying for NSes at different levels we need to clean it and refill
it with proper values from cache.
2018-11-05 09:57:11 +00:00
Witold Kręcicki
37df3ca8b6 Style nits 2018-10-29 19:22:10 +00:00
Witold Kręcicki
08460c8cb2 Don't do qname minimization when forwarding; Avoid some intermittent errors in qmin tests caused by timing 2018-10-29 19:22:10 +00:00
Ondřej Surý
b98ac2593c Add generic hashed message authentication code API (isc_hmac) to replace specific HMAC functions hmacmd5/hmacsha1/hmacsha2... 2018-10-25 08:15:42 +02:00
Mark Andrews
ba85bb1a85 whitespace 2018-10-23 12:15:04 +00:00
Mark Andrews
2b3b626cc1 set fctx->client to NULL 2018-10-23 12:15:04 +00:00
Mark Andrews
23766ff690 checkpoint 2018-10-23 12:15:04 +00:00
Witold Kręcicki
86246c7431 Initialize adbname->client properly; check for loops 2018-10-23 12:15:04 +00:00
Mark Andrews
2f36a62d16 use RUNTIME_CHECK 2018-10-23 12:15:04 +00:00
Mark Andrews
1a2a19c693 address fctx reference count leaks; style 2018-10-23 12:15:04 +00:00
Witold Kręcicki
f2af336dc4 Fix looping issues 2018-10-23 12:15:04 +00:00
Witold Kręcicki
70a1ba20ec QNAME miminimization should create a separate fetch context for each fetch -
this makes the cache more efficient and eliminates duplicates queries.
2018-10-23 12:15:04 +00:00
Mark Andrews
615ebc39e3 remove EDNS workarounds, update legacy test 2018-08-30 21:17:00 -07:00
Michał Kępień
24b9ec555a Do not treat a referral with a non-empty ANSWER section as an error
As part of resquery_response() refactoring [1], a goto statement was
replaced [2] with a call to a new function - originally called
rctx_delegation(), now folded into rctx_answer_none() - extracted from
existing code.  However, one call site of that refactored function does
not reset the "result" variable, causing a referral with a non-empty
ANSWER section to be inadvertently treated as an error, which prevents
resolution of names reliant on servers sending such responses.  Fix by
resetting the "result" variable to ISC_R_SUCCESS when a response
containing a non-empty ANSWER section can be treated as a delegation.

[1] see RT #45362

[2] see commit e1380a16741a3b4a57e54d7a9ce09dd12691522f
2018-08-22 10:14:37 +02:00
Witold Kręcicki
5cdb38c2c7 Remove unthreaded support 2018-08-16 17:18:52 +02:00
Evan Hunt
3f907b8bee caclulate nlabels and set *chainingp correctly 2018-08-08 14:33:19 -07:00
Evan Hunt
cac3978af2 explicit DNAME query could trigger a crash if deny-answer-aliases was set 2018-08-08 14:33:19 -07:00
Ondřej Surý
994e656977 Replace custom isc_boolean_t with C standard bool type 2018-08-08 09:37:30 +02:00
Ondřej Surý
cb6a185c69 Replace custom isc_u?intNN_t types with C99 u?intNN_t types 2018-08-08 09:37:28 +02:00
Ondřej Surý
64fe6bbaf2 Replace ISC_PRINT_QUADFORMAT with inttypes.h format constants 2018-08-08 09:36:44 +02:00
Evan Hunt
dde66b8012 nits
- capitalize QNAME in the doc
- regenerate options/docbook
- whitespace
2018-06-12 09:20:13 +02:00
Witold Kręcicki
265052df49 qname-minimization: Some post-review style/minor fixes 2018-06-12 09:20:12 +02:00
Witold Kręcicki
058ce1e732 qname minimization: log how many qmin steps were taken 2018-06-12 09:18:47 +02:00
Witold Kręcicki
c04784c144 Disable qname minimization if we encounter a bad server 2018-06-12 09:18:47 +02:00
Evan Hunt
c8015eb33b style nits (mostly line length) 2018-06-12 09:18:47 +02:00
Evan Hunt
2ea47c7f34 rename test to qmin; add it to conf.sh.in and Makefile.in; fix copyrights 2018-06-12 09:18:47 +02:00
Witold Kręcicki
dd7bb617be - qname minimization:
- make qname-minimization option tristate {strict,relaxed,disabled}
 - go straight for the record if we hit NXDOMAIN in relaxed mode
 - go straight for the record after 3 labels without new delegation or 7 labels total

- use start of fetch (and not time of response) as 'now' time for querying cache for
  zonecut when following delegation.
2018-06-12 09:18:46 +02:00
Witold Kręcicki
0698158eb0 QNAME minimization 2018-06-12 09:18:46 +02:00
Witold Kręcicki
cb3208aa43 Don't fetch DNSKEY when fuzzing resolver 2018-06-06 15:06:23 +02:00
Tony Finch
abfbedc0b1 Move NSID logging to its own category
It is very verbose, so it is useful to be able to filter it out.
2018-06-05 12:10:37 +10:00
Ondřej Surý
99ba29bc52 Change isc_random() to be just PRNG, and add isc_nonce_buf() that uses CSPRNG
This commit reverts the previous change to use system provided
entropy, as (SYS_)getrandom is very slow on Linux because it is
a syscall.

The change introduced in this commit adds a new call isc_nonce_buf
that uses CSPRNG from cryptographic library provider to generate
secure data that can be and must be used for generating nonces.
Example usage would be DNS cookies.

The isc_random() API has been changed to use fast PRNG that is not
cryptographically secure, but runs entirely in user space.  Two
contestants have been considered xoroshiro family of the functions
by Villa&Blackman and PCG by O'Neill.  After a consideration the
xoshiro128starstar function has been used as uint32_t random number
provider because it is very fast and has good enough properties
for our usage pattern.

The other change introduced in the commit is the more extensive usage
of isc_random_uniform in places where the usage pattern was
isc_random() % n to prevent modulo bias.  For usage patterns where
only 16 or 8 bits are needed (DNS Message ID), the isc_random()
functions has been renamed to isc_random32(), and isc_random16() and
isc_random8() functions have been introduced by &-ing the
isc_random32() output with 0xffff and 0xff.  Please note that the
functions that uses stripped down bit count doesn't pass our
NIST SP 800-22 based random test.
2018-05-29 22:58:21 +02:00
Evan Hunt
e324449349 remove the experimental authoritative ECS support from named
- mark the 'geoip-use-ecs' option obsolete; warn when it is used
  in named.conf
- prohibit 'ecs' ACL tags in named.conf; note that this is a fatal error
  since simply ignoring the tags could make ACLs behave unpredictably
- re-simplify the radix and iptable code
- clean up dns_acl_match(), dns_aclelement_match(), dns_acl_allowed()
  and dns_geoip_match() so they no longer take ecs options
- remove the ECS-specific unit and system test cases
- remove references to ECS from the ARM
2018-05-25 08:21:25 -07:00