Fix nonsensical stale TTL values in cache dump

When introducing change 5149, "rndc dumpdb" started to print a line
above a stale RRset, indicating how long the data will be retained.

At that time, I thought it should also be possible to load
a cache from file. But if a TTL has a value of 0 (because it is stale),
stale entries wouldn't be loaded from file. So, I added the
'max-stale-ttl' to TTL values, and adjusted the $DATE accordingly.

Since we actually don't have a "load cache from file" feature, this
is premature and is causing confusion at operators. This commit
changes the 'max-stale-ttl' adjustments.

A check in the serve-stale system test is added for a non-stale
RRset (longttl.example) to make sure the TTL in cache is sensible.

Also, the comment above stale RRsets could have nonsensical
values. A possible reason why this may happen is when the RRset was
marked a stale but the 'max-stale-ttl' has passed (and is actually an
RRset awaiting cleanup). This would lead to the "will be retained"
value to be negative (but since it is stored in an uint32_t, you would
get a nonsensical value (e.g. 4294362497).

To mitigate against this, we now also check if the header is not
ancient. In addition we check if the stale_ttl would be negative, and
if so we set it to 0. Most likely this will not happen because the
header would already have been marked ancient, but there is a possible
race condition where the 'rdh_ttl + serve_stale_ttl' has passed,
but the header has not been checked for staleness.

(cherry picked from commit 2a5e0232ed)
This commit is contained in:
Matthijs Mekking
2021-03-10 12:09:16 +01:00
parent 7c2b5495e0
commit dcf6e3e58a
5 changed files with 35 additions and 14 deletions

View File

@@ -108,9 +108,10 @@ sleep 2
echo_i "sending queries for tests $((n+1))-$((n+4))..."
$DIG -p ${PORT} @10.53.0.1 data.example TXT > dig.out.test$((n+1)) &
$DIG -p ${PORT} @10.53.0.1 othertype.example CAA > dig.out.test$((n+2)) &
$DIG -p ${PORT} @10.53.0.1 nodata.example TXT > dig.out.test$((n+3)) &
$DIG -p ${PORT} @10.53.0.1 nxdomain.example TXT > dig.out.test$((n+4))
$DIG -p ${PORT} @10.53.0.1 longttl.example TXT > dig.out.test$((n+2)) &
$DIG -p ${PORT} @10.53.0.1 othertype.example CAA > dig.out.test$((n+3)) &
$DIG -p ${PORT} @10.53.0.1 nodata.example TXT > dig.out.test$((n+4)) &
$DIG -p ${PORT} @10.53.0.1 nxdomain.example TXT > dig.out.test$((n+5))
wait
@@ -136,6 +137,15 @@ awk '/; answer/ { x=$0; getline; print x, $0}' ns1/named_dump.db.test$n |
if [ $ret != 0 ]; then echo_i "failed"; fi
status=$((status+ret))
n=$((n+1))
echo_i "check non-stale longttl.example ($n)"
ret=0
grep "status: NOERROR" dig.out.test$n > /dev/null || ret=1
grep "ANSWER: 1," dig.out.test$n > /dev/null || ret=1
grep "longttl\.example\..*59[0-9].*IN.*TXT.*A text record with a 600 second ttl" dig.out.test$n > /dev/null || ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=$((status+ret))
n=$((n+1))
echo_i "check stale othertype.example ($n)"
ret=0