Allow re-run of the shotgun jobs to reduce false positive

The false positive rate is about 10-20 % when evaluating shotgun results
from a single run. Attempt to reduce the false positive rate by allowing
a re-run of failed jobs.

While there is a slight risk that barely noticable decreases in
performance might slip by more easily in MRs, they'd still likely pop up
during nightly or pre-release testing.

Also increase the tolerance threshold for DoH latency comparisons, as
those tests often experience increased jitter in the tail end latencies.

(cherry picked from commit 5eab352478)
This commit is contained in:
Nicki Křížek
2025-03-12 17:24:05 +01:00
parent 61443486bb
commit cb81260e4a

View File

@@ -374,6 +374,9 @@ stages:
SHOTGUN_ROUNDS: 3
- &shotgun_rule_other
if: '$CI_PIPELINE_SOURCE =~ /^(api|pipeline|schedule|trigger|web)$/'
# when using data from a single run, the overall instability of the results
# causes quite high false positive rate, rerun the test to attemp to reduce those
retry: 1
script:
- if [ -z "$BASELINE" ]; then export BASELINE=$BIND_BASELINE_VERSION; fi # this dotenv variable can't be set in the rules section, because rules are evaluated before any jobs run
- PIPELINE_ID=$(curl -s -X POST --fail
@@ -1607,9 +1610,6 @@ respdiff-third-party:
# Performance tests
# Run shotgun:udp right away, but delay other shotgun jobs sligthly in order to
# allow re-use of the built container image. Otherwise, the jobs would do the
# same builds in parallel rather than re-use the already built image.
shotgun:udp:
<<: *shotgun_job
variables:
@@ -1641,7 +1641,7 @@ shotgun:doh-get:
variables:
SHOTGUN_SCENARIO: doh-get
SHOTGUN_TRAFFIC_MULTIPLIER: 3
SHOTGUN_EVAL_THRESHOLD_LATENCY_PCTL_MAX: 0.3 # bump from the default due to increased tail-end jitter
SHOTGUN_EVAL_THRESHOLD_LATENCY_PCTL_MAX: 0.4 # bump from the default due to increased tail-end jitter
rules: *shotgun_rules_manual_mr
.stress-test: &stress_test