[LNT] r238965 - Improved regression detection with min of diffs
Chris Matthews via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 2 11:49:03 PDT 2015
No I don’t think so, it makes the regression algorithm less vulnerable to both spikes and drops. The old approach with mins would ignore a spike, but trigger on a single point drop. Now a single sample that is largely different is completely ignored. Large improvements are also detected, except in the case where the data is bimodal, and the lesser mode is near the upper mode’s improvement.
> On Sep 2, 2015, at 10:33 AM, James Molloy <james at jamesmolloy.co.uk> wrote:
> Also, doesn't this make the regression detection algorithm very vulnerable to transient spikes (perhaps due to wonky data) or large compiler improvements? A one-off difference shouldn't affect the difference computed on aggregate.
> On Wed, 2 Sep 2015 at 17:28 Daniel Dunbar via llvm-commits <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
> Hi Chris,
> This change has some questionable behavior and works pretty poorly when dealing with instances that have reasonable sample counts per run (e.g., 10).
> Given this ComparisonResult:
> ComparisonResult(safe_min, False, False, [0.873164, 0.822148, 0.877115, 0.830048, 0.830452, 0.897407, 0.817558, 0.842067, 0.836447, 0.875904], [0.695299, 0.798883, 0.676503, 0.671114, 0.648465, 0.653936, 0.673405, 0.646322, 0.654594, 0.667564], 0.05, False, )
> where all the previous samples are from the immediately preceding run (not from multiple runs), then this ends up computing:
> cr.current == 0.8176
> cr.previous == 0.6463
> cr.delta == 0.0187
> cr.pct_delta == 2.34%
> which doesn't make any sense to the user who is just seeing those values and wondering where the strange delta value came from.
> If we are going to keep this behavior, then I think that:
> a. The cr.previous value should be updated to be the selected value, so that all the values are at least consistent in the UI.
> b. This behavior should only be enabled when using prior samples across runs.
> I would also prefer this was modeled as a different "aggregation function" that shows in the UI, and maybe make that an per-instance default.
> - Daniel
> On Wed, Jun 3, 2015 at 2:01 PM, Chris Matthews <cmatthews5 at apple.com <mailto:cmatthews5 at apple.com>> wrote:
> Author: cmatthews
> Date: Wed Jun 3 16:01:23 2015
> New Revision: 238965
> URL: http://llvm.org/viewvc/llvm-project?rev=238965&view=rev <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject-3Frev-3D238965-26view-3Drev&d=BQMFaQ&c=eEvniauFctOgLOKGJOplqw&r=sCYjGfqxzUYZGf3nWUM6D263xFvlkMzkQU_bdBSE1DY&m=wsyRVVznnryv8J12BgkYv1yGstMzJAc2yYTE6zHmSk0&s=rqEVThyTpkHLe5uAH7n1zuRy9iFYSbtA61VMWihwlrU&e=>
> Improved regression detection with min of diffs
> This change reduces false positives in LNT's regression detection
> algorithm by having it use more past data. Instead of calculating the
> delta as the difference of min(current)-min(prev), we use the smallest
> difference between min(current) and all prevs. When the data has noise
> or is multimodal this reduces our chances of flagging a false positive
> Modified: lnt/trunk/lnt/server/reporting/analysis.py
> URL: http://llvm.org/viewvc/llvm-project/lnt/trunk/lnt/server/reporting/analysis.py?rev=238965&r1=238964&r2=238965&view=diff <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_lnt_trunk_lnt_server_reporting_analysis.py-3Frev-3D238965-26r1-3D238964-26r2-3D238965-26view-3Ddiff&d=BQMFaQ&c=eEvniauFctOgLOKGJOplqw&r=sCYjGfqxzUYZGf3nWUM6D263xFvlkMzkQU_bdBSE1DY&m=wsyRVVznnryv8J12BgkYv1yGstMzJAc2yYTE6zHmSk0&s=_6z2OunP4brgSkGgYGq-qd0mVVW2v1aMLysorzU_QBg&e=>
> --- lnt/trunk/lnt/server/reporting/analysis.py (original)
> +++ lnt/trunk/lnt/server/reporting/analysis.py Wed Jun 3 16:01:23 2015
> @@ -15,6 +15,21 @@ UNCHANGED_FAIL = 'UNCHANGED_FAIL'
> MIN_VALUE_PRECISION = 0.0001
> +def absmin_diff(current, prevs):
> + """Min of differences between current sample and all previous samples.
> + Given more than one min, use the last one detected which is probably a
> + newer value. Returns (difference, prev used)
> + """
> + diffs = [abs(current-prev) for prev in prevs]
> + smallest_pos = 0
> + smallest = diffs
> + for i, diff in enumerate(diffs):
> + if diff <= smallest:
> + smallest = diff
> + smallest_pos = i
> + return current-prevs[smallest_pos], prevs[smallest_pos]
> def calc_geomean(run_values):
> # NOTE Geometric mean applied only to positive values, so fix it by
> # adding MIN_VALUE to each value and substract it from the result.
> @@ -48,8 +63,8 @@ class ComparisonResult:
> # Compute the comparison status for the test value.
> if self.current and self.previous and self.previous != 0:
> - self.delta = self.current - self.previous
> - self.pct_delta = self.delta / self.previous
> + self.delta, value = absmin_diff(self.current, prev_samples)
> + self.pct_delta = self.delta / value
> self.delta = 0
> self.pct_delta = 0.0
> Modified: lnt/trunk/tests/server/reporting/analysis.py
> URL: http://llvm.org/viewvc/llvm-project/lnt/trunk/tests/server/reporting/analysis.py?rev=238965&r1=238964&r2=238965&view=diff <https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_lnt_trunk_tests_server_reporting_analysis.py-3Frev-3D238965-26r1-3D238964-26r2-3D238965-26view-3Ddiff&d=BQMFaQ&c=eEvniauFctOgLOKGJOplqw&r=sCYjGfqxzUYZGf3nWUM6D263xFvlkMzkQU_bdBSE1DY&m=wsyRVVznnryv8J12BgkYv1yGstMzJAc2yYTE6zHmSk0&s=f4bpeqVu2SpwaN3ZuMAiORuuUkL9oa7aH4pZqLoDCy8&e=>
> --- lnt/trunk/tests/server/reporting/analysis.py (original)
> +++ lnt/trunk/tests/server/reporting/analysis.py Wed Jun 3 16:01:23 2015
> @@ -2,8 +2,10 @@
> # RUN: python %s
> import unittest
> from lnt.server.reporting.analysis import ComparisonResult, REGRESSED, IMPROVED
> from lnt.server.reporting.analysis import UNCHANGED_PASS, UNCHANGED_FAIL
> +from lnt.server.reporting.analysis import absmin_diff
> FLAT_LINE = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
> 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
> @@ -309,5 +311,18 @@ class ComparisonResultTest(unittest.Test
> # Fixme
> # self.assertEquals(slow.get_value_status(), IMPROVED)
> +class AbsMinTester(unittest.TestCase):
> + def test_absmin(self):
> + """Test finding smallest difference."""
> + self.assertEqual(absmin_diff(1, [2, 2, 3]), (-1, 2))
> + self.assertEqual(absmin_diff(1, [1, 2, 3]), (0, 1))
> + self.assertEqual(absmin_diff(1, ), (-1, 2))
> + self.assertEqual(absmin_diff(1.5, [1, 4, 4]), (0.5, 1))
> + self.assertEqual(absmin_diff(5, [1, 2, 1]), (3, 2))
> + self.assertEqual(absmin_diff(1, [2, 0, 3]), (1, 0))
> if __name__ == '__main__':
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.cs.uiuc.edu_mailman_listinfo_llvm-2Dcommits&d=BQMFaQ&c=eEvniauFctOgLOKGJOplqw&r=sCYjGfqxzUYZGf3nWUM6D263xFvlkMzkQU_bdBSE1DY&m=wsyRVVznnryv8J12BgkYv1yGstMzJAc2yYTE6zHmSk0&s=9pJrOn6nX0Xi_wSfTb722WQfcw1euEyRZFzG-AG8MCQ&e=>
> llvm-commits mailing list
> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Dcommits&d=BQMFaQ&c=eEvniauFctOgLOKGJOplqw&r=sCYjGfqxzUYZGf3nWUM6D263xFvlkMzkQU_bdBSE1DY&m=wsyRVVznnryv8J12BgkYv1yGstMzJAc2yYTE6zHmSk0&s=wQRSiZEjIWuBqIAOhNk8wcFE9H6GsHnydq6CLXqLQc8&e=>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits