[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

Sean Silva chisophugis at gmail.com
Wed May 20 15:31:28 PDT 2015


I found an interesting datapoint:

In the last 10,000 revisions of LLVM+Clang, only 10 revisions actually
caused the binary of MultiSource/Benchmarks/BitBench/five11 to change. So
if just store a hash of the binary in the database, we should be able to
pool all samples we have collected while the binary is the the same as it
currently is, which will let us use significantly more datapoints for the
reference.

Also, we can trivially eliminate running the regression detection algorithm
if the binary hasn't changed.

-- Sean Silva

On Mon, May 18, 2015 at 9:02 PM, Chris Matthews <chris.matthews at apple.com>
wrote:

> The reruns flag already does that.  It helps a bit, but only as long as
> the the benchmark is flagged as regressed.
>
>
> On May 18, 2015, at 8:28 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
>
> On Mon, May 18, 2015 at 11:24 AM, Mikhail Zolotukhin <
> mzolotukhin at apple.com> wrote:
>
>> Hi Chris and others!
>>
>> I totally support any work in this direction.
>>
>> In the current state LNT’s regression detection system is too noisy,
>> which makes it almost impossible to use in some cases. If after each run a
>> developer gets a dozen of ‘regressions’, none of which happens to be real,
>> he/she won’t care about such reports after a while. We clearly need to
>> filter out as much noise as we can - and as it turns out even simplest
>> techniques could help here. For example, the technique I used (which you
>> mentioned earlier) takes ~15 lines of code to implement and filters out
>> almost all noise in our internal data-sets. It’d be really cool to have
>> something more scientifically-proven though:)
>>
>> One thing to add from me - I think we should try to do our best in
>> assumption that we don’t have enough samples. Of course, the more data we
>> have - the better, but in many cases we can’t (or we don’t want) to
>> increase number os samples, since it dramatically increases testing time.
>>
>
> Why not just start out with only a few samples, then collect more for
> benchmarks that appear to have changed?
>
> -- Sean Silva
>
>
>> That’s not to discourage anyone from increasing number of samples, or
>> adding techniques relying on a significant number of samples, but rather to
>> try mining as many ‘samples’ as possible from the data we have - e.g. I
>> absolutely agree with your idea to pass more than 1 previous run.
>>
>> Thanks,
>> Michael
>>
>> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/b57c5e84/attachment.html>


More information about the llvm-dev mailing list