[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

Philip Reames listmail at philipreames.com
Thu May 28 10:14:21 PDT 2015


I'd love to see this tool contributed, even it isn't used for regression 
detection work.  I've got a couple of hacked up scripts which do similar 
things and having a robust tool available for this would be very useful.

Philip

On 05/26/2015 09:53 AM, Smith, Kevin B wrote:
>
> Intel has a binary comparator tool that we have been using for several 
> years for comparing output binaries
>
> to see if the code within them is considered identical.  We use it to 
> eliminate runs (and therefore some performance noise)
>
> from our own performance tracking tools.
>
> We are willing to contribute the source code for this to the LLVM 
> community if there is interest.
>
> There are two programs involved:  getdep, which displays the list of 
> DLL/.so dependencies of the image in question, and cmpimage itself, 
> which does the comparison ignoring the parts not contributed by the 
> compiler.  The cmpimage program is also almost completely derived from 
> the published object format descriptions.
>
> Let me know if there is interest in these pieces of tooling, and if 
> so, what you think next steps should be.
>
> Kevin B. Smith
>
> *From:*llvmdev-bounces at cs.uiuc.edu 
> [mailto:llvmdev-bounces at cs.uiuc.edu] *On Behalf Of *Sean Silva
> *Sent:* Thursday, May 21, 2015 2:14 PM
> *To:* Chris Matthews
> *Cc:* LLVM Developers Mailing List
> *Subject:* Re: [LLVMdev] Proposal: change LNT’s regression detection 
> algorithm and how it is used to reduce false positives
>
> On Thu, May 21, 2015 at 11:24 AM, Chris Matthews 
> <chris.matthews at apple.com <mailto:chris.matthews at apple.com>> wrote:
>
> I agree this is a great idea.  I think it needs to be fleshed out a 
> little though.
>
> It would still be wise to run the regression detection algorithm, 
> because the test suite changes and the machines change, and the 
> algorithm is not perfect yet. It would be a valuable source of 
> information though.
>
> How would running it as part of regular testing change anything? 
> Presumably the only purpose it would serve is retrospectively going 
> back and seeing false-positives in the aggregate. But if we are 
> already doing offline analysis, we can run the regression detection 
> algorithm (or any prospective ones) offline on the raw data; it 
> doesn't take that long.
>
>
>     This is not a small change to how LNT works, so I think some due
>     diligence is necessary.  Is clang *really* that deterministic,
>     especially over successive revs?
>
> Yes. Actually, google's build system depends on this for its caching 
> strategy to work and so the google guys are usually on top of any 
> issues in this respect (thanks google guys!).
>
>     I know it is supposed to be.  Does anyone have any data to show
>     this is going to be an effective approach?  It seems like there
>     are benchmarks in the test-suite which use __DATE__ and __TIME__
>     in them. I assume that will be a problem?
>
> __DATE__ and __TIME__ should be easy to solve by modifying the 
> benchmark, or teaching clang to always return a fixed value for them 
> (maybe we already have this? IIRC google's build system does something 
> like this; or maybe the do it at the OS level).
>
> -- Sean Silva
>
>
>     > On May 21, 2015, at 1:43 AM, Renato Golin
>     <renato.golin at linaro.org <mailto:renato.golin at linaro.org>> wrote:
>     >
>     > On 20 May 2015 at 23:31, Sean Silva <chisophugis at gmail.com
>     <mailto:chisophugis at gmail.com>> wrote:
>     >> In the last 10,000 revisions of LLVM+Clang, only 10 revisions
>     actually
>     >> caused the binary of MultiSource/Benchmarks/BitBench/five11 to
>     change. So if
>     >> just store a hash of the binary in the database, we should be
>     able to pool
>     >> all samples we have collected while the binary is the the same
>     as it
>     >> currently is, which will let us use significantly more
>     datapoints for the
>     >> reference.
>     >
>     > +1
>     >
>     >
>     >> Also, we can trivially eliminate running the regression
>     detection algorithm
>     >> if the binary hasn't changed.
>     >
>     > +2!
>     >
>     > --renato
>
>     > _______________________________________________
>     > LLVM Developers mailing list
>     > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>     http://llvm.cs.uiuc.edu
>     > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150528/8e58d54b/attachment.html>


More information about the llvm-dev mailing list