[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives
Philip Reames
listmail at philipreames.com
Tue Jun 2 14:24:11 PDT 2015
Personally, I would prefer this either live in it's own repository, or
llvm/tools/. None of my use cases will likely involve the test-suite.
p.s. If this is going to end up an llvm tool, it will need to follow
LLVM style.
p.p.s. We should probably start a new thread with the proposed addition
since I imagine many folks are ignoring this one by now given how deep
it's gotten.
Philip
On 06/02/2015 12:04 PM, Chris Matthews wrote:
> I like that idea!
>
>
>> On Jun 2, 2015, at 12:00 PM, Smith, Kevin B <kevin.b.smith at intel.com
>> <mailto:kevin.b.smith at intel.com>> wrote:
>>
>> The code for cmpimage and getdep consists of five source files, with
>> the following sizes
>>
>> $ wc *
>>
>> 5912 20353 191869 cmpimage.cpp
>>
>> 290 1328 10668 elf.h
>>
>> 1496 5006 41691 getdep.cpp
>>
>> 233 959 7692 macho.h
>>
>> 403 1831 18394 pecoff.h
>>
>> 8334 29477 270314 total
>>
>> to build each of them is just a simple compilation for whatever C++
>> compiler you happen to be using (clang, icc, cl, g++)
>>
>> $(CXX) –o cmpimage –O2 cmpimage.cpp
>>
>> $(CXX) –o getdep –O2 getdep.cpp
>>
>> This seems like it would fit rather easily into test-suite/tools,
>> which already exists and has a Makefile that the commands to build
>>
>> these could be integrated into.
>>
>> This is my best guess/opinion based on a cursory look over the
>> test-suite directory structure.
>>
>> Kevin
>>
>> *From:*Chris Matthews [mailto:chris.matthews at apple.com]
>> *Sent:* Thursday, May 28, 2015 1:02 PM
>> *To:* Smith, Kevin B
>> *Cc:* Philip Reames; Sean Silva; LLVM Developers Mailing List
>> *Subject:* Re: [LLVMdev] Proposal: change LNT’s regression detection
>> algorithm and how it is used to reduce false positives
>>
>> Where is the best place to keep this?
>>
>> - As third party tool we all use?
>>
>> - Contribute as new project?
>>
>> - Lives in test-suite/utils?
>>
>> - Lives in llvm/utils?
>>
>> On May 28, 2015, at 11:22 AM, Smith, Kevin B
>> <kevin.b.smith at intel.com <mailto:kevin.b.smith at intel.com>> wrote:
>>
>> OK, there is interest from at least a couple of people. What
>> should next steps be?
>>
>> Kevin
>>
>> *From:*Chris Matthews [mailto:chris.matthews at apple.com]
>> *Sent:* Thursday, May 28, 2015 10:57 AM
>> *To:* Philip Reames
>> *Cc:* Smith, Kevin B; Sean Silva; LLVM Developers Mailing List
>> *Subject:* Re: [LLVMdev] Proposal: change LNT’s regression
>> detection algorithm and how it is used to reduce false positives
>>
>> I agree. I think there are a lot of exciting uses for this tool.
>> A stage 3 build bot would be another one.
>>
>> On May 28, 2015, at 10:14 AM, Philip Reames
>> <listmail at philipreames.com
>> <mailto:listmail at philipreames.com>> wrote:
>>
>> I'd love to see this tool contributed, even it isn't used for
>> regression detection work. I've got a couple of hacked up
>> scripts which do similar things and having a robust tool
>> available for this would be very useful.
>>
>> Philip
>>
>> On 05/26/2015 09:53 AM, Smith, Kevin B wrote:
>>
>> Intel has a binary comparator tool that we have been
>> using for several years for comparing output binaries
>>
>> to see if the code within them is considered identical.
>> We use it to eliminate runs (and therefore some
>> performance noise)
>>
>> from our own performance tracking tools.
>>
>> We are willing to contribute the source code for this to
>> the LLVM community if there is interest.
>>
>> There are two programs involved: getdep, which displays
>> the list of DLL/.so dependencies of the image in
>> question, and cmpimage itself, which does the comparison
>> ignoring the parts not contributed by the compiler. The
>> cmpimage program is also almost completely derived from
>> the published object format descriptions.
>>
>> Let me know if there is interest in these pieces of
>> tooling, and if so, what you think next steps should be.
>>
>> Kevin B. Smith
>>
>> *From:*llvmdev-bounces at cs.uiuc.edu
>> <mailto:llvmdev-bounces at cs.uiuc.edu>
>> [mailto:llvmdev-bounces at cs.uiuc.edu] *On Behalf Of *Sean
>> Silva
>> *Sent:* Thursday, May 21, 2015 2:14 PM
>> *To:* Chris Matthews
>> *Cc:* LLVM Developers Mailing List
>> *Subject:* Re: [LLVMdev] Proposal: change LNT’s
>> regression detection algorithm and how it is used to
>> reduce false positives
>>
>> On Thu, May 21, 2015 at 11:24 AM, Chris Matthews
>> <chris.matthews at apple.com
>> <mailto:chris.matthews at apple.com>> wrote:
>>
>> I agree this is a great idea. I think it needs to be
>> fleshed out a little though.
>>
>> It would still be wise to run the regression detection
>> algorithm, because the test suite changes and the
>> machines change, and the algorithm is not perfect yet.
>> It would be a valuable source of information though.
>>
>> How would running it as part of regular testing change
>> anything? Presumably the only purpose it would serve is
>> retrospectively going back and seeing false-positives in
>> the aggregate. But if we are already doing offline
>> analysis, we can run the regression detection algorithm
>> (or any prospective ones) offline on the raw data; it
>> doesn't take that long.
>>
>>
>> This is not a small change to how LNT works, so I
>> think some due diligence is necessary. Is clang
>> *really* that deterministic, especially over
>> successive revs?
>>
>> Yes. Actually, google's build system depends on this for
>> its caching strategy to work and so the google guys are
>> usually on top of any issues in this respect (thanks
>> google guys!).
>>
>> I know it is supposed to be. Does anyone have any
>> data to show this is going to be an effective
>> approach? It seems like there are benchmarks in the
>> test-suite which use __DATE__ and __TIME__ in them. I
>> assume that will be a problem?
>>
>> __DATE__ and __TIME__ should be easy to solve by
>> modifying the benchmark, or teaching clang to always
>> return a fixed value for them (maybe we already have
>> this? IIRC google's build system does something like
>> this; or maybe the do it at the OS level).
>>
>> -- Sean Silva
>>
>>
>> > On May 21, 2015, at 1:43 AM, Renato Golin
>> <renato.golin at linaro.org
>> <mailto:renato.golin at linaro.org>> wrote:
>> >
>> > On 20 May 2015 at 23:31, Sean Silva
>> <chisophugis at gmail.com
>> <mailto:chisophugis at gmail.com>> wrote:
>> >> In the last 10,000 revisions of LLVM+Clang, only
>> 10 revisions actually
>> >> caused the binary of
>> MultiSource/Benchmarks/BitBench/five11 to change. So if
>> >> just store a hash of the binary in the database,
>> we should be able to pool
>> >> all samples we have collected while the binary is
>> the the same as it
>> >> currently is, which will let us use significantly
>> more datapoints for the
>> >> reference.
>> >
>> > +1
>> >
>> >
>> >> Also, we can trivially eliminate running the
>> regression detection algorithm
>> >> if the binary hasn't changed.
>> >
>> > +2!
>> >
>> > --renato
>>
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>> http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>>
>>
>> _______________________________________________
>>
>> LLVM Developers mailing list
>>
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>>
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150602/7636fc74/attachment.html>
More information about the llvm-dev
mailing list