<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jul 21, 2015, at 3:50 PM, Xinliang David Li <<a href="mailto:xinliangli@gmail.com" class="">xinliangli@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">reg_values is a file static variable in cselib.c -- so you might be able to reproduce the issue with a smaller reproducible.</div></div></blockquote>The problem here is that we haven’t defined the issue:) The original question was “does GlobalsModRef give us anything, or can we just turn it off?”, so my recent reply shows that turning it off hurts performance on 403.gcc.</div><div><br class=""></div><div>Michael</div><div><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><br class=""></div><div class="">David</div></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Tue, Jul 21, 2015 at 3:43 PM, Michael Zolotukhin <span dir="ltr" class=""><<a href="mailto:mzolotukhin@apple.com" target="_blank" class="">mzolotukhin@apple.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br class="">
> On Jul 21, 2015, at 3:34 PM, Daniel Berlin <<a href="mailto:dberlin@dberlin.org" class="">dberlin@dberlin.org</a>> wrote:<br class="">
><br class="">
> Based on function names and structures, this is some version of GCC :)<br class="">
</span>Yep, it’s 403.gcc from SPEC2006 (I thought I mentioned it - but probably I only did that on IRC).<br class="">
<span class=""><br class="">
> Any way you can post the entire .ll file?<br class="">
</span>It’s an LTO build, so it’d be troublesome.. I tried to print the module in lldb, but after a several minutes it hasn’t even finished printing globals (which I assume is the very beginning).<br class="">
<span class=""><br class="">
><br class="">
> Because it's globalsmodref, it's hard to debug without the other<br class="">
> functions, since it goes over all the functions to determine address<br class="">
> takenness, etc :)<br class="">
</span>Yep, I understand that - I’m still in debugger though, so if you’re interested in some particular data, I can try collecting it. I can try to dump the module too, but it might be not-practical in the end:)<br class="">
<span class="HOEnZb"><font color="#888888" class=""><br class="">
Michael<br class="">
</font></span><div class="HOEnZb"><div class="h5">><br class="">
><br class="">
> On Tue, Jul 21, 2015 at 3:23 PM, Michael Zolotukhin<br class="">
> <<a href="mailto:mzolotukhin@apple.com" class="">mzolotukhin@apple.com</a>> wrote:<br class="">
>> Hi Chandler,<br class="">
>><br class="">
>> We observed some regressions in our regular testing (despite I saw nothing<br class="">
>> suspicious in my runs). I did accurate investigation and was able to<br class="">
>> reproduce and track down the regression.<br class="">
>> I found the exact request to GlobalsModRef that results in the performance<br class="">
>> loss (I added a limit on number of requests into the implementation and<br class="">
>> bisected the number to find the interesting request).<br class="">
>><br class="">
>> Here are the details:<br class="">
>><br class="">
>> We’re checking following two locations:<br class="">
>><br class="">
>> (lldb) p ((llvm::Instruction*)(LocA.Ptr))->dump()<br class="">
>> %arrayidx.i = getelementptr inbounds [1 x %struct.elt_list*], [1 x<br class="">
>> %struct.elt_list*]* %te.i, i64 0, i64 %indvars.iv.i<br class="">
>> (lldb) p ((llvm::Instruction*)(LocB.Ptr))->dump()<br class="">
>> @reg_values = internal unnamed_addr global %struct.varray_head_tag* null,<br class="">
>> align 8<br class="">
>><br class="">
>> and the function in question is “cselib_init”:<br class="">
>> (lldb) p<br class="">
>> ((llvm::Instruction*)(LocA.Ptr))->getParent()->getParent()->getName()<br class="">
>> (llvm::StringRef) $3 = (Data = "cselib_init", Length = 11)<br class="">
>><br class="">
>> Corresponding underlying values:<br class="">
>> (lldb) p UV2->dump()<br class="">
>> @reg_values = internal unnamed_addr global %struct.varray_head_tag* null,<br class="">
>> align 8<br class="">
>> (lldb) p UV1->dump()<br class="">
>> %32 = load %struct.varray_head_tag*, %struct.varray_head_tag** @reg_values,<br class="">
>> align 8, !tbaa !2<br class="">
>><br class="">
>> Backtrace:<br class="">
>> (lldb) bt<br class="">
>> * thread #1: tid = 0x120baaf, 0x00000001038b752a libLTO.dylib`(anonymous<br class="">
>> namespace)::GlobalsModRef::alias(this=0x000000010eba5c10,<br class="">
>> LocA=0x00007fff5fbf4198, LocB=0x00007fff5fbf6268) + 570 at<br class="">
>> GlobalsModRef.cpp:519, queue = 'com.apple.main-thread', stop reason = step<br class="">
>> over<br class="">
>> * frame #0: 0x00000001038b752a libLTO.dylib`(anonymous<br class="">
>> namespace)::GlobalsModRef::alias(this=0x000000010eba5c10,<br class="">
>> LocA=0x00007fff5fbf4198, LocB=0x00007fff5fbf6268) + 570 at<br class="">
>> GlobalsModRef.cpp:519<br class="">
>> frame #1: 0x00000001038b82f7 libLTO.dylib`non-virtual thunk to<br class="">
>> (anonymous namespace)::GlobalsModRef::alias(this=0x000000010eba5c30,<br class="">
>> LocA=0x00007fff5fbf4198, LocB=0x00007fff5fbf6268) + 55 at<br class="">
>> GlobalsModRef.cpp:562<br class="">
>> frame #2: 0x00000001038d6aa8<br class="">
>> libLTO.dylib`llvm::AliasAnalysis::getModRefInfo(this=0x000000010eba5c30,<br class="">
>> S=0x000000010a1a22e0, Loc=0x00007fff5fbf6268) + 120 at AliasAnalysis.cpp:288<br class="">
>> frame #3: 0x0000000103a0a814<br class="">
>> libLTO.dylib`llvm::MemoryDependenceAnalysis::getPointerDependencyFrom(this=0x000000010e6cf0c0,<br class="">
>> MemLoc=0x00007fff5fbf6268, isLoad=true, ScanIt=llvm::BasicBlock::iterator at<br class="">
>> 0x00007fff5fbf4390, BB=0x000000010a19ffa0, QueryInst=0x000000010a1a20c8) +<br class="">
>> 1908 at MemoryDependenceAnalysis.cpp:570<br class="">
>> frame #4: 0x0000000103a0ffa5<br class="">
>> libLTO.dylib`llvm::MemoryDependenceAnalysis::GetNonLocalInfoForBlock(this=0x000000010e6cf0c0,<br class="">
>> QueryInst=0x000000010a1a20c8, Loc=0x00007fff5fbf6268, isLoad=true,<br class="">
>> BB=0x000000010a19ffa0, Cache=0x0000000100f9d568, NumSortedEntries=0) + 2165<br class="">
>> at MemoryDependenceAnalysis.cpp:965<br class="">
>> frame #5: 0x0000000103a0e3a9<br class="">
>> libLTO.dylib`llvm::MemoryDependenceAnalysis::getNonLocalPointerDepFromBB(this=0x000000010e6cf0c0,<br class="">
>> QueryInst=0x000000010a1a20c8, Pointer=0x00007fff5fbf62a8,<br class="">
>> Loc=0x00007fff5fbf6268, isLoad=true, StartBB=0x000000010a19ffa0,<br class="">
>> Result=0x00007fff5fbf6bf0, Visited=0x00007fff5fbf6208, SkipFirstBlock=false)<br class="">
>> + 5897 at MemoryDependenceAnalysis.cpp:1200<br class="">
>> frame #6: 0x0000000103a0cb3b<br class="">
>> libLTO.dylib`llvm::MemoryDependenceAnalysis::getNonLocalPointerDependency(this=0x000000010e6cf0c0,<br class="">
>> QueryInst=0x000000010a1a20c8, Result=0x00007fff5fbf6bf0) + 635 at<br class="">
>> MemoryDependenceAnalysis.cpp:911<br class="">
>> frame #7: 0x000000010340c5b5 libLTO.dylib`(anonymous<br class="">
>> namespace)::GVN::processNonLocalLoad(this=0x000000010e6ce680,<br class="">
>> LI=0x000000010a1a20c8) + 101 at GVN.cpp:1706<br class="">
>> frame #8: 0x0000000103408eef libLTO.dylib`(anonymous<br class="">
>> namespace)::GVN::processLoad(this=0x000000010e6ce680, L=0x000000010a1a20c8)<br class="">
>> + 1551 at GVN.cpp:1905<br class="">
>> frame #9: 0x00000001034080fd libLTO.dylib`(anonymous<br class="">
>> namespace)::GVN::processInstruction(this=0x000000010e6ce680,<br class="">
>> I=0x000000010a1a20c8) + 397 at GVN.cpp:2220<br class="">
>> frame #10: 0x0000000103407d1b libLTO.dylib`(anonymous<br class="">
>> namespace)::GVN::processBlock(this=0x000000010e6ce680,<br class="">
>> BB=0x000000010a19ffa0) + 251 at GVN.cpp:2394<br class="">
>> frame #11: 0x0000000103401755 libLTO.dylib`(anonymous<br class="">
>> namespace)::GVN::iterateOnFunction(this=0x000000010e6ce680,<br class="">
>> F=0x00000001085f69f8) + 1541 at GVN.cpp:2677<br class="">
>> frame #12: 0x0000000103400fef libLTO.dylib`(anonymous<br class="">
>> namespace)::GVN::runOnFunction(this=0x000000010e6ce680,<br class="">
>> F=0x00000001085f69f8) + 623 at GVN.cpp:2352<br class="">
>> frame #13: 0x00000001027cd05b<br class="">
>> libLTO.dylib`llvm::FPPassManager::runOnFunction(this=0x000000010eba6810,<br class="">
>> F=0x00000001085f69f8) + 427 at LegacyPassManager.cpp:1520<br class="">
>> frame #14: 0x00000001027cd375<br class="">
>> libLTO.dylib`llvm::FPPassManager::runOnModule(this=0x000000010eba6810,<br class="">
>> M=0x000000010115c5f0) + 117 at LegacyPassManager.cpp:1540<br class="">
>> frame #15: 0x00000001027cdda1 libLTO.dylib`(anonymous<br class="">
>> namespace)::MPPassManager::runOnModule(this=0x000000010e6cbaf0,<br class="">
>> M=0x000000010115c5f0) + 1409 at LegacyPassManager.cpp:1596<br class="">
>> frame #16: 0x00000001027cd636<br class="">
>> libLTO.dylib`llvm::legacy::PassManagerImpl::run(this=0x000000010e6cb740,<br class="">
>> M=0x000000010115c5f0) + 310 at LegacyPassManager.cpp:1698<br class="">
>> frame #17: 0x00000001027ce521<br class="">
>> libLTO.dylib`llvm::legacy::PassManager::run(this=0x00007fff5fbf82b8,<br class="">
>> M=0x000000010115c5f0) + 33 at LegacyPassManager.cpp:1729<br class="">
>><br class="">
>><br class="">
>> The function body is in the attached file.<br class="">
>><br class="">
>><br class="">
>><br class="">
>> GlobalsModRef reports NoAlias for this pair, here:<br class="">
>> if (GV1 || GV2) {<br class="">
>> // If the global's address is taken, pretend we don't know it's a<br class="">
>> pointer to<br class="">
>> // the global.<br class="">
>> if (GV1 && !NonAddressTakenGlobals.count(GV1))<br class="">
>> GV1 = nullptr;<br class="">
>> if (GV2 && !NonAddressTakenGlobals.count(GV2))<br class="">
>> GV2 = nullptr;<br class="">
>><br class="">
>> // If the two pointers are derived from two different non-addr-taken<br class="">
>> // globals, or if one is and the other isn't, we know these can't alias.<br class="">
>> if ((GV1 || GV2) && GV1 != GV2)<br class="">
>> return NoAlias;<br class="">
>><br class="">
>> // Otherwise if they are both derived from the same addr-taken global,<br class="">
>> we<br class="">
>> // can't know the two accesses don't overlap.<br class="">
>> }<br class="">
>><br class="">
>><br class="">
>> Thanks,<br class="">
>> Michael<br class="">
>><br class="">
>> On Jul 17, 2015, at 12:18 PM, Chandler Carruth <<a href="mailto:chandlerc@gmail.com" class="">chandlerc@gmail.com</a>> wrote:<br class="">
>><br class="">
>> On Fri, Jul 17, 2015 at 9:13 AM Evgeny Astigeevich<br class="">
>> <<a href="mailto:evgeny.astigeevich@arm.com" class="">evgeny.astigeevich@arm.com</a>> wrote:<br class="">
>>><br class="">
>>> It’s Dhrystone.<br class="">
>><br class="">
>> Dhrystone has historically not been a good indicator of real-world<br class="">
>> performance fluctuations, especially at this small of a shift.<br class="">
>><br class="">
>> I'd like to see if we see any fluctuation on larger and more realistic<br class="">
>> application benchmarks. One advantage of the flag being set is that we<br class="">
>> should get runs from folks who have automatic builds and runs periodically<br class="">
>> from trunk. Those should help give an accurate picture.<br class="">
>><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> From: Chandler Carruth [mailto:<a href="mailto:chandlerc@gmail.com" class="">chandlerc@gmail.com</a>]<br class="">
>>> Sent: 17 July 2015 16:10<br class="">
>>><br class="">
>>><br class="">
>>> To: Evgeny Astigeevich; Chandler Carruth<br class="">
>>> Cc: LLVM Developers Mailing List<br class="">
>>><br class="">
>>> Subject: Re: [LLVMdev] GlobalsModRef (and thus LTO) is completely broken<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> Can you say what Benchmark or give a test case so we understand the nature<br class="">
>>> of the regression? As Gerolf said, that will be important to understand what<br class="">
>>> is best to do.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> On Fri, Jul 17, 2015, 06:43 Evgeny Astigeevich<br class="">
>>> <<a href="mailto:Evgeny.Astigeevich@arm.com" class="">Evgeny.Astigeevich@arm.com</a>> wrote:<br class="">
>>><br class="">
>>> Yes, the regression is stable. I double checked this. A full benchmark run<br class="">
>>> consists of at least 10 sub-runs to validate the score.<br class="">
>>><br class="">
>>> I also checked if there were regressions of this benchmark across<br class="">
>>> different ARM hardware versions. I found all regressions of this benchmark<br class="">
>>> were in range 1.6%-2%.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> Kind regards,<br class="">
>>><br class="">
>>> Evgeny Astigeevich<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> From: Chandler Carruth [mailto:<a href="mailto:chandlerc@gmail.com" class="">chandlerc@gmail.com</a>]<br class="">
>>> Sent: 17 July 2015 07:52<br class="">
>>> To: Evgeny Astigeevich; Chandler Carruth<br class="">
>>> Cc: LLVM Developers Mailing List; Michael Zolotukhin<br class="">
>>><br class="">
>>><br class="">
>>> Subject: Re: [LLVMdev] GlobalsModRef (and thus LTO) is completely broken<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> Hey, thanks for benchmarking.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> How stable is the 2% regression?<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> Michael ran some benchmarks with GlobalsModRef completely disabled and the<br class="">
>>> only differences were in the noise. This was a complete spec2k6 run along<br class="">
>>> with some others. Based on the number of benchmarks run there, I'm going to<br class="">
>>> go ahead and submit these patches, but if you can clarify the impact here,<br class="">
>>> we can look at potentially some other tradeoff. I'm not particularly set on<br class="">
>>> one set of defaults, etc, I just don't want to keep patches held up based on<br class="">
>>> that. We can flip the default back and forth as new data arrives.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> On Thu, Jul 16, 2015 at 12:23 PM Evgeny Astigeevich<br class="">
>>> <<a href="mailto:evgeny.astigeevich@arm.com" class="">evgeny.astigeevich@arm.com</a>> wrote:<br class="">
>>><br class="">
>>> Hi Chandler,<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> I ran couple benchmarks with LTO turned on and your patches on ARM<br class="">
>>> hardware.<br class="">
>>><br class="">
>>> There were no performance degradation of one benchmark and 2% slowdown of<br class="">
>>> another benchmark.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> Kind regards,<br class="">
>>><br class="">
>>> Evgeny Astigeevich<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> From: <a href="mailto:llvmdev-bounces@cs.uiuc.edu" class="">llvmdev-bounces@cs.uiuc.edu</a> [mailto:<a href="mailto:llvmdev-bounces@cs.uiuc.edu" class="">llvmdev-bounces@cs.uiuc.edu</a>] On<br class="">
>>> Behalf Of Evgeny Astigeevich<br class="">
>>> Sent: 15 July 2015 15:12<br class="">
>>><br class="">
>>><br class="">
>>> To: 'Chandler Carruth'; Gerolf Hoflehner<br class="">
>>> Cc: LLVM Developers Mailing List<br class="">
>>> Subject: Re: [LLVMdev] GlobalsModRef (and thus LTO) is completely broken<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> Hi Chandler,<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> I would like to run some benchmarks on ARM hardware and to look at impact<br class="">
>>> of your patches on LTO.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> Kind regards,<br class="">
>>><br class="">
>>> Evgeny Astigeevich<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> From: <a href="mailto:llvmdev-bounces@cs.uiuc.edu" class="">llvmdev-bounces@cs.uiuc.edu</a> [mailto:<a href="mailto:llvmdev-bounces@cs.uiuc.edu" class="">llvmdev-bounces@cs.uiuc.edu</a>] On<br class="">
>>> Behalf Of Chandler Carruth<br class="">
>>><br class="">
>>><br class="">
>>> Sent: 15 July 2015 10:45<br class="">
>>> To: Chandler Carruth; Gerolf Hoflehner<br class="">
>>> Cc: LLVM Developers Mailing List<br class="">
>>> Subject: Re: [LLVMdev] GlobalsModRef (and thus LTO) is completely broken<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> I've fixed the obvious bugs I spotted in r242281. These should be pure<br class="">
>>> correctness improvements.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> I've sent the two patches I'm imagining to address the core issue here:<br class="">
>>><br class="">
>>> <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D11213&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=EsY1fGxczRcp-UjYSFv1rWmYXBpmsX7OsckxJsRUy8I&s=d4561YywaWoYOw4i0HZ8j4PKNAigcNH-78hxy3aFemE&e=" rel="noreferrer" target="_blank" class="">http://reviews.llvm.org/D11213</a><br class="">
>>><br class="">
>>> <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D11214&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=EsY1fGxczRcp-UjYSFv1rWmYXBpmsX7OsckxJsRUy8I&s=zQJk7H1XW9yvEMDPaDEh1JRp5-XJjqLM2PaWlK5fZU0&e=" rel="noreferrer" target="_blank" class="">http://reviews.llvm.org/D11214</a><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> Currently, I have the unsafe alias results disabled by default, but with a<br class="">
>>> flag that can re-enable them if needed. I don't feel really strongly about<br class="">
>>> which way the default is set -- but that may be because I don't have lots of<br class="">
>>> users relying on LTO. I'll let others indicate which way they would be most<br class="">
>>> comfortable.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> Some IRC conversations indicated that early benchmark results with GMR<br class="">
>>> completely disabled weren't showing really significant swings, so maybe this<br class="">
>>> relatively small reduction in power of GMR won't be too problematic for<br class="">
>>> folks. Either way, I'm open to the different approaches. It's D11214 that I<br class="">
>>> care a lot about. =]<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> Thanks for all the thoughts here!<br class="">
>>><br class="">
>>> -Chandler<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> On Tue, Jul 14, 2015 at 11:25 PM Chandler Carruth <<a href="mailto:chandlerc@gmail.com" class="">chandlerc@gmail.com</a>><br class="">
>>> wrote:<br class="">
>>><br class="">
>>> Replying here, but several of the questions raised boil down to "couldn't<br class="">
>>> you make the usage of GetUnderlyingObject conservatively correct?". I'll try<br class="">
>>> and address that.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> I think this *is* the right approach, but I think it is very hard to do<br class="">
>>> without effectively disabling this part of GlobalsModRef. That is, the easy<br class="">
>>> ways are likely to fire very frequently IMO.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> The core idea is to detect a "no information" state coming out of<br class="">
>>> GetUnderlyingObject (likely be providing a custom version just for<br class="">
>>> GlobalsModRef and tailored to its use cases). This is particularly effective<br class="">
>>> at avoiding the problems with the recursion limit. But let's look at what<br class="">
>>> cases we *wouldn't* return that. Here are the cases I see when I thought<br class="">
>>> about this last night with Hal, roughly in descending likelihood I would<br class="">
>>> guess:<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> 1) We detect some global or an alloca. In that case, even BasicAA would be<br class="">
>>> sufficient to provide no-alias. GMR shouldn't be relevant.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> 2) We detect a phi, select, or inttoptr, and stop there.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> 3) We detect a load and stop there.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> 4) We detect a return from a function.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> 5) We detect an argument to the function.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> I strongly suspect the vast majority of queries hit #1. That's why BasicAA<br class="">
>>> is *so* effective. Both #4 and #5 I think are actually reasonable places for<br class="">
>>> GMR to potentially say "no-alias" and provide useful definitive information.<br class="">
>>> But I also suspect these are the least common.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> So let's look at #2 and #3 because I think they're interesting. For these,<br class="">
>>> I think it is extremely hard to return "no-alias". It seems extremely easy<br class="">
>>> for a reasonable and innocuous change to the IR to introduce a phi or a<br class="">
>>> select into one side of the GetUnderlyingObject but not the other. If that<br class="">
>>> ever happens, we can't return "no-alias" for #2, or we need to add really<br class="">
>>> expensive updates. It also seems reasonable (if not as likely) to want<br class="">
>>> adding a store and load to the IR to not trigger a miscompile. If it is<br class="">
>>> possible for a valid optimization pass to do reg2mem on an SSA value, then<br class="">
>>> that could happen to only one side of the paired GetUnderlyingObject and<br class="">
>>> break GMR with #3. If that seems like an unreasonable thing to do, consider<br class="">
>>> loop re-rolling or other transformations which may need to take things in<br class="">
>>> SSA form at move them out of SSA form. Even if we then try immediately to<br class="">
>>> put it back *into* SSA form, before we do that we create a point where GMR<br class="">
>>> cannot correctly return no-alias.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> So ultimately, I don't think we want to rely on GMR returning "no-alias"<br class="">
>>> for either #2 or #3 because of the challenge of actually updating it in all<br class="">
>>> of the circumstances that could break them. That means that *only* #4 and #5<br class="">
>>> are going to return "no-alias" usefully. And even then, function inlining<br class="">
>>> and function outlining both break #4 and #5, so you have to preclude those<br class="">
>>> transforms while GMR is active. And I have serious doubts about these<br class="">
>>> providing enough value to be worth the cost.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> I think the better way to approach this is the other way around. Rather<br class="">
>>> than doing a backwards analysis to see if one location reaches and global<br class="">
>>> and the other location doesn't reach a global, I think it would be much more<br class="">
>>> effective to re-cast this as a forward analysis that determines all the<br class="">
>>> memory locations in a function that come from outside the function, and use<br class="">
>>> that to drive the no-alias responses.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> On Tue, Jul 14, 2015 at 12:12 PM Gerolf Hoflehner <<a href="mailto:ghoflehner@apple.com" class="">ghoflehner@apple.com</a>><br class="">
>>> wrote:<br class="">
>>><br class="">
>>> I wouldn’t be willing to give up performance for hypothetical issues.<br class="">
>>> Please protect all your changes with options. For some of your concerns it<br class="">
>>> is probably hard to provide a test case that shows an/the actual issue.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> I certainly agree that it will be very hard to provide a test case and<br class="">
>>> extremely rare to see this in the wild for most of these issues. As long as<br class="">
>>> I can remove the problematic update API we currently have (which as its an<br class="">
>>> API change can't really be put behind flags), I'm happy to have flags<br class="">
>>> control whether or not GMR uses the unsound / stale information to try to<br class="">
>>> answer alias queries. Do you have any opinion about what the default value<br class="">
>>> of the flags should be?<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> I'll go ahead and prepare the patches, as it seems like we're all ending<br class="">
>>> up in the same position, and just wondering about the precise tradeoffs we<br class="">
>>> want to settle on.<br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> _______________________________________________<br class="">
>>> LLVM Developers mailing list<br class="">
>>><br class="">
>>> <a href="mailto:LLVMdev@cs.uiuc.edu" class="">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu/" rel="noreferrer" target="_blank" class="">http://llvm.cs.uiuc.edu</a><br class="">
>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br class="">
>>><br class="">
>>><br class="">
>>><br class="">
>>> _______________________________________________<br class="">
>>> LLVM Developers mailing list<br class="">
>>> <a href="mailto:LLVMdev@cs.uiuc.edu" class="">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu/" rel="noreferrer" target="_blank" class="">http://llvm.cs.uiuc.edu</a><br class="">
>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br class="">
>>><br class="">
>>> _______________________________________________<br class="">
>>> LLVM Developers mailing list<br class="">
>>> <a href="mailto:LLVMdev@cs.uiuc.edu" class="">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu/" rel="noreferrer" target="_blank" class="">http://llvm.cs.uiuc.edu</a><br class="">
>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br class="">
>>><br class="">
>>><br class="">
>>> ______________________________________________<br class="">
>>> LLVM Developers mailing list<br class="">
>>> <a href="mailto:LLVMdev@cs.uiuc.edu" class="">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu/" rel="noreferrer" target="_blank" class="">http://llvm.cs.uiuc.edu</a><br class="">
>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br class="">
>>><br class="">
>>> _______________________________________________<br class="">
>>> LLVM Developers mailing list<br class="">
>>> <a href="mailto:LLVMdev@cs.uiuc.edu" class="">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu/" rel="noreferrer" target="_blank" class="">http://llvm.cs.uiuc.edu</a><br class="">
>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br class="">
>><br class="">
>> _______________________________________________<br class="">
>> LLVM Developers mailing list<br class="">
>> <a href="mailto:LLVMdev@cs.uiuc.edu" class="">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu/" rel="noreferrer" target="_blank" class="">http://llvm.cs.uiuc.edu</a><br class="">
>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br class="">
>><br class="">
>><br class="">
>><br class="">
>> _______________________________________________<br class="">
>> LLVM Developers mailing list<br class="">
>> <a href="mailto:LLVMdev@cs.uiuc.edu" class="">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu/" rel="noreferrer" target="_blank" class="">http://llvm.cs.uiuc.edu</a><br class="">
>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br class="">
>><br class="">
<br class="">
<br class="">
_______________________________________________<br class="">
LLVM Developers mailing list<br class="">
<a href="mailto:LLVMdev@cs.uiuc.edu" class="">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu/" rel="noreferrer" target="_blank" class="">http://llvm.cs.uiuc.edu</a><br class="">
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br class="">
</div></div></blockquote></div><br class=""></div>
</div></blockquote></div><br class=""></body></html>