<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 30, 2015 at 6:24 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">----- Original Message -----<br>

> From: "Bjarke Roune" <<a href="mailto:broune@google.com">broune@google.com</a>><br>

</span><span class="">> To: "Jingyue Wu" <<a href="mailto:jingyue@google.com">jingyue@google.com</a>><br>

> Cc: <a href="mailto:llvmdev@cs.uiuc.edu">llvmdev@cs.uiuc.edu</a><br>

> Sent: Tuesday, June 30, 2015 8:16:13 PM<br>

> Subject: Re: [LLVMdev] Deriving undefined behavior from nsw/inbounds/poison for scalar evolution<br>

><br>

> Hi Adam,<br>

><br>

> Jingyue is right. We need to keep things in 32 bits because 64 bit<br>

> arithmetic is more expensive and because one 64 bit register<br>

> consumes two 32 bit registers.<br>

><br>

<br>

</span>What benefit to you get from listing i64 as a legal integer width in the DataLayout for NVPTX?<br>

<br>

 -Hal<br></blockquote><div><br></div><div>That's a good call. I used to misunderstand what DataLayout::isLegalInteger means; I thought that would prevent codegen from emitting i64 at all. We will look at how things go with i64 removed from the datalayout string. </div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<span class=""><br>

><br>

><br>

> To add a bit more background: we would often emit worse code if we<br>

> widened in indvars and then narrowed in the NVPTX backend later<br>

> because we often would not be able to narrow AFAICT. Consider this<br>

> general pattern where everything is 32 bit:<br>

><br>

><br>

> for (int i = a; i < b; i += s) {<br>

> // ...<br>

> }<br>

><br>

><br>

> Suppose we widen i to be 64 bit:<br>

><br>

><br>

><br>

> for (int64 i = a; i < b; i += s) {<br>

> // ...<br>

> }<br>

><br>

><br>

> As an example, suppose a = 0, b = INT_MAX, s = 2. The final value of<br>

> i that makes the loop terminate would then be INT_MAX+1, so we<br>

> cannot narrow i to 32 bits. To narrow in general, we have to prove<br>

> that a, b and s take on only values where narrowing is sound. That's<br>

> often not possible. I suppose an alternative would be to issue an<br>

> assume intrinsic restricting the range of i, though I prefer making<br>

> scalar evolution more powerful since that should be more generally<br>

> useful.<br>

><br>

><br>

> Bjarke<br>

><br>

><br>

><br>

><br>

><br>

> On Mon, Jun 29, 2015 at 8:57 PM, Jingyue Wu < <a href="mailto:jingyue@google.com">jingyue@google.com</a> ><br>

> wrote:<br>

><br>

><br>

><br>

> Hi Adam,<br>

><br>

><br>

> Indvar widening can sometimes be harmful for architectures (e.g.<br>

> NVPTX and AMDGPU) where wider integer operations are more expensive<br>

</span>> ( <a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__llvm.org_bugs_show-5Fbug.cgi-3Fid-3D21148&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=qp-Q9Pcr65oLJEoceX6ixSmD9OCBlYEmmHP-kiTvNbk&s=cM_YZwrfBAhm_lr_7DZeDfQCIv-agL819hbX3b5BsSA&e=" rel="noreferrer" target="_blank">https://llvm.org/bugs/show_bug.cgi?id=21148</a> ). For this reason, we<br>

<div class="HOEnZb"><div class="h5">> disabled indvar widening in NVPTX in <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D6196&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=qp-Q9Pcr65oLJEoceX6ixSmD9OCBlYEmmHP-kiTvNbk&s=l3LUaRQ13GMJ7FlRXX_8G7V7qKwtALSnxNrVxBVbT4U&e=" rel="noreferrer" target="_blank">http://reviews.llvm.org/D6196</a> .<br>

><br>

><br>

> Hope it helps.<br>

><br>

><br>

> Jingyue<br>

><br>

><br>

><br>

><br>

> On Mon, Jun 29, 2015 at 11:59 AM Adam Nemet < <a href="mailto:anemet@apple.com">anemet@apple.com</a> ><br>

> wrote:<br>

><br>

><br>

><br>

> > On Jun 26, 2015, at 4:01 PM, Bjarke Roune < <a href="mailto:broune@google.com">broune@google.com</a> ><br>

> > wrote:<br>

> ><br>

> > *** Summary<br>

> > I'd like to propose (and implement) functionality in LLVM to<br>

> > determine when a poison value from an instruction is guaranteed to<br>

> > produce undefined behavior. I want to use that to improve handling<br>

> > of nsw, inbounds etc. flags in scalar evolution and LSR. I imagine<br>

> > that there would be other uses for it. I'd like feedback on this<br>

> > idea before I proceed with it.<br>

> ><br>

> ><br>

> > *** Details<br>

> > Poison values do produce undefined behavior if the poison becomes<br>

> > externally observable. A load or store to a poison address value<br>

> > is externally observable and I'd like to use that in a simple<br>

> > analysis pass to derive guarantees that certain overflows would<br>

> > produce undefined behavior, not just poison.<br>

> ><br>

> > Scalar evolution (and hence LSR) cannot currently make much use of<br>

> > the nsw and similar flags on instructions. That is because two<br>

> > instructions can map to the same scev even if one instruction has<br>

> > the nsw flag and the other one does not. If we applied the nsw<br>

> > flag to the scev, the scev for the instruction without the nsw<br>

> > flag would then incorrectly have the nsw flag.<br>

> ><br>

> > Scalar evolution would be able to use the nsw flag from an<br>

> > instruction for recurrences when the loop header dominates the<br>

> > entire loop, the instruction with nsw post-dominates the loop<br>

> > header and undefined behavior is guaranteed on wrap via the poison<br>

> > value analysis pass that I'd like to write.<br>

> ><br>

> > What do you think? Do we already have something similar to this?<br>

> ><br>

> > Bjarke<br>

> ><br>

> ><br>

> ><br>

> > *** PS: What got me thinking about this:<br>

> > My immediate motivation is that I'd like LSR to be able to create<br>

> > induction variables for expressions like &ptr[i + offset] where i<br>

> > and offset are 32 bit integers, ptr is a loop-invariant 64 bit<br>

> > pointer, i is an induction variable and offset is loop-invariant.<br>

> > For that to happen, scalar evolution needs to propagate the nsw<br>

> > flag from i + offset to the scev so that it can transform<br>

> ><br>

> > ((4 * (sext i32 {%offset,+,1}<nw><%loop> to i64))<nsw> + %ptr)<nsw><br>

> ><br>

> > to<br>

> ><br>

> > {((4 * (sext i32 %offset to i64)) + %ptr),+,4}<nsw><%loop><br>

><br>

> I guess what I am missing here why indvars does not create an i64<br>

> induction variable for this?<br>

><br>

> Adam<br>

><br>

><br>

> ><br>

> > Currently the inner <nsw> is actually <nw>, which blocks the<br>

> > transformation (the outer two nsw's shouldn't currently be there<br>

> > either, as it's the same issue for inbounds on GEP: see llvm bug<br>

> > 23527)<br>

> > _______________________________________________<br>

> > LLVM Developers mailing list<br>

> > <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" rel="noreferrer" target="_blank">http://llvm.cs.uiuc.edu</a><br>

> > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

><br>

><br>

> _______________________________________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" rel="noreferrer" target="_blank">http://llvm.cs.uiuc.edu</a><br>

> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

><br>

><br>

> _______________________________________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" rel="noreferrer" target="_blank">http://llvm.cs.uiuc.edu</a><br>

> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" rel="noreferrer" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

><br>

<br>

</div></div><div class="HOEnZb"><div class="h5">--<br>

Hal Finkel<br>

Assistant Computational Scientist<br>

Leadership Computing Facility<br>

Argonne National Laboratory<br>

</div></div></blockquote></div><br></div></div>