<div dir="ltr"><a class="GWVZpf gW" id="IloFPc-1" href="mailto:jholewinski@nvidia.com" tabindex="-1">+Justin Holewinski</a> <br><br><div class="gmail_quote"><div dir="ltr">On Thu, Jul 2, 2015 at 4:13 AM Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">----- Original Message -----<br>
> From: "Bjarke Roune" <<a href="mailto:broune@google.com" target="_blank">broune@google.com</a>><br>
> To: "Hal Finkel" <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>><br>
> Cc: "LLVM Developers Mailing List" <<a href="mailto:llvmdev@cs.uiuc.edu" target="_blank">llvmdev@cs.uiuc.edu</a>>, "Jingyue Wu" <<a href="mailto:jingyue@google.com" target="_blank">jingyue@google.com</a>><br>
> Sent: Wednesday, July 1, 2015 8:02:10 PM<br>
> Subject: Re: [LLVMdev] Deriving undefined behavior from nsw/inbounds/poison for scalar evolution<br>
><br>
><br>
><br>
><br>
> On Wed, Jul 1, 2015 at 3:07 PM, Hal Finkel < <a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a> > wrote:<br>
><br>
><br>
> ----- Original Message -----<br>
> > From: "Bjarke Roune" < <a href="mailto:broune@google.com" target="_blank">broune@google.com</a> ><br>
><br>
><br>
> > To: "Hal Finkel" < <a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a> ><br>
> > Cc: <a href="mailto:llvmdev@cs.uiuc.edu" target="_blank">llvmdev@cs.uiuc.edu</a> , "Jingyue Wu" < <a href="mailto:jingyue@google.com" target="_blank">jingyue@google.com</a> ><br>
> > Sent: Wednesday, July 1, 2015 2:27:59 PM<br>
> > Subject: Re: [LLVMdev] Deriving undefined behavior from<br>
> > nsw/inbounds/poison for scalar evolution<br>
> ><br>
> ><br>
> > On Tue, Jun 30, 2015 at 6:24 PM, Hal Finkel < <a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a> ><br>
> > wrote: [...]<br>
> > What benefit to you get from listing i64 as a legal integer width<br>
> > in<br>
> > the DataLayout for NVPTX?<br>
> ><br>
> > -Hal<br>
> ><br>
> > LSR only considers legal widths, so I think that then we could not<br>
> > generate a 64 bit induction variable to get a pointer induction<br>
> > variable if we make 64 bit illegal. From IVUsers.cpp:<br>
><br>
> I understand, but LSR is run very late, and already considers various<br>
> target-specific costs (addressing modes, etc.). LSR can be fixed<br>
> easily if that's the only relevant issue (and maybe LSR should be<br>
> doing this anyway for 64-bit types on 32-bit systems?). Do you get<br>
> any benefit from the 'canonicalizing' transformation passes from<br>
> having i64 being listed as a legal integer width in DataLayout?<br>
><br>
> I agree that LSR should be doing that anyway. I think we originally<br>
> made i64 legal because "illegal" sounds more serious than just "not<br>
> the ideal size for this hardware" and PTX does have an i64 type,<br>
> it's just slower and takes more register space after compilation to<br>
> SASS. I did a quick review of all the calls to isLegalInteger,<br>
> fitsInLegalInteger, getLargestLegalIntTypeSize,<br>
> getSmallestLegalIntType. Here's the parts that jumped out at me and<br>
> my hunch about whether its better for i64 to be legal or illegal in<br>
> that place for PTX:<br>
><br>
> 1) Strongly favors legal: LSR only considers legal induction<br>
> variables.<br>
> 2) Favors legal: In FindLoopCounter from IndVarSimplify, I think it<br>
> should be ok to find an i64 loop counter if there is nothing better<br>
> to be found. 3) Favors legal: In SROA, we should probably prefer i64<br>
> to non-virtual-register alternatives that are harder to analyze.<br>
> 4) Favors illegal: InstCombineCasts folds casts into PHIs in some<br>
> limited circumstances for legal types. It's probably better not to<br>
> do that for i64 for PTX.<br>
><br>
> 5) Slightly favors legal: For<br>
> TargetTransformInfoImplBase::getOperationCost, IntToPtr and PtrToInt<br>
> and truncate should probably still be free for i64, since the<br>
> pointer is 64 bit too.<br>
> 6) Slightly favors legal: In combineLoadToOperationType from<br>
> InstCombineLoadStoreAlloca.cpp, there's really no reason that i64<br>
> cannot be used, since it would be replacing another 64-bit type<br>
> (e.g. double) that also takes 2 registers.<br>
><br>
><br>
> On the whole, having i64 legal seems better as things stand, though<br>
> if some of these (especially 1) were changed to deal differently<br>
> with illegal types, then making i64 illegal might be marginally<br>
> better due to 4.<br>
><br>
<br>
Ah, this is very interesting.<br>
<br>
My underlying view here is that since i64 is not really a legal type in the hardware, we should not mark it as a legal type. What differs here from the normal situation is where the legalization is done. Normally we have the SDAG do this, but in this case, the legalization is done by some other part of the software stack post-LLVM. However, since the IR level optimizers should not contain explicit knowledge of how legalization is done later, this difference should be irrelevant at the IR level.<br>
<br>
I think that the correct approach here is to "fix" the few things you've identified so that they work reasonably for 32-bit architectures with 64-bit pointers, and then remove i64 from the set of legal integer types in DataLayout for NVPTX. I suspect this will yield the best end result for everyone.<br>
<br>
Thanks again,<br>
Hal<br>
<br>
><br>
><br>
> Bjarke<br>
><br>
<br>
--<br>
Hal Finkel<br>
Assistant Computational Scientist<br>
Leadership Computing Facility<br>
Argonne National Laboratory<br>
</blockquote></div></div>