Thanks for letting me know. I noticed that and fixed it in r221773 last night. <div><br></div><div>Jingyue<br><br><div class="gmail_quote">On Tue Nov 11 2014 at 11:14:04 PM NAKAMURA Takumi <<a href="mailto:geek4civic@gmail.com">geek4civic@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">It fails, w/o any warnings, if TARGETS_TO_BUILD doesn't have NVPTX.<br>
<br>
2014-11-12 15:58 GMT+09:00 Jingyue Wu <<a href="mailto:jingyue@google.com" target="_blank">jingyue@google.com</a>>:<br>
> Author: jingyue<br>
> Date: Wed Nov 12 00:58:45 2014<br>
> New Revision: 221772<br>
><br>
> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=221772&view=rev" target="_blank">http://llvm.org/viewvc/llvm-<u></u>project?rev=221772&view=rev</a><br>
> Log:<br>
> Disable indvar widening if arithmetics on the wider type are more expensive<br>
><br>
> Summary:<br>
> IndVarSimplify should not widen an indvar if arithmetics on the wider<br>
> indvar are more expensive than those on the narrower indvar. For<br>
> instance, although NVPTX64 treats i64 as a legal type, an ADD on i64 is<br>
> twice as expensive as that on i32, because the hardware needs to<br>
> simulate a 64-bit integer using two 32-bit integers.<br>
><br>
> Split from D6188, and based on D6195 which adds NVPTXTargetTransformInfo.<br>
><br>
> Fixes PR21148.<br>
><br>
> Test Plan:<br>
> Added @indvar_32_bit that verifies we do not widen an indvar if the arithmetics<br>
> on the wider type are more expensive.<br>
><br>
> Reviewers: jholewinski, eliben, meheff, atrick<br>
><br>
> Reviewed By: atrick<br>
><br>
> Subscribers: jholewinski, llvm-commits<br>
><br>
> Differential Revision: <a href="http://reviews.llvm.org/D6196" target="_blank">http://reviews.llvm.org/D6196</a><br>
><br>
> Added:<br>
> llvm/trunk/test/Transforms/<u></u>IndVarSimplify/no-widen-<u></u>expensive.ll<br>
> Modified:<br>
> llvm/trunk/lib/Target/NVPTX/<u></u>NVPTXTargetTransformInfo.cpp<br>
> llvm/trunk/lib/Transforms/<u></u>Scalar/IndVarSimplify.cpp<br>
><br>
> Modified: llvm/trunk/lib/Target/NVPTX/<u></u>NVPTXTargetTransformInfo.cpp<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp?rev=221772&r1=221771&r2=221772&view=diff" target="_blank">http://llvm.org/viewvc/llvm-<u></u>project/llvm/trunk/lib/Target/<u></u>NVPTX/<u></u>NVPTXTargetTransformInfo.cpp?<u></u>rev=221772&r1=221771&r2=<u></u>221772&view=diff</a><br>
> ==============================<u></u>==============================<u></u>==================<br>
> --- llvm/trunk/lib/Target/NVPTX/<u></u>NVPTXTargetTransformInfo.cpp (original)<br>
> +++ llvm/trunk/lib/Target/NVPTX/<u></u>NVPTXTargetTransformInfo.cpp Wed Nov 12 00:58:45 2014<br>
> @@ -36,12 +36,14 @@ void initializeNVPTXTTIPass(<u></u>PassRegistry<br>
> namespace {<br>
><br>
> class NVPTXTTI final : public ImmutablePass, public TargetTransformInfo {<br>
> + const NVPTXTargetLowering *TLI;<br>
> public:<br>
> - NVPTXTTI() : ImmutablePass(ID) {<br>
> + NVPTXTTI() : ImmutablePass(ID), TLI(nullptr) {<br>
> llvm_unreachable("This pass cannot be directly constructed");<br>
> }<br>
><br>
> - NVPTXTTI(const NVPTXTargetMachine *TM) : ImmutablePass(ID) {<br>
> + NVPTXTTI(const NVPTXTargetMachine *TM)<br>
> + : ImmutablePass(ID), TLI(TM->getSubtargetImpl()-><u></u>getTargetLowering()) {<br>
> initializeNVPTXTTIPass(*<u></u>PassRegistry::getPassRegistry(<u></u>));<br>
> }<br>
><br>
> @@ -63,6 +65,12 @@ public:<br>
><br>
> bool hasBranchDivergence() const override;<br>
><br>
> + unsigned getArithmeticInstrCost(<br>
> + unsigned Opcode, Type *Ty, OperandValueKind Opd1Info = OK_AnyValue,<br>
> + OperandValueKind Opd2Info = OK_AnyValue,<br>
> + OperandValueProperties Opd1PropInfo = OP_None,<br>
> + OperandValueProperties Opd2PropInfo = OP_None) const override;<br>
> +<br>
> /// @}<br>
> };<br>
><br>
> @@ -78,3 +86,32 @@ llvm::<u></u>createNVPTXTargetTransformInfo<u></u>Pass<br>
> }<br>
><br>
> bool NVPTXTTI::hasBranchDivergence(<u></u>) const { return true; }<br>
> +<br>
> +unsigned NVPTXTTI::<u></u>getArithmeticInstrCost(<br>
> + unsigned Opcode, Type *Ty, OperandValueKind Opd1Info,<br>
> + OperandValueKind Opd2Info, OperandValueProperties Opd1PropInfo,<br>
> + OperandValueProperties Opd2PropInfo) const {<br>
> + // Legalize the type.<br>
> + std::pair<unsigned, MVT> LT = TLI->getTypeLegalizationCost(<u></u>Ty);<br>
> +<br>
> + int ISD = TLI->InstructionOpcodeToISD(<u></u>Opcode);<br>
> +<br>
> + switch (ISD) {<br>
> + default:<br>
> + return TargetTransformInfo::<u></u>getArithmeticInstrCost(<br>
> + Opcode, Ty, Opd1Info, Opd2Info, Opd1PropInfo, Opd2PropInfo);<br>
> + case ISD::ADD:<br>
> + case ISD::MUL:<br>
> + case ISD::XOR:<br>
> + case ISD::OR:<br>
> + case ISD::AND:<br>
> + // The machine code (SASS) simulates an i64 with two i32. Therefore, we<br>
> + // estimate that arithmetic operations on i64 are twice as expensive as<br>
> + // those on types that can fit into one machine register.<br>
> + if (LT.second.SimpleTy == MVT::i64)<br>
> + return 2 * LT.first;<br>
> + // Delegate other cases to the basic TTI.<br>
> + return TargetTransformInfo::<u></u>getArithmeticInstrCost(<br>
> + Opcode, Ty, Opd1Info, Opd2Info, Opd1PropInfo, Opd2PropInfo);<br>
> + }<br>
> +}<br>
><br>
> Modified: llvm/trunk/lib/Transforms/<u></u>Scalar/IndVarSimplify.cpp<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp?rev=221772&r1=221771&r2=221772&view=diff" target="_blank">http://llvm.org/viewvc/llvm-<u></u>project/llvm/trunk/lib/<u></u>Transforms/Scalar/<u></u>IndVarSimplify.cpp?rev=221772&<u></u>r1=221771&r2=221772&view=diff</a><br>
> ==============================<u></u>==============================<u></u>==================<br>
> --- llvm/trunk/lib/Transforms/<u></u>Scalar/IndVarSimplify.cpp (original)<br>
> +++ llvm/trunk/lib/Transforms/<u></u>Scalar/IndVarSimplify.cpp Wed Nov 12 00:58:45 2014<br>
> @@ -31,6 +31,7 @@<br>
> #include "llvm/Analysis/LoopInfo.h"<br>
> #include "llvm/Analysis/LoopPass.h"<br>
> #include "llvm/Analysis/<u></u>ScalarEvolutionExpander.h"<br>
> +#include "llvm/Analysis/<u></u>TargetTransformInfo.h"<br>
> #include "llvm/IR/BasicBlock.h"<br>
> #include "llvm/IR/CFG.h"<br>
> #include "llvm/IR/Constants.h"<br>
> @@ -69,11 +70,12 @@ static cl::opt<bool> ReduceLiveIVs("liv-<br>
><br>
> namespace {<br>
> class IndVarSimplify : public LoopPass {<br>
> - LoopInfo *LI;<br>
> - ScalarEvolution *SE;<br>
> - DominatorTree *DT;<br>
> - const DataLayout *DL;<br>
> - TargetLibraryInfo *TLI;<br>
> + LoopInfo *LI;<br>
> + ScalarEvolution *SE;<br>
> + DominatorTree *DT;<br>
> + const DataLayout *DL;<br>
> + TargetLibraryInfo *TLI;<br>
> + const TargetTransformInfo *TTI;<br>
><br>
> SmallVector<WeakVH, 16> DeadInsts;<br>
> bool Changed;<br>
> @@ -661,7 +663,7 @@ namespace {<br>
> /// extended by this sign or zero extend operation. This is used to determine<br>
> /// the final width of the IV before actually widening it.<br>
> static void visitIVCast(CastInst *Cast, WideIVInfo &WI, ScalarEvolution *SE,<br>
> - const DataLayout *DL) {<br>
> + const DataLayout *DL, const TargetTransformInfo *TTI) {<br>
> bool IsSigned = Cast->getOpcode() == Instruction::SExt;<br>
> if (!IsSigned && Cast->getOpcode() != Instruction::ZExt)<br>
> return;<br>
> @@ -671,6 +673,19 @@ static void visitIVCast(CastInst *Cast,<br>
> if (DL && !DL->isLegalInteger(Width))<br>
> return;<br>
><br>
> + // Cast is either an sext or zext up to this point.<br>
> + // We should not widen an indvar if arithmetics on the wider indvar are more<br>
> + // expensive than those on the narrower indvar. We check only the cost of ADD<br>
> + // because at least an ADD is required to increment the induction variable. We<br>
> + // could compute more comprehensively the cost of all instructions on the<br>
> + // induction variable when necessary.<br>
> + if (TTI &&<br>
> + TTI->getArithmeticInstrCost(<u></u>Instruction::Add, Ty) ><br>
> + TTI->getArithmeticInstrCost(<u></u>Instruction::Add,<br>
> + Cast->getOperand(0)->getType()<u></u>)) {<br>
> + return;<br>
> + }<br>
> +<br>
> if (!WI.WidestNativeType) {<br>
> WI.WidestNativeType = SE->getEffectiveSCEVType(Ty);<br>
> WI.IsSigned = IsSigned;<br>
> @@ -1187,14 +1202,16 @@ namespace {<br>
> class IndVarSimplifyVisitor : public IVVisitor {<br>
> ScalarEvolution *SE;<br>
> const DataLayout *DL;<br>
> + const TargetTransformInfo *TTI;<br>
> PHINode *IVPhi;<br>
><br>
> public:<br>
> WideIVInfo WI;<br>
><br>
> IndVarSimplifyVisitor(PHINode *IV, ScalarEvolution *SCEV,<br>
> - const DataLayout *DL, const DominatorTree *DTree):<br>
> - SE(SCEV), DL(DL), IVPhi(IV) {<br>
> + const DataLayout *DL, const TargetTransformInfo *TTI,<br>
> + const DominatorTree *DTree)<br>
> + : SE(SCEV), DL(DL), TTI(TTI), IVPhi(IV) {<br>
> DT = DTree;<br>
> WI.NarrowIV = IVPhi;<br>
> if (ReduceLiveIVs)<br>
> @@ -1202,7 +1219,9 @@ namespace {<br>
> }<br>
><br>
> // Implement the interface used by simplifyUsersOfIV.<br>
> - void visitCast(CastInst *Cast) override { visitIVCast(Cast, WI, SE, DL); }<br>
> + void visitCast(CastInst *Cast) override {<br>
> + visitIVCast(Cast, WI, SE, DL, TTI);<br>
> + }<br>
> };<br>
> }<br>
><br>
> @@ -1236,7 +1255,7 @@ void IndVarSimplify::<u></u>SimplifyAndExtend(L<br>
> PHINode *CurrIV = LoopPhis.pop_back_val();<br>
><br>
> // Information about sign/zero extensions of CurrIV.<br>
> - IndVarSimplifyVisitor Visitor(CurrIV, SE, DL, DT);<br>
> + IndVarSimplifyVisitor Visitor(CurrIV, SE, DL, TTI, DT);<br>
><br>
> Changed |= simplifyUsersOfIV(CurrIV, SE, &LPM, DeadInsts, &Visitor);<br>
><br>
> @@ -1895,6 +1914,7 @@ bool IndVarSimplify::runOnLoop(Loop *L,<br>
> DataLayoutPass *DLP = getAnalysisIfAvailable<<u></u>DataLayoutPass>();<br>
> DL = DLP ? &DLP->getDataLayout() : nullptr;<br>
> TLI = getAnalysisIfAvailable<<u></u>TargetLibraryInfo>();<br>
> + TTI = getAnalysisIfAvailable<<u></u>TargetTransformInfo>();<br>
><br>
> DeadInsts.clear();<br>
> Changed = false;<br>
><br>
> Added: llvm/trunk/test/Transforms/<u></u>IndVarSimplify/no-widen-<u></u>expensive.ll<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/IndVarSimplify/no-widen-expensive.ll?rev=221772&view=auto" target="_blank">http://llvm.org/viewvc/llvm-<u></u>project/llvm/trunk/test/<u></u>Transforms/IndVarSimplify/no-<u></u>widen-expensive.ll?rev=221772&<u></u>view=auto</a><br>
> ==============================<u></u>==============================<u></u>==================<br>
> --- llvm/trunk/test/Transforms/<u></u>IndVarSimplify/no-widen-<u></u>expensive.ll (added)<br>
> +++ llvm/trunk/test/Transforms/<u></u>IndVarSimplify/no-widen-<u></u>expensive.ll Wed Nov 12 00:58:45 2014<br>
> @@ -0,0 +1,37 @@<br>
> +; RUN: opt < %s -indvars -S | FileCheck %s<br>
> +<br>
> +target triple = "nvptx64-unknown-unknown"<br>
> +<br>
> +; For the nvptx64 architecture, the cost of an arithmetic instruction on a<br>
> +; 64-bit integer is twice as expensive as that on a 32-bit integer, because the<br>
> +; hardware needs to simulate a 64-bit integer using two 32-bit integers.<br>
> +; Therefore, in this particular architecture, we should not widen induction<br>
> +; variables to 64-bit integers even though i64 is a legal type in the 64-bit<br>
> +; PTX ISA.<br>
> +<br>
> +define void @indvar_32_bit(i32 %n, i32* nocapture %output) {<br>
> +; CHECK-LABEL: @indvar_32_bit<br>
> +entry:<br>
> + %cmp5 = icmp sgt i32 %n, 0<br>
> + br i1 %cmp5, label %for.body.preheader, label %for.end<br>
> +<br>
> +for.body.preheader: ; preds = %entry<br>
> + br label %for.body<br>
> +<br>
> +for.body: ; preds = %for.body.preheader, %for.body<br>
> + %i.06 = phi i32 [ 0, %for.body.preheader ], [ %add, %for.body ]<br>
> +; CHECK: phi i32<br>
> + %mul = mul nsw i32 %i.06, %i.06<br>
> + %0 = sext i32 %i.06 to i64<br>
> + %arrayidx = getelementptr inbounds i32* %output, i64 %0<br>
> + store i32 %mul, i32* %arrayidx, align 4<br>
> + %add = add nsw i32 %i.06, 3<br>
> + %cmp = icmp slt i32 %add, %n<br>
> + br i1 %cmp, label %for.body, label %for.end.loopexit<br>
> +<br>
> +for.end.loopexit: ; preds = %for.body<br>
> + br label %for.end<br>
> +<br>
> +for.end: ; preds = %for.end.loopexit, %entry<br>
> + ret void<br>
> +}<br>
><br>
><br>
> ______________________________<u></u>_________________<br>
> llvm-commits mailing list<br>
> <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailman/listinfo/llvm-commits</a><br>
</blockquote></div></div>