[llvm] r192445 - [NVPTX] Switch from StrongPHIElimination to PHIElimination in NVPTXTargetMachine, and add some missing optimization passes to addOptimizedRegAlloc

Fri Oct 11 08:55:47 PDT 2013

No complaints here.  As far as I know, it's only available as an llc 
command-line option.

On 10/11/13 11:50 AM, Rafael Espíndola wrote:
> With this StrongPHIElimination is now effectively dead, no? Should we remove it?
>
> On 11 October 2013 08:39, Justin Holewinski <jholewinski at nvidia.com> wrote:
>> Author: jholewinski
>> Date: Fri Oct 11 07:39:39 2013
>> New Revision: 192445
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=192445&view=rev
>> Log:
>> [NVPTX] Switch from StrongPHIElimination to PHIElimination in NVPTXTargetMachine, and add some missing optimization passes to addOptimizedRegAlloc
>>
>> Fixes PR17529
>>
>> Added:
>>      llvm/trunk/test/CodeGen/NVPTX/pr17529.ll
>> Modified:
>>      llvm/trunk/lib/Target/NVPTX/NVPTXTargetMachine.cpp
>>
>> Modified: llvm/trunk/lib/Target/NVPTX/NVPTXTargetMachine.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/NVPTX/NVPTXTargetMachine.cpp?rev=192445&r1=192444&r2=192445&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Target/NVPTX/NVPTXTargetMachine.cpp (original)
>> +++ llvm/trunk/lib/Target/NVPTX/NVPTXTargetMachine.cpp Fri Oct 11 07:39:39 2013
>> @@ -154,10 +154,30 @@ FunctionPass *NVPTXPassConfig::createTar
>>
>>   void NVPTXPassConfig::addFastRegAlloc(FunctionPass *RegAllocPass) {
>>     assert(!RegAllocPass && "NVPTX uses no regalloc!");
>> -  addPass(&StrongPHIEliminationID);
>> +  addPass(&PHIEliminationID);
>> +  addPass(&TwoAddressInstructionPassID);
>>   }
>>
>>   void NVPTXPassConfig::addOptimizedRegAlloc(FunctionPass *RegAllocPass) {
>>     assert(!RegAllocPass && "NVPTX uses no regalloc!");
>> -  addPass(&StrongPHIEliminationID);
>> +
>> +  addPass(&ProcessImplicitDefsID);
>> +  addPass(&LiveVariablesID);
>> +  addPass(&MachineLoopInfoID);
>> +  addPass(&PHIEliminationID);
>> +
>> +  addPass(&TwoAddressInstructionPassID);
>> +  addPass(&RegisterCoalescerID);
>> +
>> +  // PreRA instruction scheduling.
>> +  if (addPass(&MachineSchedulerID))
>> +    printAndVerify("After Machine Scheduling");
>> +
>> +
>> +  addPass(&StackSlotColoringID);
>> +
>> +  // FIXME: Needs physical registers
>> +  //addPass(&PostRAMachineLICMID);
>> +
>> +  printAndVerify("After StackSlotColoring");
>>   }
>>
>> Added: llvm/trunk/test/CodeGen/NVPTX/pr17529.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/NVPTX/pr17529.ll?rev=192445&view=auto
>> ==============================================================================
>> --- llvm/trunk/test/CodeGen/NVPTX/pr17529.ll (added)
>> +++ llvm/trunk/test/CodeGen/NVPTX/pr17529.ll Fri Oct 11 07:39:39 2013
>> @@ -0,0 +1,38 @@
>> +; RUN: llc < %s -march=nvptx -mcpu=sm_20 | FileCheck %s
>> +
>> +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"
>> +target triple = "nvptx64-nvidia-cuda"
>> +
>> +; Function Attrs: nounwind
>> +; CHECK: .func kernelgen_memcpy
>> +define ptx_device void @kernelgen_memcpy(i8* nocapture %dst) #0 {
>> +entry:
>> +  br i1 undef, label %for.end, label %vector.body
>> +
>> +vector.body:                                      ; preds = %vector.body, %entry
>> +  %index = phi i64 [ %index.next, %vector.body ], [ 0, %entry ]
>> +  %scevgep9 = getelementptr i8* %dst, i64 %index
>> +  %scevgep910 = bitcast i8* %scevgep9 to <4 x i8>*
>> +  store <4 x i8> undef, <4 x i8>* %scevgep910, align 1
>> +  %index.next = add i64 %index, 4
>> +  %0 = icmp eq i64 undef, %index.next
>> +  br i1 %0, label %middle.block, label %vector.body
>> +
>> +middle.block:                                     ; preds = %vector.body
>> +  br i1 undef, label %for.end, label %for.body.preheader1
>> +
>> +for.body.preheader1:                              ; preds = %middle.block
>> +  %scevgep2 = getelementptr i8* %dst, i64 0
>> +  br label %for.body
>> +
>> +for.body:                                         ; preds = %for.body, %for.body.preheader1
>> +  %lsr.iv3 = phi i8* [ %scevgep2, %for.body.preheader1 ], [ %scevgep4, %for.body ]
>> +  store i8 undef, i8* %lsr.iv3, align 1
>> +  %scevgep4 = getelementptr i8* %lsr.iv3, i64 1
>> +  br label %for.body
>> +
>> +for.end:                                          ; preds = %middle.block, %entry
>> +  ret void
>> +}
>> +
>> +attributes #0 = { nounwind "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-realign-stack" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-- 
Thanks,

Justin Holewinski

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131011/4afd2a53/attachment.html>