[llvm-commits] [llvm] r161152 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/PeepholeOptimizer.cpp lib/Target/X86/X86InstrInfo.cpp lib/Target/X86/X86InstrInfo.h test/CodeGen/X86/2012-05-19-avx2-store.ll test/CodeGen/X86/break-sse-dep.ll test/CodeGen/X86/fold-load.ll test/CodeGen/X86/fold-pcmpeqd-1.ll test/CodeGen/X86/sse-minmax.ll test/CodeGen/X86/vec_compare.ll

Manman Ren mren at apple.com
Thu Aug 2 11:47:05 PDT 2012


On Aug 2, 2012, at 10:14 AM, Jakob Stoklund Olesen wrote:

> 
> On Aug 1, 2012, at 5:56 PM, Manman Ren <mren at apple.com> wrote:
> 
>> Author: mren
>> Date: Wed Aug  1 19:56:42 2012
>> New Revision: 161152
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=161152&view=rev
>> Log:
>> X86 Peephole: fold loads to the source register operand if possible.
> 
>> ==============================================================================
>> --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original)
>> +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Wed Aug  1 19:56:42 2012
>> @@ -14,6 +14,7 @@
>> #ifndef LLVM_TARGET_TARGETINSTRINFO_H
>> #define LLVM_TARGET_TARGETINSTRINFO_H
>> 
>> +#include "llvm/ADT/SmallSet.h"
>> #include "llvm/MC/MCInstrInfo.h"
>> #include "llvm/CodeGen/DFAPacketizer.h"
>> #include "llvm/CodeGen/MachineFunction.h"
>> @@ -693,6 +694,16 @@
>>    return false;
>>  }
>> 
>> +  /// optimizeLoadInstr - Try to remove the load by folding it to a register
>> +  /// operand at the use. We fold the load instructions if and only if the
>> +  /// def and use are in the same BB.
>> +  virtual MachineInstr* optimizeLoadInstr(MachineInstr *MI,
>> +                        const MachineRegisterInfo *MRI,
>> +                        unsigned &FoldAsLoadDefReg,
>> +                        MachineInstr *&DefMI) const {
>> +    return 0;
>> +  }
> 
> This interface is much better than before, but you have to explain what the arguments do. It should be possible to implement this hook based on information in the header file alone.
Will add more comments here to explain the arguments.
> 
>> ==============================================================================
>> --- llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp (original)
>> +++ llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp Wed Aug  1 19:56:42 2012
> 
>> +/// isLoadFoldable - Check whether MI is a candidate for folding into a later
>> +/// instruction. We only fold loads to virtual registers and the virtual
>> +/// register defined has a single use.
>> +bool PeepholeOptimizer::isLoadFoldable(MachineInstr *MI,
>> +                                       unsigned &FoldAsLoadDefReg) {
>> +  if (MI->canFoldAsLoad()) {
>> +    const MCInstrDesc &MCID = MI->getDesc();
>> +    if (MCID.getNumDefs() == 1) {
>> +      unsigned Reg = MI->getOperand(0).getReg();
>> +      // To reduce compilation time, we check MRI->hasOneUse when inserting
>> +      // loads. It should be checked when processing uses of the load, since
>> +      // uses can be removed during peephole.
>> +      if (!MI->getOperand(0).getSubReg() &&
>> +          TargetRegisterInfo::isVirtualRegister(Reg) &&
>> +          MRI->hasOneUse(Reg)) {
>> +        FoldAsLoadDefReg = Reg;
>> +        return true;
>> +      }
>> +    }
>> +  }
>> +  return false;
>> +}
> 
> Please use early returns to reduce nesting.
Will do.
> 
> 
>> 	@@ -34,8 +34,7 @@
>> define double @squirt(double* %x) nounwind {
>> entry:
>> ; CHECK: squirt:
>> -; CHECK: movsd ([[A0]]), %xmm0
>> -; CHECK: sqrtsd %xmm0, %xmm0
>> +; CHECK: sqrtsd ([[A0]]), %xmm0
>>  %z = load double* %x
>>  %t = call double @llvm.sqrt.f64(double %z)
>>  ret double %t
> 
> See the comment on the function hasPartialRegUpdate() in X86InstrInfo.cpp and its callers.
> 
>> ; CHECK:      ogt_x:
>> -; CHECK-NEXT: xorp{{[sd]}} %xmm1, %xmm1
>> -; CHECK-NEXT: maxsd %xmm1, %xmm0
>> +; CHECK-NEXT: maxsd LCP{{.*}}(%rip), %xmm0
>> ; CHECK-NEXT: ret
> 
> I am not sure this is a good idea. We set canFoldAsLoad on V_SET0 because it can be turned into a constant pool load. That is a good thing when the register allocator is running low on xmm registers, but we don't want to insert constant pool loads when there is plenty of registers.
Will not fold V_SET0 here.

Thanks a lot for reviewing and providing valuable comments :)
Manman
> 
> /jakob
> 




More information about the llvm-commits mailing list