[llvm-dev] optimisation issue in an llvm IR pass

Craig Topper via llvm-dev llvm-dev at lists.llvm.org
Thu Jul 4 11:25:55 PDT 2019


Yeah it will only be for 9.0

Looks like something like this will work on older versions, but it uses an
X86 specific intrinsic so isn't portable to other targets.

define void @foo(i32* %x) {
%a = load i32, i32* %x
%b = tail call { i8, i32 } @llvm.x86.addcarry.32(i8 0, i32 %a, i32 1)
%c = extractvalue { i8, i32 } %b, 0
%d = extractvalue { i8, i32 } %b, 1
%e = zext i8 %c to i32
%f = add i32 %d, %e
store i32 %f, i32* %x
ret void
}
declare { i8, i32 } @llvm.x86.addcarry.32(i8 , i32, i32)

~Craig


On Thu, Jul 4, 2019 at 2:07 AM Marc <mh at mh-sec.de> wrote:

> Hello Craig,
>
> thanks for filing a patch!
> I would have hoped that the issue is on my side and not an issue in llvm
> though ;)
>
> Is that patch for llvm 9 (only)?
>
> Regards,
> Marc
>
> On 04.07.19 02:28, Craig Topper wrote:
> > This patch fixes the issue. I haven't checked our lit tests yet to see
> > what effect this has and if we can commit it.
> >
> > lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> >
> > @@ -2967,7 +2967,8 @@SDValue DAGCombiner::visitADDCARRYLike(SDValue N0,
> > SDValue N1, SDValue CarryIn,
> >
> >    // Iff the flag result is dead:
> >
> >    // (addcarry (add|uaddo X, Y), 0, Carry) -> (addcarry X, Y, Carry)
> >
> >    if ((N0.getOpcode() == ISD::ADD ||
> >
> > -       (N0.getOpcode() == ISD::UADDO && N0.getResNo() == 0)) &&
> >
> > +       (N0.getOpcode() == ISD::UADDO && N0.getResNo() == 0 &&
> >
> > +        N0.getValue(1) != CarryIn)) &&
> >
> >        isNullConstant(N1) && !N->hasAnyUseOfValue(1))
> >
> >      return DAG.getNode(ISD::ADDCARRY, SDLoc(N), N->getVTList(),
> >
> >                         N0.getOperand(0), N0.getOperand(1), CarryIn);
> >
> >
> > ~Craig
> >
> >
> > On Wed, Jul 3, 2019 at 9:40 AM Marc <mh at mh-sec.de <mailto:mh at mh-sec.de>>
> > wrote:
> >
> >     Hi Craig,
> >
> >     On 03.07.19 17:33, Craig Topper wrote:
> >     > Don't the CreateICmp calls return a Value* with an i1 type? But
> then
> >     > they are added to an i8 type? Not sure that works.
> >
> >     I had that initially:
> >      auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));
> >      auto carry = IRB.CreateZExt(cf, Int8Ty);
> >      Incr = IRB.CreateAdd(Incr, carry);
> >     it makes no difference to the generated assembly
> >
> >
> >     > Have you tried using the llvm.uadd.with.overflow.i8 intrinsic?
> >
> >     we have tried this:
> >
> >      CallInst *AddOv =
> >     IRB.CreateBinaryIntrinsic(Intrinsic::uadd_with_overflow, Counter,
> >     ConstantInt::get(Int8Ty, 1));
> >      AddOv->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C,
> None));
> >      Value *SumWithOverflowBit = AddOv;
> >      Incr = IRB.CreateAdd(IRB.CreateExtractValue(SumWithOverflowBit, 0),
> >      IRB.CreateZExt(IRB.CreateExtractValue(SumWithOverflowBit, 1),
> Int8Ty));
> >
> >     but that generated exactly the same as
> >
> >      auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));
> >      Incr = IRB.CreateAdd(Incr, cf);
> >
> >
> >     All the above create (with llvm 6.0, 7 and 8):
> >
> >      mov    dl,BYTE PTR [rsi+rdi*1]
> >      mov    ecx,edx // <- not needed
> >      add    cl,0x1  // <- should be done to dl instead
> >      adc    dl,0x1
> >      mov    BYTE PTR [rsi+rdi*1],dl
> >
> >
> >     Thanks!
> >
> >     Regards,
> >     Marc
> >
> >     >
> >     > ~Craig
> >     >
> >     >
> >     > On Wed, Jul 3, 2019 at 4:01 AM Marc via llvm-dev
> >     > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> >     <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>>
> >     wrote:
> >     >
> >     >     Hello,
> >     >
> >     >     I have an optimisation issue in an llvm IR pass - the issue
> >     being that
> >     >     unnecessary instructions are generated in the final assembly
> >     (with -O3).
> >     >
> >     >     I want to create the following assembly snippet:
> >     >
> >     >       mov    dl,BYTE PTR [rsi+rdi*1]
> >     >       add    dl,0x1
> >     >       adc    dl,0x0
> >     >       mov    BYTE PTR [rsi+rdi*1],dl
> >     >
> >     >     however what is created is (variant #1):
> >     >
> >     >       mov    dl,BYTE PTR [rsi+rdx*1]
> >     >       add    dl,0x1
> >     >       cmp    dl,0x1 // <- not needed
> >     >       adc    dl,0x0
> >     >       mov    BYTE PTR [rsi+rdi*1],dl
> >     >
> >     >     in variant #2 the following is created:
> >     >
> >     >      mov    dl,BYTE PTR [rsi+rdi*1]
> >     >      mov    ecx,edx // <- not needed
> >     >      add    cl,0x1  // <- should be done to dl instead
> >     >      adc    dl,0x1
> >     >      mov    BYTE PTR [rsi+rdi*1],dl
> >     >
> >     >     Far below are both variants with the full code around it,
> >     however the
> >     >     difference in both variants is this:
> >     >
> >     >     //variant1
> >     >     auto cf = IRB.CreateICmpEQ(Incr, ConstantInt::get(Int8Ty, 0));
> >     >     Incr = IRB.CreateAdd(Incr, cf);
> >     >
> >     >     //variant2
> >     >     auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));
> >     >     Incr = IRB.CreateAdd(Incr, cf);
> >     >
> >     >     //interestingly this totally different approach creates the
> same
> >     >     instructions as variant2
> >     >     CallInst *AddOv =
> >     IRB.CreateBinaryIntrinsic(Intrinsic::uadd_with_over
> >     >     flow, Counter, ConstantInt::get(Int8Ty, 1));
> >     >     AddOv->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C,
> >     None));
> >     >     Value *SumWithOverflowBit = AddOv;
> >     >     Incr =
> >     IRB.CreateAdd(IRB.CreateExtractValue(SumWithOverflowBit, 0),
> >     >     IRB.CreateZExt(IRB.CreateExtractValue(SumWithOverflowBit, 1),
> >     Int8Ty));
> >     >
> >     >     So - How do I have to write the code that the target code has
> >     a chance
> >     >     of being generated?
> >     >     For me its the same result on LLVM 6.0 and 7.
> >     >
> >     >     Alternatively
> >     >       add    BYTE PTR [rsi+rdi*1],0x1
> >     >       adc    BYTE PTR [rsi+rdi*1],0x0
> >     >      would work as well.
> >     >
> >     >     Thank you very much!
> >     >
> >     >     Regards,
> >     >     Marc
> >     >
> >     >     -> Below now is the full llvm IR pass code
> >     >
> >     >     // vvv same code before both variants
> >     >     LoadInst *PrevLoc = IRB.CreateLoad(AFLPrevLoc);
> >     >     PrevLoc->setMetadata(M.getMDKindID("nosanitize"),
> >     MDNode::get(C, None));
> >     >     Value *PrevLocCasted = IRB.CreateZExt(PrevLoc,
> IRB.getInt32Ty());
> >     >
> >     >      LoadInst *MapPtr = IRB.CreateLoad(AFLMapPtr);
> >     >     MapPtr->setMetadata(M.getMDKindID("nosanitize"),
> >     MDNode::get(C, None));
> >     >     Value *MapPtrIdx = IRB.CreateGEP(MapPtr,
> >     IRB.CreateXor(PrevLocCasted,
> >     >     CurLoc));
> >     >
> >     >     LoadInst *Counter = IRB.CreateLoad(MapPtrIdx);
> >     >     Counter->setMetadata(M.getMDKindID("nosanitize"),
> >     MDNode::get(C, None));
> >     >
> >     >     Value *Incr = IRB.CreateAdd(Counter, ConstantInt::get(Int8Ty,
> 1));
> >     >     // ^^^ same code before both variants
> >     >
> >     >     // CODE FOR VARIANT #1
> >     >     auto cf = IRB.CreateICmpEQ(Incr, ConstantInt::get(Int8Ty, 0));
> >     >     Incr = IRB.CreateAdd(Incr, cf);
> >     >
> >     >     // CODE FOR VARIANT #2
> >     >     auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));
> >     >     Incr = IRB.CreateAdd(Incr, cf);
> >     >
> >     >     // vvv same code after both variants
> >     >     IRB.CreateStore(Incr,
> >     >     MapPtrIdx)->setMetadata(M.getMDKindID("nosanitize"),
> >     MDNode::get(C,
> >     >     None));
> >     >     StoreInst *Store = IRB.CreateStore(ConstantInt::get(Int32Ty,
> >     cur_loc >>
> >     >     1), AFLPrevLoc);
> >     >     Store->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C,
> >     None));
> >     >     // ^^^ same code after both variants
> >     >
> >     >
> >     >     --
> >     >     Marc Heuse
> >     >     www.mh-sec.de <http://www.mh-sec.de> <http://www.mh-sec.de>
> >     >
> >     >     PGP: AF3D 1D4C D810 F0BB 977D  3807 C7EE D0A0 6BE9 F573
> >     >     _______________________________________________
> >     >     LLVM Developers mailing list
> >     >     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> >     <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
> >     >     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >     >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190704/cb7a5ae1/attachment-0001.html>


More information about the llvm-dev mailing list