[llvm-dev] optimisation issue in an llvm IR pass
Craig Topper via llvm-dev
llvm-dev at lists.llvm.org
Thu Jul 4 11:25:55 PDT 2019
Yeah it will only be for 9.0
Looks like something like this will work on older versions, but it uses an
X86 specific intrinsic so isn't portable to other targets.
define void @foo(i32* %x) {
%a = load i32, i32* %x
%b = tail call { i8, i32 } @llvm.x86.addcarry.32(i8 0, i32 %a, i32 1)
%c = extractvalue { i8, i32 } %b, 0
%d = extractvalue { i8, i32 } %b, 1
%e = zext i8 %c to i32
%f = add i32 %d, %e
store i32 %f, i32* %x
ret void
}
declare { i8, i32 } @llvm.x86.addcarry.32(i8 , i32, i32)
~Craig
On Thu, Jul 4, 2019 at 2:07 AM Marc <mh at mh-sec.de> wrote:
> Hello Craig,
>
> thanks for filing a patch!
> I would have hoped that the issue is on my side and not an issue in llvm
> though ;)
>
> Is that patch for llvm 9 (only)?
>
> Regards,
> Marc
>
> On 04.07.19 02:28, Craig Topper wrote:
> > This patch fixes the issue. I haven't checked our lit tests yet to see
> > what effect this has and if we can commit it.
> >
> > lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> >
> > @@ -2967,7 +2967,8 @@SDValue DAGCombiner::visitADDCARRYLike(SDValue N0,
> > SDValue N1, SDValue CarryIn,
> >
> > // Iff the flag result is dead:
> >
> > // (addcarry (add|uaddo X, Y), 0, Carry) -> (addcarry X, Y, Carry)
> >
> > if ((N0.getOpcode() == ISD::ADD ||
> >
> > - (N0.getOpcode() == ISD::UADDO && N0.getResNo() == 0)) &&
> >
> > + (N0.getOpcode() == ISD::UADDO && N0.getResNo() == 0 &&
> >
> > + N0.getValue(1) != CarryIn)) &&
> >
> > isNullConstant(N1) && !N->hasAnyUseOfValue(1))
> >
> > return DAG.getNode(ISD::ADDCARRY, SDLoc(N), N->getVTList(),
> >
> > N0.getOperand(0), N0.getOperand(1), CarryIn);
> >
> >
> > ~Craig
> >
> >
> > On Wed, Jul 3, 2019 at 9:40 AM Marc <mh at mh-sec.de <mailto:mh at mh-sec.de>>
> > wrote:
> >
> > Hi Craig,
> >
> > On 03.07.19 17:33, Craig Topper wrote:
> > > Don't the CreateICmp calls return a Value* with an i1 type? But
> then
> > > they are added to an i8 type? Not sure that works.
> >
> > I had that initially:
> > auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));
> > auto carry = IRB.CreateZExt(cf, Int8Ty);
> > Incr = IRB.CreateAdd(Incr, carry);
> > it makes no difference to the generated assembly
> >
> >
> > > Have you tried using the llvm.uadd.with.overflow.i8 intrinsic?
> >
> > we have tried this:
> >
> > CallInst *AddOv =
> > IRB.CreateBinaryIntrinsic(Intrinsic::uadd_with_overflow, Counter,
> > ConstantInt::get(Int8Ty, 1));
> > AddOv->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C,
> None));
> > Value *SumWithOverflowBit = AddOv;
> > Incr = IRB.CreateAdd(IRB.CreateExtractValue(SumWithOverflowBit, 0),
> > IRB.CreateZExt(IRB.CreateExtractValue(SumWithOverflowBit, 1),
> Int8Ty));
> >
> > but that generated exactly the same as
> >
> > auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));
> > Incr = IRB.CreateAdd(Incr, cf);
> >
> >
> > All the above create (with llvm 6.0, 7 and 8):
> >
> > mov dl,BYTE PTR [rsi+rdi*1]
> > mov ecx,edx // <- not needed
> > add cl,0x1 // <- should be done to dl instead
> > adc dl,0x1
> > mov BYTE PTR [rsi+rdi*1],dl
> >
> >
> > Thanks!
> >
> > Regards,
> > Marc
> >
> > >
> > > ~Craig
> > >
> > >
> > > On Wed, Jul 3, 2019 at 4:01 AM Marc via llvm-dev
> > > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> > <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>>
> > wrote:
> > >
> > > Hello,
> > >
> > > I have an optimisation issue in an llvm IR pass - the issue
> > being that
> > > unnecessary instructions are generated in the final assembly
> > (with -O3).
> > >
> > > I want to create the following assembly snippet:
> > >
> > > mov dl,BYTE PTR [rsi+rdi*1]
> > > add dl,0x1
> > > adc dl,0x0
> > > mov BYTE PTR [rsi+rdi*1],dl
> > >
> > > however what is created is (variant #1):
> > >
> > > mov dl,BYTE PTR [rsi+rdx*1]
> > > add dl,0x1
> > > cmp dl,0x1 // <- not needed
> > > adc dl,0x0
> > > mov BYTE PTR [rsi+rdi*1],dl
> > >
> > > in variant #2 the following is created:
> > >
> > > mov dl,BYTE PTR [rsi+rdi*1]
> > > mov ecx,edx // <- not needed
> > > add cl,0x1 // <- should be done to dl instead
> > > adc dl,0x1
> > > mov BYTE PTR [rsi+rdi*1],dl
> > >
> > > Far below are both variants with the full code around it,
> > however the
> > > difference in both variants is this:
> > >
> > > //variant1
> > > auto cf = IRB.CreateICmpEQ(Incr, ConstantInt::get(Int8Ty, 0));
> > > Incr = IRB.CreateAdd(Incr, cf);
> > >
> > > //variant2
> > > auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));
> > > Incr = IRB.CreateAdd(Incr, cf);
> > >
> > > //interestingly this totally different approach creates the
> same
> > > instructions as variant2
> > > CallInst *AddOv =
> > IRB.CreateBinaryIntrinsic(Intrinsic::uadd_with_over
> > > flow, Counter, ConstantInt::get(Int8Ty, 1));
> > > AddOv->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C,
> > None));
> > > Value *SumWithOverflowBit = AddOv;
> > > Incr =
> > IRB.CreateAdd(IRB.CreateExtractValue(SumWithOverflowBit, 0),
> > > IRB.CreateZExt(IRB.CreateExtractValue(SumWithOverflowBit, 1),
> > Int8Ty));
> > >
> > > So - How do I have to write the code that the target code has
> > a chance
> > > of being generated?
> > > For me its the same result on LLVM 6.0 and 7.
> > >
> > > Alternatively
> > > add BYTE PTR [rsi+rdi*1],0x1
> > > adc BYTE PTR [rsi+rdi*1],0x0
> > > would work as well.
> > >
> > > Thank you very much!
> > >
> > > Regards,
> > > Marc
> > >
> > > -> Below now is the full llvm IR pass code
> > >
> > > // vvv same code before both variants
> > > LoadInst *PrevLoc = IRB.CreateLoad(AFLPrevLoc);
> > > PrevLoc->setMetadata(M.getMDKindID("nosanitize"),
> > MDNode::get(C, None));
> > > Value *PrevLocCasted = IRB.CreateZExt(PrevLoc,
> IRB.getInt32Ty());
> > >
> > > LoadInst *MapPtr = IRB.CreateLoad(AFLMapPtr);
> > > MapPtr->setMetadata(M.getMDKindID("nosanitize"),
> > MDNode::get(C, None));
> > > Value *MapPtrIdx = IRB.CreateGEP(MapPtr,
> > IRB.CreateXor(PrevLocCasted,
> > > CurLoc));
> > >
> > > LoadInst *Counter = IRB.CreateLoad(MapPtrIdx);
> > > Counter->setMetadata(M.getMDKindID("nosanitize"),
> > MDNode::get(C, None));
> > >
> > > Value *Incr = IRB.CreateAdd(Counter, ConstantInt::get(Int8Ty,
> 1));
> > > // ^^^ same code before both variants
> > >
> > > // CODE FOR VARIANT #1
> > > auto cf = IRB.CreateICmpEQ(Incr, ConstantInt::get(Int8Ty, 0));
> > > Incr = IRB.CreateAdd(Incr, cf);
> > >
> > > // CODE FOR VARIANT #2
> > > auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));
> > > Incr = IRB.CreateAdd(Incr, cf);
> > >
> > > // vvv same code after both variants
> > > IRB.CreateStore(Incr,
> > > MapPtrIdx)->setMetadata(M.getMDKindID("nosanitize"),
> > MDNode::get(C,
> > > None));
> > > StoreInst *Store = IRB.CreateStore(ConstantInt::get(Int32Ty,
> > cur_loc >>
> > > 1), AFLPrevLoc);
> > > Store->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C,
> > None));
> > > // ^^^ same code after both variants
> > >
> > >
> > > --
> > > Marc Heuse
> > > www.mh-sec.de <http://www.mh-sec.de> <http://www.mh-sec.de>
> > >
> > > PGP: AF3D 1D4C D810 F0BB 977D 3807 C7EE D0A0 6BE9 F573
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> > <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190704/cb7a5ae1/attachment-0001.html>
More information about the llvm-dev
mailing list