<div dir="ltr">Yeah it will only be for 9.0<div><br></div><div>Looks like something like this will work on older versions, but it uses an X86 specific intrinsic so isn't portable to other targets.</div><div><br></div><div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254)"><div><font face="courier new, monospace"><span style="color:rgb(0,0,255)">define</span> <span style="color:rgb(0,128,128)">void</span> <span style="color:rgb(0,17,136)">@foo</span>(<span style="color:rgb(0,128,128)">i32*</span> <span style="color:rgb(0,17,136)">%x</span>) {</font></div><div>  <font face="courier new, monospace"><span style="color:rgb(0,17,136)">%a</span> = load <span style="color:rgb(0,128,128)">i32</span>, <span style="color:rgb(0,128,128)">i32*</span> <span style="color:rgb(0,17,136)">%x</span></font></div><div>  <font face="courier new, monospace"><span style="color:rgb(0,17,136)">%b</span> = <span style="color:rgb(0,0,255)">tail</span> call { <span style="color:rgb(0,128,128)">i8</span>, <span style="color:rgb(0,128,128)">i32</span> } <span style="color:rgb(0,17,136)">@llvm.x86.addcarry.32</span>(<span style="color:rgb(0,128,128)">i8</span> <span style="color:rgb(9,136,90)">0</span>, <span style="color:rgb(0,128,128)">i32</span> <span style="color:rgb(0,17,136)">%a</span>, <span style="color:rgb(0,128,128)">i32</span> <span style="color:rgb(9,136,90)">1</span>)</font></div><div>  <font face="courier new, monospace"><span style="color:rgb(0,17,136)">%c</span> = extractvalue { <span style="color:rgb(0,128,128)">i8</span>, <span style="color:rgb(0,128,128)">i32</span> } <span style="color:rgb(0,17,136)">%b</span>, <span style="color:rgb(9,136,90)">0</span></font></div><div>  <font face="courier new, monospace"><span style="color:rgb(0,17,136)">%d</span> = extractvalue { <span style="color:rgb(0,128,128)">i8</span>, <span style="color:rgb(0,128,128)">i32</span> } <span style="color:rgb(0,17,136)">%b</span>, <span style="color:rgb(9,136,90)">1</span></font></div><div>  <font face="courier new, monospace"><span style="color:rgb(0,17,136)">%e</span> = zext <span style="color:rgb(0,128,128)">i8</span> <span style="color:rgb(0,17,136)">%c</span> <span style="color:rgb(0,0,255)">to</span> <span style="color:rgb(0,128,128)">i32</span></font></div><div>  <font face="courier new, monospace"><span style="color:rgb(0,17,136)">%f</span> = add <span style="color:rgb(0,128,128)">i32</span> <span style="color:rgb(0,17,136)">%d</span>, <span style="color:rgb(0,17,136)">%e</span></font></div><div><font face="courier new, monospace">  store <span style="color:rgb(0,128,128)">i32</span> <span style="color:rgb(0,17,136)">%f</span>, <span style="color:rgb(0,128,128)">i32*</span> <span style="color:rgb(0,17,136)">%x</span></font></div><div><font face="courier new, monospace">  ret <span style="color:rgb(0,128,128)">void</span></font></div><div><font face="courier new, monospace">}</font></div><div><font face="courier new, monospace"><span style="color:rgb(0,0,255)">declare</span> { <span style="color:rgb(0,128,128)">i8</span>, <span style="color:rgb(0,128,128)">i32</span> } <span style="color:rgb(0,17,136)">@llvm.x86.addcarry.32</span>(<span style="color:rgb(0,128,128)">i8</span> , <span style="color:rgb(0,128,128)">i32</span>, <span style="color:rgb(0,128,128)">i32</span>)</font></div></div></div><div><br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">~Craig</div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jul 4, 2019 at 2:07 AM Marc <<a href="mailto:mh@mh-sec.de">mh@mh-sec.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello Craig,<br>

<br>

thanks for filing a patch!<br>

I would have hoped that the issue is on my side and not an issue in llvm<br>

though ;)<br>

<br>

Is that patch for llvm 9 (only)?<br>

<br>

Regards,<br>

Marc<br>

<br>

On 04.07.19 02:28, Craig Topper wrote:<br>

> This patch fixes the issue. I haven't checked our lit tests yet to see<br>

> what effect this has and if we can commit it.<br>

> <br>

> lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>

> <br>

> @@ -2967,7 +2967,8 @@SDValue DAGCombiner::visitADDCARRYLike(SDValue N0,<br>

> SDValue N1, SDValue CarryIn,<br>

> <br>

>    // Iff the flag result is dead:<br>

> <br>

>    // (addcarry (add|uaddo X, Y), 0, Carry) -> (addcarry X, Y, Carry)<br>

> <br>

>    if ((N0.getOpcode() == ISD::ADD ||<br>

> <br>

> -       (N0.getOpcode() == ISD::UADDO && N0.getResNo() == 0)) &&<br>

> <br>

> +       (N0.getOpcode() == ISD::UADDO && N0.getResNo() == 0 &&<br>

> <br>

> +        N0.getValue(1) != CarryIn)) &&<br>

> <br>

>        isNullConstant(N1) && !N->hasAnyUseOfValue(1))<br>

> <br>

>      return DAG.getNode(ISD::ADDCARRY, SDLoc(N), N->getVTList(),<br>

> <br>

>                         N0.getOperand(0), N0.getOperand(1), CarryIn);<br>

> <br>

> <br>

> ~Craig<br>

> <br>

> <br>

> On Wed, Jul 3, 2019 at 9:40 AM Marc <<a href="mailto:mh@mh-sec.de" target="_blank">mh@mh-sec.de</a> <mailto:<a href="mailto:mh@mh-sec.de" target="_blank">mh@mh-sec.de</a>>><br>

> wrote:<br>

> <br>

>     Hi Craig,<br>

> <br>

>     On 03.07.19 17:33, Craig Topper wrote:<br>

>     > Don't the CreateICmp calls return a Value* with an i1 type? But then<br>

>     > they are added to an i8 type? Not sure that works. <br>

> <br>

>     I had that initially:<br>

>      auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));<br>

>      auto carry = IRB.CreateZExt(cf, Int8Ty);<br>

>      Incr = IRB.CreateAdd(Incr, carry);<br>

>     it makes no difference to the generated assembly<br>

> <br>

> <br>

>     > Have you tried using the llvm.uadd.with.overflow.i8 intrinsic?<br>

> <br>

>     we have tried this:<br>

> <br>

>      CallInst *AddOv =<br>

>     IRB.CreateBinaryIntrinsic(Intrinsic::uadd_with_overflow, Counter,<br>

>     ConstantInt::get(Int8Ty, 1));<br>

>      AddOv->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));<br>

>      Value *SumWithOverflowBit = AddOv;<br>

>      Incr = IRB.CreateAdd(IRB.CreateExtractValue(SumWithOverflowBit, 0),<br>

>      IRB.CreateZExt(IRB.CreateExtractValue(SumWithOverflowBit, 1), Int8Ty));<br>

> <br>

>     but that generated exactly the same as<br>

> <br>

>      auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));<br>

>      Incr = IRB.CreateAdd(Incr, cf);<br>

> <br>

> <br>

>     All the above create (with llvm 6.0, 7 and 8):<br>

> <br>

>      mov    dl,BYTE PTR [rsi+rdi*1]<br>

>      mov    ecx,edx // <- not needed<br>

>      add    cl,0x1  // <- should be done to dl instead<br>

>      adc    dl,0x1<br>

>      mov    BYTE PTR [rsi+rdi*1],dl<br>

> <br>

> <br>

>     Thanks!<br>

> <br>

>     Regards,<br>

>     Marc<br>

> <br>

>     ><br>

>     > ~Craig<br>

>     ><br>

>     ><br>

>     > On Wed, Jul 3, 2019 at 4:01 AM Marc via llvm-dev<br>

>     > <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>

>     <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>>><br>

>     wrote:<br>

>     ><br>

>     >     Hello,<br>

>     ><br>

>     >     I have an optimisation issue in an llvm IR pass - the issue<br>

>     being that<br>

>     >     unnecessary instructions are generated in the final assembly<br>

>     (with -O3).<br>

>     ><br>

>     >     I want to create the following assembly snippet:<br>

>     ><br>

>     >       mov    dl,BYTE PTR [rsi+rdi*1]<br>

>     >       add    dl,0x1<br>

>     >       adc    dl,0x0<br>

>     >       mov    BYTE PTR [rsi+rdi*1],dl<br>

>     ><br>

>     >     however what is created is (variant #1):<br>

>     ><br>

>     >       mov    dl,BYTE PTR [rsi+rdx*1]<br>

>     >       add    dl,0x1<br>

>     >       cmp    dl,0x1 // <- not needed<br>

>     >       adc    dl,0x0<br>

>     >       mov    BYTE PTR [rsi+rdi*1],dl<br>

>     ><br>

>     >     in variant #2 the following is created:<br>

>     ><br>

>     >      mov    dl,BYTE PTR [rsi+rdi*1]<br>

>     >      mov    ecx,edx // <- not needed<br>

>     >      add    cl,0x1  // <- should be done to dl instead<br>

>     >      adc    dl,0x1<br>

>     >      mov    BYTE PTR [rsi+rdi*1],dl<br>

>     ><br>

>     >     Far below are both variants with the full code around it,<br>

>     however the<br>

>     >     difference in both variants is this:<br>

>     ><br>

>     >     //variant1<br>

>     >     auto cf = IRB.CreateICmpEQ(Incr, ConstantInt::get(Int8Ty, 0));<br>

>     >     Incr = IRB.CreateAdd(Incr, cf);<br>

>     ><br>

>     >     //variant2<br>

>     >     auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));<br>

>     >     Incr = IRB.CreateAdd(Incr, cf);<br>

>     ><br>

>     >     //interestingly this totally different approach creates the same<br>

>     >     instructions as variant2<br>

>     >     CallInst *AddOv =<br>

>     IRB.CreateBinaryIntrinsic(Intrinsic::uadd_with_over<br>

>     >     flow, Counter, ConstantInt::get(Int8Ty, 1));<br>

>     >     AddOv->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C,<br>

>     None));<br>

>     >     Value *SumWithOverflowBit = AddOv;<br>

>     >     Incr =<br>

>     IRB.CreateAdd(IRB.CreateExtractValue(SumWithOverflowBit, 0),<br>

>     >     IRB.CreateZExt(IRB.CreateExtractValue(SumWithOverflowBit, 1),<br>

>     Int8Ty));<br>

>     ><br>

>     >     So - How do I have to write the code that the target code has<br>

>     a chance<br>

>     >     of being generated?<br>

>     >     For me its the same result on LLVM 6.0 and 7.<br>

>     ><br>

>     >     Alternatively<br>

>     >       add    BYTE PTR [rsi+rdi*1],0x1<br>

>     >       adc    BYTE PTR [rsi+rdi*1],0x0<br>

>     >      would work as well.<br>

>     ><br>

>     >     Thank you very much!<br>

>     ><br>

>     >     Regards,<br>

>     >     Marc<br>

>     ><br>

>     >     -> Below now is the full llvm IR pass code<br>

>     ><br>

>     >     // vvv same code before both variants<br>

>     >     LoadInst *PrevLoc = IRB.CreateLoad(AFLPrevLoc);<br>

>     >     PrevLoc->setMetadata(M.getMDKindID("nosanitize"),<br>

>     MDNode::get(C, None));<br>

>     >     Value *PrevLocCasted = IRB.CreateZExt(PrevLoc, IRB.getInt32Ty());<br>

>     ><br>

>     >      LoadInst *MapPtr = IRB.CreateLoad(AFLMapPtr);<br>

>     >     MapPtr->setMetadata(M.getMDKindID("nosanitize"),<br>

>     MDNode::get(C, None));<br>

>     >     Value *MapPtrIdx = IRB.CreateGEP(MapPtr,<br>

>     IRB.CreateXor(PrevLocCasted,<br>

>     >     CurLoc));<br>

>     ><br>

>     >     LoadInst *Counter = IRB.CreateLoad(MapPtrIdx);<br>

>     >     Counter->setMetadata(M.getMDKindID("nosanitize"),<br>

>     MDNode::get(C, None));<br>

>     ><br>

>     >     Value *Incr = IRB.CreateAdd(Counter, ConstantInt::get(Int8Ty, 1));<br>

>     >     // ^^^ same code before both variants<br>

>     ><br>

>     >     // CODE FOR VARIANT #1<br>

>     >     auto cf = IRB.CreateICmpEQ(Incr, ConstantInt::get(Int8Ty, 0));<br>

>     >     Incr = IRB.CreateAdd(Incr, cf);<br>

>     ><br>

>     >     // CODE FOR VARIANT #2<br>

>     >     auto cf = IRB.CreateICmpULT(Incr, ConstantInt::get(Int8Ty, 1));<br>

>     >     Incr = IRB.CreateAdd(Incr, cf);<br>

>     ><br>

>     >     // vvv same code after both variants<br>

>     >     IRB.CreateStore(Incr,<br>

>     >     MapPtrIdx)->setMetadata(M.getMDKindID("nosanitize"),<br>

>     MDNode::get(C,<br>

>     >     None));<br>

>     >     StoreInst *Store = IRB.CreateStore(ConstantInt::get(Int32Ty,<br>

>     cur_loc >><br>

>     >     1), AFLPrevLoc);<br>

>     >     Store->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C,<br>

>     None));<br>

>     >     // ^^^ same code after both variants<br>

>     ><br>

>     ><br>

>     >     --<br>

>     >     Marc Heuse<br>

>     >     <a href="http://www.mh-sec.de" rel="noreferrer" target="_blank">www.mh-sec.de</a> <<a href="http://www.mh-sec.de" rel="noreferrer" target="_blank">http://www.mh-sec.de</a>> <<a href="http://www.mh-sec.de" rel="noreferrer" target="_blank">http://www.mh-sec.de</a>><br>

>     ><br>

>     >     PGP: AF3D 1D4C D810 F0BB 977D  3807 C7EE D0A0 6BE9 F573<br>

>     >     _______________________________________________<br>

>     >     LLVM Developers mailing list<br>

>     >     <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>

>     <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>><br>

>     >     <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

>     ><br>

</blockquote></div>