<div dir="ltr">I'm going to revert r275571 to unblock us.<br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jul 18, 2016 at 5:21 PM, Jun Bum Lim via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Thanks for letting me know this. I will take a look.<br>
<br>
Best,<br>
Jun<br>
<div><div class="h5"><br>
-----Original Message-----<br>
From: Benjamin Kramer [mailto:<a href="mailto:benny.kra@gmail.com">benny.kra@gmail.com</a>]<br>
Sent: Sunday, July 17, 2016 10:15 AM<br>
To: Jun Bum Lim<br>
Cc: Chandler Carruth via llvm-commits<br>
Subject: Re: [llvm] r275571 - [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals<br>
<br>
Looks like this caused <a href="https://llvm.org/bugs/show_bug.cgi?id=28588" rel="noreferrer" target="_blank">https://llvm.org/bugs/show_bug.cgi?id=28588</a>,<br>
can you take a look?<br>
<br>
On Fri, Jul 15, 2016 at 6:14 PM, Jun Bum Lim via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<br>
> Author: junbuml<br>
> Date: Fri Jul 15 11:14:34 2016<br>
> New Revision: 275571<br>
><br>
> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=275571&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=275571&view=rev</a><br>
> Log:<br>
> [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals<br>
><br>
> Summary:<br>
> This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics.<br>
> Add test cases which was missing opportunities before.<br>
><br>
> Reviewers: hfinkel, eeckstein, mcrosier<br>
><br>
> Subscribers: mcrosier, llvm-commits<br>
><br>
> Differential Revision: <a href="https://reviews.llvm.org/D21909" rel="noreferrer" target="_blank">https://reviews.llvm.org/D21909</a><br>
><br>
> Modified:<br>
>     llvm/trunk/lib/Transforms/Scalar/DeadStoreElimination.cpp<br>
>     llvm/trunk/test/Transforms/DeadStoreElimination/OverwriteStoreBegin.ll<br>
><br>
> llvm/trunk/test/Transforms/DeadStoreElimination/OverwriteStoreEnd.ll<br>
><br>
> Modified: llvm/trunk/lib/Transforms/Scalar/DeadStoreElimination.cpp<br>
> URL:<br>
> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/D" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/D</a><br>
> eadStoreElimination.cpp?rev=275571&r1=275570&r2=275571&view=diff<br>
> ======================================================================<br>
> ========<br>
> --- llvm/trunk/lib/Transforms/Scalar/DeadStoreElimination.cpp<br>
> (original)<br>
> +++ llvm/trunk/lib/Transforms/Scalar/DeadStoreElimination.cpp Fri Jul<br>
</div></div>> +++ 15 11:14:34 2016<br>
<span class="">> @@ -290,8 +290,8 @@ enum OverwriteResult {  };  }<br>
><br>
> -typedef DenseMap<Instruction *,<br>
> -                 std::map<int64_t, int64_t>> InstOverlapIntervalsTy;<br>
</span>> +typedef std::map<int64_t, int64_t> OverlapIntervalsTy; typedef<br>
> +DenseMap<Instruction *, OverlapIntervalsTy> InstOverlapIntervalsTy;<br>
<div><div class="h5">><br>
>  /// Return 'OverwriteComplete' if a store to the 'Later' location<br>
> completely  /// overwrites a store to the 'Earlier' location,<br>
> 'OverwriteEnd' if the end of @@ -438,9 +438,9 @@ static OverwriteResult isOverwrite(const<br>
>    //<br>
>    // In this case we may want to trim the size of earlier to avoid generating<br>
>    // writes to addresses which will definitely be overwritten later<br>
> -  if (LaterOff > EarlierOff &&<br>
> -      LaterOff < int64_t(EarlierOff + Earlier.Size) &&<br>
> -      int64_t(LaterOff + Later.Size) >= int64_t(EarlierOff + Earlier.Size))<br>
> +  if (!EnablePartialOverwriteTracking &&<br>
> +      (LaterOff > EarlierOff && LaterOff < int64_t(EarlierOff + Earlier.Size) &&<br>
> +       int64_t(LaterOff + Later.Size) >= int64_t(EarlierOff +<br>
> + Earlier.Size)))<br>
>      return OverwriteEnd;<br>
><br>
>    // Finally, we also need to check if the later store overwrites the<br>
> beginning @@ -452,9 +452,11 @@ static OverwriteResult isOverwrite(const<br>
>    // In this case we may want to move the destination address and trim the size<br>
>    // of earlier to avoid generating writes to addresses which will definitely<br>
>    // be overwritten later.<br>
> -  if (LaterOff <= EarlierOff && int64_t(LaterOff + Later.Size) > EarlierOff) {<br>
> -    assert (int64_t(LaterOff + Later.Size) < int64_t(EarlierOff + Earlier.Size)<br>
> -            && "Expect to be handled as OverwriteComplete" );<br>
> +  if (!EnablePartialOverwriteTracking &&<br>
> +      (LaterOff <= EarlierOff && int64_t(LaterOff + Later.Size) > EarlierOff)) {<br>
> +    assert(int64_t(LaterOff + Later.Size) <<br>
> +               int64_t(EarlierOff + Earlier.Size) &&<br>
> +           "Expect to be handled as OverwriteComplete");<br>
>      return OverwriteBegin;<br>
>    }<br>
>    // Otherwise, they don't completely overlap.<br>
> @@ -819,6 +821,119 @@ static bool handleEndBlock(BasicBlock &B<br>
>    return MadeChange;<br>
>  }<br>
><br>
> +static bool tryToShorten(Instruction *EarlierWrite, int64_t &EarlierOffset,<br>
> +                         int64_t &EarlierSize, int64_t LaterOffset,<br>
> +                         int64_t LaterSize, bool IsOverwriteEnd) {<br>
> +  // TODO: base this on the target vector size so that if the earlier<br>
> +  // store was too small to get vector writes anyway then its likely<br>
> +  // a good idea to shorten it<br>
> +  // Power of 2 vector writes are probably always a bad idea to<br>
</div></div>> +optimize<br>
<span class="">> +  // as any store/memset/memcpy is likely using vector instructions<br>
</span>> +so<br>
<span class="">> +  // shortening it to not vector size is likely to be slower<br>
> +  MemIntrinsic *EarlierIntrinsic = cast<MemIntrinsic>(EarlierWrite);<br>
> +  unsigned EarlierWriteAlign = EarlierIntrinsic->getAlignment();<br>
> +  if (!IsOverwriteEnd)<br>
> +    LaterOffset = int64_t(LaterOffset + LaterSize);<br>
> +<br>
> +  if (!(llvm::isPowerOf2_64(LaterOffset) && EarlierWriteAlign <= LaterOffset) &&<br>
> +      !((EarlierWriteAlign != 0) && LaterOffset % EarlierWriteAlign == 0))<br>
> +    return false;<br>
> +<br>
> +  DEBUG(dbgs() << "DSE: Remove Dead Store:\n  OW "<br>
> +               << (IsOverwriteEnd ? "END" : "BEGIN") << ": " << *EarlierWrite<br>
> +               << "\n  KILLER (offset " << LaterOffset << ", " << EarlierSize<br>
> +               << ")\n");<br>
> +<br>
> +  int64_t NewLength = IsOverwriteEnd<br>
> +                          ? LaterOffset - EarlierOffset<br>
> +                          : EarlierSize - (LaterOffset -<br>
</span>> + EarlierOffset);<br>
> +<br>
> +  Value *EarlierWriteLength = EarlierIntrinsic->getLength();  Value<br>
> + *TrimmedLength =<br>
<span class="">> +      ConstantInt::get(EarlierWriteLength->getType(), NewLength);<br>
> + EarlierIntrinsic->setLength(TrimmedLength);<br>
> +<br>
> +  EarlierSize = NewLength;<br>
> +  if (!IsOverwriteEnd) {<br>
> +    int64_t OffsetMoved = (LaterOffset - EarlierOffset);<br>
> +    Value *Indices[1] = {<br>
> +        ConstantInt::get(EarlierWriteLength->getType(), OffsetMoved)};<br>
> +    GetElementPtrInst *NewDestGEP = GetElementPtrInst::CreateInBounds(<br>
> +        EarlierIntrinsic->getRawDest(), Indices, "", EarlierWrite);<br>
> +    EarlierIntrinsic->setDest(NewDestGEP);<br>
> +    EarlierOffset = EarlierOffset + OffsetMoved;<br>
> +  }<br>
> +  return true;<br>
> +}<br>
> +<br>
> +static bool tryToShortenEnd(Instruction *EarlierWrite,<br>
> +                            OverlapIntervalsTy &IntervalMap,<br>
> +                            int64_t &EarlierStart, int64_t<br>
</span>> +&EarlierSize) {<br>
<span class="">> +  if (IntervalMap.empty() || !isShortenableAtTheEnd(EarlierWrite))<br>
> +    return false;<br>
> +<br>
</span>> +  OverlapIntervalsTy::iterator OII = --IntervalMap.end();  int64_t<br>
> + LaterStart = OII->second;  int64_t LaterSize = OII->first -<br>
> + LaterStart;<br>
<span class="">> +<br>
> +  if (LaterStart > EarlierStart && LaterStart < EarlierStart + EarlierSize &&<br>
> +      LaterStart + LaterSize >= EarlierStart + EarlierSize) {<br>
> +    if (tryToShorten(EarlierWrite, EarlierStart, EarlierSize, LaterStart,<br>
> +                     LaterSize, true)) {<br>
> +      IntervalMap.erase(OII);<br>
> +      return true;<br>
> +    }<br>
> +  }<br>
> +  return false;<br>
> +}<br>
> +<br>
> +static bool tryToShortenBegin(Instruction *EarlierWrite,<br>
> +                              OverlapIntervalsTy &IntervalMap,<br>
> +                              int64_t &EarlierStart, int64_t<br>
</span>> +&EarlierSize) {<br>
<span class="">> +  if (IntervalMap.empty() || !isShortenableAtTheBeginning(EarlierWrite))<br>
> +    return false;<br>
> +<br>
</span>> +  OverlapIntervalsTy::iterator OII = IntervalMap.begin();  int64_t<br>
> + LaterStart = OII->second;  int64_t LaterSize = OII->first -<br>
> + LaterStart;<br>
<span class="">> +<br>
> +  if (LaterStart <= EarlierStart && LaterStart + LaterSize > EarlierStart) {<br>
> +    assert(LaterStart + LaterSize < EarlierStart + EarlierSize &&<br>
> +           "Should have been handled as OverwriteComplete");<br>
> +    if (tryToShorten(EarlierWrite, EarlierStart, EarlierSize, LaterStart,<br>
> +                     LaterSize, false)) {<br>
> +      IntervalMap.erase(OII);<br>
> +      return true;<br>
> +    }<br>
> +  }<br>
> +  return false;<br>
> +}<br>
> +<br>
> +static bool removePartiallyOverlappedStores(AliasAnalysis *AA,<br>
> +                                            const DataLayout &DL,<br>
> +                                            InstOverlapIntervalsTy<br>
</span>> +&IOL) {<br>
<span class="">> +  bool Changed = false;<br>
> +  for (auto OI : IOL) {<br>
> +    Instruction *EarlierWrite = OI.first;<br>
> +    MemoryLocation Loc = getLocForWrite(EarlierWrite, *AA);<br>
> +    assert(isRemovable(EarlierWrite) && "Expect only removable instruction");<br>
> +    assert(Loc.Size != MemoryLocation::UnknownSize && "Unexpected mem<br>
</span>> +loc");<br>
<span class="">> +<br>
> +    const Value *Ptr = Loc.Ptr->stripPointerCasts();<br>
> +    int64_t EarlierStart = 0;<br>
> +    int64_t EarlierSize = int64_t(Loc.Size);<br>
> +    GetPointerBaseWithConstantOffset(Ptr, EarlierStart, DL);<br>
> +    OverlapIntervalsTy &IntervalMap = OI.second;<br>
> +    Changed =<br>
> +        tryToShortenEnd(EarlierWrite, IntervalMap, EarlierStart, EarlierSize);<br>
> +    if (IntervalMap.empty())<br>
> +      continue;<br>
> +    Changed |=<br>
> +        tryToShortenBegin(EarlierWrite, IntervalMap, EarlierStart,<br>
</span>> +EarlierSize);<br>
<div><div class="h5">> +  }<br>
> +  return Changed;<br>
> +}<br>
> +<br>
>  static bool eliminateNoopStore(Instruction *Inst, BasicBlock::iterator &BBI,<br>
>                                 AliasAnalysis *AA, MemoryDependenceResults *MD,<br>
>                                 const DataLayout &DL, @@ -936,7<br>
> +1051,7 @@ static bool eliminateDeadStores(BasicBlo<br>
>          if (OR == OverwriteComplete) {<br>
>            DEBUG(dbgs() << "DSE: Remove Dead Store:\n  DEAD: "<br>
>                  << *DepWrite << "\n  KILLER: " << *Inst << '\n');<br>
> -<br>
> +          IOL.erase(DepWrite);<br>
>            // Delete the store and now-dead instructions that feed it.<br>
>            deleteDeadInstruction(DepWrite, &BBI, *MD, *TLI);<br>
>            ++NumFastStores;<br>
> @@ -948,48 +1063,14 @@ static bool eliminateDeadStores(BasicBlo<br>
>          } else if ((OR == OverwriteEnd && isShortenableAtTheEnd(DepWrite)) ||<br>
>                     ((OR == OverwriteBegin &&<br>
>                       isShortenableAtTheBeginning(DepWrite)))) {<br>
> -          // TODO: base this on the target vector size so that if the earlier<br>
> -          // store was too small to get vector writes anyway then its likely<br>
> -          // a good idea to shorten it<br>
> -          // Power of 2 vector writes are probably always a bad idea to optimize<br>
> -          // as any store/memset/memcpy is likely using vector instructions so<br>
> -          // shortening it to not vector size is likely to be slower<br>
> -          MemIntrinsic *DepIntrinsic = cast<MemIntrinsic>(DepWrite);<br>
> -          unsigned DepWriteAlign = DepIntrinsic->getAlignment();<br>
> +          assert(!EnablePartialOverwriteTracking && "Do not expect to perform "<br>
> +                                                    "when partial-overwrite "<br>
> +                                                    "tracking is enabled");<br>
> +          int64_t EarlierSize = DepLoc.Size;<br>
> +          int64_t LaterSize = Loc.Size;<br>
>            bool IsOverwriteEnd = (OR == OverwriteEnd);<br>
> -          if (!IsOverwriteEnd)<br>
> -            InstWriteOffset = int64_t(InstWriteOffset + Loc.Size);<br>
> -<br>
> -          if ((llvm::isPowerOf2_64(InstWriteOffset) &&<br>
> -               DepWriteAlign <= InstWriteOffset) ||<br>
> -              ((DepWriteAlign != 0) && InstWriteOffset % DepWriteAlign == 0)) {<br>
> -<br>
> -            DEBUG(dbgs() << "DSE: Remove Dead Store:\n  OW "<br>
> -                         << (IsOverwriteEnd ? "END" : "BEGIN") << ": "<br>
> -                         << *DepWrite << "\n  KILLER (offset "<br>
> -                         << InstWriteOffset << ", " << DepLoc.Size << ")"<br>
> -                         << *Inst << '\n');<br>
> -<br>
> -            int64_t NewLength =<br>
> -                IsOverwriteEnd<br>
> -                    ? InstWriteOffset - DepWriteOffset<br>
> -                    : DepLoc.Size - (InstWriteOffset - DepWriteOffset);<br>
> -<br>
> -            Value *DepWriteLength = DepIntrinsic->getLength();<br>
> -            Value *TrimmedLength =<br>
> -                ConstantInt::get(DepWriteLength->getType(), NewLength);<br>
> -            DepIntrinsic->setLength(TrimmedLength);<br>
> -<br>
> -            if (!IsOverwriteEnd) {<br>
> -              int64_t OffsetMoved = (InstWriteOffset - DepWriteOffset);<br>
> -              Value *Indices[1] = {<br>
> -                  ConstantInt::get(DepWriteLength->getType(), OffsetMoved)};<br>
> -              GetElementPtrInst *NewDestGEP = GetElementPtrInst::CreateInBounds(<br>
> -                  DepIntrinsic->getRawDest(), Indices, "", DepWrite);<br>
> -              DepIntrinsic->setDest(NewDestGEP);<br>
> -            }<br>
> -            MadeChange = true;<br>
> -          }<br>
> +          MadeChange = tryToShorten(DepWrite, DepWriteOffset, EarlierSize,<br>
> +                                    InstWriteOffset, LaterSize,<br>
</div></div>> + IsOverwriteEnd);<br>
<span class="">>          }<br>
>        }<br>
><br>
> @@ -1012,6 +1093,9 @@ static bool eliminateDeadStores(BasicBlo<br>
>      }<br>
>    }<br>
><br>
> +  if (EnablePartialOverwriteTracking)<br>
> +    MadeChange |= removePartiallyOverlappedStores(AA, DL, IOL);<br>
> +<br>
>    // If this block ends in a return, unwind, or unreachable, all allocas are<br>
>    // dead at its end, which means stores to them are also dead.<br>
>    if (BB.getTerminator()->getNumSuccessors() == 0)<br>
><br>
> Modified:<br>
> llvm/trunk/test/Transforms/DeadStoreElimination/OverwriteStoreBegin.ll<br>
> URL:<br>
> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/DeadSto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/DeadSto</a><br>
> reElimination/OverwriteStoreBegin.ll?rev=275571&r1=275570&r2=275571&vi<br>
> ew=diff<br>
> ======================================================================<br>
> ========<br>
> ---<br>
> llvm/trunk/test/Transforms/DeadStoreElimination/OverwriteStoreBegin.ll<br>
> (original)<br>
> +++ llvm/trunk/test/Transforms/DeadStoreElimination/OverwriteStoreBegi<br>
</span>> +++ n.ll Fri Jul 15 11:14:34 2016<br>
<span class="">> @@ -86,5 +86,23 @@ entry:<br>
>    ret void<br>
>  }<br>
><br>
> +define void @write8To15AndThen0To7(i64* nocapture %P) {<br>
> +entry:<br>
</span>> +; CHECK-LABEL: @write8To15AndThen0To7( ; CHECK: [[GEP:%[0-9]+]] =<br>
> +getelementptr inbounds i8, i8* %mybase0, i64 16 ; CHECK: tail call<br>
> +void @llvm.memset.p0i8.i64(i8* [[GEP]], i8 0, i64 16, i32 8, i1<br>
> +false)<br>
<span class="">> +<br>
> +  %base0 = bitcast i64* %P to i8*<br>
</span>> +  %mybase0 = getelementptr inbounds i8, i8* %base0, i64 0  tail call<br>
> + void @llvm.memset.p0i8.i64(i8* %mybase0, i8 0, i64 32, i32 8, i1<br>
> + false)<br>
<span class="">> +<br>
> +  %base64_0 = getelementptr inbounds i64, i64* %P, i64 0<br>
> +  %base64_1 = getelementptr inbounds i64, i64* %P, i64 1<br>
> +<br>
> +  store i64 1, i64* %base64_1<br>
> +  store i64 2, i64* %base64_0<br>
> +  ret void<br>
> +}<br>
> +<br>
>  declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1)<br>
> nounwind<br>
><br>
><br>
> Modified:<br>
> llvm/trunk/test/Transforms/DeadStoreElimination/OverwriteStoreEnd.ll<br>
> URL:<br>
> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/DeadSto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/DeadSto</a><br>
> reElimination/OverwriteStoreEnd.ll?rev=275571&r1=275570&r2=275571&view<br>
> =diff<br>
> ======================================================================<br>
> ========<br>
> ---<br>
> llvm/trunk/test/Transforms/DeadStoreElimination/OverwriteStoreEnd.ll<br>
> (original)<br>
> +++ llvm/trunk/test/Transforms/DeadStoreElimination/OverwriteStoreEnd.<br>
</span>> +++ ll Fri Jul 15 11:14:34 2016<br>
<span class="">> @@ -93,3 +93,20 @@ entry:<br>
>    store i64 3, i64* %tf_trapno, align 8<br>
>    ret void<br>
>  }<br>
> +<br>
> +define void @write16To23AndThen24To31(i64* nocapture %P, i64 %n64,<br>
</span>> +i32 %n32, i16 %n16, i8 %n8) {<br>
> +entry:<br>
> +; CHECK-LABEL: @write16To23AndThen24To31( ; CHECK: tail call void<br>
> +@llvm.memset.p0i8.i64(i8* %mybase0, i8 0, i64 16, i32 8, i1 false)<br>
<span class="">> +<br>
> +  %base0 = bitcast i64* %P to i8*<br>
</span>> +  %mybase0 = getelementptr inbounds i8, i8* %base0, i64 0  tail call<br>
> + void @llvm.memset.p0i8.i64(i8* %mybase0, i8 0, i64 32, i32 8, i1<br>
> + false)<br>
<div class="HOEnZb"><div class="h5">> +<br>
> +  %base64_2 = getelementptr inbounds i64, i64* %P, i64 2<br>
> +  %base64_3 = getelementptr inbounds i64, i64* %P, i64 3<br>
> +<br>
> +  store i64 3, i64* %base64_2<br>
> +  store i64 3, i64* %base64_3<br>
> +  ret void<br>
> +}<br>
><br>
><br>
> _______________________________________________<br>
> llvm-commits mailing list<br>
> <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</div></div></blockquote></div><br></div></div>