[llvm] r263121 - [SROA] Fix PR25873, which Andrea Di Biagio analyzed the daylights out

Chandler Carruth via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 10 07:31:17 PST 2016


Author: chandlerc
Date: Thu Mar 10 09:31:17 2016
New Revision: 263121

URL: http://llvm.org/viewvc/llvm-project?rev=263121&view=rev
Log:
[SROA] Fix PR25873, which Andrea Di Biagio analyzed the daylights out
of, and I misdiagnosed for months and months.

Andrea has had a patch for this forever, but I just couldn't see how
it was fixing the root cause of the problem. It didn't make sense to me,
even though the patch was perfectly good and the analysis of the actual
failure event was *fantastic*.

Well, I came back to it today because the patch has sat for *far* too
long and needs attention and decided I wouldn't let it go until I really
understood what was going on. After quite some time in the debugger,
I finally realized that in fact I had just missed an important case with
my previous attempt to fix PR22093 in r225149. Not only do we need to
handle loads that won't be split, but stores-of-loads that we won't
split. We *do* actually have enough logic in the presplitting to form
new slices for split stores.... *unless* we decided not to split them!

I'm so sorry that it took me this long to come to the realization that
this is the issue. It seems so obvious in hind sight (of course).
Anyways, the fix becomes *much* smaller and more focused. The fact that
we're left doing integer smashing is related to the FIXME in my original
commit: fundamentally, we're not aggressive about pre-splitting for
loads and stores to the same alloca. If we want to get aggressive about
this, it'll need both what Andrea had put into the proposed fix, but
also a *lot* more logic to essentially iteratively pre-split the alloca
until we can't do any more. As I said in that commit log, its really
unclear that this is the right call. Instead, the integer blending and
letting targets lower this to narrower stores seems slightly better. But
we definitely shouldn't really go down that path just to fix this bug.

Again, tons of thanks are owed to Andrea and others at Sony for working
on this bug. I really should have seen what was going on here and
re-directed them sooner. =////

Modified:
    llvm/trunk/lib/Transforms/Scalar/SROA.cpp
    llvm/trunk/test/Transforms/SROA/basictest.ll

Modified: llvm/trunk/lib/Transforms/Scalar/SROA.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SROA.cpp?rev=263121&r1=263120&r2=263121&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Scalar/SROA.cpp (original)
+++ llvm/trunk/lib/Transforms/Scalar/SROA.cpp Thu Mar 10 09:31:17 2016
@@ -3372,11 +3372,15 @@ bool SROA::presplitLoadsAndStores(Alloca
   for (auto &P : AS.partitions()) {
     for (Slice &S : P) {
       Instruction *I = cast<Instruction>(S.getUse()->getUser());
-      if (!S.isSplittable() ||S.endOffset() <= P.endOffset()) {
-        // If this was a load we have to track that it can't participate in any
-        // pre-splitting!
+      if (!S.isSplittable() || S.endOffset() <= P.endOffset()) {
+        // If this is a load we have to track that it can't participate in any
+        // pre-splitting. If this is a store of a load we have to track that
+        // that load also can't participate in any pre-splitting.
         if (auto *LI = dyn_cast<LoadInst>(I))
           UnsplittableLoads.insert(LI);
+        else if (auto *SI = dyn_cast<StoreInst>(I))
+          if (auto *LI = dyn_cast<LoadInst>(SI->getValueOperand()))
+            UnsplittableLoads.insert(LI);
         continue;
       }
       assert(P.endOffset() > S.beginOffset() &&

Modified: llvm/trunk/test/Transforms/SROA/basictest.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SROA/basictest.ll?rev=263121&r1=263120&r2=263121&view=diff
==============================================================================
--- llvm/trunk/test/Transforms/SROA/basictest.ll (original)
+++ llvm/trunk/test/Transforms/SROA/basictest.ll Thu Mar 10 09:31:17 2016
@@ -1633,3 +1633,39 @@ entry:
   %load = load i16, i16* %bc2
   ret i16 %load
 }
+
+%struct.STest = type { %struct.SPos, %struct.SPos }
+%struct.SPos = type { float, float }
+
+define void @PR25873(%struct.STest* %outData) {
+; CHECK-LABEL: @PR25873(
+; CHECK: store i32 1123418112
+; CHECK: store i32 1139015680
+; CHECK: %[[HIZEXT:.*]] = zext i32 1139015680 to i64
+; CHECK: %[[HISHL:.*]] = shl i64 %[[HIZEXT]], 32
+; CHECK: %[[HIMASK:.*]] = and i64 undef, 4294967295
+; CHECK: %[[HIINSERT:.*]] = or i64 %[[HIMASK]], %[[HISHL]]
+; CHECK: %[[LOZEXT:.*]] = zext i32 1123418112 to i64
+; CHECK: %[[LOMASK:.*]] = and i64 %[[HIINSERT]], -4294967296
+; CHECK: %[[LOINSERT:.*]] = or i64 %[[LOMASK]], %[[LOZEXT]]
+; CHECK: store i64 %[[LOINSERT]]
+entry:
+  %tmpData = alloca %struct.STest, align 8
+  %0 = bitcast %struct.STest* %tmpData to i8*
+  call void @llvm.lifetime.start(i64 16, i8* %0)
+  %x = getelementptr inbounds %struct.STest, %struct.STest* %tmpData, i64 0, i32 0, i32 0
+  store float 1.230000e+02, float* %x, align 8
+  %y = getelementptr inbounds %struct.STest, %struct.STest* %tmpData, i64 0, i32 0, i32 1
+  store float 4.560000e+02, float* %y, align 4
+  %m_posB = getelementptr inbounds %struct.STest, %struct.STest* %tmpData, i64 0, i32 1
+  %1 = bitcast %struct.STest* %tmpData to i64*
+  %2 = bitcast %struct.SPos* %m_posB to i64*
+  %3 = load i64, i64* %1, align 8
+  store i64 %3, i64* %2, align 8
+  %4 = bitcast %struct.STest* %outData to i8*
+  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %4, i8* %0, i64 16, i32 4, i1 false)
+  call void @llvm.lifetime.end(i64 16, i8* %0)
+  ret void
+}
+
+declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture, i64, i32, i1) nounwind




More information about the llvm-commits mailing list