[llvm-commits] Struct-handling patches

Sun Sep 28 12:04:54 PDT 2008

Hi all,

please find three patches attached. These patches fix some behaviour related
to struct transformations.

I was observing issues with struct alloca's not always being properly split up
by scalarrepl. Some other passes sometimes make changes that confuse
scalarrepl, preventing it from doing its work.

I don't have a clear-cut testcase for this, since the issues occured when
running our own ordering of LLVM passes. Using -std-compile-opts or llvm-gcc
mostly managed to mask thes bugs by doing transformations in a particular
order. I have include small testcases with each patch, though.

The first patch, preserve-struct.diff changes instcombine not to replace a
struct alloca by some other type in some cases. This is currently done if the
alloca is bitcasted to any type with a higher alignment (such as i32 and i64),
which greatly confuses scalarrepl. This patch is not without problems, since
it makes the llvm-gcc bug in PR2823 happen a bit more often. For more details
about this patch, see the thread at
http://lists.cs.uiuc.edu/pipermail/llvmdev/2008-September/016870.html

The second patch, vector-gep.diff, allows scalarrepl to treat all GEP
instructions with all zero indices as bitcasts. Sometimes, all-zero bitcasts
are emitted where bitcasts would be conceptually more appropriate. Currently,
scalarrepl handles all-zero geps only when there are no vector types involved.
This patch removes this limitation.

The last patch, firstclass.diff, concerns instcombine. Instcombine can replace
memcpy calls for small values with a load and a store. This defaults to using
an integer type of appropriate size to load and store, unless a single value
type is copied (ie, a double, or a struct containing just a double, etc.). The
patch changes this to any first class type, since all first class types can be
loaded and stored directly. In particular, a memcpy of a small struct is now
replaced by a load and a store of the right struct type, instead of casting it
to integer first (confusing scalarrepl more).

This last patch is not completely useful by itself yet, since scalarrepl
doesn't know how to handle struct loads and stores yet :-) I'll have a look at
that tomorrow.


I'd like to have some feedback on these patches. In particular, I'm not
completely sure that the policies these patches implement are the right way to
go. In general, these patches make structs be handled as structs as often as
possible, which seems like a good idea IMHO (keeping as much type information
correct as long as possible).

Gr.

Matthijs
-------------- next part --------------
Index: test/Transforms/ScalarRepl/2008-09-22-vector-gep.ll
===================================================================

--- test/Transforms/ScalarRepl/2008-09-22-vector-gep.ll	(revision 0)
+++ test/Transforms/ScalarRepl/2008-09-22-vector-gep.ll	(revision 0)
@@ -0,0 +1,24 @@
+; This test checks to see if scalarrepl also works when a gep with all zeroes is
+; used instead of a bitcast to prepare a memmove pointer argument. Previously,
+; this would not work when there was a vector involved in the struct, preventing
+; scalarrepl from removing the alloca below.
+
+; RUN: llvm-as < %s | opt -scalarrepl | llvm-dis > %t
+; RUN: cat %t | not grep alloca
+
+%struct.two = type <{ < 2 x i8 >, i16 }>
+
+define void @main(%struct.two* %D, i16 %V) {
+entry:
+	%S = alloca %struct.two
+        %S.2 = getelementptr %struct.two* %S, i32 0, i32 1
+        store i16 %V, i16* %S.2
+        ; This gep is effectively a bitcast to i8*, but is sometimes generated
+        ; because the type of the first element in %struct.two is i8.
+	%tmpS = getelementptr %struct.two* %S, i32 0, i32 0, i32 0 
+	%tmpD = bitcast %struct.two* %D to i8*
+        call void @llvm.memmove.i32(i8* %tmpD, i8* %tmpS, i32 4, i32 1)
+        ret void
+}
+
+declare void @llvm.memmove.i32(i8*, i8*, i32, i32) nounwind
Index: lib/Transforms/Scalar/ScalarReplAggregates.cpp
===================================================================
--- lib/Transforms/Scalar/ScalarReplAggregates.cpp	(revision 56433)
+++ lib/Transforms/Scalar/ScalarReplAggregates.cpp	(working copy)
@@ -530,8 +530,9 @@
       return MarkUnsafe(Info);
     }
   }
+ 
+  bool hasVector = false;
   
-  
   // Walk through the GEP type indices, checking the types that this indexes
   // into.
   for (; I != E; ++I) {
@@ -539,22 +540,29 @@
     if (isa<StructType>(*I))
       continue;
     
-    // Don't SROA pointers into vectors.
-    if (isa<VectorType>(*I))
-      return MarkUnsafe(Info);
-    
-    // Otherwise, we must have an index into an array type.  Verify that this is
-    // an in-range constant integer.  Specifically, consider A[0][i].  We
-    // cannot know that the user isn't doing invalid things like allowing i to
-    // index an out-of-range subscript that accesses A[1].  Because of this, we
-    // have to reject SROA of any accesses into structs where any of the
-    // components are variables.
     ConstantInt *IdxVal = dyn_cast<ConstantInt>(I.getOperand());
     if (!IdxVal) return MarkUnsafe(Info);
-    if (IdxVal->getZExtValue() >= cast<ArrayType>(*I)->getNumElements())
+
+    // Are all indices still zero?
+    IsAllZeroIndices &= IdxVal->isZero();
+    
+    if (isa<ArrayType>(*I)) {
+      // Otherwise, we must have an index into an array type.  Verify that this is
+      // an in-range constant integer.  Specifically, consider A[0][i].  We
+      // cannot know that the user isn't doing invalid things like allowing i to
+      // index an out-of-range subscript that accesses A[1].  Because of this, we
+      // have to reject SROA of any accesses into structs where any of the
+      // components are variables.
+      if (IdxVal->getZExtValue() >= cast<ArrayType>(*I)->getNumElements())
+        return MarkUnsafe(Info);
+    }
+  
+    // Note if we've seen a vector type yet
+    hasVector |= isa<VectorType>(*I);
+    
+    // Don't SROA pointers into vectors, unless all indices are zero.
+    if (hasVector && !IsAllZeroIndices)
       return MarkUnsafe(Info);
-    
-    IsAllZeroIndices &= IdxVal->isZero();
   }
   
   // If there are any non-simple uses of this getelementptr, make sure to reject
@@ -661,6 +669,11 @@
       // It is likely that OtherPtr is a bitcast, if so, remove it.
       if (BitCastInst *BC = dyn_cast<BitCastInst>(OtherPtr))
         OtherPtr = BC->getOperand(0);
+      // All zero GEPs are effectively casts
+      if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(OtherPtr))
+        if (GEP->hasAllZeroIndices())
+          OtherPtr = GEP->getOperand(0);
+        
       if (ConstantExpr *BCE = dyn_cast<ConstantExpr>(OtherPtr))
         if (BCE->getOpcode() == Instruction::BitCast)
           OtherPtr = BCE->getOperand(0);
-------------- next part --------------
Index: test/Transforms/InstCombine/2008-09-22-preserve-struct-alloca.ll
===================================================================
--- test/Transforms/InstCombine/2008-09-22-preserve-struct-alloca.ll	(revision 0)
+++ test/Transforms/InstCombine/2008-09-22-preserve-struct-alloca.ll	(revision 0)
@@ -0,0 +1,27 @@
+; This checks if instcombine leaves struct alloca's intact. It used to change
+; the below alloca to an alloca i32, which doesn't really make sense (and can in
+; a lot of cases confuse other passes such as scalarrepl.
+; Now, instcombine only touches struct alloca's when all uses are bitcasts to
+; the same type.
+
+; RUN: llvm-as < %s | opt -instcombine | llvm-dis > %t
+; RUN: cat %t | grep {alloca .struct.two}
+
+%struct.two = type <{ i16, i16 }>
+
+; Dummy function to give %S another use below.
+declare void @fill(%struct.two*)
+
+define void @main(%struct.two* %D) {
+entry:
+	%S = alloca %struct.two
+        call void @fill (%struct.two* %S)
+        ; Copy %S to %D byt loading and storing an i32. This is written this way
+        ; to lure instcombine into replacing the alloca above with an alloca
+        ; i32.
+	%tmpS = bitcast %struct.two* %S to i32*
+	%tmpD = bitcast %struct.two* %D to i32*
+	%val = load i32* %tmpS	
+	store i32 %val, i32* %tmpD
+        ret void
+}
Index: lib/Transforms/Scalar/InstructionCombining.cpp
===================================================================
--- lib/Transforms/Scalar/InstructionCombining.cpp	(revision 56433)
+++ lib/Transforms/Scalar/InstructionCombining.cpp	(working copy)
@@ -6946,6 +6951,17 @@
   // increasing the alignment of the resultant allocation.  If we keep it the
   // same, we open the door to infinite loops of various kinds.
   if (!AI.hasOneUse() && CastElTyAlign == AllocElTyAlign) return 0;
+  
+  // If the allocation has a struct type and has other uses than casts to
+  // PTy, don't change the alloca. Changing it would prevent passes like
+  // scalarrepl from doing their work later on.
+  if (!AI.hasOneUse() && isa<StructType>(AllocElTy))
+    for (Value::use_iterator UI = AI.use_begin(), UE = AI.use_end();
+         UI != UE; ++UI) {
+      CastInst *CI = dyn_cast<CastInst>(UI);
+      if (!CI || CI->getDestTy() != PTy)
+        return 0;
+    }
 
   uint64_t AllocElTySize = TD->getABITypeSize(AllocElTy);
   uint64_t CastElTySize = TD->getABITypeSize(CastElTy);
-------------- next part --------------
Index: test/Transforms/InstCombine/2008-09-22-small-struct-memmove.ll
===================================================================
--- test/Transforms/InstCombine/2008-09-22-small-struct-memmove.ll	(revision 0)
+++ test/Transforms/InstCombine/2008-09-22-small-struct-memmove.ll	(revision 0)
@@ -0,0 +1,23 @@
+; This checks if instcombine replaces small struct memmoves with loads and
+; stores of a struct, instead of using a equally sized integer.
+
+; RUN: llvm-as < %s | opt -instcombine | llvm-dis > %t
+; Use . instead of % in grep, since %s in %struct gets replaced...
+; RUN: cat %t | grep {load .struct.two*}
+
+%struct.two = type <{ i16, i16 }>
+
+; Dummy function to give %S another use below.
+declare void @fill(%struct.two*)
+
+define void @main(%struct.two* %D) {
+entry:
+	%S = alloca %struct.two
+        call void @fill (%struct.two* %S)
+	%tmpS = bitcast %struct.two* %S to i8*
+	%tmpD = bitcast %struct.two* %D to i8*
+        call void @llvm.memmove.i32(i8* %tmpD, i8* %tmpS, i32 4, i32 1)
+        ret void
+}
+
+declare void @llvm.memmove.i32(i8*, i8*, i32, i32) nounwind
Index: lib/Transforms/Scalar/InstructionCombining.cpp
===================================================================
--- lib/Transforms/Scalar/InstructionCombining.cpp	(revision 56433)
+++ lib/Transforms/Scalar/InstructionCombining.cpp	(working copy)
@@ -8707,7 +8800,7 @@
           break;
       }
       
-      if (SrcETy->isSingleValueType())
+      if (SrcETy->isFirstClassType())
         NewPtrTy = PointerType::getUnqual(SrcETy);
     }
   }