[llvm] 188a7d6 - Add alloca size threshold for StackTagging initializer merging.

Mon Oct 19 13:44:26 PDT 2020

Author: Evgenii Stepanov
Date: 2020-10-19T13:44:07-07:00
New Revision: 188a7d671019247932b7242c7794960ca1986b5a

URL: https://github.com/llvm/llvm-project/commit/188a7d671019247932b7242c7794960ca1986b5a
DIFF: https://github.com/llvm/llvm-project/commit/188a7d671019247932b7242c7794960ca1986b5a.diff

LOG: Add alloca size threshold for StackTagging initializer merging.

Summary:
Initializer merging generates pretty inefficient code for large allocas
that also happens to trigger an exponential algorithm somewhere in
Machine Instruction Scheduler. See https://bugs.llvm.org/show_bug.cgi?id=47867.

This change adds an upper limit for the alloca size. The default limit
is selected such that worst case size of memtag-generated code is
similar to non-memtag (but because of the ISA quirks, this case is
realized at the different value of alloca size, ex. memset inlining
triggers at sizes below 512, but stack tagging instructions are 2x
shorter, so limit is approx. 256).

We could try harder to emit more compact code with initializer merging,
but that would only affect large, sparsely initialized allocas, and
those are doing fine already.

Reviewers: vitalybuka, pcc

Subscribers: llvm-commits

Added: 
    

Modified: 
    llvm/lib/Target/AArch64/AArch64StackTagging.cpp
    llvm/test/CodeGen/AArch64/stack-tagging-initializer-merge.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/AArch64/AArch64StackTagging.cpp b/llvm/lib/Target/AArch64/AArch64StackTagging.cpp
index 1b8c6417be38..ac25a0aeb41b 100644

--- a/llvm/lib/Target/AArch64/AArch64StackTagging.cpp
+++ b/llvm/lib/Target/AArch64/AArch64StackTagging.cpp
@@ -73,6 +73,10 @@ static cl::opt<bool>
 static cl::opt<unsigned> ClScanLimit("stack-tagging-merge-init-scan-limit",
                                      cl::init(40), cl::Hidden);
 
+static cl::opt<unsigned>
+    ClMergeInitSizeLimit("stack-tagging-merge-init-size-limit", cl::init(272),
+                         cl::Hidden);
+
 static const Align kTagGranuleSize = Align(16);
 
 namespace {
@@ -434,7 +438,8 @@ void AArch64StackTagging::tagAlloca(AllocaInst *AI, Instruction *InsertBefore,
   bool LittleEndian =
       Triple(AI->getModule()->getTargetTriple()).isLittleEndian();
   // Current implementation of initializer merging assumes little endianness.
-  if (MergeInit && !F->hasOptNone() && LittleEndian) {
+  if (MergeInit && !F->hasOptNone() && LittleEndian &&
+      Size < ClMergeInitSizeLimit) {
     LLVM_DEBUG(dbgs() << "collecting initializers for " << *AI
                       << ", size = " << Size << "\n");
     InsertBefore = collectInitializers(InsertBefore, Ptr, Size, IB);

diff  --git a/llvm/test/CodeGen/AArch64/stack-tagging-initializer-merge.ll b/llvm/test/CodeGen/AArch64/stack-tagging-initializer-merge.ll
index 9dc08c192a01..f6081926743f 100644
--- a/llvm/test/CodeGen/AArch64/stack-tagging-initializer-merge.ll
+++ b/llvm/test/CodeGen/AArch64/stack-tagging-initializer-merge.ll
@@ -306,3 +306,17 @@ entry:
 ; CHECK:  call void @llvm.aarch64.stgp(i8* {{.*}}, i64 46360584388608, i64 0)
 ; CHECK:  call void @llvm.aarch64.stgp(i8* {{.*}}, i64 0, i64 3038287259199220266)
 ; CHECK:  ret void
+
+define void @LargeAlloca() sanitize_memtag {
+entry:
+  %x = alloca i32, i32 256, align 16
+  %0 = bitcast i32* %x to i8*
+  call void @llvm.memset.p0i8.i64(i8* nonnull align 16 %0, i8 42, i64 256, i1 false)
+  call void @use(i8* nonnull %0)
+  ret void
+}
+
+; CHECK-LABEL: define void @LargeAlloca(
+; CHECK:  call void @llvm.aarch64.settag(i8* {{.*}}, i64 1024)
+; CHECK:  call void @llvm.memset.p0i8.i64(i8* {{.*}}, i8 42, i64 256,
+; CHECK:  ret void