[PATCH] D11434: Allow value forwarding past release fences in EarlyCSE

Philip Reames via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 26 15:36:13 PDT 2015


reames updated this revision to Diff 33260.
reames added a comment.

Adjust test comments to indicate where the negative tests are testing implementation rather than the memory model per review comments.


http://reviews.llvm.org/D11434

Files:
  lib/Transforms/Scalar/EarlyCSE.cpp
  test/Transforms/EarlyCSE/fence.ll

Index: test/Transforms/EarlyCSE/fence.ll
===================================================================
--- test/Transforms/EarlyCSE/fence.ll
+++ test/Transforms/EarlyCSE/fence.ll
@@ -1,4 +1,10 @@
 ; RUN: opt -S -early-cse < %s | FileCheck %s
+; NOTE: This file is testing the current implementation.  Some of
+; the transforms used as negative tests below would be legal, but 
+; only if reached through a chain of logic which EarlyCSE is incapable
+; of performing.  To say it differently, this file tests a conservative
+; version of the memory model.  If we want to extend EarlyCSE to be more
+; aggressive in the future, we may need to relax some of the negative tests.
 
 ; We can value forward across the fence since we can (semantically) 
 ; reorder the following load before the fence.
@@ -31,21 +37,28 @@
 ; We can not value forward across an acquire barrier since we might
 ; be syncronizing with another thread storing to the same variable
 ; followed by a release fence.  If this thread observed the release 
-; had happened, we must present a consistent view of memory at the fence.
+; had happened, we must present a consistent view of memory at the
+; fence.  Note that it would be legal to reorder '%a' after the fence
+; and then remove '%a2'.  The current implementation doesn't know how
+; to do this, but if it learned, this test will need revised.
 define i32 @test3(i32* noalias %addr.i, i32* noalias %otheraddr) {
 ; CHECK-LABEL: @test3
 ; CHECK: load
 ; CHECK: fence
 ; CHECK: load
+; CHECK: sub
 ; CHECK: ret
   %a = load i32, i32* %addr.i, align 4
   fence acquire
   %a2 = load i32, i32* %addr.i, align 4
   %res = sub i32 %a, %a2
-  ret i32 %a
+  ret i32 %res
 }
 
-; We can NOT dead store eliminate accross the fence
+; We can not dead store eliminate accross the fence.  We could in 
+; principal reorder the second store above the fence and then DSE either
+; store, but this is beyond the simple last-store DSE which EarlyCSE
+; implements.
 define void @test4(i32* %addr.i) {
 ; CHECK-LABEL: @test4
 ; CHECK: store
Index: lib/Transforms/Scalar/EarlyCSE.cpp
===================================================================
--- lib/Transforms/Scalar/EarlyCSE.cpp
+++ lib/Transforms/Scalar/EarlyCSE.cpp
@@ -613,6 +613,17 @@
       continue;
     }
 
+    // A release fence requires that all stores complete before it, but does
+    // not prevent the reordering of following loads 'before' the fence.  As a
+    // result, we don't need to consider it as writing to memory and don't need
+    // to advance the generation.  We do need to prevent DSE across the fence,
+    // but that's handled above.
+    if (FenceInst *FI = dyn_cast<FenceInst>(Inst))
+      if (FI->getOrdering() == Release) {
+        assert(Inst->mayReadFromMemory() && "relied on to prevent DSE above");
+        continue;
+      }
+
     // Okay, this isn't something we can CSE at all.  Check to see if it is
     // something that could modify memory.  If so, our available memory values
     // cannot be used so bump the generation count.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D11434.33260.patch
Type: text/x-patch
Size: 3057 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150826/e6ddeff1/attachment.bin>


More information about the llvm-commits mailing list