[PATCH] D11434: Allow value forwarding past release fences in EarlyCSE
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 26 15:36:13 PDT 2015
reames updated this revision to Diff 33260.
reames added a comment.
Adjust test comments to indicate where the negative tests are testing implementation rather than the memory model per review comments.
http://reviews.llvm.org/D11434
Files:
lib/Transforms/Scalar/EarlyCSE.cpp
test/Transforms/EarlyCSE/fence.ll
Index: test/Transforms/EarlyCSE/fence.ll
===================================================================
--- test/Transforms/EarlyCSE/fence.ll
+++ test/Transforms/EarlyCSE/fence.ll
@@ -1,4 +1,10 @@
; RUN: opt -S -early-cse < %s | FileCheck %s
+; NOTE: This file is testing the current implementation. Some of
+; the transforms used as negative tests below would be legal, but
+; only if reached through a chain of logic which EarlyCSE is incapable
+; of performing. To say it differently, this file tests a conservative
+; version of the memory model. If we want to extend EarlyCSE to be more
+; aggressive in the future, we may need to relax some of the negative tests.
; We can value forward across the fence since we can (semantically)
; reorder the following load before the fence.
@@ -31,21 +37,28 @@
; We can not value forward across an acquire barrier since we might
; be syncronizing with another thread storing to the same variable
; followed by a release fence. If this thread observed the release
-; had happened, we must present a consistent view of memory at the fence.
+; had happened, we must present a consistent view of memory at the
+; fence. Note that it would be legal to reorder '%a' after the fence
+; and then remove '%a2'. The current implementation doesn't know how
+; to do this, but if it learned, this test will need revised.
define i32 @test3(i32* noalias %addr.i, i32* noalias %otheraddr) {
; CHECK-LABEL: @test3
; CHECK: load
; CHECK: fence
; CHECK: load
+; CHECK: sub
; CHECK: ret
%a = load i32, i32* %addr.i, align 4
fence acquire
%a2 = load i32, i32* %addr.i, align 4
%res = sub i32 %a, %a2
- ret i32 %a
+ ret i32 %res
}
-; We can NOT dead store eliminate accross the fence
+; We can not dead store eliminate accross the fence. We could in
+; principal reorder the second store above the fence and then DSE either
+; store, but this is beyond the simple last-store DSE which EarlyCSE
+; implements.
define void @test4(i32* %addr.i) {
; CHECK-LABEL: @test4
; CHECK: store
Index: lib/Transforms/Scalar/EarlyCSE.cpp
===================================================================
--- lib/Transforms/Scalar/EarlyCSE.cpp
+++ lib/Transforms/Scalar/EarlyCSE.cpp
@@ -613,6 +613,17 @@
continue;
}
+ // A release fence requires that all stores complete before it, but does
+ // not prevent the reordering of following loads 'before' the fence. As a
+ // result, we don't need to consider it as writing to memory and don't need
+ // to advance the generation. We do need to prevent DSE across the fence,
+ // but that's handled above.
+ if (FenceInst *FI = dyn_cast<FenceInst>(Inst))
+ if (FI->getOrdering() == Release) {
+ assert(Inst->mayReadFromMemory() && "relied on to prevent DSE above");
+ continue;
+ }
+
// Okay, this isn't something we can CSE at all. Check to see if it is
// something that could modify memory. If so, our available memory values
// cannot be used so bump the generation count.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D11434.33260.patch
Type: text/x-patch
Size: 3057 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150826/e6ddeff1/attachment.bin>
More information about the llvm-commits
mailing list