[llvm-branch-commits] [llvm] b54ccef - [SDAG] fix miscompile from merging stores of different sizes

Tom Stellard via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Wed Jun 9 15:03:41 PDT 2021


Author: Sanjay Patel
Date: 2021-06-09T15:02:51-07:00
New Revision: b54ccef144d2753a9742a3c0e75bcf377120fc6c

URL: https://github.com/llvm/llvm-project/commit/b54ccef144d2753a9742a3c0e75bcf377120fc6c
DIFF: https://github.com/llvm/llvm-project/commit/b54ccef144d2753a9742a3c0e75bcf377120fc6c.diff

LOG: [SDAG] fix miscompile from merging stores of different sizes

As shown in:
https://llvm.org/PR50623
...and the similar tests here, we were not accounting for
store merging of different sizes that do not cover the
entire range of the wide value to be stored.

This is the easy fix: just make sure that all of the
original stores are the same size, so when we calculate
the wide width, it's a simple N * M check.

This still allows all of the motivating optimizations from:
D86420 / 54a5dd485c4d
D87112 / 7a06b166b1af

We could enhance this code to track individual bytes and
allow merging multiple sizes.

(cherry picked from commit dd763ac79196b3d3bc0370b9dbd35e0c083e52a4)

Added: 
    

Modified: 
    llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
    llvm/test/CodeGen/X86/stores-merging.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 6a6f83827f72..7f2add81e80d 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -7105,14 +7105,22 @@ SDValue DAGCombiner::mergeTruncStores(StoreSDNode *N) {
   if (LegalOperations)
     return SDValue();
 
-  // Collect all the stores in the chain.
-  SDValue Chain;
-  SmallVector<StoreSDNode *, 8> Stores;
-  for (StoreSDNode *Store = N; Store; Store = dyn_cast<StoreSDNode>(Chain)) {
-    // TODO: Allow unordered atomics when wider type is legal (see D66309)
-    EVT MemVT = Store->getMemoryVT();
-    if (!(MemVT == MVT::i8 || MemVT == MVT::i16 || MemVT == MVT::i32) ||
-        !Store->isSimple() || Store->isIndexed())
+  // We only handle merging simple stores of 1-4 bytes.
+  // TODO: Allow unordered atomics when wider type is legal (see D66309)
+  EVT MemVT = N->getMemoryVT();
+  if (!(MemVT == MVT::i8 || MemVT == MVT::i16 || MemVT == MVT::i32) ||
+      !N->isSimple() || N->isIndexed())
+    return SDValue();
+
+  // Collect all of the stores in the chain.
+  SDValue Chain = N->getChain();
+  SmallVector<StoreSDNode *, 8> Stores = {N};
+  while (auto *Store = dyn_cast<StoreSDNode>(Chain)) {
+    // All stores must be the same size to ensure that we are writing all of the
+    // bytes in the wide value.
+    // TODO: We could allow multiple sizes by tracking each stored byte.
+    if (Store->getMemoryVT() != MemVT || !Store->isSimple() ||
+        Store->isIndexed())
       return SDValue();
     Stores.push_back(Store);
     Chain = Store->getChain();

diff  --git a/llvm/test/CodeGen/X86/stores-merging.ll b/llvm/test/CodeGen/X86/stores-merging.ll
index d4857a6645af..f738710ab8f3 100644
--- a/llvm/test/CodeGen/X86/stores-merging.ll
+++ b/llvm/test/CodeGen/X86/stores-merging.ll
@@ -614,14 +614,15 @@ define dso_local void @be_i64_to_i32_order(i64 %x, i32* %p0) {
 }
 
 ; https://llvm.org/PR50623
-; FIXME:
 ; It is a miscompile to merge the stores if we are not
 ; writing all of the bytes from the source value.
 
 define void @merge_hole(i32 %x, i8* %p) {
 ; CHECK-LABEL: merge_hole:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    movl %edi, (%rsi)
+; CHECK-NEXT:    movb %dil, (%rsi)
+; CHECK-NEXT:    shrl $16, %edi
+; CHECK-NEXT:    movw %di, 2(%rsi)
 ; CHECK-NEXT:    retq
   %pcast = bitcast i8* %p to i16*
   %p2 = getelementptr inbounds i16, i16* %pcast, i64 1
@@ -678,15 +679,15 @@ define void @merge_hole3(i32 %x, i8* %p) {
 }
 
 ; Change offset.
-; FIXME:
 ; It is a miscompile to merge the stores if we are not
 ; writing all of the bytes from the source value.
 
 define void @merge_hole4(i32 %x, i8* %p) {
 ; CHECK-LABEL: merge_hole4:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    rorl $16, %edi
-; CHECK-NEXT:    movl %edi, (%rsi)
+; CHECK-NEXT:    movb %dil, 2(%rsi)
+; CHECK-NEXT:    shrl $16, %edi
+; CHECK-NEXT:    movw %di, (%rsi)
 ; CHECK-NEXT:    retq
   %pcast = bitcast i8* %p to i16*
   %p2 = getelementptr inbounds i8, i8* %p, i64 2


        


More information about the llvm-branch-commits mailing list