[llvm-bugs] [Bug 42123] New: [SelectionDAG] MergeConsecutiveStores loses non-temporal flag

via llvm-bugs llvm-bugs at lists.llvm.org
Tue Jun 4 11:23:44 PDT 2019


https://bugs.llvm.org/show_bug.cgi?id=42123

            Bug ID: 42123
           Summary: [SelectionDAG] MergeConsecutiveStores loses
                    non-temporal flag
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Common Code Generator Code
          Assignee: unassignedbugs at nondot.org
          Reporter: llvm-dev at redking.me.uk
                CC: andrea.dibiagio at gmail.com, craig.topper at gmail.com,
                    hfinkel at anl.gov, llvm-bugs at lists.llvm.org,
                    spatel+llvm at rotateright.com

https://godbolt.org/z/zWf3xk

Derived from (not direct copy of cpp source - alignment gets messed up):

#include <x86intrin.h>

void memcpy256_2_128_aligned(__m256 *src, __m256 *dst) {
    auto x = _mm_load_ps((float*)src + 0);
    auto y = _mm_load_ps((float*)src + 4);
    _mm_stream_ps((float*)dst + 0, x);
    _mm_stream_ps((float*)dst + 4, y);
}

define void @memcpy256_2_128_aligned(<8 x float>* noalias nocapture readonly,
<8 x float>* noalias nocapture) {
  %3 = bitcast <8 x float>* %0 to <4 x float>*
  %4 = load <4 x float>, <4 x float>* %3, align 32
  %5 = getelementptr inbounds <8 x float>, <8 x float>* %0, i64 0, i64 4
  %6 = bitcast float* %5 to <4 x float>*
  %7 = load <4 x float>, <4 x float>* %6, align 16
  %8 = bitcast <8 x float>* %1 to <4 x float>*
  store <4 x float> %4, <4 x float>* %8, align 32, !nontemporal !0
  %9 = getelementptr inbounds <8 x float>, <8 x float>* %1, i64 0, i64 4
  %10 = bitcast float* %9 to <4 x float>*
  store <4 x float> %7, <4 x float>* %10, align 16, !nontemporal !0
  ret void
}
!0 = !{i32 1}

llc -mcpu=btver2

memcpy256_2_128_aligned: # @memcpy256_2_128_aligned
  vmovaps (%rdi), %ymm0
  vmovaps %ymm0, (%rsi) <-- SHOULD BE VMOVNTPS
  retq

Several things need to be addressed:
1 - retain the nontemporal flag for merged stores
2 - don't merge stores if only some have a nontemporal flag
3 - only merges nontemporal if they are naturally aligned - unaligned nt-stores
are problematic (see [Bug #42026])

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190604/93bd554b/attachment.html>


More information about the llvm-bugs mailing list