[llvm-bugs] [Bug 39926] New: [X86] X86AvoidStoreForwardingBlocks creates incomplete copies

Sat Dec 8 15:56:54 PST 2018

https://bugs.llvm.org/show_bug.cgi?id=39926

            Bug ID: 39926
           Summary: [X86] X86AvoidStoreForwardingBlocks creates incomplete
                    copies
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: nikita.ppv at gmail.com
                CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
                    llvm-dev at redking.me.uk, spatel+llvm at rotateright.com

Originally reported and diagnosed in
https://github.com/rust-lang/rust/issues/56618.

The X86AvoidStoreForwardingBlocks pass seems to drop copies for some byte
ranges under some circumstances. The following IR...

define i8 @test_offset(i8* %base) #0 {
entry:
  %z = alloca [128 x i8], align 16
  %gep0 = getelementptr inbounds i8, i8* %base, i64 7
  store volatile i8 0, i8* %gep0
  %gep1 = getelementptr inbounds i8, i8* %base, i64 5
  %bc1 = bitcast i8* %gep1 to i16*
  store volatile i16 0, i16* %bc1
  %gep2 = getelementptr inbounds i8, i8* %base, i64 1
  %bc2 = bitcast i8* %gep2 to i32*
  store volatile i32 0, i32* %bc2

  %y1 = getelementptr inbounds i8, i8* %base, i64 -4
  %y2 = bitcast [128 x i8]* %z to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %y2, i8* %y1, i64 16, i1 false)

  %gep4 = getelementptr inbounds [128 x i8], [128 x i8]* %z, i64 0, i64 4
  %ret = load i8, i8* %gep4
  ret i8 %ret
}

; Function Attrs: argmemonly nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture
readonly, i64, i1) #1

attributes #0 = { "target-cpu"="core-avx2" }

...when run through llc yields...

        pushq   %rax
        .cfi_def_cfa_offset 16
        movb    $0, 7(%rdi)
        movw    $0, 5(%rdi)
        movl    $0, 1(%rdi)
        movzwl  -4(%rdi), %eax
        movw    %ax, -128(%rsp)
        movb    -2(%rdi), %al    # Copies -2..-1
        movb    %al, -126(%rsp)
        movl    1(%rdi), %eax    # Copies 1..5
        movl    %eax, -123(%rsp)
        movzwl  5(%rdi), %eax
        movw    %ax, -119(%rsp)
        movb    7(%rdi), %al
        movb    %al, -117(%rsp)
        movl    8(%rdi), %eax
        movl    %eax, -116(%rsp)
        movb    -124(%rsp), %al
        popq    %rcx
        .cfi_def_cfa_offset 8
        retq

Notably, a copy of the range -1..1 is missing.

MIR for the transform:
https://gist.github.com/nikic/61b5ac3390755b17dd542cd8131b926a One maybe
relevant aspect that's visible in the MIR is that the displacement for the MOVs
and the MMO offsets are out of sync:

  %4:gr8 = MOV8rm %0:gr64, 1, $noreg, -2, $noreg :: (load 1 from %ir.y1 + 2)
  MOV8mr %stack.0.z, 1, $noreg, 2, $noreg, killed %4:gr8 :: (store 1 into
%ir.y2 + 2, align 16)
  %5:gr32 = MOV32rm %0:gr64, 1, $noreg, 1, $noreg :: (load 4 from %ir.y1 + 3,
align 1)
  MOV32mr %stack.0.z, 1, $noreg, 5, $noreg, killed %5:gr32 :: (store 4 into
%ir.y2 + 3, align 16)

The MOV displacement goes from -2 to 1, while the MMO offset goes from 2 to 3.
Both should be incrementing by the same value.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20181208/2d1a1af6/attachment.html>