[llvm-bugs] [Bug 39926] New: [X86] X86AvoidStoreForwardingBlocks creates incomplete copies
via llvm-bugs
llvm-bugs at lists.llvm.org
Sat Dec 8 15:56:54 PST 2018
https://bugs.llvm.org/show_bug.cgi?id=39926
Bug ID: 39926
Summary: [X86] X86AvoidStoreForwardingBlocks creates incomplete
copies
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: nikita.ppv at gmail.com
CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
llvm-dev at redking.me.uk, spatel+llvm at rotateright.com
Originally reported and diagnosed in
https://github.com/rust-lang/rust/issues/56618.
The X86AvoidStoreForwardingBlocks pass seems to drop copies for some byte
ranges under some circumstances. The following IR...
define i8 @test_offset(i8* %base) #0 {
entry:
%z = alloca [128 x i8], align 16
%gep0 = getelementptr inbounds i8, i8* %base, i64 7
store volatile i8 0, i8* %gep0
%gep1 = getelementptr inbounds i8, i8* %base, i64 5
%bc1 = bitcast i8* %gep1 to i16*
store volatile i16 0, i16* %bc1
%gep2 = getelementptr inbounds i8, i8* %base, i64 1
%bc2 = bitcast i8* %gep2 to i32*
store volatile i32 0, i32* %bc2
%y1 = getelementptr inbounds i8, i8* %base, i64 -4
%y2 = bitcast [128 x i8]* %z to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %y2, i8* %y1, i64 16, i1 false)
%gep4 = getelementptr inbounds [128 x i8], [128 x i8]* %z, i64 0, i64 4
%ret = load i8, i8* %gep4
ret i8 %ret
}
; Function Attrs: argmemonly nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture
readonly, i64, i1) #1
attributes #0 = { "target-cpu"="core-avx2" }
...when run through llc yields...
pushq %rax
.cfi_def_cfa_offset 16
movb $0, 7(%rdi)
movw $0, 5(%rdi)
movl $0, 1(%rdi)
movzwl -4(%rdi), %eax
movw %ax, -128(%rsp)
movb -2(%rdi), %al # Copies -2..-1
movb %al, -126(%rsp)
movl 1(%rdi), %eax # Copies 1..5
movl %eax, -123(%rsp)
movzwl 5(%rdi), %eax
movw %ax, -119(%rsp)
movb 7(%rdi), %al
movb %al, -117(%rsp)
movl 8(%rdi), %eax
movl %eax, -116(%rsp)
movb -124(%rsp), %al
popq %rcx
.cfi_def_cfa_offset 8
retq
Notably, a copy of the range -1..1 is missing.
MIR for the transform:
https://gist.github.com/nikic/61b5ac3390755b17dd542cd8131b926a One maybe
relevant aspect that's visible in the MIR is that the displacement for the MOVs
and the MMO offsets are out of sync:
%4:gr8 = MOV8rm %0:gr64, 1, $noreg, -2, $noreg :: (load 1 from %ir.y1 + 2)
MOV8mr %stack.0.z, 1, $noreg, 2, $noreg, killed %4:gr8 :: (store 1 into
%ir.y2 + 2, align 16)
%5:gr32 = MOV32rm %0:gr64, 1, $noreg, 1, $noreg :: (load 4 from %ir.y1 + 3,
align 1)
MOV32mr %stack.0.z, 1, $noreg, 5, $noreg, killed %5:gr32 :: (store 4 into
%ir.y2 + 3, align 16)
The MOV displacement goes from -2 to 1, while the MMO offset goes from 2 to 3.
Both should be incrementing by the same value.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20181208/2d1a1af6/attachment.html>
More information about the llvm-bugs
mailing list