<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - [X86] X86AvoidStoreForwardingBlocks creates incomplete copies"
href="https://bugs.llvm.org/show_bug.cgi?id=39926">39926</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>[X86] X86AvoidStoreForwardingBlocks creates incomplete copies
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>nikita.ppv@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, spatel+llvm@rotateright.com
</td>
</tr></table>
<p>
<div>
<pre>Originally reported and diagnosed in
<a href="https://github.com/rust-lang/rust/issues/56618">https://github.com/rust-lang/rust/issues/56618</a>.
The X86AvoidStoreForwardingBlocks pass seems to drop copies for some byte
ranges under some circumstances. The following IR...
define i8 @test_offset(i8* %base) #0 {
entry:
%z = alloca [128 x i8], align 16
%gep0 = getelementptr inbounds i8, i8* %base, i64 7
store volatile i8 0, i8* %gep0
%gep1 = getelementptr inbounds i8, i8* %base, i64 5
%bc1 = bitcast i8* %gep1 to i16*
store volatile i16 0, i16* %bc1
%gep2 = getelementptr inbounds i8, i8* %base, i64 1
%bc2 = bitcast i8* %gep2 to i32*
store volatile i32 0, i32* %bc2
%y1 = getelementptr inbounds i8, i8* %base, i64 -4
%y2 = bitcast [128 x i8]* %z to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %y2, i8* %y1, i64 16, i1 false)
%gep4 = getelementptr inbounds [128 x i8], [128 x i8]* %z, i64 0, i64 4
%ret = load i8, i8* %gep4
ret i8 %ret
}
; Function Attrs: argmemonly nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture
readonly, i64, i1) #1
attributes #0 = { "target-cpu"="core-avx2" }
...when run through llc yields...
pushq %rax
.cfi_def_cfa_offset 16
movb $0, 7(%rdi)
movw $0, 5(%rdi)
movl $0, 1(%rdi)
movzwl -4(%rdi), %eax
movw %ax, -128(%rsp)
movb -2(%rdi), %al # Copies -2..-1
movb %al, -126(%rsp)
movl 1(%rdi), %eax # Copies 1..5
movl %eax, -123(%rsp)
movzwl 5(%rdi), %eax
movw %ax, -119(%rsp)
movb 7(%rdi), %al
movb %al, -117(%rsp)
movl 8(%rdi), %eax
movl %eax, -116(%rsp)
movb -124(%rsp), %al
popq %rcx
.cfi_def_cfa_offset 8
retq
Notably, a copy of the range -1..1 is missing.
MIR for the transform:
<a href="https://gist.github.com/nikic/61b5ac3390755b17dd542cd8131b926a">https://gist.github.com/nikic/61b5ac3390755b17dd542cd8131b926a</a> One maybe
relevant aspect that's visible in the MIR is that the displacement for the MOVs
and the MMO offsets are out of sync:
%4:gr8 = MOV8rm %0:gr64, 1, $noreg, -2, $noreg :: (load 1 from %ir.y1 + 2)
MOV8mr %stack.0.z, 1, $noreg, 2, $noreg, killed %4:gr8 :: (store 1 into
%ir.y2 + 2, align 16)
%5:gr32 = MOV32rm %0:gr64, 1, $noreg, 1, $noreg :: (load 4 from %ir.y1 + 3,
align 1)
MOV32mr %stack.0.z, 1, $noreg, 5, $noreg, killed %5:gr32 :: (store 4 into
%ir.y2 + 3, align 16)
The MOV displacement goes from -2 to 1, while the MMO offset goes from 2 to 3.
Both should be incrementing by the same value.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>