[llvm-bugs] [Bug 33945] New: instruction combiner breaks alignemnt of store
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Jul 26 02:25:27 PDT 2017
https://bugs.llvm.org/show_bug.cgi?id=33945
Bug ID: 33945
Summary: instruction combiner breaks alignemnt of store
Product: new-bugs
Version: 4.0
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: gideon.smeding at 3ds.com
CC: llvm-bugs at lists.llvm.org
I came across the following problem, where the instruction combiner wrongly
changes a store, leading to alignment problems in the generated code. The
following program stores a 16 byte aligned <4 x float> vector to a 4 byte
aligned { float, float, float, float } struct:
source_filename = "packed_float3_cpu.ll"
target datalayout =
"e-m:w-p:64:64:64-i8:8:8-i32:32:32-i64:64:64-f16:16:16-f32:32:32-v64:64:64-v96:128:128-v128:128:128-a:0:64-S128"
target triple = "x86_64-pc-windows-msvc19.0.24215"
define void @save_to_packed_float4({ float, float, float, float }* %p, <4 x
float> %v) {
body: ; preds = %entry
%x = extractelement <4 x float> %v, i32 0
%y = extractelement <4 x float> %v, i32 1
%z = extractelement <4 x float> %v, i32 2
%w = extractelement <4 x float> %v, i32 3
%x___ = insertvalue { float, float, float, float } undef, float %x, 0
%xy__ = insertvalue { float, float, float, float } %x___, float %y, 1
%xyz_ = insertvalue { float, float, float, float } %xy__, float %z, 2
%xyzw = insertvalue { float, float, float, float } %xyz_, float %w, 3
store { float, float, float, float } %xyzw, { float, float, float, float }*
%p ; unaligned store
ret void
}
After running opt - -instcombine, this program looks as follows:
source_filename = "packed_float3_cpu.ll"
target datalayout =
"e-m:w-p:64:64:64-i8:8:8-i32:32:32-i64:64:64-f16:16:16-f32:32:32-v64:64:64-v96:128:128-v128:128:128-a:0:64-S128"
target triple = "x86_64-pc-windows-msvc19.0.24215"
define void @save_to_packed_float4({ float, float, float, float }* %p, <4 x
float> %v) {
body:
%0 = bitcast { float, float, float, float }* %p to <4 x float>*
store <4 x float> %v, <4 x float>* %0, align 16
ret void
}
The problem is that the unaligned store in the source becomes a 16 byte aligned
store after the transformation which is obviously not what I wanted to achieve
with a packed float4 in the first place.
Now I can explicitly set the aligment of the store in the source program (which
I'm doing from now on for all stores because of this issue), but I'm surprised
by this 'optimization'.
My expectation would be that a store without explicit alignment defaults to the
alignment of the type behind the pointer (4 byte in this case). In stead, the
transformation just boldly assumes that the pointer has a 16 byte alignment,
which it does not.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170726/3237fa73/attachment.html>
More information about the llvm-bugs
mailing list