[llvm-bugs] [Bug 33945] New: instruction combiner breaks alignemnt of store

Wed Jul 26 02:25:27 PDT 2017

https://bugs.llvm.org/show_bug.cgi?id=33945

            Bug ID: 33945
           Summary: instruction combiner breaks alignemnt of store
           Product: new-bugs
           Version: 4.0
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: gideon.smeding at 3ds.com
                CC: llvm-bugs at lists.llvm.org

I came across the following problem, where the instruction combiner wrongly
changes a store, leading to alignment problems in the generated code. The
following program stores a 16 byte aligned <4 x float> vector to a 4 byte
aligned { float, float, float, float } struct:

source_filename = "packed_float3_cpu.ll"
target datalayout =
"e-m:w-p:64:64:64-i8:8:8-i32:32:32-i64:64:64-f16:16:16-f32:32:32-v64:64:64-v96:128:128-v128:128:128-a:0:64-S128"
target triple = "x86_64-pc-windows-msvc19.0.24215"

define void @save_to_packed_float4({ float, float, float, float }* %p, <4 x
float> %v) {

body:                                             ; preds = %entry
  %x = extractelement <4 x float> %v, i32 0
  %y = extractelement <4 x float> %v, i32 1
  %z = extractelement <4 x float> %v, i32 2
  %w = extractelement <4 x float> %v, i32 3
  %x___ = insertvalue { float, float, float, float } undef, float %x, 0
  %xy__ = insertvalue { float, float, float, float } %x___, float %y, 1
  %xyz_ = insertvalue { float, float, float, float } %xy__, float %z, 2
  %xyzw = insertvalue { float, float, float, float } %xyz_, float %w, 3
  store { float, float, float, float } %xyzw, { float, float, float, float }*
%p ; unaligned store
  ret void
}

After running opt - -instcombine, this program looks as follows:

source_filename = "packed_float3_cpu.ll"
target datalayout =
"e-m:w-p:64:64:64-i8:8:8-i32:32:32-i64:64:64-f16:16:16-f32:32:32-v64:64:64-v96:128:128-v128:128:128-a:0:64-S128"
target triple = "x86_64-pc-windows-msvc19.0.24215"

define void @save_to_packed_float4({ float, float, float, float }* %p, <4 x
float> %v) {
body:
  %0 = bitcast { float, float, float, float }* %p to <4 x float>*
  store <4 x float> %v, <4 x float>* %0, align 16
  ret void
}

The problem is that the unaligned store in the source becomes a 16 byte aligned
store after the transformation which is obviously not what I wanted to achieve
with a packed float4 in the first place.

Now I can explicitly set the aligment of the store in the source program (which
I'm doing from now on for all stores because of this issue), but I'm surprised
by this 'optimization'.

My expectation would be that a store without explicit alignment defaults to the
alignment of the type behind the pointer (4 byte in this case). In stead, the
transformation just boldly assumes that the pointer has a 16 byte alignment,
which it does not.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170726/3237fa73/attachment.html>