[llvm-dev] load/stores adjusted to align - prevent aggregate replacement/elimination

Uday Kumar Reddy B via llvm-dev llvm-dev at lists.llvm.org
Tue Sep 3 12:03:16 PDT 2019


On Wed, 4 Sep 2019 at 00:23, Uday Kumar Reddy B <uday at polymagelabs.com> wrote:
>
> Hello,
>
> This is a question reg. replacement of malloc'ed single element arrays
> by scalars, which LLVM's opt appears to normally perform well. Now,
> when there are arrays of vector types with elements of size 32 bytes
> or more (eg <8 x float> *), it's common to adjust load/store's so that
> they align on element type boundaries (since GNU malloc would
> typically align only to 16 byte boundaries, say on x86-64). On such
> IR, I notice that the scalar replacement / register promotion of a
> malloc'ed vector element doesn't work any more.
>
> https://godbolt.org/z/9KByAf

Looks like the short URLs aren't working: I'm anyway appending the
snippet below.

> (Commenting out the alignment adjustment makes it work perfectly.)
>
> Are there any attributes/hints that might be used in generating those
> alignment arithmetic instructions to help the optimizer here?
>
> Thanks,
> ~ Uday

--------------------------------------------------------------------------------------------
declare i8* @malloc(i64)

define <8 x float> @xyz(<8 x float>* %0) {
%2 = call i8* @malloc(i64 63)
%3 = bitcast i8* %2 to <8 x float>*
%4 = ptrtoint <8 x float>* %3 to i64
%5 = add i64 %4, 31
%6 = udiv i64 %5, 32
%7 = mul i64 %6, 32
%8 = inttoptr i64 %7 to <8 x float>*
%9 = getelementptr <8 x float>, <8 x float>* %8, i64 0
; uncomment this and comment the one above to allow full scalar rep
; %9 = getelementptr <8 x float>, <8 x float>* %3, i64 0
store <8 x float> zeroinitializer, <8 x float>* %9
br label %10

10: ; preds = %13, %1
%11 = phi i64 [ %23, %13 ], [ 0, %1 ]
%12 = icmp slt i64 %11, 100
br i1 %12, label %13, label %24

13: ; preds = %10
%14 = ptrtoint <8 x float>* %0 to i64
%15 = add i64 %14, 31
%16 = udiv i64 %15, 32
%17 = mul i64 %16, 32
%18 = inttoptr i64 %17 to <8 x float>*
%19 = getelementptr <8 x float>, <8 x float>* %18, i64 %11
%20 = load <8 x float>, <8 x float>* %19
%21 = load <8 x float>, <8 x float>* %9
%22 = fadd <8 x float> %20, %21
store <8 x float> %22, <8 x float>* %9
%23 = add i64 %11, 1
br label %10

24: ; preds = %10
%25 = load <8 x float>, <8 x float>* %9
ret <8 x float> %25
}
-------------------------------


More information about the llvm-dev mailing list