[llvm] [SLP]Improve final minbitwidth analysis attempt. (PR #87786)
Arthur Eubanks via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 17 15:07:57 PDT 2024
aeubanks wrote:
We have a test that started failing after this patch. I seem to have narrowed it down to the following diff after SLPVectorizer with and without this patch:
```
$ diff /tmp/good.ll /tmp/bad.ll
992,1011c992,1009
< %8 = zext <8 x i16> %0 to <8 x i32>
< %9 = zext <8 x i16> %1 to <8 x i32>
< %10 = sub nsw <8 x i32> %9, %8
< %11 = add nsw <8 x i32> %10, <i32 3329, i32 3329, i32 3329, i32 3329, i32 3329, i32 3329, i32 3329, i32 3329>
< %12 = insertelement <8 x i32> poison, i32 %zext795, i32 0
< %13 = shufflevector <8 x i32> %12, <8 x i32> poison, <8 x i32> zeroinitializer
< %14 = mul <8 x i32> %11, %13
< %15 = zext <8 x i32> %14 to <8 x i64>
< %16 = mul nuw nsw <8 x i64> %15, <i64 5039, i64 5039, i64 5039, i64 5039, i64 5039, i64 5039, i64 5039, i64 5039>
< %17 = lshr <8 x i64> %16, <i64 24, i64 24, i64 24, i64 24, i64 24, i64 24, i64 24, i64 24>
< %18 = trunc <8 x i64> %17 to <8 x i32>
< %19 = mul <8 x i32> %18, <i32 62207, i32 62207, i32 62207, i32 62207, i32 62207, i32 62207, i32 62207, i32 62207>
< %20 = add <8 x i32> %19, %14
< %21 = trunc <8 x i32> %20 to <8 x i16>
< %22 = add <8 x i16> %21, <i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329>
< %23 = icmp slt <8 x i16> %22, zeroinitializer
< %24 = select <8 x i1> %23, <8 x i16> %21, <8 x i16> zeroinitializer
< %25 = call <8 x i16> @llvm.smax.v8i16(<8 x i16> %22, <8 x i16> zeroinitializer)
< %26 = or <8 x i16> %24, %25
< store <8 x i16> %26, ptr %getelementptr797, align 2
---
> %8 = sub <8 x i16> %1, %0
> %9 = add <8 x i16> %8, <i16 3329, i16 3329, i16 3329, i16 3329, i16 3329, i16 3329, i16 3329, i16 3329>
> %10 = insertelement <8 x i32> poison, i32 %zext795, i32 0
> %11 = shufflevector <8 x i32> %10, <8 x i32> poison, <8 x i32> zeroinitializer
> %12 = trunc <8 x i32> %11 to <8 x i16>
> %13 = mul <8 x i16> %9, %12
> %14 = zext <8 x i16> %13 to <8 x i64>
> %15 = mul nuw nsw <8 x i64> %14, <i64 5039, i64 5039, i64 5039, i64 5039, i64 5039, i64 5039, i64 5039, i64 5039>
> %16 = lshr <8 x i64> %15, <i64 24, i64 24, i64 24, i64 24, i64 24, i64 24, i64 24, i64 24>
> %17 = trunc <8 x i64> %16 to <8 x i16>
> %18 = mul <8 x i16> %17, <i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329>
> %19 = add <8 x i16> %18, %13
> %20 = add <8 x i16> %19, <i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329, i16 -3329>
> %21 = icmp slt <8 x i16> %20, zeroinitializer
> %22 = select <8 x i1> %21, <8 x i16> %19, <8 x i16> zeroinitializer
> %23 = call <8 x i16> @llvm.smax.v8i16(<8 x i16> %20, <8 x i16> zeroinitializer)
> %24 = or <8 x i16> %22, %23
> store <8 x i16> %24, ptr %getelementptr797, align 2
```
but I'm trying to understand what's going wrong
https://github.com/llvm/llvm-project/pull/87786
More information about the llvm-commits
mailing list