[llvm] [instCombine][bugfix] Fix crash caused by using of cast in instCombineSVECmpNE (PR #102472)

Thu Aug 15 19:46:41 PDT 2024

wuxie2022 wrote:

> > > Fix looks good but the test case can be significantly reduced. It looks like you can just add to `llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll` by cloning `dupq_b_0` as `dupq_b_idx` whereby it takes an `%idx` function parameter that it uses when calling `sve.dupq.lane`.
> > 
> > 
> > Thank you for your review! I have simplified the case in 'sve-inst-combine-cmpne.ll'. But it can't be merged into sve-intrinsic-opts-cmpne.ll, because the compile command of my new test case must contain '-mtriple=aarch64-unknown-linux-gnu' and '-O2'.
> 
> Are you sure? `sve-intrinsic-opts-cmpne.ll` already contains a `target triple` line and I think the only reason you need `-O2` is because you're relying on function inlining? I've just tried my suggestion on the current main branch and adding the following to `sve-intrinsic-opts-cmpne.ll` triggers the original failure for me. Do you not see the same?
> 
> ```
> define <vscale x 16 x i1> @dupq_b_idx(i64 %idx) #0 {                                
>   %1 = tail call <vscale x 16 x i1> @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)         
>   %2 = tail call <vscale x 16 x i8> @llvm.vector.insert.nxv16i8.v16i8(<vscale x 16 x i8> undef,
>     <16 x i8> <i8 0, i8 0, i8 0, i8 0, i8 0, i8 0, i8 0, i8 0,                      
>                i8 0, i8 0, i8 0, i8 0, i8 0, i8 0, i8 0, i8 0>, i64 0)              
>   %3 = tail call <vscale x 16 x i8> @llvm.aarch64.sve.dupq.lane.nxv16i8(<vscale x 16 x i8> %2 , i64 %idx)
>   %4 = tail call <vscale x 2 x i64> @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)          
>   %5 = tail call <vscale x 16 x i1> @llvm.aarch64.sve.cmpne.wide.nxv16i8(<vscale x 16 x i1> %1, <vscale x 16 x i8> %3, <vscale x 2 x i64> %4)
>   ret <vscale x 16 x i1> %5                                                         
> } 
> ```

> > > Fix looks good but the test case can be significantly reduced. It looks like you can just add to `llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll` by cloning `dupq_b_0` as `dupq_b_idx` whereby it takes an `%idx` function parameter that it uses when calling `sve.dupq.lane`.
> > 
> > 
> > Thank you for your review! I have simplified the case in 'sve-inst-combine-cmpne.ll'. But it can't be merged into sve-intrinsic-opts-cmpne.ll, because the compile command of my new test case must contain '-mtriple=aarch64-unknown-linux-gnu' and '-O2'.
> 
> Are you sure? `sve-intrinsic-opts-cmpne.ll` already contains a `target triple` line and I think the only reason you need `-O2` is because you're relying on function inlining? I've just tried my suggestion on the current main branch and adding the following to `sve-intrinsic-opts-cmpne.ll` triggers the original failure for me. Do you not see the same?
> 
> ```
> define <vscale x 16 x i1> @dupq_b_idx(i64 %idx) #0 {                                
>   %1 = tail call <vscale x 16 x i1> @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)         
>   %2 = tail call <vscale x 16 x i8> @llvm.vector.insert.nxv16i8.v16i8(<vscale x 16 x i8> undef,
>     <16 x i8> <i8 0, i8 0, i8 0, i8 0, i8 0, i8 0, i8 0, i8 0,                      
>                i8 0, i8 0, i8 0, i8 0, i8 0, i8 0, i8 0, i8 0>, i64 0)              
>   %3 = tail call <vscale x 16 x i8> @llvm.aarch64.sve.dupq.lane.nxv16i8(<vscale x 16 x i8> %2 , i64 %idx)
>   %4 = tail call <vscale x 2 x i64> @llvm.aarch64.sve.dup.x.nxv2i64(i64 0)          
>   %5 = tail call <vscale x 16 x i1> @llvm.aarch64.sve.cmpne.wide.nxv16i8(<vscale x 16 x i1> %1, <vscale x 16 x i8> %3, <vscale x 2 x i64> %4)
>   ret <vscale x 16 x i1> %5                                                         
> } 
> ```

When I try your suggestion, the compiler didn't generate any failure. I have also test in https://www.godbolt.org/ using 'opt(trunk)'. Besides, the original failure which I want to fix is like:

https://github.com/llvm/llvm-project/pull/102472