[PATCH] D86789: [DAGCombiner] Fold an AND of a masked load into a zext_masked_load

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 1 06:58:55 PDT 2020


RKSimon added inline comments.


================
Comment at: llvm/test/CodeGen/Thumb2/mve-zext-masked-load.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main-none-none-eabi -mattr=+mve.fp,+fp64 -verify-machineinstrs -o - %s | FileCheck %s
+
----------------
samtebbs wrote:
> samtebbs wrote:
> > samtebbs wrote:
> > > dmgreen wrote:
> > > > It can be good to show before and after in the tests, to make the differences clearer.
> > > I've added extra checks to show what was generated before.
> > It turns out I misunderstood. Here is the difference between codegen with and without this patch.{F12829520}
> ```
> diff --git a/llvm/test/CodeGen/Thumb2/mve-zext-masked-load.ll b/llvm/test/CodeGen/Thumb2/mve-zext-masked-load.ll
> index 5db6637ca81..9696827d846 100644
> --- a/llvm/test/CodeGen/Thumb2/mve-zext-masked-load.ll
> +++ b/llvm/test/CodeGen/Thumb2/mve-zext-masked-load.ll
> @@ -7,10 +7,8 @@ define arm_aapcs_vfpcc <4 x float> @foo_v4i16(<4 x i16>* nocapture readonly %pSr
>  ; CHECK-NEXT:    vmovlb.s16 q0, q0
>  ; CHECK-NEXT:    vpt.s32 lt, q0, zr
>  ; CHECK-NEXT:    vldrht.u32 q0, [r0]
> -; CHECK-NEXT:    vmovlb.u16 q0, q0
>  ; CHECK-NEXT:    vcvt.f32.u32 q0, q0
>  ; CHECK-NEXT:    bx lr
> -; CHECK-OLD-NEXT:    vmovlb.u16 q0, q0
>  entry:
>    %active.lane.mask = icmp slt <4 x i16> %a, zeroinitializer
>    %wide.masked.load = call <4 x i16> @llvm.masked.load.v4i16.p0v4i16(<4 x i16>* %pSrc, i32 2, <4 x i1> %active.lane.mask, <4 x i16> undef)
> @@ -24,10 +22,8 @@ define arm_aapcs_vfpcc <8 x half> @foo_v8i8(<8 x i8>* nocapture readonly %pSrc,
>  ; CHECK-NEXT:    vmovlb.s8 q0, q0
>  ; CHECK-NEXT:    vpt.s16 lt, q0, zr
>  ; CHECK-NEXT:    vldrbt.u16 q0, [r0]
> -; CHECK-NEXT:    vmovlb.u8 q0, q0
>  ; CHECK-NEXT:    vcvt.f16.u16 q0, q0
>  ; CHECK-NEXT:    bx lr
> -; CHECK-OLD-NEXT:    vmovlb.u8 q0, q0
>  entry:
>    %active.lane.mask = icmp slt <8 x i8> %a, zeroinitializer
>    %wide.masked.load = call <8 x i8> @llvm.masked.load.v8i8.p0v8i8(<8 x i8>* %pSrc, i32 1, <8 x i1> %active.lane.mask, <8 x i8> undef)
> @@ -39,15 +35,11 @@ define arm_aapcs_vfpcc <4 x float> @foo_v4i8(<4 x i8>* nocapture readonly %pSrc,
>  ; CHECK-LABEL: foo_v4i8:
>  ; CHECK:       @ %bb.0: @ %entry
>  ; CHECK-NEXT:    vmovlb.s8 q0, q0
> -; CHECK-NEXT:    vmov.i32 q1, #0xff
>  ; CHECK-NEXT:    vmovlb.s16 q0, q0
>  ; CHECK-NEXT:    vpt.s32 lt, q0, zr
>  ; CHECK-NEXT:    vldrbt.u32 q0, [r0]
> -; CHECK-NEXT:    vand q0, q0, q1
>  ; CHECK-NEXT:    vcvt.f32.u32 q0, q0
>  ; CHECK-NEXT:    bx lr
> -; CHECK-OLD-NEXT:    vmov.i32 q1, #0xff
> -; CHECK-OLD-NEXT:    vand q0, q0, q1
>  entry:
>    %active.lane.mask = icmp slt <4 x i8> %a, zeroinitializer
>    %wide.masked.load = call <4 x i8> @llvm.masked.load.v4i8.p0v4i8(<4 x i8>* %pSrc, i32 1, <4 x i1> %active.lane.mask, <4 x i8> undef)
> 
> 
> ```
Can you commit this test with trunk's current codegen, then rebase this patch so it shows the delta.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86789/new/

https://reviews.llvm.org/D86789



More information about the llvm-commits mailing list