[PATCH] D146683: [DAG] Conservatively check the VSELECT Operation Action in tryToFoldExtendSelectLoad

Xiang Zhang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 23 18:12:50 PDT 2023


xiangzhangllvm added inline comments.


================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12247
+      (N0->getOpcode() == ISD::VSELECT &&
+       TLI.getOperationAction(ISD::VSELECT, VT) != TargetLowering::Legal))
     return SDValue();
----------------
LuoYuanke wrote:
> LuoYuanke wrote:
> > Check "Level <= AfterLegalizeVectorOps" instead?
> Maybe changed as below.
> 
> ```
>   if ((N0->getOpcode() == ISD::VSELECT && Level > AfterLegalizeVectorOps &&
>        TLI.getOperationAction(ISD::VSELECT, VT) != TargetLowering::Legal
>        ))
>     return SDValue();
> ```
Yes, I also had a quick think about it, just need to pass Level here.


================
Comment at: llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:1043
+      assert(N->getValueType(0).getVectorElementType() != MVT::i16 &&
+             "We can't replace VSELECT with BLENDV in vxi16!");
       SDValue Blendv =
----------------
LuoYuanke wrote:
> LuoYuanke wrote:
> > vXi16?
> On the other hand, why can't use `VPBLENDVB ymm1, ymm2, ymm3/m256` (https://www.felixcloutier.com/x86/pblendvb) to vselect v16i16? We can bitcast v16i16 to v32i8.
Yes, but not only bitcast, it also need to bring new instructions to transfer value of ymm1.


================
Comment at: llvm/test/CodeGen/X86/vselect-post-combine.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-pc-windows-msvc -mattr=+avx2 | FileCheck %s --check-prefix=AVX2
+
----------------
pengfei wrote:
> This problem is not limited to Windows. Linux has the some problem if passing a ptr and load it to `<32 x i8>`.
Yes, I tended to rm windows then I found it is not triggered, duo to the previous optimization difference.


================
Comment at: llvm/test/CodeGen/X86/vselect-post-combine.ll:4
+
+define ptr @test_mul(<32 x i8> %vecinit.i.i.i.i.i92) {
+; AVX2-LABEL: test_mul:
----------------
LuoYuanke wrote:
> Can we simplify the variable name?
Sure


================
Comment at: llvm/test/CodeGen/X86/vselect-post-combine.ll:26
+  store <4 x i64> %3, ptr null, align 1
+  ret ptr null
+}
----------------
RKSimon wrote:
> You might be able to further simplify this test?
Let me have a try, in fact I spend a lot time to simplify the raw test to current status : )


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146683/new/

https://reviews.llvm.org/D146683



More information about the llvm-commits mailing list