[PATCH] D147996: [X86] combineConcatVectorOps - remove FADD/FSUB/FMUL handling (2-1)

Tue Apr 11 00:35:01 PDT 2023

xiangzhangllvm created this revision.
xiangzhangllvm added reviewers: RKSimon, LuoYuanke, pengfei.
Herald added a subscriber: hiraditya.
Herald added a project: All.
xiangzhangllvm requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Due to VADD, VSUB, VMUL can executed on more ports than VINSERT.
We tend to remove FADD/FSUB/FMUL handling from combineConcatVectorOps.
More details pls refer to fb91f0 <https://reviews.llvm.org/rGfb91f0a298492ed7db8928e18cda35228474007e>

**Problems**
**1** This patch cause affected tests generated worse code. We need to fix it.

  PS: After [[ https://reviews.llvm.org/rGfb91f0a298492ed7db8928e18cda35228474007e | fb91f0 ]] there is another [[ https://reviews.llvm.org/rG649b14928a67e016f3e01ac46499aaf1824c2d09 | 649b14 ]] optimize these tests

**2** We may also need to root cause why middle end vector optimization didn't work

  for there cases.
  For example, llvm/test/CodeGen/X86/widen_fadd.ll function widen_fadd_v2f32_v8f32
  was not be optimized (in avx mode)

**from**

   define void @widen_fadd_v2f32_v8f32(ptr %a0, ptr %b0, ptr %c0) {
    %a2 = getelementptr inbounds i8, ptr %a0, i64 8
    %b2 = getelementptr inbounds i8, ptr %b0, i64 8
    %c2 = getelementptr inbounds i8, ptr %c0, i64 8
    %a4 = getelementptr inbounds i8, ptr %a0, i64 16
    %b4 = getelementptr inbounds i8, ptr %b0, i64 16
    %c4 = getelementptr inbounds i8, ptr %c0, i64 16
    %a6 = getelementptr inbounds i8, ptr %a0, i64 24
    %b6 = getelementptr inbounds i8, ptr %b0, i64 24
    %c6 = getelementptr inbounds i8, ptr %c0, i64 24
    %va0 = load <2 x float>, ptr %a0, align 4
    %vb0 = load <2 x float>, ptr %b0, align 4
    %va2 = load <2 x float>, ptr %a2, align 4
    %vb2 = load <2 x float>, ptr %b2, align 4
    %va4 = load <2 x float>, ptr %a4, align 4
    %vb4 = load <2 x float>, ptr %b4, align 4
    %va6 = load <2 x float>, ptr %a6, align 4
    %vb6 = load <2 x float>, ptr %b6, align 4
    %vc0 = fadd <2 x float> %va0, %vb0
    %vc2 = fadd <2 x float> %va2, %vb2
    %vc4 = fadd <2 x float> %va4, %vb4
    %vc6 = fadd <2 x float> %va6, %vb6
    store <2 x float> %vc0, ptr %c0, align 4
    store <2 x float> %vc2, ptr %c2, align 4
    store <2 x float> %vc4, ptr %c4, align 4
    store <2 x float> %vc6, ptr %c6, align 4
    ret void
  }

**to**

  define void @widen_fadd_v2f32_v8f32(ptr %a0, ptr %b0, ptr %c0) {
    %va0 = load <8 x float>, ptr %a0, align 4
    %vb0 = load <8 x float>, ptr %b0, align 4
    %vc0 = fadd <8 x float> %va0, %vb0
    store <8 x float> %vc0, ptr %c0, align 4
    ret void
  }

https://reviews.llvm.org/D147996

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/test/CodeGen/X86/widen_fadd.ll
  llvm/test/CodeGen/X86/widen_fmul.ll
  llvm/test/CodeGen/X86/widen_fsub.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D147996.512351.patch
Type: text/x-patch
Size: 17868 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230411/d62a3bca/attachment.bin>