[all-commits] [llvm/llvm-project] 0b91de: [X86] Add X86FixupVectorConstantsPass to re-fold A...

Tue May 23 03:01:27 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 0b91de5ea32d40a25c609bf155426fea12c1e2fb
      https://github.com/llvm/llvm-project/commit/0b91de5ea32d40a25c609bf155426fea12c1e2fb
  Author: Simon Pilgrim <llvm-dev at redking.me.uk>
  Date:   2023-05-23 (Tue, 23 May 2023)

  Changed paths:
    M llvm/lib/Target/X86/CMakeLists.txt
    M llvm/lib/Target/X86/X86.h
    A llvm/lib/Target/X86/X86FixupVectorConstants.cpp
    M llvm/lib/Target/X86/X86InstrFoldTables.cpp
    M llvm/lib/Target/X86/X86InstrFoldTables.h
    M llvm/lib/Target/X86/X86TargetMachine.cpp
    M llvm/test/CodeGen/X86/avx512-calling-conv.ll
    M llvm/test/CodeGen/X86/avx512-ext.ll
    M llvm/test/CodeGen/X86/avx512-logic.ll
    M llvm/test/CodeGen/X86/avx512fp16-cvt-ph-w-vl-intrinsics.ll
    M llvm/test/CodeGen/X86/avx512vl-logic.ll
    M llvm/test/CodeGen/X86/bitcast-vector-bool.ll
    M llvm/test/CodeGen/X86/combine-and.ll
    M llvm/test/CodeGen/X86/combine-sdiv.ll
    M llvm/test/CodeGen/X86/dpbusd_const.ll
    M llvm/test/CodeGen/X86/dpbusd_i4.ll
    M llvm/test/CodeGen/X86/gfni-funnel-shifts.ll
    M llvm/test/CodeGen/X86/gfni-rotates.ll
    M llvm/test/CodeGen/X86/gfni-shifts.ll
    M llvm/test/CodeGen/X86/horizontal-reduce-smax.ll
    M llvm/test/CodeGen/X86/horizontal-reduce-smin.ll
    M llvm/test/CodeGen/X86/i64-to-float.ll
    M llvm/test/CodeGen/X86/icmp-pow2-diff.ll
    M llvm/test/CodeGen/X86/midpoint-int-vec-128.ll
    M llvm/test/CodeGen/X86/midpoint-int-vec-256.ll
    M llvm/test/CodeGen/X86/midpoint-int-vec-512.ll
    M llvm/test/CodeGen/X86/min-legal-vector-width.ll
    M llvm/test/CodeGen/X86/movmsk-cmp.ll
    M llvm/test/CodeGen/X86/opt-pipeline.ll
    M llvm/test/CodeGen/X86/paddus.ll
    M llvm/test/CodeGen/X86/prefer-avx256-lzcnt.ll
    M llvm/test/CodeGen/X86/prefer-avx256-mulo.ll
    M llvm/test/CodeGen/X86/prefer-avx256-shift.ll
    M llvm/test/CodeGen/X86/prefer-avx256-trunc.ll
    M llvm/test/CodeGen/X86/prefer-avx256-wide-mul.ll
    M llvm/test/CodeGen/X86/psubus.ll
    M llvm/test/CodeGen/X86/rotate-extract-vector.ll
    M llvm/test/CodeGen/X86/rotate_vec.ll
    M llvm/test/CodeGen/X86/sadd_sat_vec.ll
    M llvm/test/CodeGen/X86/srem-seteq-vec-nonsplat.ll
    M llvm/test/CodeGen/X86/ssub_sat_vec.ll
    M llvm/test/CodeGen/X86/usub_sat_vec.ll
    M llvm/test/CodeGen/X86/vec-strict-inttofp-128-fp16.ll
    M llvm/test/CodeGen/X86/vec-strict-inttofp-256-fp16.ll
    M llvm/test/CodeGen/X86/vec-strict-inttofp-256.ll
    M llvm/test/CodeGen/X86/vec-strict-inttofp-512-fp16.ll
    M llvm/test/CodeGen/X86/vector-fshl-128.ll
    M llvm/test/CodeGen/X86/vector-fshl-256.ll
    M llvm/test/CodeGen/X86/vector-fshl-512.ll
    M llvm/test/CodeGen/X86/vector-fshl-rot-128.ll
    M llvm/test/CodeGen/X86/vector-fshl-rot-256.ll
    M llvm/test/CodeGen/X86/vector-fshl-rot-512.ll
    M llvm/test/CodeGen/X86/vector-fshr-128.ll
    M llvm/test/CodeGen/X86/vector-fshr-256.ll
    M llvm/test/CodeGen/X86/vector-fshr-512.ll
    M llvm/test/CodeGen/X86/vector-fshr-rot-128.ll
    M llvm/test/CodeGen/X86/vector-fshr-rot-256.ll
    M llvm/test/CodeGen/X86/vector-fshr-rot-512.ll
    M llvm/test/CodeGen/X86/vector-idiv-sdiv-512.ll
    M llvm/test/CodeGen/X86/vector-idiv-udiv-512.ll
    M llvm/test/CodeGen/X86/vector-lzcnt-128.ll
    M llvm/test/CodeGen/X86/vector-lzcnt-256.ll
    M llvm/test/CodeGen/X86/vector-lzcnt-512.ll
    M llvm/test/CodeGen/X86/vector-mul.ll
    M llvm/test/CodeGen/X86/vector-pack-128.ll
    M llvm/test/CodeGen/X86/vector-pack-256.ll
    M llvm/test/CodeGen/X86/vector-pack-512.ll
    M llvm/test/CodeGen/X86/vector-pcmp.ll
    M llvm/test/CodeGen/X86/vector-reduce-add-mask.ll
    M llvm/test/CodeGen/X86/vector-reduce-or-bool.ll
    M llvm/test/CodeGen/X86/vector-reduce-or-cmp.ll
    M llvm/test/CodeGen/X86/vector-reduce-smax.ll
    M llvm/test/CodeGen/X86/vector-reduce-smin.ll
    M llvm/test/CodeGen/X86/vector-rotate-128.ll
    M llvm/test/CodeGen/X86/vector-rotate-256.ll
    M llvm/test/CodeGen/X86/vector-rotate-512.ll
    M llvm/test/CodeGen/X86/vector-shift-ashr-128.ll
    M llvm/test/CodeGen/X86/vector-shift-ashr-256.ll
    M llvm/test/CodeGen/X86/vector-shift-ashr-512.ll
    M llvm/test/CodeGen/X86/vector-shift-ashr-sub128.ll
    M llvm/test/CodeGen/X86/vector-shift-lshr-128.ll
    M llvm/test/CodeGen/X86/vector-shift-lshr-256.ll
    M llvm/test/CodeGen/X86/vector-shift-lshr-512.ll
    M llvm/test/CodeGen/X86/vector-shift-lshr-sub128.ll
    M llvm/test/CodeGen/X86/vector-shift-shl-128.ll
    M llvm/test/CodeGen/X86/vector-shift-shl-256.ll
    M llvm/test/CodeGen/X86/vector-shift-shl-512.ll
    M llvm/test/CodeGen/X86/vector-shift-shl-sub128.ll
    M llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
    M llvm/test/CodeGen/X86/vselect-pcmp.ll

  Log Message:
  -----------
  [X86] Add X86FixupVectorConstantsPass to re-fold AVX512 vector load folds as broadcast folds

This patch analyzes AVX512 instructions for full vector width folded loads from the constant pool and attempts to determine if it can be replaced with a smaller broadcast folded variant. Typically the broadcast opportunities were missed by type-width mismatches or mulituse limitations which have been removed in later passes.

As well as introducing broadcast fold tables (which can hopefully be extended/automated in the future), this also handles mismatches in the AND/ANDN/OR/XOR/TERNLOG type-widths, catching additional missed opportunities.

This is patch is pulled from the ongoing work based on D150143, but without removing the existing DAG constant broadcast lowering code - this patch is currently a late stage cleanup only.

The intention is to add additional broadcast/extension handling of constants in future patches, but it turned out that AVX512 broadcast handling was the easiest to start with.

Differential Revision: https://reviews.llvm.org/D150526