[PATCH] D134982: [X86] Add support for "light" AVX

Phoebe Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 10 23:48:47 PDT 2022


pengfei added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:2665
       if (Op.size() >= 32 && Subtarget.hasAVX() &&
-          (Subtarget.getPreferVectorWidth() >= 256)) {
+          (Subtarget.getPreferVectorWidth() >= 256 || EnableLightAVX)) {
         // Although this isn't a well-supported type for AVX1, we'll let
----------------
TokarIP wrote:
> pengfei wrote:
> > Here the check for 256 was intended from rG47272217 authored by @echristo.
> > It looks to me it is the only difference between `prefer-128-bit` and `prefer-256-bit`. So I don't understand why using `-mattr=prefer-128-bit -x86-light-avx=true` rather than `prefer-256-bit`.
> I'm not sure I understand the question. Building everything with prefer-256-bit means getting e.g 256-bit FMA and corresponding frequency penalty. I want to get 256-bit loads/stores because they are free performance win, but not "heavy" instructions.
Oh, I thought it is the only place we compared PreferVectorWidth with 256. (There're two other places in X86TargetTransformInfo.cpp)
I mentioned @echristo's patch, because I guess make load/store the same size as PreferVectorWidth is beneficial in the most cases. And do we need to consider the intensity of "heavy" instructions? E.g., if we load 256-bit vector, shuffle to 2 128-bit to do a single FMA, and then shuffle back to 256-bit to do the store. Is it really better than 2 128-bit load/store that might be folded into the FMA instruction?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134982/new/

https://reviews.llvm.org/D134982



More information about the llvm-commits mailing list