[PATCH] D134982: [X86] Add support for "light" AVX
Phoebe Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 10 23:48:47 PDT 2022
pengfei added inline comments.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:2665
if (Op.size() >= 32 && Subtarget.hasAVX() &&
- (Subtarget.getPreferVectorWidth() >= 256)) {
+ (Subtarget.getPreferVectorWidth() >= 256 || EnableLightAVX)) {
// Although this isn't a well-supported type for AVX1, we'll let
----------------
TokarIP wrote:
> pengfei wrote:
> > Here the check for 256 was intended from rG47272217 authored by @echristo.
> > It looks to me it is the only difference between `prefer-128-bit` and `prefer-256-bit`. So I don't understand why using `-mattr=prefer-128-bit -x86-light-avx=true` rather than `prefer-256-bit`.
> I'm not sure I understand the question. Building everything with prefer-256-bit means getting e.g 256-bit FMA and corresponding frequency penalty. I want to get 256-bit loads/stores because they are free performance win, but not "heavy" instructions.
Oh, I thought it is the only place we compared PreferVectorWidth with 256. (There're two other places in X86TargetTransformInfo.cpp)
I mentioned @echristo's patch, because I guess make load/store the same size as PreferVectorWidth is beneficial in the most cases. And do we need to consider the intensity of "heavy" instructions? E.g., if we load 256-bit vector, shuffle to 2 128-bit to do a single FMA, and then shuffle back to 256-bit to do the store. Is it really better than 2 128-bit load/store that might be folded into the FMA instruction?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D134982/new/
https://reviews.llvm.org/D134982
More information about the llvm-commits
mailing list