[PATCH] D139627: clang/X86: Don't emit "min-legal-vector-width"="0"

Thu Dec 8 05:49:41 PST 2022

arsenm created this revision.
arsenm added reviewers: RKSimon, craig.topper, aaron.ballman, pengfei, serge-sans-paille, xbolva00, jdoerfert, nikic, FreddyYe.
Herald added subscribers: kosarev, StephenFan, arphaman, hiraditya, jvesely.
Herald added a project: All.
arsenm requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.

This should be NFC as far as clang end user experience is concerned,
but does change X86's interpretation of IR without the attribute
explicitly specified.

This was clutter that's always annoyed me. I have no idea what this
does and just don't want to see it anymore. This was previously
attempted in D97116 <https://reviews.llvm.org/D97116>, which was reverted. It didn't attempt to deal
with the X86 backend's backwards treatment of the unset attribute
case. I tried out the reported regressing sample, and end to end I get
the same result before and after.

I have no idea what this attribute means and it seems to be very x86
specific spaghetti spread all around. It's constantly getting
recomputed. The IR documentation was only recently added, and makes no
sense to me. I have no idea what "legal" means in this context. At
first glance this corresponds to clang's min_vector_width attribute,
with a slightly different name. There seems to be an aggressive amount
of reinterpetation of this value.

The adventure begins in the calling convention
lowering. X86_64ABIInfo::classifyRegCallStructType, which computes a
"MaxVectorWidth", updating it based on the C types on certain
arguments. That is then used to initialize
CGFunctionInfo::setMaxVectorWidth.

CodeGenFunction::StartFunction then initializes LargestVectorWidth
based on MinVectorWidthAttr. CodeGenFunction::EmitCall computes a max
vector width again based on IR types for call sites? Yet another
reduction over the IR argument and return type is (redundantly?)
performed before final emission of the attribute. This is then clamped
to the value apparently computed during the initial calling convention
lowering?

In the middle end there's some additional insanity with attribute
propagation. There are additional argument type reductions. Finally,
the actual use modifies the X86Subtarget construction. It default
assumed UINT_MAX if the attribute isn't set, contrary to 0 this value
apparently used everywhere else. The point the subtarget is
constructed isn't well defined, so the underlying function's
min-legal-vector-width may have changed later during these attribute
propagation passes. Most of this code also consistently doesn't handle
vectors of pointers correctly.

The documentation and commit message for D123284 <https://reviews.llvm.org/D123284> make no sense to
me. It's implying it's an ABI attribute, but the fact that it changes
in so many places and that clang doesn't emit it on declarations
implies it can't be one.

This avoids it the most common case, but it's still emitted for any
function with a vector in the arguments or return type. It should
perhaps be moved into x86 specific attribute emission. It would be
reasonable to treat this as a generic attribute if it was a 1:1
mapping from the source level attribute, which merely passes through
to the backend. Is there a reason we can't just have this pass through
to let x86 do the IR argument type inspection later? To minimize test
churn, I added yet another reduction to handle unannotated functions
in the subtarget construction. I believe this should be removed and
the remaining tests updated.

The net effect of all this complexity seems to be pretty small, only
changing handling of very large vectors on subtargets with avx512. I
updated the tests by merely placing min-legal-vector-width=512
(although some of the test failures without this looked reasonable to
just update, although they looked like regressions).

https://reviews.llvm.org/D139627

Files:
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/test/CodeGen/aarch64-neon-ldst-one.c
  clang/test/CodeGen/aarch64-neon-scalar-x-indexed-elem.c
  clang/test/CodeGen/aarch64-poly128.c
  clang/test/CodeGen/aarch64-poly64.c
  clang/test/CodeGen/regcall2.c
  clang/test/CodeGenCXX/arm-generated-fn-attr.cpp
  clang/test/CodeGenCXX/dllexport-ctor-closure-nested.cpp
  clang/test/CodeGenCXX/dllexport-ctor-closure.cpp
  clang/test/CodeGenCXX/dllexport.cpp
  clang/test/OpenMP/amdgcn-attributes.cpp
  clang/test/OpenMP/irbuilder_safelen.cpp
  clang/test/OpenMP/irbuilder_safelen_order_concurrent.cpp
  clang/test/OpenMP/irbuilder_simd_aligned.cpp
  clang/test/OpenMP/irbuilder_simdlen.cpp
  clang/test/OpenMP/irbuilder_simdlen_safelen.cpp
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/test/Analysis/CostModel/X86/masked-interleaved-load-i16.ll
  llvm/test/Analysis/CostModel/X86/masked-interleaved-store-i16.ll
  llvm/test/Analysis/CostModel/X86/masked-intrinsic-cost-inseltpoison.ll
  llvm/test/Analysis/CostModel/X86/masked-intrinsic-cost.ll
  llvm/test/Analysis/CostModel/X86/powi.ll
  llvm/test/CodeGen/X86/avx512-calling-conv.ll
  llvm/test/CodeGen/X86/avx512bw-mask-op.ll
  llvm/test/CodeGen/X86/avx512fp16-subv-broadcast-fp16.ll
  llvm/test/CodeGen/X86/perm.avx512-false-deps.ll
  llvm/test/CodeGen/X86/pr47299.ll
  llvm/test/CodeGen/X86/pr48727.ll
  llvm/test/CodeGen/X86/vector-shuffle-avx512.ll
  llvm/test/CodeGen/X86/vector-trunc-usat.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D139627.481256.patch
Type: text/x-patch
Size: 110247 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20221208/049431e3/attachment-0001.bin>