[PATCH] D47401: [X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible.
Craig Topper via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Sat May 26 18:03:56 PDT 2018
craig.topper added inline comments.
Comment at: cfe/trunk/lib/Headers/avx512fintrin.h:9855
+ __v8di __t6 = (__v8di)_mm512_##op(__t4, __t5); \
+ return __t6;
> Would it be dumb to allow VLX capable CPUs to use 128/256 variants of the VPMAXUQ etc ? Or is it better to focus on improving SimplifyDemandedElts to handle this (and many other reduction cases that all tend to keep to the original vector width)?
I'm not sure how to do that from clang. Should we be using a reduction intrinsic and do custom lowering in the backend?
More information about the cfe-commits