[PATCH] D80491: [LV] Clamp MaxVF to power of 2
Bjorn Pettersson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 1 07:28:20 PDT 2020
bjope added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5056
+ // Ensure MaxVF is a power of 2; the dependence distance bound may not be.
+ WidestRegister = PowerOf2Floor(WidestRegister);
+
----------------
This has caused some problems for us (downstream) and I'm not really sure how to deal with it.
Maybe this never happens for in-tree targets, but our target will return 160 for PowerOf2Floor. And I haven't really seen anything that says that TTI.getRegisterBitWidth(true) must return a power of 2 number.
Another, perhaps abnormal, thing is that our frontend support i40 types. So SmallestType/WidestType isn't guaranteed to be power-of-2 either.
Even if we actually want to scalarize operations using i40, doing PowerOf2Floor makes WidestRegister=128. And then we won't get a power-of-2 result from this method, since 128/40 isn't a power of 2 (considering that SmallestType/WidestType may end up not being a power-of-2). And then we trigger the new assert in computeMaxVF.
So in some sense this patch limits the types allowed in loops to be power-of-2-sized. And returning something that isn't a power-of-two from TTI.getRegisterBitWidth is now stupid.
Maybe there are other ways to ensure MaxVF is a power-of-2, or were these implications for getSmallestAndWidestTypes and getRegisterBitWidth intentional?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D80491/new/
https://reviews.llvm.org/D80491
More information about the llvm-commits
mailing list