[PATCH] D80491: [LV] Clamp MaxVF to power of 2

Mon Jun 1 07:28:20 PDT 2020

bjope added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5056
+  // Ensure MaxVF is a power of 2; the dependence distance bound may not be.
+  WidestRegister = PowerOf2Floor(WidestRegister);
+
----------------
This has caused some problems for us (downstream) and I'm not really sure how to deal with it.

Maybe this never happens for in-tree targets, but our target will return 160 for PowerOf2Floor. And I haven't really seen anything that says that TTI.getRegisterBitWidth(true) must return a power of 2 number.
Another, perhaps abnormal, thing is that our frontend support i40 types. So SmallestType/WidestType isn't guaranteed to be power-of-2 either.

Even if we actually want to scalarize operations using i40, doing PowerOf2Floor makes WidestRegister=128. And then we won't get a power-of-2 result from this method, since 128/40 isn't a power of 2 (considering that SmallestType/WidestType may end up not being a power-of-2). And then we trigger the new assert in computeMaxVF.

So in some sense this patch limits the types allowed in loops to be power-of-2-sized. And returning something that isn't a power-of-two from  TTI.getRegisterBitWidth is now stupid.

Maybe there are other ways to ensure MaxVF is a power-of-2, or were these implications for getSmallestAndWidestTypes and getRegisterBitWidth intentional?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80491/new/

https://reviews.llvm.org/D80491