[PATCH] D42946: Verify profile data confirms large loop trip counts.

Mircea Trofin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 5 22:59:35 PST 2018


mtrofin added inline comments.


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:8353
 
-  if (!HasExpectedTC && LoopVectorizeWithBlockFrequency) {
+  // ExpectedTC may be large because it's bound by a variable. Check
+  // profiling information to validate we should vectorize.
----------------
davidxl wrote:
> What is the ExpectedTC returned in this case?  Why does it not return CouldNotCompute  (or 0)?
For example, for:

```
for (unsigned i = 1; i < something; ++i) {
  ...
}
```

it returns 0xffff fffe - assuming `unsigned something`. That's correct - that's the maximum times the loop body would be executed, because `something` might at most be 0xffff ffff. If the step were 2 instead of 1, the ExpectedTC would reflect that accordingly (meaning, the trip count would be half). 
In contrast, CouldNotCompute is produced in cases like unsupported loops, for instance those with multiple exits.

When it comes to '0', the API behaves a bit unexpectedly, I'd argue: in cases like the ones in the current set of unit tests, it ends up adding 1 to the max taken back edges (which is 0xffff ffff there), which through overflow gets us to 0. That was the reason those tests were passing.

I'd love to hear others' thoughts are on this overflow behavior, and whether there's any reason we shouldn't refactor it, because relying on the overflow to represent "not having an expected trip count" falls through in cases like I just presented (e.g. non-unitary step, starting from something else than 0, etc)

This issue of determining "no expected trip counts" aside, I'd argue the issue addressed here can be addressed separately.


Repository:
  rL LLVM

https://reviews.llvm.org/D42946





More information about the llvm-commits mailing list