[PATCH] D57789: [CGP] form usub with overflow from sub+icmp
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 15 06:40:51 PDT 2019
spatel added a comment.
In D57789#1429714 <https://reviews.llvm.org/D57789#1429714>, @Carrot wrote:
> This patch causes 5% regression of one of our eigen benchmarks on Haswell.
>
> The problem is when it combines the CMP in a hot block with SUB in a cold block into a single SUB in hot block, on a two address architecture like x86, if the operand of CMP has other uses, it needs to make an extra COPY before the original CMP, so there is one more instruction in hot block.
>
> Another patch r355823 papered over the problem in our code, but it didn't fix the root cause.
>
> The regression is only observed on Haswell, it doesn't impact Skylake.
Thanks for letting me know. We could limit this transform based on profile metadata (or is there some other heuristic to determine that 1 block is hot and the other is cold?). In the absence of that information, I think this is the right theoretical optimization at this layer as shown in the improvements in the tests with this patch. If you have a test that shows the problem, I can take a look.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D57789/new/
https://reviews.llvm.org/D57789
More information about the llvm-commits
mailing list