[PATCH] D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 12 05:42:54 PDT 2018
dmgreen added a comment.
Yeah, that looks like similar IR to what I was looking at. The vectorised version on Skylake (https://godbolt.org/z/RBS2Os) has a lot of shuffling, perhaps that's deemed unprofitable on Goldmont?
I can agree that 8 registers are hard to deal with. Can you explain the "promoting everything to 32-bits", do you mean essentially zext's/truncs around the whole max/max/xor/sub's block? I gave that a try and the sub's still seemed to be using bl's. (it uses cmp's not branches though, which looks better to my untrained eyes).
Repository:
rL LLVM
https://reviews.llvm.org/D52177
More information about the llvm-commits
mailing list