[PATCH] D48278: [SelectionDAG] Fold redundant masking operations of shifted value
Diogo N. Sampaio via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 13 01:00:06 PDT 2018
dnsampaio added a comment.
In https://reviews.llvm.org/D48278#1155793, @spatel wrote:
> In https://reviews.llvm.org/D48278#1155779, @spatel wrote:
>
> > In https://reviews.llvm.org/D48278#1155560, @dnsampaio wrote:
> >
> > > It reduces the number of computation operations, from 3 to 2, and the number of constants kept int constants for performing the masking, from 2 to 1.
> > > I don't see how it increases the latency. If you are going to perform the masking and the shift anyway.
> >
> >
> > Ah, I see that now. But I'm not convinced this is the right approach. Why are we waiting to optimize this in the backend? This is a universally good optimization, so it should be in IR:
> > https://rise4fun.com/Alive/O04
> >
> > I'm not sure exactly where that optimization belongs. Ie, is it EarlyCSE, GVN, somewhere else, or is it its own pass? But I don't see any benefit in waiting to do this in the DAG.
>
>
> This also raises a question that has come up in another review recently - https://reviews.llvm.org/D41233. If we reverse the canonicalization of shl+and, we would solve the most basic case that I showed above:
>
> define i32 @shl_first(i32 %a {
> %t2 = shl i32 %a, 8
> %t3 = and i32 %t2, 44032
> ret i32 %t3
> }
>
> define i32 @mask_first(i32 %a) {
> %a2 = and i32 %a, 172
> %a3 = shl i32 %a2, 8
> ret i32 %a3
> }
>
>
This "canonicalization" won't help to prevent even basic duplicated masked values when using a lshr:
%0 = sext i16 %a to i32
%1 = lshr i32 %0, 8
%2 = and i32 %1, 172
%3 = and i32 %0, 44032
And a simplest case, that it is already in the test case, that won't be handled in the IR level:
define i32 @ror(i32 %a) {
entry:
%m2 = and i32 %a, 3855
%shl = shl i32 %a, 24
%shr = lshr i32 %a, 8
%or = or i32 %shl, %shr
%m1 = and i32 %or, 251658255
%or2 = or i32 %m1, %m2
ret i32 %or2
}
The shl shr instructions become a ror that masks the same masked value.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:5986
+ const auto &MaskedValue = dyn_cast<SDNode>(MASKED);
+ for (SDNode *OtherUser : MaskedValue->uses()) {
+ if ((&(*OtherUser) == ShiftAmount) || (OtherUser->getOpcode() != ISD::AND))
----------------
efriedma wrote:
> Walking the uses of an SDNode can get expensive... is this O(N^2) in the number of uses of MaskedValue? If not, please add a comment explaining why it isn't.
Ok, will change it as to work only with MaskedValue used by a shift and a AND operations.
https://reviews.llvm.org/D48278
More information about the llvm-commits
mailing list