[PATCH] D46494: [DAGCombiner] Masked merge: enhance handling of 'andn' with immediates
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon May 7 07:38:35 PDT 2018
spatel added a subscriber: andreadb.
spatel added a comment.
In https://reviews.llvm.org/D46494#1089380, @lebedev.ri wrote:
> Fixed with the correct fold, updated mca diffs in the differential's description:
> F6120274: trunk-vs-patch.txt <https://reviews.llvm.org/F6120274> F6120277: prevpatch-vs-patch.txt <https://reviews.llvm.org/F6120277>
We need to be careful here (and maybe there's a way for mca to show/warn about this, cc @andreadb).
When you simulate these instructions:
andnl %edx, %edi, %eax
orl $42, %edx
andnl %edx, %eax, %eax
Notice that the output of the sequence (%eax) is not used again by any instruction in the sequence. So this is measuring ideal throughput in a vacuum - each simulated iteration proceeds independently. Maybe that's what you intended, but the original sequence that you're comparing does not have that property:
andl %edx, %edi
notl %edx
andl $42, %edx
orl %edi, %edx <--- output fed back as the input to first instruction
Each iteration depends on the previous one, so it's not fair to compare the stats for the 2 sequences as they're shown in the attached diff.
================
Comment at: test/CodeGen/X86/unfold-masked-merge-scalar-variablemask.ll:707-710
%notmask = xor i32 %mask, -1
%n0 = xor i32 %x, 42 ; %x
%n1 = and i32 %n0, %notmask
%r = xor i32 %n1, 42
----------------
I'd still prefer to put a comment on tests like this where we're testing a non-canonical form, so we have a reminder that we're testing this for completeness, but it's not expected. Ie, instcombine would improve this sequence to remove the inverted mask.
Repository:
rL LLVM
https://reviews.llvm.org/D46494
More information about the llvm-commits
mailing list