[PATCH] D108925: [AMDGPU] enable scalar compare in truncate selection
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 31 16:28:26 PDT 2021
rampitec added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:2112
+ (i1 (UniformUnaryFrag<trunc> i32:$a)),
+ (S_CMP_EQ_U32 (S_AND_B32 (i32 1), $a), (i32 1))
+>;
----------------
rampitec wrote:
> rampitec wrote:
> > alex-t wrote:
> > > rampitec wrote:
> > > > I think we can later fold it into
> > > > ```
> > > > s_bitcmp1_b32 $a, 0
> > > > ```
> > > For now, I would prefer to leave TODO here. Any objections?
> > No objections, I also do not think we have to select it this way, but rather combine later.
> What's interesting, S_AND_B32 will produce SCC = 1 on non-zero result just by itself! When you land it we may experiment with removing S_CMP from this pattern completely, although I am not sure how the rest of the BE prepared for the lack of compare and what would the pattern after propagating SCC from moveToVALU().
D109031 will remove compare and that is possible to extend it to use s_bitcmp. The latter will be generally beneficial for any scalar code checking a bitfield, not just this.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D108925/new/
https://reviews.llvm.org/D108925
More information about the llvm-commits
mailing list