[PATCH] D108925: [AMDGPU] enable scalar compare in truncate selection

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 31 16:28:26 PDT 2021


rampitec added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:2112
+  (i1 (UniformUnaryFrag<trunc> i32:$a)),
+  (S_CMP_EQ_U32 (S_AND_B32 (i32 1), $a), (i32 1))
+>;
----------------
rampitec wrote:
> rampitec wrote:
> > alex-t wrote:
> > > rampitec wrote:
> > > > I think we can later fold it into
> > > > ```
> > > > s_bitcmp1_b32 $a, 0
> > > > ```
> > > For now, I would prefer to leave TODO here. Any objections?
> > No objections, I also do not think we have to select it this way, but rather combine later.
> What's interesting, S_AND_B32 will produce SCC = 1 on non-zero result just by itself! When you land it we may experiment with removing S_CMP from this pattern completely, although I am not sure how the rest of the BE prepared for the lack of compare and what would the pattern after propagating SCC from moveToVALU().
D109031 will remove compare and that is possible to extend it to use s_bitcmp. The latter will be generally beneficial for any scalar code checking a bitfield, not just this.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108925/new/

https://reviews.llvm.org/D108925



More information about the llvm-commits mailing list