[llvm] [AMDGPU] - Add constant folding for s_bitreplicate (PR #72366)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 15 02:12:48 PST 2023


jayfoad wrote:

> > > However, I don't think I can do the same for the `s_wqm` and `s_quadmask` intrinsics, because they implicitly set SCC?
> > 
> > 
> > No, that's not true. The intrinsics do not set SCC. It does not make sense for an intrinsic to be defined as changing the value of some physical register, because it would be impossible to make use of that behaviour.
> > The definition of the intrinsics is that they just do bit twiddling on their input value to produce a result. They can be constant-folded.
> 
> The intrinsics do not set SCC, but the instructions `s_wqm` and `s_quadmask` do, right? And the intrinsics themselves only lower to these instructions. It's true that the bit operations themselves can be constant folded, but by removing the instructions from the equation we lose the setting of SCC. At least, that's what I thought? If that is not an issue, implementing constant folding for the intrinsics is of course doable.

Yes the instructions set SCC but that is not part of the behaviour that is required to implement the intrinsic. It is just an inconvenience that codegen passes have to cope with.

It would be correct to implement the intrinsics in various different ways:
- you could constant fold it (if the argument is a constant)
- you could expand it to a long sequence of VALU instructions (which would not touch SCC)
- you could emit the s_wqm instruction (which would set SCC)

Maybe the confusion here is that we never wrote down an exact specification for what the intrinsics do? There is an implied specification of "do whatever the corresponding instruction does", but that does not (should not) include side effects like setting SCC.

https://github.com/llvm/llvm-project/pull/72366


More information about the llvm-commits mailing list