[llvm] [AMDGPU] - Add constant folding for s_bitreplicate (PR #72366)

Jessica Del via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 15 02:21:23 PST 2023


OutOfCache wrote:

> Yes the instructions set SCC but that is not part of the behaviour that is required to implement the intrinsic. It is just an inconvenience that codegen passes have to cope with.
> 
> It would be correct to implement the intrinsics in various different ways:
> 
> * you could constant fold it (if the argument is a constant)
> * you could expand it to a long sequence of VALU instructions (which would not touch SCC)
> * you could emit the s_wqm instruction (which would set SCC)
> 
> Maybe the confusion here is that we never wrote down an exact specification for what the intrinsics do? There is an implied specification of "do whatever the corresponding instruction does", but that does not (should not) include side effects like setting SCC.

Oh, I see. So the behavior of the intrinsic is independent from the behavior of the instruction, despite their link? Is this possible because at the IR level, we are not aware of SCC yet and the compiler handles SCC during the lowering process?

https://github.com/llvm/llvm-project/pull/72366


More information about the llvm-commits mailing list