[llvm] [RISCV] Custom promote s32 G_UDIV/UREM/SDIV on RV64. Promote SREM using G_SEXT. (PR #115402)

Craig Topper via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 8 13:00:40 PST 2024


topperc wrote:

> > as we can detect it with (srem (sexti32), (sexti32))
> 
> as a result of `clampScalar(0, sXLen, sDoubleXLen)`?

That's how the type is widened, but the comment was about the math.

The important thing is that the behavior of (srem (sexti32 X), (sexti32 Y)) is exactly equivalent to REMW for all values of X and Y. From computeNumSignBits for srem.

```
    // The sign bit is the LHS's sign bit, except when the result of the         
    // remainder is zero. The magnitude of the result should be less than or     
    // equal to the magnitude of the LHS. Therefore, the result should have      
    // at least as many sign bits as the left hand side.
```

The result of (sexti32 X) has at least 33 sign bits so the result of (srem (sexti32 X), (sexti32 Y)) has at least 33 sign bits. REMW produces a value with at least 33 sign bits.

This doesn't work for (sdiv (sexti32 X), (sexti32 Y)). 0xffffffff80000000 / 0xfffffffffffffff is 0x0000000080000000, but DIVW produces 0xffffffff80000000 for that case. We use a custom opcode to remember that we started with i32 where 0x80000000 / 0xffffffff is immediate UB so that case doesn't matter.

(urem (zexti32 X), (zexti32 Y)) is not equivalent to REMUW because 0x0000000080000000 % 0x00000000ffffffff is 0x0000000080000000, but REMUW will produce 0xffffffff80000000. Since we started with an i32 type, the upper bits don't matter at the time of promotion so we use a custom opcode to remember that. After promotion, later optimizations may become dependent on the G_REMUW producing a sign extended value. Though I haven't addded that to computeNumSignBits yet for GISel.

(udiv (zexti32 X), (zexti32 Y)) is not equivalent to DIVUW because 0x0000000080000000 % 0x0000000000000001 is 0x0000000080000000, but DIVUW will return 0xffffffff80000000. Again at promotion time the upper don't matter so we use a custom opcode.

https://github.com/llvm/llvm-project/pull/115402


More information about the llvm-commits mailing list