[PATCH] D59506: [ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()

Tue Mar 19 13:04:19 PDT 2019

nikic marked an inline comment as done.
nikic added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/smed3.ll:48
+; GCN: v_mov_b32_e32 v{{[0-9]+}}, 12
 define amdgpu_kernel void @v_test_smed3_r_i_i_constant_order_i32(i32 addrspace(1)* %out, i32 addrspace(1)* %aptr) #1 {
   %tid = call i32 @llvm.amdgcn.workitem.id.x()
----------------
lebedev.ri wrote:
> nikic wrote:
> > arsenm wrote:
> > > The test should be fixed so this is still testing what it intends to
> > From what I understand, the test is here to ensure that a "clamp" pattern with swapped constants is not incorrectly compiled to a `med3` instruction. After this change this kind of pattern is always constant folded by InstSimplify. From the debug log, in this case the simplification is caused by an EarlyCSE pass added by the AMDGPU target. I haven't found a way to disable it.
> > 
> > Do you have any ideas/pointers on how this pattern can be preserved all the way down to isel?
> The *test* needs to be modified, supposedly.
> I.e. just write another test that would still produce the "original output".
> 
Yes, I'm just not sure how to do that. What we need is to get something like

        t36: i32 = umax t44, Constant:i32<17>
      t40: i32 = umin t36, Constant:i32<12>

during isel without anything optimizing it away before that. For the unsigned case I can use something like

```
define amdgpu_kernel void @v_test_umed3_r_i_i_constant_order_i32(i32 addrspace(1)* %out, i32 addrspace(1)* %aptr) #1 {
  %tid = call i32 @llvm.amdgcn.workitem.id.x()
  %gep0 = getelementptr i32, i32 addrspace(1)* %aptr, i32 %tid
  %outgep = getelementptr i32, i32 addrspace(1)* %out, i32 %tid
  %a = load i32, i32 addrspace(1)* %gep0

  %icmp0 = icmp ugt i32 %a, 17
  %i0 = select i1 %icmp0, i32 %a, i32 17

  %sat = call i32 @llvm.uadd.sat(i32 %i0, i32 -13)

  store i32 %sat, i32 addrspace(1)* %outgep
  ret void
}
declare i32 @llvm.uadd.sat(i32, i32)
```

because uaddsat x, -13 will be expanded using a umax x, 12 as first instruction on AMDGPU. I don't have any ideas for the signed case though. And relying on expansion details seems like a bad idea anyway.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59506/new/

https://reviews.llvm.org/D59506