[PATCH] D70367: Fix for AMDGPU MUL_I24 known bits calculation

Sun Nov 17 23:26:38 PST 2019

ekuznetsov139 added a comment.

As far as an example, compile the following
F10774762: test.ll <https://reviews.llvm.org/F10774762>

  llc -mtriple amdgcn-amd-amdhsa -mcpu=gfx900 -mattr=-code-object-v3 -O2 -amdgpu-function-calls=0 -filetype=asm -o test.isa test.ll

The IR code:

  %5 = tail call i32 @llvm.amdgcn.workitem.id.x() #28, !range !4
  %tid_y = lshr i32 %5, 4
  %tid_x = and i32 %5, 15

  %y_div_5 = sdiv i32 %tid_y, 5
  %6 = mul nsw i32 %y_div_5, -5
  %y_mod_5 = add nsw i32 %6, %tid_y
  %v1 = add nsw i32 %tid_x, %y_mod_5
  %7 = icmp sgt i32 %v1, -2
  %spec.select.i51 = select i1 %7, i32 %v1, i32 -2

The corresponding ISA:

  v_and_b32_e32 v15, 15, v0
  v_lshrrev_b32_e32 v14, 4, v0
  s_load_dwordx2 s[0:1], s[4:5], 0x0
  v_add3_u32 v0, v14, v15, -8
  v_max_i32_e32 v0, -2, v0

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70367/new/

https://reviews.llvm.org/D70367