[llvm] [WIP][AMDGPU] Split `isInlinableLiteral16` into three and call the specific version if possible (PR #81345)

Tue Feb 13 14:23:16 PST 2024

================
@@ -99,39 +99,39 @@ define i32 @inline_A_constraint_H1() {
 ; VI-LABEL: {{^}}inline_A_constraint_H2:
 ; VI: v_mov_b32 {{v[0-9]+}}, 0x3c00
 define i32 @inline_A_constraint_H2() {
-  %v0 = tail call i32 asm "v_mov_b32 $0, $1", "=v,A"(i16 bitcast (half 1.0 to i16))
+  %v0 = tail call i32 asm "v_mov_b32 $0, $1", "=v,A"(trunc i32 bitcast (float 1.0 to i32) to i16)
----------------
shiltian wrote:

TBH I don't think this test case is valid here. The instruction accepts a `b32` operand, and we pass a `i16`, which is bitcasted from `fp16`. We "expect" the backend to take it as an inline literal, but apparently it has already been casted to `i16`, the backend can only see `0x3C00` here, how does it know whether it is `1.0`? The spec says, `1.0` needs to be taken as inline literal, but it does't state in what encoding. My understanding is, it should be based on the type of the operand. If the operand expects a `bf16`, then `0x3F80` is `1.0`. If the operand expects a `fp16`, then `0x3C00` should be it.
The problem here is, if we don't treat it as an inline literal, the instruction will not be emitted.

https://github.com/llvm/llvm-project/pull/81345