[PATCH] D137066: [AMDGPU] Add amdgcn_s_buffer_load_imm intrinsic

Thu Nov 3 00:58:44 PDT 2022

piotr added a comment.

In D137066#3903754 <https://reviews.llvm.org/D137066#3903754>, @arsenm wrote:

> In D137066#3902128 <https://reviews.llvm.org/D137066#3902128>, @piotr wrote:
>
>> In D137066#3902068 <https://reviews.llvm.org/D137066#3902068>, @nhaehnle wrote:
>>
>>> I think the question is really: what IR examples are there that could use scalar loads with immediate offsets but don't because instruction selection fails to extract the constant; and why does extracting the constant fail?
>>
>> All cases of isel not being able to extract the constant I looked at were due to the nodes being scattered over different basic blocks.
>
> This is the kind of case that CodeGenPrepare works around for addressing mode matching. I'd rather add that sort of optimization rather than changing the IR by adding new intrinsics to workaround this

I did consider extending CodeGenPrepare instead. The advantage of using the intrinsic is that we can run the code through the optimizer (CodeGenPrepare is run very late), which yields even better code.

While the GlobalISel may not need the new intrinsic, one could argue that adding the intrinsic is not a workaround and it should be implemented for gfx9 already alongside other changes. The new SMEM format features (including the separate immediate offset field) were originally omitted - see https://github.com/llvm/llvm-project/issues/38652. It was only about a couple of months ago, when @kosarev fixed this by adding support for the new format in a series of commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137066/new/

https://reviews.llvm.org/D137066