[PATCH] D94648: [amdgpu] Implement lower function LDS pass

Mahesha S via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 19 18:48:11 PDT 2021


hsmhsm added a comment.

In D94648#2770088 <https://reviews.llvm.org/D94648#2770088>, @hsmhsm wrote:

> In D94648#2769780 <https://reviews.llvm.org/D94648#2769780>, @JonChesterfield wrote:
>
>> In D94648#2758386 <https://reviews.llvm.org/D94648#2758386>, @arsenm wrote:
>>
>>> In D94648#2757793 <https://reviews.llvm.org/D94648#2757793>, @JonChesterfield wrote:
>>>
>>>> 
>>>
>>>
>>>
>>>> edit: lowerFormalArguments is not called if there are no formal arguments to the kernel. Test case I started from does pass arguments to the kernel, but they were unused and eliminated.
>>>
>>> Not seeing how it isn't called with no arguments? It should still be called anyway
>>
>> I expected that too, but debug statements around allocateLDSGlobal didn't fire, and looking at the control flow around the sdag and globalisel paths lowerFormalArguments is called from within a loop `for (const Argument &Arg : F.args()) {}`. I may be missing something of course (didn't put a lot of time into chasing this), but it definitely looks like lowerFormalArguments doesn't get called when there are no arguments.
>
> Hi Jon,
>
> For the 0 argument kernel, did you put debug statement within SITargetLowering::LowerFormalArguments() and test whether it will hit or not?  My experimentation shows that it is indeed hit for 0 arg kernels too. So it is not problem with 0 arg kernel.
>
> Problem is within the function - AMDGPUMachineFunction::allocateModuleLDSGlobal() that you wrote.
>
> The statement
>
>   GlobalVariable *GV = M->getGlobalVariable("llvm.amdgcn.module.lds");
>
> is suppose to be replaced by
>
>   GlobalVariable *GV = M->getGlobalVariable("llvm.amdgcn.module.lds", true);
>
> My understanding is - For Module to search for a internal global variable successfully within the symbol table, we need to explicitly tell it to do so by passing "true" as an additional arg to getGlobalVariable(). Otherwise, the internal symbol won't be find.

Further, perhaps, you also need to add one lit test like below, call llc and make sure that @llvm.amdgcn.module.lds is allocated at address 0.

  @llvm.amdgcn.module.lds = internal unnamed_addr addrspace(3) global [16 x i8] undef, align 16
  
  define amdgpu_kernel void @kern() {
    %llvm.amdgcn.module.lds.bc = bitcast [16 x i8] addrspace(3)* @llvm.amdgcn.module.lds to i8 addrspace(3)*
     store i8 6, i8 addrspace(3)* %llvm.amdgcn.module.lds.bc, align 16
  
    ret void
  }

CMD:  llc -march=amdgcn -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx900 --amdhsa-code-object-version=4 lds-allocation2.ll -o tmp.s

assembly generated:

  v_mov_b32_e32 v0, 0
  v_mov_b32_e32 v1, 6
  ds_write_b8 v0, v1
  s_endpgm

We should probably CHECK for the instruction pattern - ```v_mov_b32_e32 v0, 0```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94648/new/

https://reviews.llvm.org/D94648



More information about the llvm-commits mailing list