[PATCH] D15873: AMDGPU: Implement readcyclecounter

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 5 06:53:48 PST 2016


arsenm added a comment.

In http://reviews.llvm.org/D15873#319491, @arsenm wrote:

> In http://reviews.llvm.org/D15873#319488, @nhaehnle wrote:
>
> > NAK on that. I agree that it feels odd, but...
> >
> > The documentation explicitly mentions that "s_memtime counts the same as an s_load_dwordx2" (in the publicly released AMD_GCN3_Instruction_Set_Architecture.pdf, page 39 of the PDF in section 4.4, where LGKM_CNT is discussed).
>
>
> OK, I didn't find this in the SI/CI manual (which seems to used old names: "smrd_fetch_time counts the same as an smrd_fetch_2.")


Another question would be where the waitcnt should be inserted, and if this should have one inserted right after it to be more precise. Right now with the attempt to reduce waits, the s_memtime ends up being the first instruction, and has a waitcnt shared with the kernel argument load in the testcase. This probably isn't what you want


http://reviews.llvm.org/D15873





More information about the llvm-commits mailing list