[PATCH] D15873: AMDGPU: Implement readcyclecounter

Nicolai Hähnle via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 5 07:42:34 PST 2016


nhaehnle added a comment.

Ah, now I understand. The load after the s_memtime logically belongs to the store that follows it, so there is no reordering going on. Combining the waits is exactly what we want: when s_memtime is executed, the shader hardware immediately sends the corresponding request to some other hardware block (LDS? I'm not sure). The returned cycle count will be the correct one, no matter how late the corresponding s_waitcnt is scheduled.


http://reviews.llvm.org/D15873





More information about the llvm-commits mailing list