[PATCH] D91516: [AMDGPU] Support for device scope shared variables
Mahesha S via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 16 20:34:48 PST 2020
hsmhsm added a comment.
In D91516#2397323 <https://reviews.llvm.org/D91516#2397323>, @arsenm wrote:
> This is a bit different than the most recent proposal which I thought avoided the need to pass multiple arguments per kernel and allowed supporting indirect calls. I thought this was going to produce a table in constant memory containing the offsets which would be indexed instead.
We had not arrived at any general consensus on which approach to stick to. Following were the sugguested proposals:
(1) Function argument driven approach:
(2) Table driven approach:
(2.1) Table within global memory
(2.2) Table within shared memory
(2.2.) Table within constant memory
Being a less used feature, and also as suggested by Sam in his one of the early emails (while disussing it), I have choosen approach (1) since I felt that it is comparitavely simpler approach, and try others only when this approach does not practically work either due to performance issues or any other valid reasons.
> Needs test with indirect calls and stored function addresses.
Indirect calls are something I completely missed. Thanks for pointing it. Let me take a look at whether the current implementation handles indirect calls too.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D91516/new/
https://reviews.llvm.org/D91516
More information about the llvm-commits
mailing list