[PATCH] D91516: [AMDGPU] Support for device scope shared variables

Mahesha S via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 16 20:34:48 PST 2020


hsmhsm added a comment.

In D91516#2397323 <https://reviews.llvm.org/D91516#2397323>, @arsenm wrote:

> This is a bit different than the most recent proposal which I thought avoided the need to pass multiple arguments per kernel and allowed supporting indirect calls. I thought this was going to produce a table in constant memory containing the offsets which would be indexed instead.

We had not arrived at any general consensus on which approach to stick to.  Following were the sugguested proposals:

(1)  Function argument driven approach:
(2)  Table driven approach:

  (2.1)  Table within global memory
  (2.2)  Table within shared memory
  (2.2.)  Table within constant memory

Being a less used feature, and also as suggested by Sam in his one of the early emails (while disussing it), I  have choosen approach (1) since I felt that it is comparitavely simpler approach, and try others only when this approach does not practically work either due to performance issues or any other valid reasons.

> Needs test with indirect calls and stored function addresses.

Indirect calls are something I completely missed. Thanks for pointing it.  Let me take a look at whether the current implementation handles indirect calls too.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91516/new/

https://reviews.llvm.org/D91516



More information about the llvm-commits mailing list