[Openmp-commits] [PATCH] D64217: [OpenMP][NFCI] Cleanup the target state queue implementation

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Thu Jul 18 08:57:48 PDT 2019

jdoerfert added a comment.

> It is handled by the PTX-to-SASS compiler.

If they are uninitialized in LLVM-IR, it doesn't matter what happens at the PTX level and lower, LLVM could simply get rid of the accesses. 
Why do you fight so hard against **zero-cost** initialization that *might be* needed for correctness?

> Do you really need it?

Do I need/want the code and results of a correct and faster implementation? Yes, please share the results and the code so we can improve things. I don't know why you think I would not want improvements?

>> Ideally, we can try to get rid of the target state queue by reducing the amount of information and using only shared memory. [...]
>  I would try to use lazy initialization for the cases where we need to track some ICVs. I did not try it yet but thought about it for some time.

These are great ideas that you should pursue. Though this patch does actually not change anything (important) of the queue implementation and I
think it would be best to open a ticket or email thread on future development instead.


This discussion still seems to go nowhere. Let me summarize:

- @ABataev has performance results for a correct and improved version of this queue. I'm looking forward to the patches!
- @Hahnfeld and @ABataev discussed options to eliminate the need for a statue queue all together. I suggest we start a separate discussion on this as it is obviously more involved.
- I will add initialization to the uninitialized class members.
- I am not aware of any test case / system configuration under which this patch causes problems. If it does, we split it in two, one to do the refactoring, one to remove volatile.
- @ABataev mentioned removing volatile could cause problems. I'm hoping for details on the system (Cuda, LLVM, Clang, version numbers) and maybe a reproducer for which it does.

  rG LLVM Github Monorepo



More information about the Openmp-commits mailing list