[Openmp-commits] [PATCH] D132532: [libomptarget] Avoid deleting the same entry multiple times

Guilherme Valarini via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Aug 24 07:39:03 PDT 2022


gValarini added a comment.

In D132532#3745645 <https://reviews.llvm.org/D132532#3745645>, @ye-luo wrote:

> In D132532#3745640 <https://reviews.llvm.org/D132532#3745640>, @jdoerfert wrote:
>
>> I thought about this but IIRC, this doesn't work either. Now you have no way to prevent races between the copy back and copy to since the entry is gone while the copy back runs.
>
> I thought about the racing. If race to write the same data, there is no harm. If race to write different data, the source of race is user program which needs to protect the race anyway.

I believe what @jdoerfert is trying to say is that the race will happen between a thread deleting the device data and another making the data transfer. Image the following:

- Thread A: is not the "owner" (`IsOwned == false`) of the entry, but will do a data transfer.
- Thread B: is the "owner" (`IsOwned == true`) of the entry and will delete it.

If A is switched out of execution right after `getTgtPtrBegin` returns, B might remove the data before A starts any data transfer. This will lead to a segfault on the device side. Although it seems to be a problem on the user program, this should lead to data corruption and not a segfault. I also believe this may happen on correct programs, but I cannot think of a simple example right now.

I was planning to send a patch with this same fix this week. It was quite similar to this one, but it also had a new reference counter designed to be used only by threads wanting to remove an entry. This way, multiple threads might get the entry, but only the last one would delete the buffer and the entry data.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132532/new/

https://reviews.llvm.org/D132532



More information about the Openmp-commits mailing list