[PATCH] D102107: [OpenMP] Codegen aggregate for outlined function captures
Johannes Doerfert via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Jul 9 09:29:28 PDT 2021
jdoerfert added a comment.
In D102107#2832740 <https://reviews.llvm.org/D102107#2832740>, @ABataev wrote:
> In D102107#2832286 <https://reviews.llvm.org/D102107#2832286>, @jdoerfert wrote:
>
>> In D102107#2824581 <https://reviews.llvm.org/D102107#2824581>, @ABataev wrote:
>>
>>> In D102107#2823706 <https://reviews.llvm.org/D102107#2823706>, @jdoerfert wrote:
>>>
>>>> In D102107#2821976 <https://reviews.llvm.org/D102107#2821976>, @ABataev wrote:
>>>>
>>>>> We used this kind of codegen initially but later found out that it causes a large overhead when gathering pointers into a record. What about hybrid scheme where the first args are passed as arguments and others (if any) are gathered into a record?
>>>>
>>>> I'm confused, maybe I misunderstand the problem. The parallel function arguments need to go from the main thread to the workers somehow, I don't see how this is done w/o a record. This patch makes it explicit though.
>>>
>>> Pass it in a record for workers only? And use a hybrid scheme for all other parallel regions.
>>
>> I still do not follow. What does it mean for workers only? What is a hybrid scheme? And, probably most importantly, how would we not eventually put everything into a record anyway?
>
> On the host you don’t need to put everything into a record, especially for small parallel regions. Pass some first args in registers and only the remaining args gather into the record. For workers just pass all args in the record.
Could you please respond to my question so we make progress here. We *always* have to pass things in a record, do you agree? If we pack the things eventually to pass it to the workers, why would we not pack it right away and avoid complexity? Passing varargs, then packing them later (with the same thread) into a record to give it to the workers is arguably introducing cost. What is the benefit of a hybrid approach given that it is (theoretically) more costly and arguably more complex?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D102107/new/
https://reviews.llvm.org/D102107
More information about the cfe-commits
mailing list