[PATCH] D129285: [OMPIRBuilder] Fill the Aggregate Struct in the serialized parallel region

Kiran Chandramohan via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 7 09:45:56 PDT 2022


kiranchandramohan added inline comments.


================
Comment at: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:1054
+        Builder.CreateStore(StoredAggValue, CloneGEP);
+      }
       CI->removeFromParent();
----------------
jdoerfert wrote:
> kiranchandramohan wrote:
> > jdoerfert wrote:
> > > This is super fragile. What if there is no GEP for 0 offset or a GEP is reused. If we want something like this we should verify the conditions more closely. E.g., GEP->num_uses() == 1, all users of the CapturedStruct should be GEPs or other things we expect, etc.
> > > 
> > > Honestly, I would suggest to unify the handling rather than adding special cases. So, always outline, always do the same thing. In the serial case just omit the fork_call indirection.
> > > 
> > > Would that work?
> > I wasn't able to think of a way to unify the handling. The code to outline exists only in one place. So the extractor can be called only once unless we duplicate the parallel region code or modify the extractor to provide two copies or something like that. Were you suggesting something else?
> > 
> > > This is super fragile. What if there is no GEP for 0 offset or a GEP is reused. If we want something like this we should verify the conditions more closely. E.g., GEP->num_uses() == 1, all users of the CapturedStruct should be GEPs or other things we expect, etc.
> > 
> > Yeah, I agree that it is fragile and we should add the asserts. And there is always the case that something changes in the extractor's handling of aggregates and then all these will not work.
> I'm spitballing here: Could we not call the outlined function in the sequential case too, just without the fork_call indirection? If so, would that solve the issue?
> TBH, I'm not sure what the issue is w/o code to look at.
Both of them call the outlined function. I think that the extractor is only called once for the parallel region to outline and it fills the aggregate struct and creates the call to the outlined function at the place the extractor is called. For the serialized version we could previously just call the outlined function directly, but now because the arguments are passed in a structure, we have to fill that structure separately. Please correct me if this is not what actually happens.

I posted an example in D110114 yesterday to start a discussion but later thought can start a discussion in a patch.  Copying the example over here. The call to @sb_..omp_par from kmpc_fork_call in block omp_parallel has the aggregate filled in while the call to @sb_..omp_par in the serialized parallel region in block 6 does not have this. 

```
define void @sb_(ptr %0, ptr %1) !dbg !3 {
  %structArg = alloca { ptr }, align 8
  %tid.addr = alloca i32, align 4, !dbg !7
  %zero.addr = alloca i32, align 4, !dbg !7
  store i32 0, ptr %tid.addr, align 4, !dbg !7
  store i32 0, ptr %zero.addr, align 4, !dbg !7
  %3 = load i32, ptr %0, align 4, !dbg !7
  %4 = icmp ne i32 %3, 0, !dbg !7                 
  br label %entry, !dbg !7

entry:                                            ; preds = %2
  %omp_global_thread_num = call i32 @__kmpc_global_thread_num(ptr @1), !dbg !7
  br i1 %4, label %5, label %6

5:                                                ; preds = %entry
  br label %omp_parallel

omp_parallel:                                     ; preds = %5
  %gep_ = getelementptr { ptr }, ptr %structArg, i32 0, i32 0
  store ptr %1, ptr %gep_, align 8
  call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr @1, i32 1, ptr @sb_..omp_par, ptr %structArg), !dbg !9
  br label %omp.par.outlined.exit

omp.par.outlined.exit:                            ; preds = %omp_parallel
  br label %omp.par.exit.split

omp.par.exit.split:                               ; preds = %omp.par.outlined.exit
  br label %7

6:                                                ; preds = %entry
  call void @__kmpc_serialized_parallel(ptr @1, i32 %omp_global_thread_num)
  call void @sb_..omp_par(ptr %tid.addr, ptr %zero.addr, ptr %structArg), !dbg !9
  call void @__kmpc_end_serialized_parallel(ptr @1, i32 %omp_global_thread_num)
  br label %7

7:                                                ; preds = %6, %omp.par.exit.split
  ret void, !dbg !10
}
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129285/new/

https://reviews.llvm.org/D129285



More information about the llvm-commits mailing list