[Openmp-dev] How to apply custom pass to OpenMP offloading code
Johannes Doerfert via Openmp-dev
openmp-dev at lists.llvm.org
Thu Nov 19 13:09:13 PST 2020
If you open the IR module you generate you'd see that it contains both
host and device code.
Short of manually applying all steps the driver takes to build a offload
binary, you cannot
run custom passes via opt. Load them in via a plugin, that's the proper
way to add custom passes.
~ Johannes
On 11/19/20 11:18 AM, Alok Mishra via Openmp-dev wrote:
> I'm trying to develop and apply my custom pass to an OpenMP target
> offloading code. But when using 'opt' I face the following error:
> opt: <stdin>:29:1: error: expected top-level entity
> source_filename = "test.cpp"
>
> I receive this error message even if I do not use my pass and just try to
> use opt with -O1.
>
> Sample code I tried to build:
> $ cat test.cpp
> int main()
> {
> #pragma omp target teams distribute parallel for
> for(int i=0; i<1000;i++);
> return 0;
> }
>
> Command used to build the code and run
> $ clang++ -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target
> -march=sm_70 test.cpp
> $ nvprof ./a.out
> ==85306== NVPROF is profiling process 85306, command: ./a.out
> ==85306== Profiling application: ./a.out
> ==85306== Profiling result:
> Type Time(%) Time Calls Avg Min Max
> Name
> GPU activities: 91.79% 30.752us 1 30.752us 30.752us
> 30.752us __omp_offloading_2c_715c4b_main_l3
> 4.68% 1.5680us 1 1.5680us 1.5680us 1.5680us
> [CUDA memcpy DtoH]
> 3.53% 1.1840us 1 1.1840us 1.1840us 1.1840us
> [CUDA memcpy HtoD]
> API calls: 80.54% 362.46ms 1 362.46ms 362.46ms
> 362.46ms cuDevicePrimaryCtxRetain
> 17.76% 79.946ms 1 79.946ms 79.946ms 79.946ms
> cuDevicePrimaryCtxRelease
> 1.00% 4.4941ms 1 4.4941ms 4.4941ms 4.4941ms
> cuModuleLoadDataEx
> 0.54% 2.4175ms 1 2.4175ms 2.4175ms 2.4175ms
> cuModuleUnload
> 0.10% 446.26us 32 13.945us 2.0710us 174.72us
> cuStreamCreate
> 0.02% 111.17us 32 3.4730us 2.9480us 13.457us
> cuStreamDestroy
> 0.01% 64.949us 1 64.949us 64.949us 64.949us
> cuStreamSynchronize
> 0.01% 44.065us 1 44.065us 44.065us 44.065us
> cuMemcpyDtoH
> 0.01% 27.451us 1 27.451us 27.451us 27.451us
> cuLaunchKernel
> 0.00% 12.617us 1 12.617us 12.617us 12.617us
> cuDeviceGetPCIBusId
> 0.00% 9.4680us 1 9.4680us 9.4680us 9.4680us
> cuMemcpyHtoD
> 0.00% 5.9570us 1 5.9570us 5.9570us 5.9570us
> cuModuleGetFunction
> 0.00% 5.1610us 2 2.5800us 1.1090us 4.0520us
> cuModuleGetGlobal
> 0.00% 5.1190us 6 853ns 277ns 1.9740us
> cuCtxSetCurrent
> 0.00% 3.5270us 6 587ns 214ns 1.2310us
> cuDeviceGetAttribute
> 0.00% 3.4210us 3 1.1400us 465ns 2.4020us
> cuDeviceGetCount
> 0.00% 3.3210us 2 1.6600us 1.5970us 1.7240us
> cuDeviceGet
> 0.00% 1.6230us 1 1.6230us 1.6230us 1.6230us
> cuFuncGetAttribute
> 0.00% 1.1600us 1 1.1600us 1.1600us 1.1600us
> cuDevicePrimaryCtxGetState
> 0.00% 685ns 1 685ns 685ns 685ns
> cuDevicePrimaryCtxSetFlags
> 0.00% 346ns 1 346ns 346ns 346ns
> cuCtxGetDevice
>
> This shows that my Clang and OpenMP are built properly and target
> offloading works.
>
> Next I'm trying to apply any pass to this code, so I convert the code into
> LLVM-IR and then apply -O1 to it using opt.
> $ clang++ -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target
> -march=sm_70 -emit-llvm -S test.cpp
> $ opt -O1 < test.ll
> opt: <stdin>:891:1: error: expected top-level entity
> source_filename = "test.cpp"
> ^
>
> I'm able to use opt for any non-target offloading code. I get this error
> only when I use OpenMP target offloading.
> Please advise what am I doing wrong here.
>
> --
> Thank You.
> Regards,
> Alok
> *'FOR THE GREATER GOOD'*
>
>
> _______________________________________________
> Openmp-dev mailing list
> Openmp-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
More information about the Openmp-dev
mailing list