[Openmp-commits] [PATCH] D19879: Solve 'Too many args to microtask' problem

Sat May 7 17:28:31 PDT 2016

pawosm01 added a comment.

In http://reviews.llvm.org/D19879#420879, @jcownie wrote:

> On the microtask stuff: I have  no objection to this, but I'm very surprised that it is needed, since I was under the impression that Clang/OpenMP only ever emits an outlined function for the parallel region that takes a single pointer argument, and then generates code that uses offsets from that to find all the actual arguments.
>
> It also seems very odd that most code is OK, but LULESH fails. (I klnow some SpecOMP codes have a lot of references to shared variables...)
>
> Have you tried test cases which simply pass large numbers of arguments?
>  (So, my concern here is that you may be fixing the wrong bug!)

Hi James,

This LULESH benchmark doesn't do anything unusual. It doesn't do any explicit tasking, only implicit tasks due to plain old 'parallel for' loops with 'firstprivate' clause.
I added raise(11); in place where "Too many args to microtask" is printed and stared LULESH (built with -g) along with 'ulimit -c unlimited' and 'OMP_NUM_THREADS=1'. As it crashed on raised signal, I could analyse dumped core file in gdb:
(gdb) bt
#0  0x000003ff8f981450 in raise () from /lib64/libpthread.so.0
#1  0x000003ff8fa298d8 in ..kmp_invoke_microtask () from /home/pawosm01/llvm/lib/libomp.so
#2  0x000003ff8fa097f4 in ..kmp_fork_call () from /home/pawosm01/llvm/lib/libomp.so
#3  0x000003ff8fa00048 in ..kmpc_fork_call () from /home/pawosm01/llvm/lib/libomp.so
#4  0x00000000004036a0 in CalcEnergyForElems (p_new=0x3ef85120, e_new=<optimized out>, q_new=<optimized out>, bvc=<optimized out>, pbvc=<optimized out>, p_old=<optimized out>, e_old=<optimized out>, q_old=<optimized out>, compHalfStep=<optimized out>, vnewc=0x3ed53de0, work=<optimized out>, delvc=<optimized out>,

  e_cut=<optimized out>, q_cut=<optimized out>, emin=<optimized out>, qq_old=<optimized out>, ql_old=<optimized out>, rho0=<optimized out>, length=1056463064, regElemList=<optimized out>, compression=<optimized out>, pmin=<optimized out>, p_cut=<optimized out>, eosvmax=<optimized out>) at lulesh.cc:2145

#5  EvalEOSForElems (domain=..., vnewc=<optimized out>, numElemReg=<optimized out>, regElemList=<optimized out>, rep=<optimized out>) at lulesh.cc:2318
#6  ApplyMaterialPropertiesForElems (domain=..., vnew=<optimized out>) at lulesh.cc:2424
#7  LagrangeElements (domain=..., numElem=<optimized out>) at lulesh.cc:2463
#8  LagrangeLeapFrog (domain=...) at lulesh.cc:2656
#9  main (argc=<optimized out>, argv=<optimized out>) at lulesh.cc:2774

lulesh.cc line 2145 is a start of parallel region:

#pragma omp parallel for firstprivate(length, rho0, emin, e_cut)

Inside of it, 13 arrays from the outside of this region are accessed plus we got 4 firstprivate variables in the clause, also defined outside the region. This gives 17 variables and this is reflected by error message:

Running problem size 30^3 per domain until completion
Num processors: 1
Num threads: 6
Total number of elements: 27000

To run other sizes, use -s <integer>.
To run a fixed number of iterations, use -i <integer>.
To run a more or less balanced region set, use -b <integer>.
To change the relative costs of regions, use -c <integer>.
To print out progress, use -p
To write an output file for VisIt, use -v
See help (-h) for more options

Too many args to microtask: 17!
Too many args to microtask: 17!
Too many args to microtask: 17!
Too many args to microtask: 17!

I removed use of arbitrarily selected (the least used) two of them in the loop body (as the switch in ..kmp_invoke_kicrotask() handles up to 15 params) and - as expected - this error message stopped to appear.

Repository:
  rL LLVM

http://reviews.llvm.org/D19879