[PATCH] D26437: Use -fno-unit-at-a-time and -funit-at-a-time

Stanislav Mekhanoshin via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 11 13:42:15 PST 2016


rampitec added a comment.

The full testcase is rather big and requires a lot of setup, but I have created a really small one, sb1.cl:

  #include <opencl-c.h>
  
  __attribute__((overloadable, always_inline, convergent)) void
  sub_group_barrier(cl_mem_fence_flags flags, memory_scope scope)
  {
      if (flags)
          atomic_work_item_fence(flags, memory_order_acq_rel, scope);
  }

The file opencl-c.h is from clang, and note, that attribute convergent is added because of the contents of this file:

  #define __ovld __attribute__((overloadable))
  #define __conv __attribute__((convergent))
  void    __ovld __conv sub_group_barrier(cl_mem_fence_flags flags, memory_scope scope);

Now I compile it to produce a library .bc module:

clang -o sb1.bc -c -emit-llvm  -x cl -cl-std=CL2.0 -target amdgcn-amd-amdhsa-opencl -O3 -I <path to llvm>/llvm/tools/clang/lib/Headers sb1.cl

That is the result:

  ; Function Attrs: alwaysinline nounwind
  define void @_Z17sub_group_barrierj12memory_scope(i32 %flags, i32 %scope) local_unnamed_addr #0 {
  
  attributes #0 = { alwaysinline nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-features"="+fp64-denormals,-fp32-denormals" "unsafe-fp-math"="false" "use-soft-float"="false" }
  attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-features"="+fp64-denormals,-fp32-denormals" "unsafe-fp-math"="false" "use-soft-float"="false" }
  attributes #2 = { nounwind }

Convergent attribute is missing. You may argue that atomic_work_item_fence does not have convergent attribute, but it should not in fact. Then I can remove a body of the sub_group_barrier completely so it does not call anything at all, result is still the same.

If I use -fno-unit-at-a-time option, then attribute is in place:

  ; Function Attrs: alwaysinline convergent nounwind
  define void @_Z17sub_group_barrierj12memory_scope(i32 %flags, i32 %scope) #0 {

Also note that besides functions attribute pass PassManagerBuilder adds a lot of other passes under unit-a-time-mode which are not expected to run in a non-whole program mode, we have there IPSCCP, GlobalOptimizer, DeadArgElimination etc. All of that is not supposed to run on a library module.


Repository:
  rL LLVM

https://reviews.llvm.org/D26437





More information about the llvm-commits mailing list