[llvm-bugs] [Bug 46450] New: More registers are used when multiple target regions are compiled together

via llvm-bugs llvm-bugs at lists.llvm.org
Wed Jun 24 22:21:55 PDT 2020


https://bugs.llvm.org/show_bug.cgi?id=46450

            Bug ID: 46450
           Summary: More registers are used when multiple target regions
                    are compiled together
           Product: OpenMP
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Clang Compiler Support
          Assignee: unassignedclangbugs at nondot.org
          Reporter: xw111luoye at gmail.com
                CC: llvm-bugs at lists.llvm.org

I initially spotted this issue with AOMP but it seems from upstream clang.
https://github.com/ROCm-Developer-Tools/aomp/issues/24

reproducer:
git clone https://github.com/ye-luo/miniqmc
cd miniqmc/build
cmake -DCMAKE_CXX_COMPILER=clang++ -DENABLE_OFFLOAD=ON \
      -DUSE_OBJECT_TARGET=ON -DCMAKE_EXE_LINKER_FLAGS="-v" ..
make -j32 check_spo_batched

all the 6 kernels use 254 registers.

Then I comment out "target teams" at 159, 311, 405.
make -j32 check_spo_batched
now all the 3 kernels left use 243 registers.

If I add
-DCMAKE_CXX_FLAGS="-Xcuda-ptxas -v" to cmake and print out register usage
reported by ptxas. The three kernels take 146, 30, 30 registers when compiled.

I think the register usage is fine when kernels are compiled individually.
Somehow at linking, all the assembled kernels get the worst register usage
among all the individual kernels.

It destroys performance completely.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200625/4c409747/attachment-0001.html>


More information about the llvm-bugs mailing list