[llvm-bugs] [Bug 42172] New: x86-domain-reassignment pass causes dreadful compile time slowdown when generating code for skylake-avx512 architecture

via llvm-bugs llvm-bugs at lists.llvm.org
Thu Jun 6 14:25:35 PDT 2019


https://bugs.llvm.org/show_bug.cgi?id=42172

            Bug ID: 42172
           Summary: x86-domain-reassignment pass causes dreadful compile
                    time slowdown when generating code for skylake-avx512
                    architecture
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: Kevin.Harris at unisys.com
                CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
                    llvm-dev at redking.me.uk, spatel+llvm at rotateright.com

Created attachment 22078
  --> https://bugs.llvm.org/attachment.cgi?id=22078&action=edit
the small case described above

At Unisys, we have an LLVM-based JIT to generate x86-64 code from instruction
sequences from one of our historical architectures.  We recently started
testing on servers with the skylake-avx512 architecture.  We encountered a
shocking compile time slowdown when compiling for this target architecture. 
Using pass timings, we isolated the problems to the x86-domain-reassignment
pass, by disabling this pass and seeing a large compile-time speedup.  From the
scatter plot of 31K+ examples, it clearly shows that an n-squared algorithm
must be the culprit, since the slowdown is strongly related to the size of the
IR.  I provide three examples, a small one, a medium sized one, and a large
one:

i131323820_f000203011542_1.bc - 712 object code bytes
i141043225_f400004013147_1.bc - 40376 object code bytes
i140409520_f400004002276_1.bc - 129424 object code bytes

The opt+llc pipeline that I ran for these cases showed the following slowdowns
when the x86-domain-reassignment pass is used:

small case: 0.013 secs to 0.017 secs
middle case: 1.043 secs to 7.981 secs
large case: 4.140 secs to 69.689 secs

The opt+llc pipeline that I used for these comparisons looks like this:

$LLVMPATH/opt -O3 -enable-tbaa -mcpu=skylake-avx512 $INFILE | $LLVMPATH/llc -O3
-enable-tbaa -filetype=obj -o=out.o -mcpu=skylake-avx512 -

where $INFILE is one of the 3 files listed above.  These commands generated the
slow times noted above.  The fast times were obtained by adding the
-disable-x86-domain-reassignment option to the llc command above.

I measured the object size for each of the 31K+ bitcode files that we used for
this experiment, and saw no changes to the object code size with/without the
-disable-x86-domain-reassignment option, so I'm presuming that this extra
compile time gives us no benefit.  

Please let me know if I can provide any additional assistance in resolving this
problem.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190606/c2513c3f/attachment.html>


More information about the llvm-bugs mailing list