[llvm-bugs] [Bug 52040] New: compiler crash when optimizing 3 or more HADD/HSUB intrinsics

Sat Oct 2 06:37:10 PDT 2021

https://bugs.llvm.org/show_bug.cgi?id=52040

            Bug ID: 52040
           Summary: compiler crash when optimizing 3 or more HADD/HSUB
                    intrinsics
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: benjsith at gmail.com
                CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
                    llvm-dev at redking.me.uk, pengfei.wang at intel.com,
                    spatel+llvm at rotateright.com

Created attachment 25315
  --> https://bugs.llvm.org/attachment.cgi?id=25315&action=edit
Compiler output with error log/callstack

A week ago, I reported https://bugs.llvm.org/show_bug.cgi?id=51974 which was
fixed in commit 468ff703e114599ce8fb7457bd3c7ef0b219e952. However, I came
across a variant of it when at least 3 HSUB/HADD intrinsics are chained
together in a particular way. This causes a cycle in the DAG as well, even with
the new fix.

The following code is a minimal repro for it, causing an assert about cycles in
Debug and looping forever in Release:

#include <immintrin.h>
__m128i do_stuff(__m128i I0) {
    __m128i A = _mm_hadd_epi16(I0, I0);
    __m128i B = _mm_hadd_epi16(A, A);
    __m128i C = _mm_hadd_epi16(B, A);
    return C;
}

when compiled with 'clang -O1 -mavx2'. The same issue occurs if all the
_mm_hadd_epi16 calls are changed to one of _mm_hadd_epi32, _mm_hsub_epi16, or
_mm_hsub_epi32.

I looked into the code in combineVectorHADDSUB() in X86ISelLowering.cpp, and
noticed the following code being hit:

SDValue LHS0 = LHS.getOperand(0);
SDValue RHS0 = LHS.getOperand(1);
SDValue LHS1 = RHS.getOperand(0);
SDValue RHS1 = RHS.getOperand(1);

This appears to be a typo, and changing it to 

SDValue LHS0 = LHS.getOperand(0);
SDValue LHS1 = LHS.getOperand(1);
SDValue RHS0 = RHS.getOperand(0);
SDValue RHS1 = RHS.getOperand(1);

avoids the DAG cycle, and seems to optimize the function correctly. I don't
know what knock-on effects this would have.

I have also attached the debug output from the compiler I get from the test
case above which shows the cycle in the DAG and a callstack of the assert.

I tested on the latest trunk (commit e420164f40a907643db40e65fff51a6041d40090)
and it's still present. I have observed it on both Windows and Linux.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20211002/7c097162/attachment.html>