[llvm-bugs] [Bug 50305] New: Poor vector code generation for blend operation

Tue May 11 11:30:26 PDT 2021

https://bugs.llvm.org/show_bug.cgi?id=50305

            Bug ID: 50305
           Summary: Poor vector code generation for blend operation
           Product: clang
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: -New Bugs
          Assignee: unassignedclangbugs at nondot.org
          Reporter: binjimin at gmail.com
                CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org,
                    neeilans at live.com, richard-llvm at metafoo.co.uk

See https://godbolt.org/z/Ms9E4nPhM

Given the following code:

typedef uint8_t u8;
typedef uint16_t u16;
typedef u8 u8x16 __attribute__((vector_size(16)));
typedef u16 u16x8 __attribute__((vector_size(16)));

typedef struct {
  u8x16 counter, shift;
} A;

void bad(A* a) {
    u8x16 active = a->counter == 0;
    a->counter -= 1 & ~active;
    a->shift = ((a->shift << 1) & active) | (a->shift & ~active);
}

Clang seems to prefer to generate a variable shift in the LLVM IR (see %11),
which then cannot be lowered efficiently in x86 SSE3/Wasm:

define dso_local void @_Z3badP1A(%struct.A* nocapture %0) local_unnamed_addr #0
!dbg !267 {
  call void @llvm.dbg.value(metadata %struct.A* %0, metadata !285, metadata
!DIExpression()), !dbg !287
  %2 = getelementptr inbounds %struct.A, %struct.A* %0, i64 0, i32 0, !dbg !288
  %3 = load <16 x i8>, <16 x i8>* %2, align 16, !dbg !288, !tbaa !289
  %4 = icmp ne <16 x i8> %3, zeroinitializer, !dbg !292
  call void @llvm.dbg.value(metadata <16 x i8> undef, metadata !286, metadata
!DIExpression()), !dbg !287
  %5 = sext <16 x i1> %4 to <16 x i8>, !dbg !293
  %6 = add <16 x i8> %3, %5, !dbg !294
  store <16 x i8> %6, <16 x i8>* %2, align 16, !dbg !294, !tbaa !289
  %7 = getelementptr inbounds %struct.A, %struct.A* %0, i64 0, i32 1, !dbg !295
  %8 = load <16 x i8>, <16 x i8>* %7, align 16, !dbg !295, !tbaa !289
  %9 = xor <16 x i1> %4, <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true,
i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1
true, i1 true>, !dbg !296
  %10 = zext <16 x i1> %9 to <16 x i8>, !dbg !296
  %11 = shl <16 x i8> %8, %10, !dbg !296
  store <16 x i8> %11, <16 x i8>* %7, align 16, !dbg !297, !tbaa !289
  ret void, !dbg !298
}

Using the platform-specific vector intrinsics seems to avoid this issue.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210511/dcc869ba/attachment-0001.html>