[llvm-bugs] [Bug 50971] New: Messy code for insertelement on AVX2 vectors
via llvm-bugs
llvm-bugs at lists.llvm.org
Fri Jul 2 14:41:25 PDT 2021
https://bugs.llvm.org/show_bug.cgi?id=50971
Bug ID: 50971
Summary: Messy code for insertelement on AVX2 vectors
Product: libraries
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: efriedma at quicinc.com
CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
llvm-dev at redking.me.uk, pengfei.wang at intel.com,
spatel+llvm at rotateright.com
Examples:
#include <immintrin.h>
#define IDX 3
__m256d float1(__m256d a, __m256d b) { a[IDX] = b[0]; return a; }
__m256d float2(__m256d a, double b) { a[IDX] = b; return a; }
__m256d float3(__m256d a, double b) { return _mm256_blend_epi32(a,
_mm256_set1_pd(b), 3 << IDX * 2); }
__m256i int1(__m256i a, __m256i b) { a[IDX] = b[0]; return a; }
__m256i int2(__m256i a, long b) { a[IDX] = b; return a; }
__m256i int3(__m256i a, long b) { return _mm256_blend_epi32(a,
_mm256_set1_epi64x(b), 3 << IDX * 2); }
__m256i int4(__m256i a, long *b) { a[IDX] = *b; return a; }
__m256i int5(__m256i a, long *b) { return _mm256_blend_epi32(a,
_mm256_set1_epi64x(*b), 3 << IDX * 2); }
It looks like the lowering extracts an 128-bit vector, inserts into that, then
reconstructs the 256-bit vector. A broadcast+blend is almost always going to be
more efficient, I think.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210702/5c5eebba/attachment.html>
More information about the llvm-bugs
mailing list