[llvm-bugs] [Bug 26301] New: [x86] suboptimal codegen for vector with string of set bits (-1)
via llvm-bugs
llvm-bugs at lists.llvm.org
Mon Jan 25 15:46:23 PST 2016
https://llvm.org/bugs/show_bug.cgi?id=26301
Bug ID: 26301
Summary: [x86] suboptimal codegen for vector with string of set
bits (-1)
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: spatel+llvm at rotateright.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
I noticed this while looking at vmaskmov codegen because that (unnecessarily)
uses vector masks with elements of all ones or zeros:
define <4 x i32> @high_64_ones() {
ret <4 x i32><i32 0, i32 0, i32 -1, i32 -1>
}
define <2 x i64> @high_64_ones_alt() {
ret <2 x i64><i64 0, i64 -1>
}
$ ./llc high_ones.ll -o -
_high_64_ones: ## @high_64_ones
movaps LCPI0_0(%rip), %xmm0 ## xmm0 = [0,0,4294967295,4294967295]
retq
_high_64_ones_alt: ## @high_64_ones_alt
movq $-1, %rax
movd %rax, %xmm0
pslldq $8, %xmm0 ## xmm0 =
zero,zero,zero,zero,zero,zero,zero,zero,xmm0[0,1,2,3,4,5,6,7]
retq
-----------------------------------------------------------------------------
Some might argue that the 1st case is fine; don't burden the vector units
because loads (and memory space?) are free on big Intel core systems. But I
think the 2nd would be better as:
pcmpeqd %xmm0, %xmm0 // splat 1 bits all the way across
pslldq $8, %xmm0 // shift in the zeros
...to avoid the move from integer to SSE register. And that's the codegen we
should produce by default for both cases.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160125/e67aaa60/attachment.html>
More information about the llvm-bugs
mailing list