[PATCH] D26521: [X86] Allow folding of reloads from stack slots when loading a subreg of the spilled reg
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 11 01:48:56 PST 2016
RKSimon added a comment.
Nice! I have an outstanding poor codegen issue with the inability to split+fold scalarization cases, do you think we would be able to expand this patch in the future to handle those cases?
__m128i popcnt1(__m128i *in) {
return (__m128i) { __builtin_popcountll(in[0][0]), __builtin_popcountll(in[0][1]) };
}
popcnt1(long long __vector(2)*):
vmovdqu (%rdi), %xmm0
vmovq %xmm0, %rax
vpextrq $1, %xmm0, %rcx
popcntq %rax, %rax
popcntq %rcx, %rcx
vmovq %rcx, %xmm0
vmovq %rax, %xmm1
vpunpcklqdq %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0],xmm0[0]
retq
Which would be better as:
popcnt1(long long __vector(2)*):
popcntq (%rdi), %rax
popcntq 8(%rdi), %rcx
vmovq %rcx, %xmm0
vmovq %rax, %xmm1
vpunpcklqdq %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0],xmm0[0]
retq
================
Comment at: test/CodeGen/X86/partial-fold.ll:1
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu < %s | FileCheck %s
+
----------------
Add i686 tests as well and regenerate with utils\update_llc_test_checks.py
https://reviews.llvm.org/D26521
More information about the llvm-commits
mailing list