[PATCH] D26521: [X86] Allow folding of reloads from stack slots when loading a subreg of the spilled reg

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 11 01:48:56 PST 2016


RKSimon added a comment.

Nice! I have an outstanding poor codegen issue with the inability to split+fold scalarization cases, do you think we would be able to expand this patch in the future to handle those cases?

  __m128i popcnt1(__m128i *in) {
    return (__m128i) { __builtin_popcountll(in[0][0]), __builtin_popcountll(in[0][1]) };
  }
  
  popcnt1(long long __vector(2)*):
          vmovdqu (%rdi), %xmm0
          vmovq   %xmm0, %rax
          vpextrq $1, %xmm0, %rcx
          popcntq %rax, %rax
          popcntq %rcx, %rcx
          vmovq   %rcx, %xmm0
          vmovq   %rax, %xmm1
          vpunpcklqdq     %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0],xmm0[0]
          retq

Which would be better as:

  popcnt1(long long __vector(2)*):
          popcntq (%rdi), %rax
          popcntq 8(%rdi), %rcx
          vmovq   %rcx, %xmm0
          vmovq   %rax, %xmm1
          vpunpcklqdq     %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0],xmm0[0]
          retq



================
Comment at: test/CodeGen/X86/partial-fold.ll:1
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu < %s | FileCheck %s
+
----------------
Add i686 tests as well and regenerate with utils\update_llc_test_checks.py


https://reviews.llvm.org/D26521





More information about the llvm-commits mailing list