[PATCH] D45733: [DAGCombiner] Unfold scalar masked merge if profitable
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 24 13:44:39 PDT 2018
lebedev.ri added a comment.
In https://reviews.llvm.org/D45733#1077183, @lebedev.ri wrote:
> It seems this has uncovered something.
> It does not look like a miscompilation to me (FIXME or is it?), but the produced code is certainly worse:
>
> ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
> ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+bmi | FileCheck %s
>
> define float @test_andnotps_scalar(float %a0, float %a1, float* %a2) {
> ; CHECK-LABEL: test_andnotps_scalar:
> ; CHECK: # %bb.0:
> -; CHECK-NEXT: movd %xmm0, %eax
> -; CHECK-NEXT: movd %xmm1, %ecx
> -; CHECK-NEXT: andnl %ecx, %eax, %eax
> -; CHECK-NEXT: movd {{.*#+}} xmm1 = mem[0],zero,zero,zero
> -; CHECK-NEXT: notl %eax
> -; CHECK-NEXT: movd %eax, %xmm0
> +; CHECK-NEXT: movd %xmm1, %eax
> +; CHECK-NEXT: movd {{.*#+}} xmm2 = mem[0],zero,zero,zero
> ; CHECK-NEXT: pand %xmm1, %xmm0
> +; CHECK-NEXT: movd %xmm0, %ecx
> +; CHECK-NEXT: notl %eax
> +; CHECK-NEXT: orl %ecx, %eax
> +; CHECK-NEXT: movd %eax, %xmm0
> +; CHECK-NEXT: pand %xmm2, %xmm0
> ; CHECK-NEXT: retq
> %tmp = bitcast float %a0 to i32
> %tmp1 = bitcast float %a1 to i32
> %tmp2 = xor i32 %tmp, -1
> %tmp3 = and i32 %tmp2, %tmp1
> %tmp4 = load float, float* %a2, align 16
> %tmp5 = bitcast float %tmp4 to i32
> %tmp6 = xor i32 %tmp3, -1
> %tmp7 = and i32 %tmp5, %tmp6
> %tmp8 = bitcast i32 %tmp7 to float
> ret float %tmp8
> }
>
>
> We **lost** `andnl`.
> Discovered accidentally because the same happened to `@test_andnotps`/`@test_andnotpd` in `test/CodeGen/X86/*-schedule.ll` (they are no longer lowered to `andnps`/`andnpd`).
And it happened because both `xor`'s have the same [constant] operand - `-1`.
Repository:
rL LLVM
https://reviews.llvm.org/D45733
More information about the llvm-commits
mailing list