[llvm] aa0dcb3 - [X86] AMD Zen 3: same-reg SSE XMM XORPS is a 1-cycle(!) dep-breaking one-idiom
Roman Lebedev via llvm-commits
llvm-commits at lists.llvm.org
Thu May 13 14:07:03 PDT 2021
On Fri, May 14, 2021 at 12:04 AM Roman Lebedev via llvm-commits
<llvm-commits at lists.llvm.org> wrote:
>
>
> Author: Roman Lebedev
> Date: 2021-05-14T00:03:36+03:00
> New Revision: aa0dcb3ba4b93e4499208def080ced98f3a89ad5
>
> URL: https://github.com/llvm/llvm-project/commit/aa0dcb3ba4b93e4499208def080ced98f3a89ad5
> DIFF: https://github.com/llvm/llvm-project/commit/aa0dcb3ba4b93e4499208def080ced98f3a89ad5.diff
>
> LOG: [X86] AMD Zen 3: same-reg SSE XMM XORPS is a 1-cycle(!) dep-breaking one-idiom
That of course should have been zero-idiom.
> While both the SOG and Agner insist that it is zero-cycle,
> i can not confirm that claim. While it clearly breaks the dependency,
> i can not come up with a snippet, or measurement approach,
> to end up with IPC bigger than 4, which, to me, means that it actually
> consumes execution resource of an FP unit for a cycle.
>
> Added:
>
>
> Modified:
> llvm/lib/Target/X86/X86ScheduleZnver3.td
> llvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-sse-xmm.s
>
> Removed:
>
>
>
> ################################################################################
> diff --git a/llvm/lib/Target/X86/X86ScheduleZnver3.td b/llvm/lib/Target/X86/X86ScheduleZnver3.td
> index 571aedf15d4c8..82233b6dda97a 100644
> --- a/llvm/lib/Target/X86/X86ScheduleZnver3.td
> +++ b/llvm/lib/Target/X86/X86ScheduleZnver3.td
> @@ -1536,6 +1536,9 @@ def : IsZeroIdiomFunction<[
> XOR64rr, XOR64rr_REV,
> SUB32rr, SUB32rr_REV,
> SUB64rr, SUB64rr_REV ], ZeroIdiomPredicate>,
> +
> + // SSE XMM Zero-idioms.
> + DepBreakingClass<[ XORPSrr ], ZeroIdiomPredicate>,
> ]>;
>
> def : IsDepBreakingFunction<[
>
> diff --git a/llvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-sse-xmm.s b/llvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-sse-xmm.s
> index a7b848bd3a921..3eae26fdcab7e 100644
> --- a/llvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-sse-xmm.s
> +++ b/llvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-sse-xmm.s
> @@ -10,12 +10,12 @@ xorps %xmm0, %xmm1
>
> # CHECK: Iterations: 10000
> # CHECK-NEXT: Instructions: 20000
> -# CHECK-NEXT: Total Cycles: 20003
> +# CHECK-NEXT: Total Cycles: 5004
> # CHECK-NEXT: Total uOps: 20000
>
> # CHECK: Dispatch Width: 6
> -# CHECK-NEXT: uOps Per Cycle: 1.00
> -# CHECK-NEXT: IPC: 1.00
> +# CHECK-NEXT: uOps Per Cycle: 4.00
> +# CHECK-NEXT: IPC: 4.00
> # CHECK-NEXT: Block RThroughput: 0.5
>
> # CHECK: Instruction Info:
> @@ -31,13 +31,13 @@ xorps %xmm0, %xmm1
> # CHECK-NEXT: 1 1 0.25 xorps %xmm0, %xmm1
>
> # CHECK: Register File statistics:
> -# CHECK-NEXT: Total number of mappings created: 20000
> -# CHECK-NEXT: Max number of mappings used: 66
> +# CHECK-NEXT: Total number of mappings created: 10000
> +# CHECK-NEXT: Max number of mappings used: 37
>
> # CHECK: * Register File #1 -- Zn3FpPRF:
> # CHECK-NEXT: Number of physical registers: 160
> -# CHECK-NEXT: Total number of mappings created: 20000
> -# CHECK-NEXT: Max number of mappings used: 66
> +# CHECK-NEXT: Total number of mappings created: 10000
> +# CHECK-NEXT: Max number of mappings used: 37
>
> # CHECK: * Register File #2 -- Zn3IntegerPRF:
> # CHECK-NEXT: Number of physical registers: 192
> @@ -75,16 +75,16 @@ xorps %xmm0, %xmm1
>
> # CHECK: Resource pressure by instruction:
> # CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12.0] [12.1] [13] [14.0] [14.1] [14.2] [15.0] [15.1] [15.2] [16.0] [16.1] Instructions:
> -# CHECK-NEXT: - - - - - - - - - 0.50 - 0.50 - - - - - - - - - - - xorps %xmm1, %xmm1
> -# CHECK-NEXT: - - - - - - - - 0.50 - 0.50 - - - - - - - - - - - - xorps %xmm0, %xmm1
> +# CHECK-NEXT: - - - - - - - - - 0.50 0.25 0.25 - - - - - - - - - - - xorps %xmm1, %xmm1
> +# CHECK-NEXT: - - - - - - - - 0.50 - 0.25 0.25 - - - - - - - - - - - xorps %xmm0, %xmm1
>
> # CHECK: Timeline view:
> -# CHECK-NEXT: Index 0123456
> +# CHECK-NEXT: Index 01234
>
> -# CHECK: [0,0] DeER .. xorps %xmm1, %xmm1
> -# CHECK-NEXT: [0,1] D=eER.. xorps %xmm0, %xmm1
> -# CHECK-NEXT: [1,0] D==eER. xorps %xmm1, %xmm1
> -# CHECK-NEXT: [1,1] D===eER xorps %xmm0, %xmm1
> +# CHECK: [0,0] DeER. xorps %xmm1, %xmm1
> +# CHECK-NEXT: [0,1] D=eER xorps %xmm0, %xmm1
> +# CHECK-NEXT: [1,0] DeE-R xorps %xmm1, %xmm1
> +# CHECK-NEXT: [1,1] D=eER xorps %xmm0, %xmm1
>
> # CHECK: Average Wait times (based on the timeline view):
> # CHECK-NEXT: [0]: Executions
> @@ -93,6 +93,6 @@ xorps %xmm0, %xmm1
> # CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
>
> # CHECK: [0] [1] [2] [3]
> -# CHECK-NEXT: 0. 2 2.0 0.5 0.0 xorps %xmm1, %xmm1
> -# CHECK-NEXT: 1. 2 3.0 0.0 0.0 xorps %xmm0, %xmm1
> -# CHECK-NEXT: 2 2.5 0.3 0.0 <total>
> +# CHECK-NEXT: 0. 2 1.0 1.0 0.5 xorps %xmm1, %xmm1
> +# CHECK-NEXT: 1. 2 2.0 0.0 0.0 xorps %xmm0, %xmm1
> +# CHECK-NEXT: 2 1.5 0.5 0.3 <total>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list