[llvm] f0903de - [x86] Enable bypassing 64-bit division on generic x86-64
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 29 08:57:10 PDT 2020
Author: Simon Pilgrim
Date: 2020-04-29T16:55:48+01:00
New Revision: f0903de1aa7a77a1d1cf9ecdf17e0c16ca135eaa
URL: https://github.com/llvm/llvm-project/commit/f0903de1aa7a77a1d1cf9ecdf17e0c16ca135eaa
DIFF: https://github.com/llvm/llvm-project/commit/f0903de1aa7a77a1d1cf9ecdf17e0c16ca135eaa.diff
LOG: [x86] Enable bypassing 64-bit division on generic x86-64
This is currently enabled for Intel big cores from Sandy Bridge onward, as well as Atom, Silvermont, and KNL, due to 64-bit division being so slow on these cores. AMD cores can do this in hardware (use 32-bit division based on input operand width), so it's not a win there. But since the majority of x86 CPUs benefit from this optimization, and since the potential upside is significantly greater than the downside, we should enable this for the generic x86-64 target.
Patch By: @atdt
Reviewed By: @craig.topper, @RKSimon
Differential Revision: https://reviews.llvm.org/D75567
Added:
Modified:
llvm/lib/Target/X86/X86.td
llvm/test/CodeGen/X86/bypass-slow-division-tune.ll
Removed:
################################################################################
diff --git a/llvm/lib/Target/X86/X86.td b/llvm/lib/Target/X86/X86.td
index c7990ba5d55a..921c7793a6b2 100644
--- a/llvm/lib/Target/X86/X86.td
+++ b/llvm/lib/Target/X86/X86.td
@@ -1260,6 +1260,7 @@ def : ProcessorModel<"x86-64", SandyBridgeModel, [
FeatureNOPL,
Feature64Bit,
FeatureSlow3OpsLEA,
+ FeatureSlowDivide64,
FeatureSlowIncDec,
FeatureMacroFusion,
FeatureInsertVZEROUPPER
diff --git a/llvm/test/CodeGen/X86/bypass-slow-division-tune.ll b/llvm/test/CodeGen/X86/bypass-slow-division-tune.ll
index 75a00dd03a31..8369a44dcbad 100644
--- a/llvm/test/CodeGen/X86/bypass-slow-division-tune.ll
+++ b/llvm/test/CodeGen/X86/bypass-slow-division-tune.ll
@@ -66,9 +66,20 @@ define i64 @div64(i64 %a, i64 %b) {
; X64-LABEL: div64:
; X64: # %bb.0: # %entry
; X64-NEXT: movq %rdi, %rax
+; X64-NEXT: movq %rdi, %rcx
+; X64-NEXT: orq %rsi, %rcx
+; X64-NEXT: shrq $32, %rcx
+; X64-NEXT: je .LBB1_1
+; X64-NEXT: # %bb.2:
; X64-NEXT: cqto
; X64-NEXT: idivq %rsi
; X64-NEXT: retq
+; X64-NEXT: .LBB1_1:
+; X64-NEXT: # kill: def $eax killed $eax killed $rax
+; X64-NEXT: xorl %edx, %edx
+; X64-NEXT: divl %esi
+; X64-NEXT: # kill: def $eax killed $eax def $rax
+; X64-NEXT: retq
;
; SLM-LABEL: div64:
; SLM: # %bb.0: # %entry
@@ -178,9 +189,20 @@ define i64 @div64_hugews(i64 %a, i64 %b) {
; X64-LABEL: div64_hugews:
; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax
+; X64-NEXT: movq %rdi, %rcx
+; X64-NEXT: orq %rsi, %rcx
+; X64-NEXT: shrq $32, %rcx
+; X64-NEXT: je .LBB4_1
+; X64-NEXT: # %bb.2:
; X64-NEXT: cqto
; X64-NEXT: idivq %rsi
; X64-NEXT: retq
+; X64-NEXT: .LBB4_1:
+; X64-NEXT: # kill: def $eax killed $eax killed $rax
+; X64-NEXT: xorl %edx, %edx
+; X64-NEXT: divl %esi
+; X64-NEXT: # kill: def $eax killed $eax def $rax
+; X64-NEXT: retq
;
; SLM-LABEL: div64_hugews:
; SLM: # %bb.0:
More information about the llvm-commits
mailing list