[PATCH] D75748: [X86] Replace (most) X86ISD::SHLD/SHRD usage with ISD::FSHL/FSHR generic opcodes (PR39467)

Fri Mar 6 08:13:58 PST 2020

RKSimon created this revision.
RKSimon added reviewers: craig.topper, spatel, lebedev.ri.
Herald added a subscriber: hiraditya.
Herald added a project: LLVM.
RKSimon edited the summary of this revision.
RKSimon marked an inline comment as done.
RKSimon added inline comments.

================
Comment at: llvm/test/CodeGen/X86/clear-highbits.ll:6
 ; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+bmi,+tbm,+bmi2 < %s | FileCheck %s --check-prefixes=CHECK,X86,BMI2,X86-BMI2,FALLBACK3,X86-FALLBACK3
 ; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+bmi,-tbm,+bmi2 < %s | FileCheck %s --check-prefixes=CHECK,X86,BMI2,X86-BMI2,FALLBACK4,X86-FALLBACK4
 ; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=-bmi,-tbm,-bmi2 < %s | FileCheck %s --check-prefixes=CHECK,X64,NOBMI2,X64-NOBMI2,FALLBACK0,X64-FALLBACK0
----------------
@lebedev.ri Apart from the X86-FALLBACK0 case, can't we enable +cmov on these x86 targets? I can't think of a target that would support any BMI/TBM level without CMOV support.

For i32 and i64 cases, X86ISD::SHLD/SHRD are close enough to ISD::FSHL/FSHR that we can use them directly, we just need to account for the operand commutation for SHRD.

The i16 SHLD/SHRD case is annoying as the shift amount is modulo-32 (vs funnel shift modulo-16), so I've added X86ISD::FSHL/FSHR equivalents, which matches the generic implementation in all other terms.

Something I'm slightly concerned with is that ISD::FSHL/FSHR legality is controlled by the Subtarget.isSHLDSlow() feature flag - we don't normally use non-ISA features for this but it allows the DAG combines to continue to operate after legalization.

The X86 clear_highbits.ll changes are all affected by the same issue - we now have a "FSHR(-1,-1,amt) -> ROTR(-1,amt) -> (-1)" simplification that reduces the dependencies enough for the branch fall through code to mess up. I'm not sure how much of a patch-specific problem this is - tbh if it wasn't for the extra stack usage I wouldn't care much at all. Thoughts?

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D75748

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/lib/Target/X86/X86ISelLowering.h
  llvm/lib/Target/X86/X86InstrCompiler.td
  llvm/lib/Target/X86/X86InstrInfo.td
  llvm/lib/Target/X86/X86InstrShiftRotate.td
  llvm/test/CodeGen/X86/clear-highbits.ll
  llvm/test/CodeGen/X86/clear-lowbits.ll
  llvm/test/CodeGen/X86/extract-bits.ll
  llvm/test/CodeGen/X86/extract-lowbits.ll
  llvm/test/CodeGen/X86/fshl.ll
  llvm/test/CodeGen/X86/fshr.ll
  llvm/test/CodeGen/X86/shift-combine.ll
  llvm/test/CodeGen/X86/shift-parts.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D75748.248730.patch
Type: text/x-patch
Size: 121830 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200306/9779017f/attachment-0001.bin>