[PATCH] D52570: [X86] Don't generate BMI BEXTR from X86DAGToDAGISel::matchBEXTRFromAnd

Andrea Di Biagio via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 27 06:18:11 PDT 2018


andreadb added a comment.

In https://reviews.llvm.org/D52570#1247140, @craig.topper wrote:

> Forgot to mention that BEXTR breaks the two addressness of the shift+and pattern to avoid a copy which can also be beneficial. Unfortunately isel can't generally determine that a copy will be needed.
>
> I agree, I don't really want to add a new fast/slow flag either.


As Simon already wrote, BEXTR is very fast on AMD.
The throughput reported by llvm-mca matches what I see with perf.

What if at ISel stage we just select BEXTR, and then we have a later (peephole) pass that expands it based on the subtarget and properties of the machine instruction?

Basically, for BEXTR we could do something similar to what is done for LEA. We could have a "fixup" pass (or custom patterns matched by the MachineCombiner) to expand BEXTR when the shift-and sequence is more convenient. This would still require a subtarget target hook.

If the decision only depends on the subtarget, and (maybe) properties of the instructions, then we could have tablegen automatically expand a TargetSubtarget hook for us.

Something like this (just an example...):

  Index: lib/Target/X86/X86SchedPredicates.td
  ===================================================================
  --- lib/Target/X86/X86SchedPredicates.td        (revision 343198)
  +++ lib/Target/X86/X86SchedPredicates.td        (working copy)
  @@ -54,3 +54,5 @@
  // X86GenInstrInfo.
  def IsThreeOperandsLEAFn :
      TIIPredicate<"isThreeOperandsLEA", IsThreeOperandsLEABody>;
  +
  +def ShouldExpandBEXTRDecl : STIPredicateDecl<"shouldExpandBEXTR", TruePred, /* overrides */ 0, /* expandForMC */ 0>;
  Index: lib/Target/X86/X86ScheduleBtVer2.td
  ===================================================================
  --- lib/Target/X86/X86ScheduleBtVer2.td (revision 343198)
  +++ lib/Target/X86/X86ScheduleBtVer2.td (working copy)
  @@ -772,4 +772,9 @@
    ], ZeroIdiomPredicate>
  ]>;
  
  +def : STIPredicate<
  +  ShouldExpandBEXTRDecl,
  +  [ InstructionEquivalenceClass<[BEXTR32rr, BEXTR64rr], FalsePred> ]
  +>;
  +
  } // SchedModel

..that generates this:

  bool X86GenSubtargetInfo::shouldExpandBEXTR(const MachineInstr *MI) const {
    unsigned ProcessorID = getSchedModel().getProcessorID();
    switch(MI->getOpcode()) {
    default:
      break;
    case X86::BEXTR32rr:
    case X86::BEXTR64rr:
      if (ProcessorID == 4) {
        return false;
      }
      break;
    }
  
    return true;
  } // X86GenSubtargetInfo::shouldExpandBEXTR

As I wrote. This is just an idea...

Not sure if that helps.
-Andrea


https://reviews.llvm.org/D52570





More information about the llvm-commits mailing list