[PATCH] D59758: [DAGCombiner] Combine OR as ADD when no common bits are set

Bjorn Pettersson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 28 06:14:09 PDT 2019


bjope added a comment.

In D59758#1444441 <https://reviews.llvm.org/D59758#1444441>, @kparzysz wrote:

> The Hexagon testcase can be fixed---it's probably just a matter of changing the selection pattern for the instruction we're checking.


I've done some more analysis for llvm/test/CodeGen/Hexagon/subi-asl.ll.

The first DAG combine that makes a difference is that with this patch we fold

      t30: i32 = shl t11, Constant:i32<1>
    t31: i32 = sub Constant:i32<0>, t30
  t28: i32 = or t9, t31

into

    t30: i32 = shl t11, Constant:i32<1>
  t32: i32 = sub t9, t30

IMO that looks like a nice fold.

A few combines later we end up with the following before selection:

  Address tree balanced selection DAG:SelectionDAG has 27 nodes:
    t0: ch = EntryToken
      t44: i32 = HexagonISD::CONST32_GP TargetGlobalAddress:i32<i32* @this_insn_number> 0
    t11: i32,ch = load<(dereferenceable load 4 from @this_insn_number)> t0, t44, undef:i32
      t2: i32,ch = CopyFromReg t0, Register:i32 %1
    t41: i32,ch = load<(load 2 from %ir.cgep59, align 4), sext from i16> t0, t2, undef:i32
    t30: i32 = shl t11, Constant:i32<1>
      t18: ch = TokenFactor t41:1, t11:1
      t17: i32,ch = CopyFromReg t0, Register:i32 %0
    t23: ch,glue = CopyToReg t18, Register:i32 $r0, t17
          t37: i1 = setcc t41, Constant:i32<56>, seteq:ch
          t48: i32 = sub Constant:i32<1>, t30
          t47: i32 = sub Constant:i32<0>, t30
        t49: i32 = select t37, t48, t47
      t25: ch,glue = CopyToReg t23, Register:i32 $r1, t49, t23:1
    t27: ch,glue = HexagonISD::TC_RETURN t25, TargetGlobalAddress:i32<void (%struct.rtx_def*, i32)* @reg_is_born> 0, Register:i32 $r0, Register:i32 $r1, RegisterMask:Untyped

and since there now is two uses of t30 the patterns for selecting subi_asl won't trigger (they check that there only is one use of the shl).

Without the patch we instead would get

  Address tree balanced selection DAG:SelectionDAG has 27 nodes:
    t0: ch = EntryToken
      t40: i32 = HexagonISD::CONST32_GP TargetGlobalAddress:i32<i32* @this_insn_number> 0
    t11: i32,ch = load<(dereferenceable load 4 from @this_insn_number)> t0, t40, undef:i32
      t2: i32,ch = CopyFromReg t0, Register:i32 %1
    t38: i32,ch = load<(load 2 from %ir.cgep59, align 4), sext from i16> t0, t2, undef:i32
      t30: i32 = shl t11, Constant:i32<1>
    t31: i32 = sub Constant:i32<0>, t30
      t18: ch = TokenFactor t38:1, t11:1
      t17: i32,ch = CopyFromReg t0, Register:i32 %0
    t23: ch,glue = CopyToReg t18, Register:i32 $r0, t17
          t35: i1 = setcc t38, Constant:i32<56>, seteq:ch
          t41: i32 = or t31, Constant:i32<1>
        t42: i32 = select t35, t41, t31
      t25: ch,glue = CopyToReg t23, Register:i32 $r1, t42, t23:1
    t27: ch,glue = HexagonISD::TC_RETURN t25, TargetGlobalAddress:i32<void (%struct.rtx_def*, i32)* @reg_is_born> 0, Register:i32 $r0, Register:i32 $r1, RegisterMask:Untyped

So the height of the DAG (looking at the operands for the select) seem to be one less with this patch `(select ... (sub (shl)) ...)` instead of `(select ... (or (sub (shl ..))) ...)` .

If you think that the old codegen (using subi_asl) actually is superior here, then we might need a pattern doing

  (sub 1, (shl x, c)) => (setbit_i (subi_asl_ri 0, x, c), 0)

But I guess with a predicate to only do it if the shl has multiple uses, because otherwise

  (sub 1, (shl x, c)) => (subi_asl_ri 1, x, c)

would be better.
I actually think that we need even more predicates to distinguish when the setbit_i solution is better, because it does not look optimal (at least not with my limited knowledge about Hexagon).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59758/new/

https://reviews.llvm.org/D59758





More information about the llvm-commits mailing list