[PATCH] D96283: [DAGCombine] Do not remove masking argument to FP16_TO_FP for some targets

Mon Feb 8 11:11:50 PST 2021

nemanjai created this revision.
nemanjai added reviewers: vchuravy, spatel, PowerPC, olista01, t.p.northover.
Herald added subscribers: steven.zhang, kbarton, hiraditya, kristof.beyls.
nemanjai requested review of this revision.
Herald added a project: LLVM.

As of commit 284f2bffc9bc5, the DAG Combiner gets rid of the masking of the input to this node if the mask only keeps the bottom 16 bits. This is because the underlying library function does not use the high order bits. However, on PowerPC's ELFv2 ABI, it is the caller that is responsible for clearing the bits from the register. Therefore, the library implementation of `__gnu_h2f_ieee` will return an incorrect result if the bits aren't cleared.

This combine is desired for ARM (and possibly other targets) so this patch adds a query to Target Lowering to check if this zeroing needs to be kept.

I am hoping to be able to get this reviewed, updated if needed and merge the fix into 12.0.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=49092


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D96283

Files:
  llvm/include/llvm/CodeGen/TargetLowering.h
  llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  llvm/lib/Target/PowerPC/PPCISelLowering.h
  llvm/test/CodeGen/PowerPC/pr49092.ll


Index: llvm/test/CodeGen/PowerPC/pr49092.ll
===================================================================

--- /dev/null
+++ llvm/test/CodeGen/PowerPC/pr49092.ll
@@ -0,0 +1,39 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mcpu=pwr8 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
+; RUN:   -mtriple=powerpc64le-unknown-unknown < %s | FileCheck %s
+; RUN: llc -mcpu=pwr9 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
+; RUN:   -mtriple=powerpc64le-unknown-unknown < %s | FileCheck %s \
+; RUN:   -check-prefix=CHECK-P9
+
+define dso_local half @test2(i64 %a, i64 %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test2:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    mflr r0
+; CHECK-NEXT:    std r0, 16(r1)
+; CHECK-NEXT:    stdu r1, -32(r1)
+; CHECK-NEXT:    add r3, r4, r3
+; CHECK-NEXT:    addi r3, r3, 11
+; CHECK-NEXT:    clrlwi r3, r3, 16
+; CHECK-NEXT:    bl __gnu_h2f_ieee
+; CHECK-NEXT:    nop
+; CHECK-NEXT:    addi r1, r1, 32
+; CHECK-NEXT:    ld r0, 16(r1)
+; CHECK-NEXT:    mtlr r0
+; CHECK-NEXT:    blr
+;
+; CHECK-P9-LABEL: test2:
+; CHECK-P9:       # %bb.0: # %entry
+; CHECK-P9-NEXT:    add r3, r4, r3
+; CHECK-P9-NEXT:    addi r3, r3, 11
+; CHECK-P9-NEXT:    clrlwi r3, r3, 16
+; CHECK-P9-NEXT:    mtfprwz f0, r3
+; CHECK-P9-NEXT:    xscvhpdp f1, f0
+; CHECK-P9-NEXT:    blr
+entry:
+  %add = add i64 %b, %a
+  %0 = trunc i64 %add to i16
+  %conv = add i16 %0, 11
+  %call = bitcast i16 %conv to half
+  ret half %call
+}
+attributes #0 = { nounwind }
Index: llvm/lib/Target/PowerPC/PPCISelLowering.h
===================================================================
--- llvm/lib/Target/PowerPC/PPCISelLowering.h
+++ llvm/lib/Target/PowerPC/PPCISelLowering.h
@@ -987,6 +987,11 @@
     shouldExpandBuildVectorWithShuffles(EVT VT,
                                         unsigned DefinedValues) const override;
 
+    // Keep the zero-extensions for arguments to libcalls.
+    bool shouldKeepZExtForLibCall() const override {
+      return true;
+    }
+
     /// createFastISel - This method returns a target-specific FastISel object,
     /// or null if the target does not support "fast" instruction selection.
     FastISel *createFastISel(FunctionLoweringInfo &FuncInfo,
Index: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
===================================================================
--- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -21174,7 +21174,7 @@
   SDValue N0 = N->getOperand(0);
 
   // fold fp16_to_fp(op & 0xffff) -> fp16_to_fp(op)
-  if (N0->getOpcode() == ISD::AND) {
+  if (!TLI.shouldKeepZExtForLibCall() && N0->getOpcode() == ISD::AND) {
     ConstantSDNode *AndConst = getAsNonOpaqueConstant(N0.getOperand(1));
     if (AndConst && AndConst->getAPIntValue() == 0xffff) {
       return DAG.getNode(ISD::FP16_TO_FP, SDLoc(N), N->getValueType(0),
Index: llvm/include/llvm/CodeGen/TargetLowering.h
===================================================================
--- llvm/include/llvm/CodeGen/TargetLowering.h
+++ llvm/include/llvm/CodeGen/TargetLowering.h
@@ -2789,6 +2789,12 @@
     return false;
   }
 
+  /// Does this target tolerate non-zero high-order bits in a register passed
+  /// to a library function.
+  virtual bool shouldKeepZExtForLibCall() const {
+    return false;
+  }
+
   //===--------------------------------------------------------------------===//
   // Runtime Library hooks
   //


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D96283.322172.patch
Type: text/x-patch
Size: 3462 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210208/d33c394c/attachment.bin>