[PATCH] D50222: [CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case)

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 11 08:37:02 PDT 2018


RKSimon added inline comments.


================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:3812
+  bool DivisorIsEven = (K != 0);
+  const APInt &D0 = D.lshr(K);
+
----------------
APInt::lshr returns a value not a reference.


================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:3831
+  unsigned W = D.getBitWidth();
+  const APInt &P = D0.zext(W + 1)
+                       .multiplicativeInverse(APInt::getHighBitsSet(W + 1, 1))
----------------
APInt::zext returns a value not a reference.


================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:3836
+  // Q = floor((2^W - 1) / D0)
+  const APInt &Q = APInt::getAllOnesValue(W).udiv(D0);
+
----------------
Again, value not a reference


================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:3843
+  // (mul N, P)
+  SDValue Op1 = DAG.getNode(ISD::MUL, DL, REMVT, REMNode->getOperand(0), PVal);
+
----------------
You must ensure that MUL + ROTR are legal when necessary - pass a IsAfterLegalization flag into the function and check with isOperationLegalOrCustom/isOperationLegal - see TargetLowering::BuildSDIV


================
Comment at: test/CodeGen/X86/urem-seteq-vec-nonsplat.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+sse2 < %s | FileCheck %s --check-prefixes=CHECK,CHECK-SSE2
+
----------------
Run with avx2 as well for better test coverage.


================
Comment at: test/CodeGen/X86/urem-seteq-vec-nonsplat.ll:6
+
+define <4 x i32> @test_urem_odd_div_nonsplat(<4 x i32> %X) nounwind readnone {
+; CHECK-LABEL: test_urem_odd_div_nonsplat:
----------------
You don't need nonsplat in the test name - its in the filename.


================
Comment at: test/CodeGen/X86/urem-seteq-vec-splat.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+sse2 < %s | FileCheck %s --check-prefixes=CHECK,CHECK-SSE2
+
----------------
Run with avx2 as well for better test coverage.


================
Comment at: test/CodeGen/X86/urem-seteq-vec-splat.ll:27
+; Like test_urem_odd_vec_i32, but with 4 x i16 vectors.
+define <4 x i16> @test_urem_odd_vec_i16(<4 x i16> %X) nounwind readnone {
+; CHECK-LABEL: test_urem_odd_vec_i16:
----------------
Use legal types - 8 x i16 etc.


Repository:
  rL LLVM

https://reviews.llvm.org/D50222





More information about the llvm-commits mailing list