[PATCH] D50222: [CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case)
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 11 08:37:02 PDT 2018
RKSimon added inline comments.
================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:3812
+ bool DivisorIsEven = (K != 0);
+ const APInt &D0 = D.lshr(K);
+
----------------
APInt::lshr returns a value not a reference.
================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:3831
+ unsigned W = D.getBitWidth();
+ const APInt &P = D0.zext(W + 1)
+ .multiplicativeInverse(APInt::getHighBitsSet(W + 1, 1))
----------------
APInt::zext returns a value not a reference.
================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:3836
+ // Q = floor((2^W - 1) / D0)
+ const APInt &Q = APInt::getAllOnesValue(W).udiv(D0);
+
----------------
Again, value not a reference
================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:3843
+ // (mul N, P)
+ SDValue Op1 = DAG.getNode(ISD::MUL, DL, REMVT, REMNode->getOperand(0), PVal);
+
----------------
You must ensure that MUL + ROTR are legal when necessary - pass a IsAfterLegalization flag into the function and check with isOperationLegalOrCustom/isOperationLegal - see TargetLowering::BuildSDIV
================
Comment at: test/CodeGen/X86/urem-seteq-vec-nonsplat.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+sse2 < %s | FileCheck %s --check-prefixes=CHECK,CHECK-SSE2
+
----------------
Run with avx2 as well for better test coverage.
================
Comment at: test/CodeGen/X86/urem-seteq-vec-nonsplat.ll:6
+
+define <4 x i32> @test_urem_odd_div_nonsplat(<4 x i32> %X) nounwind readnone {
+; CHECK-LABEL: test_urem_odd_div_nonsplat:
----------------
You don't need nonsplat in the test name - its in the filename.
================
Comment at: test/CodeGen/X86/urem-seteq-vec-splat.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+sse2 < %s | FileCheck %s --check-prefixes=CHECK,CHECK-SSE2
+
----------------
Run with avx2 as well for better test coverage.
================
Comment at: test/CodeGen/X86/urem-seteq-vec-splat.ll:27
+; Like test_urem_odd_vec_i32, but with 4 x i16 vectors.
+define <4 x i16> @test_urem_odd_vec_i16(<4 x i16> %X) nounwind readnone {
+; CHECK-LABEL: test_urem_odd_vec_i16:
----------------
Use legal types - 8 x i16 etc.
Repository:
rL LLVM
https://reviews.llvm.org/D50222
More information about the llvm-commits
mailing list