[PATCH] D48067: [InstCombine] Replacing X86-specific rounding intrinsics with generic floor-ceil
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 12 10:27:28 PDT 2018
craig.topper added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:582
+ InstCombiner::BuilderTy &Builder) {
+ int RoundControl;
+ Intrinsic::ID IntrinsicID = II.getIntrinsicID();
----------------
Why int and getSExtValue? Feels like it should be unsigned and getZExtValue.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:586
+ IntrinsicID == Intrinsic::x86_sse41_round_sd)
+ RoundControl = cast<ConstantInt>(II.getArgOperand(2))->getSExtValue();
+ else if (IntrinsicID == Intrinsic::x86_avx512_mask_rndscale_ss ||
----------------
Not sure if we should assume the rounding mode or SAE is a constant int. The clang frontend guarantees it, but handcrafted IR tests could break it.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:619
+ Value *Zero = Constant::getNullValue(I32Ty);
+ Mask = (MaskTy == I32Ty) ? II.getArgOperand(3)
+ : Builder.CreateZExt(II.getArgOperand(3), I32Ty);
----------------
Why would MaskTy vary here? It's fixed by the intrinsic isn't it?
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:621
+ : Builder.CreateZExt(II.getArgOperand(3), I32Ty);
+ Mask = Builder.CreateAnd(Mask, One);
+ Mask = Builder.CreateICmp(ICmpInst::ICMP_NE, Mask, Zero);
----------------
There's an overload of CreateAnd that accepts a uint64_t for RHS, so you can probably just pass 1 here without creating the ConstantInt yourself.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:652
+ if (Width < 8) {
+ auto *MaskIntTy = Builder.getIntNTy(Width);
+ Mask = Builder.CreateTrunc(Mask, MaskIntTy);
----------------
For less than 8, you should cast i8 to v8i1 first then extract the subvector. We don't want i4 and i2 types floating around.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:2575
+ }
+ V = simplifyX86round(*II, Builder);
+ if (V)
----------------
Why are we calling simplifyX86Round nested under SimplifyDemandedVectorElts? Simplifying should be completely orthogonal. If a simplification happened and the round intrinsic still exists, InstCombine will revisit here and SimplifyDemandedVectorElts will return nullptr. No need to try to handle every case in one shot.
Repository:
rL LLVM
https://reviews.llvm.org/D48067
More information about the llvm-commits
mailing list