[PATCH] D80683: [SelectionDAG] Fix up SimplifyDemandedBits for ppcf128
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 26 05:32:55 PDT 2020
RKSimon added a comment.
please pre-commit the pr45475.ll test case against trunk to show the current codegen and rebase the patch to show the codegen delta
================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:855
+ !TLO.BitsRotatedForPPCF128) {
+ TLO.BitsRotatedForPPCF128 = true;
+ DemandedBits = DemandedBits.rotl(64);
----------------
nemanjai wrote:
> RKSimon wrote:
> > This is going to have the issue that it doesn't reset the flag to its old state when you leave the function, and TLO by design needs to be passed by reference.
> >
> > Initially we might just have to bailout if we encounter MVT::ppcf128 until we can come up with a better solution?
> I took a look at the calls to this function and was fairly certain that every time we do a non-recursive call to this, we will create a new `TLO` (which would set the flag to its initial state of `false`). However, maybe that is not the case (or not guaranteed to be the case in the future).
>
> My initial thoughts on this were to reset the flag to `false` upon entering recursion (i.e. `Depth == 0`) but I ended up not adding that for the above reason.
>
> I believe that if I add something like the following above this condition:
> ```
> if (Depth == 0)
> TLO.BitsRotatedForPPCF128 = false;
> ```
> we will achieve the desired result. Leaving the bit set after leaving the function isn't really an issue - as long as it is reset when we enter the recursive call chain.
>
> Conversely, I can add another parameter to the function with a default value of `false` to signal that we have encountered a `ppcf128` and rotated the demanded bits. But I would favour the first approach I outlined above as it does not further complicate the API here.
I'm still worried that we're not correctly resetting the flag when we return from deeper SimplifyDemandedBits calls (at Depth > 0 - not a new instance) - if we then have another f128 node that we attempt to recurse into as part of the same SimplifyDemandedBits call tree then the demandedbits that time won't get flipped - e.g. select (i1, f128, f128)?
================
Comment at: llvm/test/CodeGen/PowerPC/pr45475.ll:42
+}
+define dso_local signext i32 @testMultNodes(ppc_fp128 %a, ppc_fp128 %b, i32 signext %c, i32 signext %d) {
+; CHECK-LABEL: testMultNodes:
----------------
do you need the dso_local ?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D80683/new/
https://reviews.llvm.org/D80683
More information about the llvm-commits
mailing list