[PATCH] D57243: GlobalISel: Allow bitcount ops to have different result type

Tue Feb 5 09:53:44 PST 2019

qcolombet added inline comments.

================
Comment at: lib/CodeGen/GlobalISel/LegalizerHelper.cpp:756
+      return Legalized;
+    }
+
----------------
dsanders wrote:
> rtereshin wrote:
> > arsenm wrote:
> > > arsenm wrote:
> > > > rtereshin wrote:
> > > > > arsenm wrote:
> > > > > > rtereshin wrote:
> > > > > > > Hi Matt @arsenm,
> > > > > > > 
> > > > > > > Looks like adding `widenScalarDst` call for type index 0 is not enough to adjust the widening to the new type constraints in full. All the code below it that now would handle type index 1 widening appears to be written under the assumption that destination and source types are the same, and it widens both, source and the destination. It was perfectly fine an assumption before, but not anymore. If the source needs to be widened, it generates incorrect code.
> > > > > > > 
> > > > > > > Here's a unit test that reproduces one of the possible issues (thanks Aditya @aditya_nandakumar for coming up with it!):
> > > > > > > 
> > > > > > > ```
> > > > > > > diff --git a/unittests/CodeGen/GlobalISel/LegalizerHelperTest.cpp b/unittests/CodeGen/GlobalISel/LegalizerHelperTest.cpp
> > > > > > > index 815f1e3d321..a5798693c08 100644
> > > > > > > --- a/unittests/CodeGen/GlobalISel/LegalizerHelperTest.cpp
> > > > > > > +++ b/unittests/CodeGen/GlobalISel/LegalizerHelperTest.cpp
> > > > > > > @@ -341,6 +341,31 @@ TEST_F(GISelMITest, WidenBitCountingCTPOP) {
> > > > > > >    EXPECT_TRUE(CheckMachineFunction(*MF, CheckStr)) << *MF;
> > > > > > >  }
> > > > > > >  
> > > > > > > +// CTPOP widening.
> > > > > > > +TEST_F(GISelMITest, WidenBitCountingCTPOP1) {
> > > > > > > +  if (!TM)
> > > > > > > +    return;
> > > > > > > +
> > > > > > > +  // Declare your legalization info
> > > > > > > +  DefineLegalizerInfo(A, {
> > > > > > > +    getActionDefinitionsBuilder(G_CTPOP).legalFor({{s16, s16}});
> > > > > > > +  });
> > > > > > > +  // Build
> > > > > > > +  // Trunc it to s8.
> > > > > > > +  LLT s8{LLT::scalar(8)};
> > > > > > > +  LLT s16{LLT::scalar(16)};
> > > > > > > +  auto MIBTrunc = B.buildTrunc(s8, Copies[0]);
> > > > > > > +  auto MIBCTPOP = B.buildInstr(TargetOpcode::G_CTPOP, {s16}, {MIBTrunc});
> > > > > > > +  AInfo Info(MF->getSubtarget());
> > > > > > > +  DummyGISelObserver Observer;
> > > > > > > +  LegalizerHelper Helper(*MF, Info, Observer, B);
> > > > > > > +  EXPECT_TRUE(Helper.widenScalar(*MIBCTPOP, 1, s16) ==
> > > > > > > +              LegalizerHelper::LegalizeResult::Legalized);
> > > > > > > +
> > > > > > > +
> > > > > > > +  MF->dump();
> > > > > > > +}
> > > > > > > +
> > > > > > >  // CTTZ_ZERO_UNDEF widening.
> > > > > > >  TEST_F(GISelMITest, WidenBitCountingCTTZ_ZERO_UNDEF) {
> > > > > > >    if (!TM)
> > > > > > > 
> > > > > > > ```
> > > > > > > 
> > > > > > > Could you please take a look at this?
> > > > > > Try r353102
> > > > > Hi Matt,
> > > > > 
> > > > > That's much better, thank you!
> > > > > 
> > > > > I feel a little bit divided though about the fact that in some cases widening one type index ends up changing the type of the other type index.
> > > > > 
> > > > > For instance, in this case widening type index 1 ends up also widening type index 0:
> > > > > ```lang=c++
> > > > > // CTLZ widening.
> > > > > TEST_F(GISelMITest, WidenBitCountingCTLZ) {
> > > > >   if (!TM)
> > > > >     return;
> > > > > 
> > > > >   // Declare your legalization info
> > > > >   DefineLegalizerInfo(A, {
> > > > >     getActionDefinitionsBuilder(G_CTLZ).legalFor({{s16, s16}});
> > > > >   });
> > > > >   // Build
> > > > >   // Trunc it to s8.
> > > > >   LLT s8{LLT::scalar(8)};
> > > > >   LLT s16{LLT::scalar(16)};
> > > > >   auto MIBTrunc = B.buildTrunc(s8, Copies[0]);
> > > > >   auto MIBCTLZ = B.buildInstr(TargetOpcode::G_CTLZ, {s8}, {MIBTrunc}); // We start with the type signature {s8, s8}
> > > > >   AInfo Info(MF->getSubtarget());
> > > > >   DummyGISelObserver Observer;
> > > > >   LegalizerHelper Helper(*MF, Info, Observer, B);
> > > > >   EXPECT_TRUE(Helper.widenScalar(*MIBCTLZ, 1, s16) ==  // We ask the API to widen type index 1 to s16, probably expecting type signature {s8, s16} as a result
> > > > >               LegalizerHelper::LegalizeResult::Legalized);
> > > > > 
> > > > >   auto CheckStr = R"(
> > > > >   CHECK: [[Trunc:%[0-9]+]]:_(s8) = G_TRUNC
> > > > >   CHECK: [[Zext:%[0-9]+]]:_(s16) = G_ZEXT [[Trunc]]
> > > > >   CHECK: [[Ctlz:%[0-9]+]]:_(s16) = G_CTLZ [[Zext]]  # The API ends up widening not just type index 1, but also type index 0 with the resulting {s16, s16} type signature
> > > > >   CHECK: [[Cst8:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
> > > > >   CHECK: [[Sub:%[0-9]+]]:_(s16) = G_SUB [[Ctlz]]:_, [[Cst8]]:_
> > > > >   CHECK: [[Trunc:%[0-9]+]]:_(s8) = G_TRUNC [[Sub]]
> > > > >   )";
> > > > > 
> > > > >   // Check
> > > > >   EXPECT_TRUE(CheckMachineFunction(*MF, CheckStr)) << *MF;
> > > > > }
> > > > > ```
> > > > > 
> > > > > In this other case widening type index 1 ends up also narrowing type index 0:
> > > > > ```lang=c++
> > > > > // Test a strange case where the result is wider than the source
> > > > > TEST_F(GISelMITest, WidenBitCountingCTPOP2) {
> > > > >   if (!TM)
> > > > >     return;
> > > > > 
> > > > >   // Declare your legalization info
> > > > >   DefineLegalizerInfo(A, {
> > > > >       getActionDefinitionsBuilder(G_CTPOP).legalFor({{s32, s16}});
> > > > >     });
> > > > > 
> > > > >   // Build
> > > > >   // Trunc it to s8.
> > > > >   LLT s8{LLT::scalar(8)};
> > > > >   LLT s16{LLT::scalar(16)};
> > > > >   LLT s32{LLT::scalar(32)};
> > > > >   auto MIBTrunc = B.buildTrunc(s8, Copies[0]);
> > > > >   auto MIBCTPOP = B.buildInstr(TargetOpcode::G_CTPOP, {s32}, {MIBTrunc}); // We start with type signature {s32, s8}
> > > > >   AInfo Info(MF->getSubtarget());
> > > > >   DummyGISelObserver Observer;
> > > > >   LegalizerHelper Helper(*MF, Info, Observer, B);
> > > > >   EXPECT_EQ(LegalizerHelper::LegalizeResult::Legalized,  // We ask the API to widen type index 1 to s16, probably expecting {s32, s16} as a result
> > > > >             Helper.widenScalar(*MIBCTPOP, 1, s16));
> > > > > 
> > > > >   auto CheckStr = R"(
> > > > >   CHECK: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC %0:_(s64)
> > > > >   CHECK: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[TRUNC]]:_(s8)
> > > > >   CHECK: [[CTPOP:%[0-9]+]]:_(s16) = G_CTPOP [[ZEXT]]  # API ends up narrowing the type index 0 in the process with {s16, s16} type signature as a result.
> > > > >   CHECK: [[COPY:%[0-9]+]]:_(s32) = G_ZEXT [[CTPOP]]:_(s16)
> > > > >   )";
> > > > > 
> > > > >   EXPECT_TRUE(CheckMachineFunction(*MF, CheckStr)) << *MF;
> > > > > }
> > > > > ```
> > > > > 
> > > > > I talked to @aditya_nandakumar, @volkan, and @qcolombet offline and it appears that we don't have any contract about what pre-defined operations like widenScalar, narrowScalar etc. are allowed to do with the type indices not included in the request, but I think it maybe worth introducing such a contract. Specifically the one that would state that it's guaranteed that these generic, pre-defined operations aren't allowed to change the types of the type indices except the one(s) explicitly included in the request.
> > > > > 
> > > > > I think not having these operations following such a contract makes their behavior quite unexpected for anyone who focuses on writing legalizations rules for their target w/o having an extensive knowledge about how exactly the implementation of the pre-defined API-provided operations looks like. In particular, I think it might make it easier for the users to accidentally introduce loops in legalization rules, and harder for them to understand where they are coming from.
> > > > > 
> > > > > Please let me know what do you think!
> > > > > 
> > > > > Thanks,
> > > > > Roman
> > > > In general I don't think it's in general possible to avoid changing other type indexes. For example, FewerElements for a select's condition type requires doing the same FewerElements for the select type. The same applies for any vector cast. The two type indexes won't match, but they still need to have the same number of elements.
> > > > 
> > > > I don't think these examples actually show this though. Note in these examples, these aren't producing the fully legalized operation. They're only doing one step, but need at least one more to legalize the operation
> > > A bigger problem I have is the order of legalization steps matters. The intermediate operations will be done in a different type if you specify the legalize actions for type index 1 before 0 vs. the other wya around. You wouldn't see the new operation with the narrower type if you legalized in the other order
> > > A bigger problem I have is the order of legalization steps matters. The intermediate operations will be done in a different type if you specify the legalize actions for type index 1 before 0 vs. the other wya around. You wouldn't see the new operation with the narrower type if you legalized in the other order
> > 
> > That's a very good point!
> I generally agree that the types should generally be modified independently. Not following that makes it fairly easy to get into infinite loops where one type is too big and one is too small. However...
> 
> There are some difficulties with such a rule though (at least with the current definitions of operations). Suppose you have:
>   %dst:(s4) = G_EXTRACT %src:(s16), %idx:s32
> and the legalizer is told to widen %dst to s32. G_EXTRACT currently expects all the result bits to originate from the input which isn't possible if the input stays fixed at s16. Arguably, G_EXTRACT should have a size operand anyway at which point we can treat it the excess bits as an any-extension.
> 
> There's also some performance costs to such a rule as it increases the number of hops needed to get from illegal to legal. I generally think targets should be allowed to leap directly to the right answer whenever possible. >From that point of view, the limitation that we can only specify one index and type as the target for a legalization step is a bit of a nuisance. Personally, I'd like to replace the LegalizationAction with a function pointer that just performs the desired change and have a standard library of actions that can be called for the common changes. This also has the nice side-benefit of removing the Custom action and the limitations that has (the opcode must always be changed to something that is Legal rather than Custom).
> For example, FewerElements for a select's condition type requires doing the same FewerElements for the select type. The same applies for any vector cast. The two type indexes won't match, but they still need to have the same number of elements.

That's a good example, but that doesn't mean we shouldn't come up with some kind of contract.
The thing I am after is a way to tell if we are creating loops in our legalization rules.

I haven't thought if this is practical, but we could imagine that the contract if we don't change the element type of the other operands. The number of elements can change but in a reasonable way (at least for the generic code, custom legalization does what it wants).

What do you think?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57243/new/

https://reviews.llvm.org/D57243