[PATCH] D61289: [globalisel] Add G_SEXT_INREG

Daniel Sanders via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 10 14:58:18 PDT 2019


dsanders marked 2 inline comments as done.
dsanders added inline comments.


================
Comment at: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:698
     return Legalized;
+  case TargetOpcode::G_SEXT_INREG: {
+    if (TypeIdx != 0)
----------------
Petar.Avramovic wrote:
> dsanders wrote:
> > Petar.Avramovic wrote:
> > > NarrowScalar is good candidate for separate patch. Reason is that artifact combiner currently can not handle test that would use  G_SEXT_INREG since it will have chained artifacts.  G_SEXT will be  G_UNMERGE-d, G_UNMERGE has to wait for: 
> > >  - G_SEXT to be combined into G_SEXT_INREG,
> > >  - and then for G_SEXT_INREG to be narrowScalared into sequence of instructions that will end with G_MERGE_VALUES. 
> > > Only at this point G_UNMERGE can be combined.
> > > 
> > > 
> > There's a test for narrowScalar of G_SEXT_INREG in this patch. Could you elaborate on what kind of test you're looking for? I think you're looking for a test with neighbouring instructions which are also narrowScalar'd
> Yes. For example with this test on aarch64, store is narrowScalared:
> 
> ```
>   define void @func(i8 %x, i128* %p) {
>   entry:
>     %conv = sext i8 %x to i128
>     store i128 %conv, i128* %p
>     ret void
>   }
> ```
> Legalizer is not able to produce something like this
> ```
>     %2:_(s32) = COPY $w0
>     %1:_(p0) = COPY $x1
>     %11:_(s64) = G_CONSTANT i64 63
>     %12:_(s64) = G_SEXT_INREG %2, 8
>     %13:_(s64) = G_ASHR %12, %11(s64)
>     G_STORE %12(s64), %1(p0) :: (store 8 into %ir.p, align 16)
>     %7:_(s64) = G_CONSTANT i64 8
>     %6:_(p0) = G_GEP %1, %7(s64)
>     G_STORE %13(s64), %6(p0) :: (store 8 into %ir.p + 8, align 16)
>     RET_ReallyLR
> ```
> which I assume is a desired output.
Ah ok. The issue there is that G_STORE hasn't implemented narrowScalar yet. I don't think that we should prevent any opcode from getting a narrowScalar implementation until all opcodes have one. I think we should be implementing them as we need them and gradually build up the full set over time.


================
Comment at: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:708
+    // don't need to break it apart.
+    if (NarrowTy.getScalarSizeInBits() >= SizeInBits) {
+      Observer.changingInstr(MI);
----------------
Petar.Avramovic wrote:
> dsanders wrote:
> > Petar.Avramovic wrote:
> > > What is the idea behind this block, we generate two instructions with types larger then NarrowTy? G_TRUNC and G_SEXT that are generated have to be narrowScalared and then we have a few more merge/unmerge combines to do.  Also LegalizerHelper does not know how to narrow scalar G_TRUNC and G_SEXT at the moment.
> > It's not possible to eliminate types larger than NarrowTy with a single narrowScalar. narrowScalar's job is to replace this instruction with one or more that work on NarrowTy sized types and some legalization artifacts (G_SEXT and G_TRUNC in this case) which are required to maintain type correctness. Those legalization artifacts are either combined away or further legalized according to their legalization rules.
> > 
> > This block aims to optimize the case where all but one of the narrowed components is a wholly a copy of the sign bit. It expects the G_TRUNC and G_SEXT to be either combined away by the artifact combiner or further legalized. For example: narrowing `%0:_(s32) = G_SEXT %1:_(s32), 8` down to s16 only requires us to preserve the component containing bits 0-15. We can forget about bits 16-31 and reconstruct them with the G_SEXT. This will help chains of instructions to be deleted as they are dead.
> I did not consider situation when G_SEXT(G_SEXT_INREG) could combine away. 
> Currently G_SEXT can only combine into G_TRUNC.  
> The use of this G_SEXT is something legal or G_MERGE that is waiting for G_UNMERGE. 
> We know this since uses are legalized before defs (G_SEXT is def in this context).
> Yet again if use is legal, then G_SEXT_INREG was legal or something else like lower but not narrow scalar.
> Are we talking about artifact combiner, or something that is not in tree?
> Could you provide a test where G_SEXT and G_TRUNC produced  with this narrowScalar combine away?
> 
It does look like the artifact combiner is missing some combines, for the `NarrowTy.getScalarSizeInBits() < SizeInBits` case the resulting:
    %6:_(s128) = G_MERGE_VALUES %0(s64), %12(s64)
    %7:_(s32) = G_TRUNC %6(s128)
ought to be:
    %7:_(s32) = G_TRUNC %0(s64)

and for the `NarrowTy.getScalarSizeInBits() >= SizeInBits` there's a:
    %10:_(s64) = G_SEXT_INREG %9, 32
    %6:_(s128) = G_SEXT %10(s64)
    %7:_(s32) = G_TRUNC %6(s128)
which firstly ought to be:
    %10:_(s64) = G_SEXT_INREG %9, 32
    %7:_(s32) = G_TRUNC %10(s64)
and secondly:
    %10:_(s64) = G_SEXT_INREG %9, 32
    %7:_(s32) = %9(s32)
the former of those two isn't really related to this patch (the G_SEXT_INREG isn't involved in the combine) but the latter is.


================
Comment at: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:756
+
+    // Do the operation on each small part.
+    for (int i = 0; i < NumParts; ++i) {
----------------
Petar.Avramovic wrote:
> dsanders wrote:
> > Petar.Avramovic wrote:
> > > This loop works for `NarrowTy.getScalarSizeInBits() >= SizeInBits` as well.
> > > `NarrowTy.getScalarSizeInBits()` -> `NarrowSize`
> > > 
> > It does but it potentially generates more instructions to do it. Consider `%0:_(s32) = G_SEXT %1:_(s32), 4` narrowScalar'd to s8. This method will break it into four components and will emit one G_SEXT_INREG and one G_ASHR. The other method will emit one G_SEXT_INREG and one G_SEXT which may be eliminated entirely or at worst, lower to one G_ASHR
> Could we then add a possibility for targets to chose how they want to narrow scalar that is to have:
> narrowScalar - always creates G_UNMERGE+...+G_MERGE
> narrowScalarExt - creates G_TRUNC+...+G_{S|Z|ANY}EXT
Assuming we fix the missing artifact combines, what would be the case where G_UNMERGE/G_MERGE is the better option?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61289/new/

https://reviews.llvm.org/D61289





More information about the llvm-commits mailing list