[PATCH] D91512: [AArch64][Isel] Avoid implicit zext for SIGN_EXTEND_INREG ( TRUNCATE )

guopeilin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 4 01:34:38 PST 2021


guopeilin added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.h:436
+
+  if (Opc == ISD::SIGN_EXTEND_INREG) {
+    SDValue Opr = N.getOperand(0);
----------------
guopeilin wrote:
> guopeilin wrote:
> > t.p.northover wrote:
> > > t.p.northover wrote:
> > > > t.p.northover wrote:
> > > > > guopeilin wrote:
> > > > > > t.p.northover wrote:
> > > > > > > Could we fix `isBitfieldExtractOpFromSExtInReg` to produce `SBFMWri` instead?
> > > > > > > 
> > > > > > > The other cases here seem to be nodes that we'd have to pessimize in the general case to ensure they zero the high bits on the off-chance they're needed, but for sign-extension it's just a quirk of the implementation.
> > > > > > I guess the main job of `isBitfieldExtractOpFromSExtInReg` is trying to change `Truncate + SBFMWri` into `SBFMXri + COPY`, So I think if we fix it to produce `SBFMWri`, this function may become meaningless.
> > > > > > And, I think maybe it's ok to use a pattern like `SBFMXri + COPY`. The problem is that COPY is likely to be erased after Register Coalescing. And I found a patch that modifies the function `shouleCoalesce()` (https://reviews.llvm.org/D85956) to avoid coalescing COPY in some cases. However, this solution will cause performance regression and cannot fix this case.
> > > > > > So I think the key point is how to keep the operation that `extract the lower bits` after Register Coalescing,  and I found this patch (https://reviews.llvm.org/D87771). So when we zero extend sign_exten_inreg, we need to look forward, if it comes from a `truncate`, then we should not use implicit zero extend.
> > > > > I don't think that'''s what `isBitfieldExtractOpFromSExtInReg` is trying to do. Any `COPY` it results in would be far too implicit, and as you note completely unrelable. We just can't rely on it being preserved for correctness and shouldn't try.
> > > > > 
> > > > > Instead, I see a function that tries to select the correct `SBFM` variant, but is tripped up by the truncate detection code changing `VT` under it. `N` itself produces an `i32` so `SBFMWri` is correct, but that's forgotten in the truncate case.
> > > > Oh hang on, I think I see what you mean. Let me think a bit more.
> > > OK, I'm convinced, but please add a comment explaining why this is a special case, and probably mentioning `isBitfieldExtractOpFromSExtInReg` explicitly as the source.
> > Ok, I will add this comment, and a simplifier case will be added later, please wait a bit more.
> a `isBitfieldExtractOpFromSExtInReg` related comment and a simpler case are added
If everything is ok, could you please approve this patch? Thanks a lot!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91512/new/

https://reviews.llvm.org/D91512



More information about the llvm-commits mailing list