[PATCH] D103427: [X86] Fix handling of maskmovdqu in X32

Sat Mar 26 21:02:57 PDT 2022

skan added inline comments.

================
Comment at: llvm/lib/Target/X86/X86InstrSSE.td:4032-4039
+           AdSize64;
+let Uses = [EDI], Predicates = [UseSSE2,In64BitMode] in
+def MASKMOVDQUX32 : PDI<0xF7, MRMSrcReg, (outs), (ins VR128:$src, VR128:$mask),
+           "addr32 maskmovdqu\t{$mask, $src|$src, $mask}",
+           [(int_x86_sse2_maskmov_dqu VR128:$src, VR128:$mask, EDI)]>,
+           AdSize32 {
+  let AsmVariantName = "NonParsable";
----------------
hvdijk wrote:
> skan wrote:
> > hvdijk wrote:
> > > skan wrote:
> > > > hvdijk wrote:
> > > > > hvdijk wrote:
> > > > > > craig.topper wrote:
> > > > > > > skan wrote:
> > > > > > > > craig.topper wrote:
> > > > > > > > > skan wrote:
> > > > > > > > > > craig.topper wrote:
> > > > > > > > > > > skan wrote:
> > > > > > > > > > > > BTW, why do we need add an instruction like `MASKMOVDQUX32` when we already have  `MASKMOVDQU`? They have the same encoding/decoding except a address-size prefix. We can definitly encode  a `0x67` during encoding and and print a "addr32" during decoding according to the mode w/o adding any new intrustion.  If removing `Not64BitMode` of `MASKMOVDQU` may cause a ISEL issue, we should fix it in ISEL.
> > > > > > > > > > > The address register is implicit in the instruction. Our addr32 emission is normally based on the register class of the address registers. But we don't have that here. Are you suggesting to hardcode a special case for MASKMOVDQU in the encoder?
> > > > > > > > > > Harcode is one of the solution. Another solution is that we can add `X86::IP_HAS_AD_SIZE` to `Flags` of `MCInst` when translating a `MachineInstr` to a `MCInst`, so that a 0x67 will be emitted. `MASKMOVDQU` accesses memory and has implicit use `EDI`,  and we can get such information from a `MachineInstr`.
> > > > > > > > > The implicit def will always be RDI because it's part of the tablegen Uses. I think. So SelectionDAG will always create the MachineInstr with it.
> > > > > > > > Do you mean MASKMOVDQU use `RDI` rather than `EDI` in MachineInstr?
> > > > > > > Oh nevermind, I should have looked more carefully. I didn't realize we had 3 instructions and you want to reduce to 2.
> > > > > > > We can definitly encode a `0x67` during encoding and and print a "addr32" during decoding according to the mode w/o adding any new intrustion.
> > > > > > 
> > > > > > The 0x67 during encoding is handled automatically if we mark the instruction AdSize32, but we also need to print the addr32 when printing in text form, that's what I had trouble with originally. However, trying again to reuse the existing `MASKMOVDQU` and `VMASKMOVDQU`, we can actually get that working with something along these lines instead:
> > > > > > 
> > > > > > ```
> > > > > > --- a/llvm/lib/Target/X86/MCTargetDesc/X86ATTInstPrinter.cpp
> > > > > > +++ b/llvm/lib/Target/X86/MCTargetDesc/X86ATTInstPrinter.cpp
> > > > > > @@ -69,8 +69,12 @@ void X86ATTInstPrinter::printInst(const MCInst *MI, uint64_t Address,
> > > > > >      OS << "\tdata32";
> > > > > >    }
> > > > > >    // Try to print any aliases first.
> > > > > > -  else if (!printAliasInstr(MI, Address, OS) && !printVecCompareInstr(MI, OS))
> > > > > > +  else if (!printAliasInstr(MI, Address, OS) && !printVecCompareInstr(MI, OS)) {
> > > > > > +    if ((MI->getOpcode() == X86::MASKMOVDQU || MI->getOpcode() == X86::VMASKMOVDQU) &&
> > > > > > +        STI.getFeatureBits()[X86::Is64Bit])
> > > > > > +      OS << "\taddr32\n";
> > > > > >      printInstruction(MI, Address, OS);
> > > > > > +  }
> > > > > >  
> > > > > >    // Next always print the annotation.
> > > > > >    printAnnotation(OS, Annot);
> > > > > > ```
> > > > > > 
> > > > > > Is that about what you had in mind too? I'll continue with this approach and see if it passes your and my tests.
> > > > > That should be handled in `printInstFlags` instead, which already checks whether `addr32` needs to be printed but does not handle this case. The important part of the question, however, is whether this should be a special hardcoded exception for (V)MASKMOVDQU, or whether there should be something to automatically detect the need for the prefix.
> > > > Exactly not... Let me propose a patch to illustrate that.
> > > Looking at your D122537: so they *do* need to be special cased, it's just that you moved the special casing into `X86MCInstLower::Lower`. I had already come up with that independently as well while cleaning up my own special casing. I have a little bit more than what you put in D122537 though, let me check whether that is still needed.
> > No. The two functions are in different stages.  The decoding for  `maskmovdqu/vmaskmovdqu` is already correct if we revert this change, no more fix is needed.  `printInstFlags` or `printInst` is used mostly for disassembler,  so we do need to touch it.  `X86MCInstLower::Lower` is used to tranlate MachineInstr to MCInst, which is used to add address-size prefix when we lowering from MIR.
> So, yes. You're saying exactly what I'm saying. I don't have a change in `printInstFlags` or `printInst` any longer either because like I said, I moved the special casing to `X86MCInstLower::Lower` exactly like you did.
> No. The two functions are in different stages.  The decoding for  `maskmovdqu/vmaskmovdqu` is already correct if we revert this change, no more fix is needed.  `printInstFlags` or `printInst` is used mostly for disassembler,  so we do need to touch it.  `X86MCInstLower::Lower` is used to tranlate MachineInstr to MCInst, which is used to add address-size prefix when we lowering from MIR.

so we do need to touch it -> so we don't need to touch it

================
Comment at: llvm/lib/Target/X86/X86InstrSSE.td:4032-4039
+           AdSize64;
+let Uses = [EDI], Predicates = [UseSSE2,In64BitMode] in
+def MASKMOVDQUX32 : PDI<0xF7, MRMSrcReg, (outs), (ins VR128:$src, VR128:$mask),
+           "addr32 maskmovdqu\t{$mask, $src|$src, $mask}",
+           [(int_x86_sse2_maskmov_dqu VR128:$src, VR128:$mask, EDI)]>,
+           AdSize32 {
+  let AsmVariantName = "NonParsable";
----------------
skan wrote:
> hvdijk wrote:
> > skan wrote:
> > > hvdijk wrote:
> > > > skan wrote:
> > > > > hvdijk wrote:
> > > > > > hvdijk wrote:
> > > > > > > craig.topper wrote:
> > > > > > > > skan wrote:
> > > > > > > > > craig.topper wrote:
> > > > > > > > > > skan wrote:
> > > > > > > > > > > craig.topper wrote:
> > > > > > > > > > > > skan wrote:
> > > > > > > > > > > > > BTW, why do we need add an instruction like `MASKMOVDQUX32` when we already have  `MASKMOVDQU`? They have the same encoding/decoding except a address-size prefix. We can definitly encode  a `0x67` during encoding and and print a "addr32" during decoding according to the mode w/o adding any new intrustion.  If removing `Not64BitMode` of `MASKMOVDQU` may cause a ISEL issue, we should fix it in ISEL.
> > > > > > > > > > > > The address register is implicit in the instruction. Our addr32 emission is normally based on the register class of the address registers. But we don't have that here. Are you suggesting to hardcode a special case for MASKMOVDQU in the encoder?
> > > > > > > > > > > Harcode is one of the solution. Another solution is that we can add `X86::IP_HAS_AD_SIZE` to `Flags` of `MCInst` when translating a `MachineInstr` to a `MCInst`, so that a 0x67 will be emitted. `MASKMOVDQU` accesses memory and has implicit use `EDI`,  and we can get such information from a `MachineInstr`.
> > > > > > > > > > The implicit def will always be RDI because it's part of the tablegen Uses. I think. So SelectionDAG will always create the MachineInstr with it.
> > > > > > > > > Do you mean MASKMOVDQU use `RDI` rather than `EDI` in MachineInstr?
> > > > > > > > Oh nevermind, I should have looked more carefully. I didn't realize we had 3 instructions and you want to reduce to 2.
> > > > > > > > We can definitly encode a `0x67` during encoding and and print a "addr32" during decoding according to the mode w/o adding any new intrustion.
> > > > > > > 
> > > > > > > The 0x67 during encoding is handled automatically if we mark the instruction AdSize32, but we also need to print the addr32 when printing in text form, that's what I had trouble with originally. However, trying again to reuse the existing `MASKMOVDQU` and `VMASKMOVDQU`, we can actually get that working with something along these lines instead:
> > > > > > > 
> > > > > > > ```
> > > > > > > --- a/llvm/lib/Target/X86/MCTargetDesc/X86ATTInstPrinter.cpp
> > > > > > > +++ b/llvm/lib/Target/X86/MCTargetDesc/X86ATTInstPrinter.cpp
> > > > > > > @@ -69,8 +69,12 @@ void X86ATTInstPrinter::printInst(const MCInst *MI, uint64_t Address,
> > > > > > >      OS << "\tdata32";
> > > > > > >    }
> > > > > > >    // Try to print any aliases first.
> > > > > > > -  else if (!printAliasInstr(MI, Address, OS) && !printVecCompareInstr(MI, OS))
> > > > > > > +  else if (!printAliasInstr(MI, Address, OS) && !printVecCompareInstr(MI, OS)) {
> > > > > > > +    if ((MI->getOpcode() == X86::MASKMOVDQU || MI->getOpcode() == X86::VMASKMOVDQU) &&
> > > > > > > +        STI.getFeatureBits()[X86::Is64Bit])
> > > > > > > +      OS << "\taddr32\n";
> > > > > > >      printInstruction(MI, Address, OS);
> > > > > > > +  }
> > > > > > >  
> > > > > > >    // Next always print the annotation.
> > > > > > >    printAnnotation(OS, Annot);
> > > > > > > ```
> > > > > > > 
> > > > > > > Is that about what you had in mind too? I'll continue with this approach and see if it passes your and my tests.
> > > > > > That should be handled in `printInstFlags` instead, which already checks whether `addr32` needs to be printed but does not handle this case. The important part of the question, however, is whether this should be a special hardcoded exception for (V)MASKMOVDQU, or whether there should be something to automatically detect the need for the prefix.
> > > > > Exactly not... Let me propose a patch to illustrate that.
> > > > Looking at your D122537: so they *do* need to be special cased, it's just that you moved the special casing into `X86MCInstLower::Lower`. I had already come up with that independently as well while cleaning up my own special casing. I have a little bit more than what you put in D122537 though, let me check whether that is still needed.
> > > No. The two functions are in different stages.  The decoding for  `maskmovdqu/vmaskmovdqu` is already correct if we revert this change, no more fix is needed.  `printInstFlags` or `printInst` is used mostly for disassembler,  so we do need to touch it.  `X86MCInstLower::Lower` is used to tranlate MachineInstr to MCInst, which is used to add address-size prefix when we lowering from MIR.
> > So, yes. You're saying exactly what I'm saying. I don't have a change in `printInstFlags` or `printInst` any longer either because like I said, I moved the special casing to `X86MCInstLower::Lower` exactly like you did.
> > No. The two functions are in different stages.  The decoding for  `maskmovdqu/vmaskmovdqu` is already correct if we revert this change, no more fix is needed.  `printInstFlags` or `printInst` is used mostly for disassembler,  so we do need to touch it.  `X86MCInstLower::Lower` is used to tranlate MachineInstr to MCInst, which is used to add address-size prefix when we lowering from MIR.
> 
> so we do need to touch it -> so we don't need to touch it
> Looking at your D122537: so they *do* need to be special cased, it's just that you moved the special casing into `X86MCInstLower::Lower`. I had already come up with that independently as well while cleaning up my own special casing. I have a little bit more than what you put in D122537 though, let me check whether that is still needed.

And I commented before, it's not necessary to add a special case for `maskmovdqu` b/c we can check the implicit operand of the instruction. However, there are so many existing speical cases in `X86MCInstLower::Lower`, I think it's okay to hardcode it.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103427/new/

https://reviews.llvm.org/D103427