[PATCH] D138639: [X86] Add In64BitMode for MOVSX64/MOVZX64 instructions

Thu Nov 24 22:52:43 PST 2022

craig.topper added a comment.

In D138639#3950153 <https://reviews.llvm.org/D138639#3950153>, @skan wrote:

> In D138639#3950147 <https://reviews.llvm.org/D138639#3950147>, @craig.topper wrote:
>
>> In D138639#3950146 <https://reviews.llvm.org/D138639#3950146>, @skan wrote:
>>
>>> In D138639#3950140 <https://reviews.llvm.org/D138639#3950140>, @craig.topper wrote:
>>>
>>>> In D138639#3950133 <https://reviews.llvm.org/D138639#3950133>, @HaohaiWen wrote:
>>>>
>>>>>> Is it not possible to use the encoding information in TSFlags rather than going through the assembly parser? Your patches for schedtool seem very coupled to the names of operand classes and other things. It looks like it will require updates often.
>>>>>
>>>>> The asm enumeration code and asm matcher patch as well as xed patch are used to build map between llvm opcode <-> Xed info <-> uops.info data / other scheduling info data source. Do you have any suggestion to build this map?
>>>>> IsaSet in Xed info can also be used to identify whether a llvm opcode is supported by specific target. LLVM predicates can't determine that precisely.
>>>>
>>>> I don’t know anything about xed. What does it require?
>>>>
>>>>> Apart from that, I think we'd better fix wrong predicates.
>>>>
>>>> Be careful using the word “wrong”. The current implementation was intentional as it saves space in some generated tables.
>>>
>>> @craig.topper I'm not following. I also see that `mayLoad`, `mayStore` is omitted for x86 instructions with memory operands. And when the first operand is memory, the instructions is assumed to load/store, otherwise it's assumed to load only. Is it intentional or just for a slack off?
>>
>> Assumed where? In tablegen?
>
> llvm/lib/CodeGen/TargetInstrInfo.cpp
>
>     auto Flags = MachineMemOperand::MONone;
>     for (unsigned OpIdx : Ops)
>       Flags |= MI.getOperand(OpIdx).isDef() ? MachineMemOperand::MOStore
>                                             : MachineMemOperand::MOLoad;
>   
>   ...
>   
>       // Add a memory operand, foldMemoryOperandImpl doesn't do that.
>       assert((!(Flags & MachineMemOperand::MOStore) ||
>               NewMI->mayStore()) &&
>              "Folded a def to a non-store!");
>       assert((!(Flags & MachineMemOperand::MOLoad) ||
>               NewMI->mayLoad()) &&
>              "Folded a use to a non-load!");
>
> llvm/lib/Target/X86/X86InstrInfo.cpp
>
>   if (I != nullptr) {
>     unsigned Opcode = I->DstOp;
>     bool FoldedLoad =
>         isTwoAddrFold || (OpNum == 0 && I->Flags & TB_FOLDED_LOAD) || OpNum > 0;
>     bool FoldedStore =
>         isTwoAddrFold || (OpNum == 0 && I->Flags & TB_FOLDED_STORE);

The first does that because we haven't folded the memory operand yet so the mayLoad/mayStore flag wouldn't be set. `MI` is a register only instruction at that point. I believe that function is used for folding reloads and spills for register allocation, so using isDef is accurate.

The second is taking a shortcut for the isTwoAddrFold case and the OpNum > 0 case. We could add the TB_FOLDED_LOAD and TB_FOLDED_STORE flags into the X86InstrFoldTables.cpp tables. I think I added that code to avoid updating all of the tables while fixing a bug in D89656 <https://reviews.llvm.org/D89656>. Thought similar assumptions exist here

  X86MemUnfoldTable() {
    for (const X86MemoryFoldTableEntry &Entry : MemoryFoldTable2Addr)
      // Index 0, folded load and store, no alignment requirement.
      addTableEntry(Entry, TB_INDEX_0 | TB_FOLDED_LOAD | TB_FOLDED_STORE);

    for (const X86MemoryFoldTableEntry &Entry : MemoryFoldTable0)
      // Index 0, mix of loads and stores.
      addTableEntry(Entry, TB_INDEX_0);

    for (const X86MemoryFoldTableEntry &Entry : MemoryFoldTable1)
      // Index 1, folded load
      addTableEntry(Entry, TB_INDEX_1 | TB_FOLDED_LOAD);

    for (const X86MemoryFoldTableEntry &Entry : MemoryFoldTable2)
      // Index 2, folded load
      addTableEntry(Entry, TB_INDEX_2 | TB_FOLDED_LOAD);

    for (const X86MemoryFoldTableEntry &Entry : MemoryFoldTable3)
      // Index 3, folded load
      addTableEntry(Entry, TB_INDEX_3 | TB_FOLDED_LOAD);

    for (const X86MemoryFoldTableEntry &Entry : MemoryFoldTable4)
      // Index 4, folded load
      addTableEntry(Entry, TB_INDEX_4 | TB_FOLDED_LOAD);

    // Broadcast tables.
    for (const X86MemoryFoldTableEntry &Entry : BroadcastFoldTable2)
      // Index 2, folded broadcast
      addTableEntry(Entry, TB_INDEX_2 | TB_FOLDED_LOAD | TB_FOLDED_BCAST);

    for (const X86MemoryFoldTableEntry &Entry : BroadcastFoldTable3)
      // Index 3, folded broadcast
      addTableEntry(Entry, TB_INDEX_3 | TB_FOLDED_LOAD | TB_FOLDED_BCAST);

All of the flags could be moved into the tables and the second argument to addTableEntry could be removed.

My only concern is that it might increase the size and maybe compile time of the file.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138639/new/

https://reviews.llvm.org/D138639