[PATCH] D120887: The [2/2] Patch (Current Patch) revert 2 history commits:

Xiang Zhang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 3 03:10:12 PST 2022


xiangzhangllvm created this revision.
Herald added a subscriber: hiraditya.
Herald added a project: All.
xiangzhangllvm requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Base on [1/2] D120886 <https://reviews.llvm.org/D120886>

I encounter serval problems about using Global Value (GV) in inline asm.  ( both today and before)
I think it is may be a right time to discuss/fix it.

**Mangler problem with GV used in inline asm**
Currently (and before too) we directly use the GV’s name in inline asm for some cases (I don’t know the early reason, but that doesn’t make sense to me)

**For example** (Mangler problem)

  $cat t.c
  int GArr[]={0,1,3,};
  
  int foo(int x) {
    __asm {mov eax, 0};
    __asm {mov GArr[eax*2+2], 4};
  
    return GArr[x];
  }

Let’s compile it with clang -fasm-blocks -target i686-windows-msvc t.c -S

  15 _foo:                                   # @foo
  16 # %bb.0:                                # %entry
  …
  29         movl    $4, GArr+2(,%eax,2)       # inline asm
  ...
  34         movl    _GArr(,%eax,4), %eax   # read GArr[x] to eax
  ..
  37         retl
  …
  42 _GArr:
  43         .long   0                               # 0x0
  44         .long   1                               # 0x1
  45         .long   3                               # 0x3

Bug: The Mangler didn’t work on the GArr in inline asm, so it can’t rename to _GArr
(Reason: due to we didn’t connected the GArr with its real value in parse process, it just a string name)

**History fix** (build fail)
I find Shengchen and Pheobe has fix the relation problem before.
They once try to fix it by connecting the GArr with its real value in parse process, but quickly revert it. Because meet 2 fails:

**1st Fail**: We can’t build upper code in PIC mode.
Some history patches and relation can be find in D115225 <https://reviews.llvm.org/D115225> . 
because the GArr  in __asm {mov GArr[eax*2+2], 4}; and eax may conflict in index reg in PIC mode, or other similar errors.
For example (after the history fix):
$clang -fasm-blocks t-pic.c -fpic -S

  error: ambiguous operand size for instruction 'mov'
    __asm {mov GArr[eax*2+2], 4};
               ^
  1 error generated.

**Status of current LLVM**: (has relocation problem/ GCC too)
After we revert the history fix (D113096 <https://reviews.llvm.org/D113096>), LLVM will generate such asm for upper test t.c:
$clang -fasm-blocks t-pic.c -fpic -S

    6 foo:                                    # @foo
  …
  20         movq    GArr at GOTPCREL(%rip), %rax
  23         movl    $4, GArr+2(,%eax,2)                   # inline asm   (not right code, should accessed from GOT)
  26         movslq  -4(%rbp), %rcx
  27         movq    GArr at GOTPCREL(%rip), %rax
  28         movl    (%rax,%rcx,4), %eax                 # read GArr[x] to eax
  31         retq
  …
  40 GArr:
  41         .long   0                               # 0x0
  …

Though we successful generate out the code for inline asm, but it is not right code.
Because in PIC mode, the GArr should be accessed from GOT.  This will cause segment fault.

**2nd Fail**: We can't generate PC related code  for address with both Base reg and Index reg. (Even in non-PIC mode)
For example:
// clang -fasm-blocks f2.c -S
$cat f2.c

  int ARR[10];
  int main2() {
          int x;
          ARR[6] = 7;
      __asm {
          mov eax, DWORD PTR ARR [ecx + eax * 4]
      }
  }

will fail due to the RIP conflict with ecx.

**How to fix**
That is what I want to discuss.
For GV in inline asm, I prefer to keep the connection between the GV and its real value, (not just keep a GV name in inline asm).
So the GV can be seen in all the optimization pass.

So 1st step, I will revert the patch D115225 <https://reviews.llvm.org/D115225> and D116090 <https://reviews.llvm.org/D116090>.
And then 2nd step, I will fix the upper bug **Fail 2**.


https://reviews.llvm.org/D120887

Files:
  clang/test/CodeGen/ms-inline-asm-variables.c
  llvm/lib/MC/MCParser/AsmParser.cpp


Index: llvm/lib/MC/MCParser/AsmParser.cpp
===================================================================
--- llvm/lib/MC/MCParser/AsmParser.cpp
+++ llvm/lib/MC/MCParser/AsmParser.cpp
@@ -5944,6 +5944,54 @@
   llvm_unreachable("Unstable rewrite sort.");
 }
 
+// IntelExpr with [BaseReg + Index + ...]
+static bool isSpecialIntelExprAppend(SmallVector<AsmRewrite, 4> &AsmStrRewrites,
+                                     SmallVector<AsmRewrite, 4>::iterator It) {
+  if (It == AsmStrRewrites.end())
+    return false;
+
+  ++It;
+  if (It == AsmStrRewrites.end()) {
+    --It;
+    return false;
+  }
+
+  const AsmRewrite &AR = *It;
+  --It;
+  if (AR.Kind != AOK_IntelExpr)
+    return false;
+
+  if(!AR.IntelExp.hasBaseReg() || !AR.IntelExp.hasIndexReg())
+    return false;
+
+  const AsmRewrite &PreAR = *It;
+  char *Ptr1 = const_cast<char *>(PreAR.Loc.getPointer());
+  char *Ptr2 = const_cast<char *>(AR.Loc.getPointer());
+
+  Ptr1 += PreAR.Len;
+  assert (Ptr1 <= Ptr2 && "Unexpected Asm Sting Pointer!");
+
+  // In most case, there are ARR[..] or ARR [].
+  //                       Ptr1 ^    Ptr1  ^
+  //                            ^ Ptr2      ^ Ptr2
+  // So quickly return true.
+  if (Ptr1 + 1 >= Ptr2)
+    return true;
+
+  // The IntelExpr not append the Pre-Input if they are breaked by instruction(s).
+  // For example:
+  // mov     eax,ARR
+  // call    dword ptr[edx+eax*4]
+  // We can't see them as ARR[edx+eax*4].
+  while (++Ptr1 < Ptr2) {
+    char C = *Ptr1;
+    if (C != ' ' || C != '\t' || C != '\n')
+      return false;
+  }
+
+  return true;
+}
+
 bool AsmParser::parseMSInlineAsm(
     std::string &AsmString, unsigned &NumOutputs, unsigned &NumInputs,
     SmallVectorImpl<std::pair<void *, bool>> &OpDecls,
@@ -6150,7 +6198,12 @@
       OS << Ctx.getAsmInfo()->getPrivateLabelPrefix() << AR.Label;
       break;
     case AOK_Input:
-      OS << '$' << InputIdx++;
+      // The Symbol of IntelExpr input can not transfer to pc related
+      // mode, when the IntelExpr contrains both base reg and index reg.
+      if (isSpecialIntelExprAppend(AsmStrRewrites, it))
+        OS << "${" << InputIdx++ << ":P}";
+      else
+        OS << '$' << InputIdx++;
       break;
     case AOK_CallInput:
       OS << "${" << InputIdx++ << ":P}";
Index: clang/test/CodeGen/ms-inline-asm-variables.c
===================================================================
--- clang/test/CodeGen/ms-inline-asm-variables.c
+++ clang/test/CodeGen/ms-inline-asm-variables.c
@@ -11,7 +11,7 @@
   __asm add ebx, dword ptr gVar[271 - 82 + 81 + ebx]
   // CHECK: add dword ptr ${{[0-9]}}[ebx + $$828], ebx
   __asm add dword ptr [ebx + gVar + 828], ebx
-  // CHECK: add ecx, dword ptr ${{[0-9]}}[ecx + ecx * $$4 + $$4590]
+  // CHECK: add ecx, dword ptr ${{{[0-9]}}:P}[ecx + ecx * $$4 + $$4590]
   __asm add ecx, dword ptr gVar[4590 + ecx + ecx*4]
   // CHECK: add dword ptr ${{[0-9]}}[ecx + ecx * $$8 + $$73], ecx
   __asm add dword ptr [gVar + ecx + 45 + 23 - 53 + 60 - 2 + ecx*8], ecx


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D120887.412653.patch
Type: text/x-patch
Size: 3013 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220303/158ac85d/attachment.bin>


More information about the llvm-commits mailing list