[PATCH] D120887: The [2/2] Fix mangle problem when variable used in inline asm (non-rip for ARR[BaseReg+IndexReg+..])

Xiang Zhang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Mar 6 18:52:13 PST 2022


xiangzhangllvm added inline comments.


================
Comment at: llvm/lib/MC/MCParser/AsmParser.cpp:5990
+      return false;
+  }
+
----------------
xiangzhangllvm wrote:
> pengfei wrote:
> > xiangzhangllvm wrote:
> > > pengfei wrote:
> > > > xiangzhangllvm wrote:
> > > > > efriedma wrote:
> > > > > > If I'm understanding correctly, this is trying to detect two adjacent expressions in a specific form, and then transform one of them so they can be concatenated?  This is way too much magic.  If we aren't parsing expressions correctly, please fix the initial parse; don't fudge the expressions afterwards.
> > > > > I think you understand a part of it.
> > > > > Here the instruction parsing are finished. The Operands info written into AsmStrRewrites[].
> > > > > Let me take a example:
> > > > > Both follow cases will get same Operands info into AsmStrRewrites[i] (ARR) and AsmStrRewrites[i+1] ([edx+eax*4])
> > > > > Case 1: 
> > > > > ```
> > > > >   mov     eax,ARR[edx+eax*4]
> > > > > ```
> > > > > Case 2:
> > > > > ```
> > > > >  mov     eax,ARR ;   call  dword ptr[edx+eax*4]
> > > > > ```
> > > > > So here code (line 5967-5990) is just to distinguish the 2 case.
> > > > > 
> > > > > For Case 1, we can not let ARR be a RIP related address, because it already has base reg and index reg. (We can't use 3 regs for a mem address)
> > > > > For Case 2, it is no problem let ARR use RIP, the [edx+eax*4] are in another instruction.
> > > > I agree with @efriedma and think the use of `:P` is hacky. I wrote an experiment patch which seems more concise D121072.
> > > I have explained that efriedma's understand is not correct. There is not **concatenated **them, just distinguish. PLS refer my previous explain.
> > > > and then transform one of them **so they can be concatenated**? This is way too much magic. If we ...
> > > 
> > > D121072 is still try to replace the Global name in IR/MIR, that is not good way to let follow process see it. (for example Mangler), and no reason do it on codegen. Using Modifer ":P", let all process see it, and let exsiting code to handle it, in my eye, is better.
> > I think you misunderstood efriedma's point. He meant you should distinguish during parsing instead of distinguishing again.
> > Besides, I think you missed the case when ARR inside [], e.g, `[gVar + ecx + ebx]`. I agreed with him the logic is magic and not easy to understand.
> > D121072 intends do the mangling at front end. No following process is needed. I see front end do mangling sometime, so I think we can do it there.
> > D121072 intends do the mangling at front end. No following process is needed. 
> We can't make sure "No following process is needed. "
> One problem I can see is a symbol both used for lea and call. Call may @plt (or some others), but lea not. 
> 
> In fact, D121072 is only the idea of do part print job at codegen. If it can be replaced in MIR why it can be directly repalced in IR ? Just try getSymbol (which will use mangler name)
> That is not a good arch for IR/MIR. All the operands should be visible to IR/MIR.
> 
> 
> Besides, I think you missed the case when ARR inside [], e.g, [gVar + ecx + ebx]
Let me refine the code to cover this case, 
BTW, that is not a good/standard way to get address in this way.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120887/new/

https://reviews.llvm.org/D120887



More information about the llvm-commits mailing list