[cfe-commits] r161703 - in /cfe/trunk: include/clang/AST/Stmt.h lib/AST/Stmt.cpp

Chad Rosier mcrosier at apple.com
Mon Aug 13 09:46:52 PDT 2012


On Aug 13, 2012, at 2:37 AM, Eli Friedman wrote:

> On Mon, Aug 13, 2012 at 2:24 AM, Enea Zaffanella <zaffanella at cs.unipr.it> wrote:
>> On 08/10/2012 11:36 PM, Chad Rosier wrote:
>>> Author: mcrosier
>>> Date: Fri Aug 10 16:36:25 2012
>>> New Revision: 161703
>>> 
>>> URL: http://llvm.org/viewvc/llvm-project?rev=161703&view=rev
>>> Log:
>>> [ms-inline asm] Avoid extra allocations by making this an array of StringRefs.
>> [...]
>> 
>> 
>> Hello.
>> 
>> Since you are working on the ms-inline asm statements,
>> we would like to propose a change to the AST representation.
>> 
>> Currently, consecutive MS asm statements are merged into a single
>> MSAsmStmt node.
>> 
>> This choice negatively affects source code fidelity.
>> For instance, there is a single source location for just the first of
>> the possibly many consecutive __asm keywords and there is no faithful
>> tracking of braces.
>> Moreover, it seems to also complicate later code trying to reconstruct
>> the spacing and line-breaking around the asm tokens (namely, the use of
>> array LineEnds and of static function needSpaceAsmToken).
>> 
>> As things are now, when parsing the following:
>> 
>>   __asm {
>>     mov  eax, dword ptr [0]
>>     mov  edx, dword ptr [4]
>>   }
>> 
>> we obtain the string
>>   "mov eax, dword ptr[0]mov edx, dword ptr[4]"
>> with no line break.
>> 
>> 
>> Wouldn't thing be more simple if each __asm keyword is mapped onto its
>> own MSAsmStmt node?
>> (Side note: afawct, this is what happens for non-MS AsmStmt nodes.)
>> 
>> From the source code fidelity perspective, we would get a source
>> location for each __asm keyword and we will be able to properly tell if
>> it is using braces or not.
>> Moreover, the logic for producing the asm string would be simpler too:
>> there will be no need to use an array LineEnds, because you can query
>> each asm token with isAtStartOfLine().
>> As a matter of fact, token method hasLeadingSpace() could replace static
>> function needSpaceAsmToken() to add a white space if there was one in
>> the original sources.
>> 
>> For the sample above, we would thus obtain the string
>>   "mov eax, dword ptr [0]\nmov edx, dword ptr [4]"
>> with both the line break and the extra space before the "[".
> 
> Semantically, there's only one statement; CodeGen, for example, must
> produce a single LLVM IR inline asm.  A series of __asm lines and a
> single __asm{} block have the same meaning; they're basically just
> spelled differently in the source code.  I think it would
> substantially complicate the semantic analysis and CodeGen to be
> forced to handle a series of AST nodes as if they were one.

I strongly agree with Eli.

> Granted, if the current AST nodes don't provide enough source location
> information, we should do something, but I don't think splitting them
> is the right answer.

Again, I agree.

The ms-style inline assembly is still in the very early stages of development and we do intend on improving the support better source location tracking.

Patches are welcome..

 Chad

> 
> -Eli




More information about the cfe-commits mailing list