[LLVMdev] RFC building a target MCAsmParser

Tom Stellard tom at stellard.net
Wed Apr 15 12:58:19 PDT 2015


On Wed, Apr 15, 2015 at 02:22:54PM -0500, Colin LeMahieu wrote:
> One possibility for which we'd be interested in getting feedback is allowing
> a target to fully handle the parsing process.
> 

How many of the problems that you have encountered come from the c++ code that
is generated by TableGen?  If you ignore all the Tablegen'd code, does the
MCAssembler interface give you enough freedom to do what you want?

-Tom

>  
> 
> We have a generated parser that can output other compiler IRs and this could
> be changed to output MCInsts.  If we could get the input text stream and an
> output MC stream we could have a target specific way of doing all parsing.
> Perhaps this would be useful to other targets that have difficulty?
> 
>  
> 
> A parser generator isn't distributed with the project so we could publish
> the parser generator input and output.
> 
>  
> 
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
> Behalf Of Colin LeMahieu
> Sent: Tuesday, April 14, 2015 12:59 PM
> To: 'LLVM Developers Mailing List'
> Subject: [LLVMdev] RFC building a target MCAsmParser
> 
>  
> 
> Hi everyone.  We're interested in contributing a Hexagon assembler to MC and
> we're looking for comments on a good way to integrate the grammar in to the
> infrastructure.
> 
>  
> 
> We rely on having a robust assembler because we have a large base of
> developers that write in assembly due to low power requirements for mobile
> devices.  We put in some C-like concepts to make the syntax easier and this
> design is fairly well received by users.
> 
>  
> 
> The following is a list of grammar snippets we've had trouble integrating in
> to the asm parser framework.
> 
>  
> 
> Instruction packets are optionally enclosed in braces.
> 
>     { r0 = add(r1, r2) r1 = add(r2, r0) }
> 
>  
> 
> Register can be the beginning of a statement.  Register transfers have no
> mnemonic.
> 
>     r0 = r1
> 
>  
> 
> Double registers have a colon in the middle which can look like a label
> 
>     r1:0 = add(r3:2, r5:4)
> 
>  
> 
> Predicated variants for many instructions
> 
>     if(p1) r0 = add(r1, r2)
> 
>  
> 
> Dense semantics for DSP applications.  Complex multiply optionally shifting
> result left by 1 with optional rounding and optional saturation
> 
>     r0 = cmpy(r1, r2):<<1:rnd:sat
> 
>  
> 
> Hardware loops ended by optional packet suffix
> 
>     { r0 = r1 }:endloop0:endloop1
> 
>  
> 
> We found the Hexagon grammar to be straight forward to implement using plain
> lex / parse but harder within the MCTargetAsmParser. 
> 
>  
> 
> We were thinking a way to get the grammar to work would involve modifying
> tablegen and the main asm parser loop.  We'd have to make tablegen break
> down each instructions in to a sequence of tokens and build a sorted
> matching table based on the set of these sequences.  The matching loop would
> bisect this sorted list looking for a match.  We think existing grammars
> would be unaffected; all existing instructions start with a mnemonic so
> their first token would be an identifier followed by the same sequence of
> tokens they currently have.
> 
>  
> 
> Let us know if we're likely to run in to any issues making these changes or
> if there are other recommendations on what we could do.  Thanks!
> 
>  
> 
> Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
> a Linux Foundation Collaborative Project
> 
>  
> 

> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list