[llvm-dev] Parse Instruction

Daniel Sanders via llvm-dev llvm-dev at lists.llvm.org
Mon Sep 28 07:53:52 PDT 2015


<http://code.woboq.org/llvm/llvm/include/llvm/MC/MCParser/MCAsmLexer.h.html#llvm::AsmToken::TokenKind::EndOfStatement>Would getLexer().isNot(AsmToken::EndOfStatement) in that condition do the trick? The lexer is already splitting the input at spaces.

________________________________
From: llvm-dev [llvm-dev-bounces at lists.llvm.org] on behalf of Sky Flyer via llvm-dev [llvm-dev at lists.llvm.org]
Sent: 28 September 2015 14:41
To: Pierre-Andre Saulais
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Parse Instruction

Hi Pierre-Andre

Thanks for your prompt reply.
What I mean, is located at line 4192 (http://code.woboq.org/llvm/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp.html#4192<UrlBlockedError.aspx>).
It, first, has to parse the instruction, and based on the number of operands it uses a pattern in MatchAndEmit.
My problem is, what would be a suitable substitute if operands in the assembly code are not comma-separated, instead space-separated. (as you know, space is automatically removed so I cannot simply switch AsmToken::Comma to AsmToken::Space.)

Thanks a lot. :-)




On Mon, Sep 28, 2015 at 3:32 PM, Pierre-Andre Saulais <pierre-andre at codeplay.com<mailto:pierre-andre at codeplay.com>> wrote:
Hi ES,

>From what I understand instruction parsing is divided into two parts:

- Parsing an operand list (XXXAsmParser::ParseInstruction)
- Turning the operand list into an actual instruction (XXXAsmParser::MatchAndEmitInstruction)

The second part does the validation (e.g. how many operands, what kind, etc) while the first part only does the parsing. That's why I think in the first part you have to handle all possible operand combinations (i.e. parse the first operand, and keep parsing operands as long as you see spaces). LLVM will reject instructions with too many operands (as defined in the .td files).

Is this something that would work with your assembly syntax?

Cheers,
Pierre-Andre


On 28/09/15 14:21, Sky Flyer via llvm-dev wrote:
practically I cannot use a function namly getMnemonicAcceptInfo (mnemonic as input, and number of possible outputs as output), because there are mnemonics that accepts different number of operands! :-/

Any help is highly appreciated.

On Mon, Sep 28, 2015 at 10:53 AM, Sky Flyer <skylake007 at googlemail.com<mailto:skylake007 at googlemail.com>> wrote:
Hi all,

in most of the architectures, assembly operands are comma-separated.
I would like to parse an assembly code that is space-separated and I am having a bit of problem.
In ParseInstruction function, I don't know what is the easiest way to figure out how many operands a mnemonic expected to have.
In comma-separated assembly code, it just consuming commas (while (getLexer().is(AsmToken::Comma))) and adds operands, but it's not the case for space...

I have a dirty hack, that I manually provide such information (number of operands) in a function called for example getMnemonicAcceptInfo and with a for loop I parse the operand!!

What would you suggest for parsing space-separated assembly codes when it comes to figuring out if a mnemonic has two operands or one?

Cheers,
ES




_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150928/8e575f63/attachment.html>


More information about the llvm-dev mailing list