[PATCH] D65205: [RISCV] Add Custom Parser for Atomic Memory Operands

Thu Jul 25 03:47:56 PDT 2019

lenary added a comment.
Herald added a subscriber: s.egerton.

In D65205#1599400 <https://reviews.llvm.org/D65205#1599400>, @jrtc27 wrote:

> In D65205#1599193 <https://reviews.llvm.org/D65205#1599193>, @lenary wrote:
>
> > In D65205#1599133 <https://reviews.llvm.org/D65205#1599133>, @jrtc27 wrote:
> >
> > > Yeah this looks like a nice way to do it, better than making an `InstAlias` for each one. GNU as will accept any expression that evaluates to the constant 0 in place of the 0 (i.e. it's just an imm0, same logic as all the other [su]immX fields, other than discarding it). Could we not therefore reuse part of the generic parser, just with a tweak to drop the immediate if parsed and 0? Something like:
> > >
> > > ... snip ...
> >
> >
> > I am worried about this approach, mostly because of the lines like 1268 in parseMemOpBaseReg, where they're pushing on `(`/`)` tokens, which aren't in the instruction operand strings (any more). It does seem like these might complicate the pushing/popping more than necessary. I am also not sure we want to support bare register operands.
>
>
> Is there a reason not to include them in the instruction operand string? It would mean you don't need the custom printer, no?

This is actually really complex. If I include the parens in the tablegen instruction definition, then we definitely cannot support `0(a0)` because the parser requires a `(` where the `0` is. If i don't include them in the instruction definition, then when parseMemBaseOp runs, it adds `(` tokens into the Operands, which means later the register operand is not in the expected place in Operands when operand matching happens.

> 
> 
>> That said, I do understand the value of using parseImmediate, to allow for any value that evaluates to zero.
>> 
>>> You could avoid the pushing and subsequent popping from the vector if you also split `RISCVAsmParser::parseImmediate` (and thus also `RISCVAsmParser::parseOperandWithModifier`) up into everything but the pushing.
>> 
>> This feels intrusive. At the moment I'm happy to refactor to using parseImmediate (but doing all my own register parsing to avoid paren issues), and I'll add a test to make sure it works with more complex expressions.
> 
> Yes, that's another sensible alternative.

Update: I looked into doing this, and it's a lot more complex than I hoped. parseImmediate accepts left parens, and we cannot backtrack through a parse. So, if you write `lr.w a0, (a1)`, parseImmediate tries to parse `(a1)` as an immediate, which succeeds, and then only fails when you try to check it's a constant. I would therefore have to duplicate my memory operand parsing code (which drops parens, because above) to attempt to parse a register first, and then if that fails, backtrack and attempt to parse an immediate then a register. There are other approaches (doing lookahead, and assuming that `(<token>)` must refer to a register, so skipping the immediate parsing if the third token is a right paren) but they all feel really hacky and not wonderful.

For the moment I'm going to leave the code approximately* how it is currently, because I do not see the added value in the extra complexity by using parseImmediate, compared to the added assembly constructs (computed offsets) that it supports.

*= I may have a more robust way of checking the immediate is zero, based on your code and other `isImm*` functions, which I want to polish and add to this patch.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65205/new/

https://reviews.llvm.org/D65205