[PATCH] Handle tied sub-operands in AsmMatcherEmitter
Jakob Stoklund Olesen
stoklund at 2pi.dk
Thu Mar 28 09:52:56 PDT 2013
Hi Ulrich,
Thanks for the explanation. Jim understands this code better than I do, so I'll let him comment.
I would like to see a solution that can be shared between ARM and PPC.
Thanks,
/jakob
On Mar 28, 2013, at 9:37 AM, Ulrich Weigand <ulrich.weigand at de.ibm.com> wrote:
>
> Jakob,
>
> this is another TableGen patch related to handling pre-inc instruction
> which I need as prerequisite for the PowerPC asm parser.
>
> The problem this patch attempts to fix is the handling of register tie
> constraints in AsmMatcherEmitter, where one operand is tied to a
> sub-operand of another operand. The typical scenario for this to happen is
> the tie between the "write-back" register of a pre-inc instruction, and the
> base register sub-operand of the memory address operand of that
> instruction.
>
> Now from reading the code, it seems that this scenario wasn't ever supposed
> to be properly supported by AsmMatcherEmitter. In fact, there seems to be
> code that at one time may have completely rejected such instructions, but
> now simply emits some warnings (and only when using debug mode). However,
> even while those instructions may no longer be rejected, the code that is
> generated doesn't seem to be correct either.
>
> One underlying cause of the problem is the way tied operands are handled:
> If operand I is tied to J (where I < J), then the actual operand is emitted
> when at location I, and when at location J a copy of what is already at
> location I is emitted. To ensure this can always be done,
> buildInstructionOperandReference "canonicalizes" the assembler string such
> that the "destination" operand (with smaller number) is always visited
> first:
>
> // If the named operand is tied, canonicalize it to the untied operand.
> // For example, something like:
> // (outs GPR:$dst), (ins GPR:$src)
> // with an asmstring of
> // "inc $src"
> // we want to canonicalize to:
> // "inc $dst"
> // so that we know how to provide the $dst operand when filling in the
> result.
>
> This has the effect, in the example given above, that when generating the
> output MI operand list, the assembler string operand originally found at
> the location indicated by "$src" in the pattern will be placed in the MI
> operand indicated by "$dst", and later on, in the MI operand indicated by
> "$src", a copy of "$dst" is placed.
>
> However, if the destination is only *part* of the source operand (e.g. with
> a tie like "$wb = $mem.ptrreg"), this code currently also "canonicalizes"
> the full $mem to $wb. This has the effect that in the output MI operand
> list, in the place of "$wb" *all* the sub-operands indicated by "$mem" are
> placed, and in the place of "$mem", only a single sub-operand is copied.
> This results in a wrong MI output list.
>
> On the other hand, it is necessary to do this canonicalization, since the
> MI operand list is created by appending one operand after the other, so it
> is not possible to perform a "forward tie": when encoding "$wb", we'd have
> to leave a slot open, and only after we've encoded "$mem", we can then go
> back and fill in the copy.
>
>
> One backend currently affected by this problem is ARM, where they work
> around it by either providing custom matchers for pre-inc instructions, or
> else just handling the whole instruction differently via special asm-parser
> only instruction patterns.
>
> To avoid having to duplicate hacks like this, I've come up with the
> attached patch, which fixes at least one part of the problem: if the
> multi-part operand can be handled piecewise by the asm parser, we can still
> do the "canonicalization" trick by moving up just the piece of the operand
> that's being tied. This is implemented by updating the canonicalization
> logic in buildInstructionOperandReference, and modifying
> buildInstructionOperandReference to allow ties to work on sub-operands.
>
> Now, those changes unfortunately affect some of the other cases that still
> cannot be implemented correctly, because the multi-part operand *cannot* be
> handled piecewise (as is the case in the ARM back-end). In those, you now
> see errors of the form "Instruction ... has operand '$wb' that doesn't
> appear in asm string." This is because $wb is no longer incorrectly
> canonicalized ... If those instructions would have actually been used by
> the asm parser, they would have generated wrong MI operand lists. However,
> they aren't (because the use custom matchers or are "shadowed" by
> asm-parser only alias instructions), those errors really are spurious.
>
> One of the cases is simple to fix: if we have a custom matcher, we don't
> need to run through buildInstructionOperandReference at all, since all the
> matching is done by the custom matcher anyway. Thus we can just skip the
> routine and avoid spurious errors.
>
> However, the other case (where the instruction is shadowed by a asm-parser
> only alias) cannot be detected automatically. To avoid a build break, the
> patch demotes the FatalError to a simple Warning, which will now show up on
> those handfull of ARM instruction patterns. To remove those warnings, we
> could either mark the instructions as "isCodeGenOnly" or implement proper
> custom matchers. I'd prefer to leave this up to the ARM back-end
> maintainers, though ...
>
>
> Does this approach look reasonable? I'd appreciate a review of the
> patch ...
>
>
> Thanks,
> Ulrich
>
>
> (See attached file: diff-llvm-tblgen-tie)<diff-llvm-tblgen-tie>
More information about the llvm-commits
mailing list