[llvm-dev] Using target-specific flags register directly in LLVM

Wed Oct 2 04:46:38 PDT 2019

Hi Taras,

On Wed, 2 Oct 2019 at 11:50, Taras Zakharko via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> However, how would one go about implementing something like this within LLVM? I suppose that one can use MachineBasicBlock to emit flag set intructions, but does one have a guarantee that LLVM won't insert an instruction afterwards that might change the flag value?

I think the dataflow would have to be represented explicitly somehow
to avoid these issues. I haven't thought in huge detail, but obvious
options are either a new LLVM type (e.g. i1flag) or a new
calling-convention with special behaviour around an i1 in returned
values. The "swifterror" attribute is another possible channel, but
IMO that's a bit of a hack from start to finish so I'd prefer to avoid
it.

Then function lowering could easily convert that into a (brcond
(X86ISD::SETCC C, (CallWithFlagsReturn ...)) TrueBB, FalseBB). The
LowerCall code would spot the special value and insert that SETCC
part, which then fits neatly into the existing LLVM type system at
that stage.

The new CallWithFlagsReturn looks like it would be needed so that the
InstrEmitter.cpp (line 932 I think) knows to associate that explicit
dataflow in DAG with the call's first implicitly defined register in
MachineIR (which we'd make sure was EFLAGS)[*]. Once that link is
made, LLVM is pretty used to not clobbering special physical
registers.

Lowering a return would be the same kind of thing in reverse: a CMP on
the real value, feeding data into the RET so that it's recognized as
going through EFLAGS.

> Also, how would one test for the flag value and jump to a LLVM IR block?

With no optimizations from the above you'd get something like:

    call foo
    setc %al
    test %al
    jz wherever

which is not good, but functional. Adding DAG (or other) combines to
optimize the sequence ought to be pretty easy though. After all it's
pretty much the process normal LLVM comparisons go through: start as
an i1 in DAG, get optimized to implicit flags-based dataflow.

I think Chandler had some other concerns he mentioned, but I'm afraid
I've forgotten what they are (or even if they applied to this part of
the problem). So assume I might be doing some slightly more vigorous
hand-waving than is entirely justified.

Cheers.

Tim.

[*] Alternatively, since calls *clobber* flags anyway, we might just
get away with using a single call opcode. As long as we ensure it's
compatible with both uses the duplication is arguably just pedantry.