[llvm-dev] Specifying conditional blocks for the back end
Friedman, Eli via llvm-dev
llvm-dev at lists.llvm.org
Mon Mar 13 10:29:08 PDT 2017
On 3/11/2017 4:36 PM, Alex Susu via llvm-dev wrote:
> Hello.
> I wanted to tell you that I managed to codegen correctly the LLVM
> VSELECT instruction by doing the steps described below.
> Can somebody help me with the problems with the
> PredicateInstruction() method I describe below at point 3? Although I
> managed to avoid using PredicateInstruction(), I am curious why it
> doesn't work.
>
> To codegen correctly the LLVM VSELECT instruction (I will be very
> explicit, so bare with me if you have similar issues):
> - 1. I declare in TableGen an instruction WHERE_EQ (I assume
> without loss of generality that VSELECT has a seteq predicate), which
> will implement the VSELECT in terms of my processor's WHERE blocks.
> - 2. in ISelLowering::Lower() I replace the VSELECT with
> WHERE_EQ. (note that before I was generating the entire list of
> MachineSDNode instructions equivalent to VSELECT in
> ISelLowering::Lower(), but the scheduler and the DCE (Dead Code
> Elimination) pass were messing up the order of instructions resulting
> in incorrect semantics). Note that I give to WHERE_EQ as inputs the
> SDNode operands of VSELECT, in order to be able to access them later
> in the PassCreateWhereBlocks pass mentioned below;
>
> - 3. I registered a pass PassCreateWhereBlocks in
> addInstSelector() in [Target]TargetMachine.cpp, which gets executed
> immediately after instruction selection followed by a first scheduling
> phase.
> Even if I predicate in PassCreateWhereBlocks the instructions
> inside the WHERE block, the method PredicateInstruction() fails by
> returning false, which means the method did not add a predicated flag
> to the instructions I wanted to.
PredicateInstruction is a virtual method, and the default implementation
always returns false; your target is supposed to override it.
> This results, as I said before, in incorrect program optimizations
> such as useful instructions being removed, because the compiler does
> not understand that code in my WHERE blocks are predicated
> (conditional), so it assumes they are always being executed. As a side
> not, I see the ARM and SystemZ back ends are overriding the
> PredicateInstruction() method, but their code is a bit complex and I
> did not bother much to understand how they manage to predicate their
> instructions e.g., for ARM Thumb2 "it" instruction - are there some
> links documenting their work?
Thumb2 models its predicated instructions the same way as non-Thumb ARM
does until very late in the backend. Basically, the predicate is just
an operand of the MachineInstr. But it's a bit simpler because we don't
predicate instructions until after register allocation.
> Therefore I started using bundles instead of making predicated
> instructions - as far as I can see DCE cannot be performed inside
> bundled instructions (see also
> http://llvm.org/docs/doxygen/html/DeadMachineInstructionElim_8cpp_source.html
> which does NOT treat bundles, which implies it is not looking at the
> instruction inside a bundle and can only see the "header" instruction
> of a bundle; therefore, I believe it is safe to bundle instructions to
> avoid DCE as long as at least we can infer the "header" instruction of
> the bundle is not going to be ever DCE-ed). Using bundles also avoids
> that the scheduler changes the order of the bundled instructions. To
> create the bundle I use MIBundleBuilder, since using directly in this
> pass (PassCreateWhereBlocks) the finalizeBundle() method results in an
> error like "llc: /llvm/lib/CodeGen/MachineInstrBundle.cpp:149: void
> llvm::finalizeBundle(llvm::MachineBasicBlock&,
> llvm::MachineBasicBlock::instr_iterator,
> llvm::MachineBasicBlock::instr_iterator): Assertion
> `TargetRegisterInfo::isPhysicalRegister(Reg)' failed."
> So I create for VSELECT pred, Vreg_true, Vreg_false an
> equivalent sequence of MachineInstr:
> // pred is computed before
> R31 = OR Rfalse, Rfalse // copy Rfalse to R31
> WHERE_EQ
> R31 = OR Rtrue, Rtrue // copy Rtrue to R31
> ENDWHERE
>
> Note that I create a physical register (R31, a vector
> register; I also reserve this register in
> [Target]RegisterInfo::getReservedRegs(), to avoid an error which
> sometimes happened due to MachineVerifier.cpp like "Bad machine code:
> Using an undefined physical register"). I cannot use instead of R31 a
> virtual register in PassCreateWhereBlocks (and ISelLowering::Lower())
> since I need to assign to it twice (for both the then and else
> branches of the VSELECT instruction) and virtual registers follow the
> SSA rule of single-assignment (so I get the following error if
> assigning twice to a virtual register: <<MachineRegisterInfo.cpp:339
> [...] "getVRegDef assumes a single definition or no definition"'
> failed.>>). Also I tried without success using
> MachineRegisterInfo::leaveSSA() to avoid this problem with
> single-assignment, but then other passes like MachineLICM will give an
> error in llc like <<MachineLICM.cpp:409: [...] Assertion
> `TargetRegisterInfo::isPhysicalRegister(Reg) && "Not expecting virtual
> register!"' failed.>>, because MachineRegisterInfo::isSSA() returns
> false, which makes the pass assume that register allocation has
> finished and we have only physical registers, which unfortunately is
> NOT the case.
The right way to model this in SSA form would be something like this:
Rresult1 = OR Rfalse, Rfalse
Rresult2 = WHERE_EQ_OR flags, Rresult1, Rtrue, Rtrue
You then tie the two virtual registers together so the register
allocator knows they have to be allocated to same physical register
(something like `let Constraints = "$Rresult1 = $Rresult2"` in TableGen).
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
More information about the llvm-dev
mailing list